Performance Monitoring and Analysis of Cluster Systems


Monday 26.08.2019, 14:00 - 17:30


Heyne-Haus, Papendieck 16, 37073 Göttingen, room 2/right ► Map


For a long time, hardware performance monitoring was used on a small scale to measure and analyze data of single application runs in order to detect performance limitations caused by hardware and/or software. Monitoring the whole cluster system for observing hardware failures has been the duty of system administrators with emphasis on operating the system and changes in the system parameters. In recent years, many HPC providers have extended or replaced their monitoring system to additionally track performance data from hardware monitoring facilities and even from the applications. The analysis of the data provides deeper insight in resource utilization and the quality of software. In addition, system administrators use performance data to track the causes of system instabilities to specific user codes. Due to the diversity of HPC centers, many tailored solutions for collection, storage, evaluation and visualization exist today. The workshop wants to bring together developers and users of such infrastructure in order to find ways of collaboration and exchange ideas for further developments. 

Workshop Chairs

  • Thomas Gruber, Friedrich-Alexander-University Erlangen-Nuremberg (FAU), Erlangen Regional Computing Center (RRZE), Erlangen, Germany
  • Anthony Danalis, University of Tennessee, Department of Electrical Engineering and Computer Science, Knoxville, USA


14:00 - 15:00
Stephane Eranian, Google. Mountain View, CA, USA
Keynote TBA

15:00 - 15:30
Luka Stanisic and Klaus Reuter
MPCDF HPC Performance Monitoring System: Enabling Insight via Job-Specific Analysis 

15:30 - 16:00 Coffee Break

16:00 - 16:30
Philipp Neumann
Sparse Grid Regression for Performance Prediction Using High-Dimensional Run Time Data

16:30 - 17:00
Gence Ozer, Sarthak Garg, Neda Davoudi, Gabrielle Poerwawinata, Matthias Maiterth, Alessio Nettiand Daniele Tafani
Towards a Predictive Energy Model for HPC Runtime Systems Using Supervised Learning

17:00 - 17:30
Saurav Nanda, Ganapathy Parthasarathy, Parivesh Choudhary and Arun Venkatachar
Resource Aware Scheduling for EDARegression Jobs




We use cookies in order to design and continuously improve our website for you. By continuing to use the website, you agree to the use of cookies. You can find further information on this in our privacy policy.