Whitepaper on Job specific Performance Monitoring - RRZE-HPC/DFG-PE GitHub Wiki
Introduction
Requirements
Functional
Technical
Available solutions
Components
Data sources
Hardware PErformance Monitoring
Kernel file system
Other
Node agents
Responsible for triggering and collecting measured data.
Data collection
Protocols and solutions to collect the measured data from nodes.
Databases
Database solutions and variants.