2025.03.24 - ovis-hpc/ovis-wiki GitHub Wiki

Slides: https://github.com/ovis-hpc/ovis-wiki/blob/main/LDMS-UG-03-25-2024.pdf

LDMSCON2025 News

LDMS V4.5.1

  • Selected New Features
    • LDMS Rails
    • Multiple Configuration Plugins
    • Logging Infrastructure Improvement
    • IPv6 Support in LDMS Transports
    • LDMS Message (ldms_msg)
  • Additional Improvements
    • LDMS Daemon Performance Monitoring -- Stats (e.g., store_time_stats, thread_stats) commands provides more information
    • Exclusive thread feature for samplers that take longer to execute
  • In development
    • Storage Scalable Improvement: Draft PR1632
    • Introduce Readthedoc to the source tree: PR1599

Slurm Job ID Tracking Discrepancy

  • Ben S. found that the JOB ID in LDMS database from Slurm is different from the JOB IDs shown by sacct
  • A Side Conclusion: Job ID should be string
  • A meeting to discuss JOB information in ldmsd will be scheduled to address and discuss:
    • What is JOB ID field's description? What does the JOB ID field represent? This will help determining what data from job schedulers ldmsd should use to fill the JOB ID field
    • The JOB ID field should be string. What are the guideline, policy to handle different situation, e.g., repeated JOB ID if a Slurm job is rescheduled, Flux produced duplicated JOB ID, Hierarchy Job Scheduling