2023.12.18 - ovis-hpc/ovis-wiki GitHub Wiki
Version 4.4.1 Release
- A decision has been made to release OVIS-4.4.1 with a known race condition. The code has been running stably on production machines for at least 6 months. V4.4.1 has bug fixes and improvements in stats commands. It has no core functionality changes from the previous version, v4.3.1.
- The known bug is a race condition that occurred when the test scale was 140,000 sets aggregated to one LDMSD. The connections between sampler daemons and the 1st-level aggregators were disconnected and reconnected repeatedly. There were three aggregation levels. It did not surface when there was only one aggregation level.
Version 4.5.1 Testing Plan
- We plan to address the race condition mentioned above in V4.5.1.
- V4.5.1 contains significant capabilities; hence, the version moves from V4.4.1 to V4.5.1.
A path forward for LDMSD's samplers
- We will brainstorm on a path forward for LDMSD's sampler in our future Best Practices discussion.
- Chris brought up the direction of LDMSD's samplers to handle multiple devices or jobs.
- Creating a single metric set that contains arrays and/or lists of records.
- Creating multiple metric sets for each device or job.
- Jim mentioned that we are adding a new support for configuring a plugin multiple configurations. This could be another option for LDMSD's samplers.