Next week we plan to upgrade gitlab.spack.io to the latest patch release (v16.1.2).
Metrics & Dashboarding
Ryan's PR to add more fine-grained timers to spack install is going well. We are hoping to merge it soon.
We are also working on ingesting this new data into OpenSearch and using it to publish new Grafana dashboards.
Jake and Alec are going to meet next week to discuss strategies to more centrally store CI metrics.
cache.spack.io now shows results for our weekly snapshot mirrors.
CI Status
Scott opened PR #38866 to unconditionally run the protected-publish job in our protected pipelines. This will fix the problem where the top-level mirror is not always up-to-date with the results from the individual stack-specific mirrors.
Scott also discovered that many of the no-binary-for-spec failures we've seen lately may be due to a DeleteOldObjects lifecycle policy that was configured to delete objects from the PR mirror after 14 days. This has since been disabled.
Buildcache Pruning
Ryan discovered that his pruning script was sometimes receiving incomplete results from the GitLab API. He's updating this script to directly query GitLab's database for the list of jobs to fetch instead.
Other topics
Alec will be working with a student this summer to investigating job scheduling & performance in our GitLab CI pipelines.
Priorities
Finish timing data PR and start working on subsequent dashboards
Upgrade gitlab.spack.io to the latest patch release
Migrate GitLab's minio volume from gp2 to gp3
Manually running the pruning script for our develop buildcache and continue to work on automating this task.
Investigate why our gitlab sidekiq pods die and get restarted somewhat frequently. Perhaps increasing resource requests will reduce this error rate?
Update gitlab.spack.io to use S3 and ElasticCache rather than minio and redis.
Update the sync script to merge topics branches against their base branch instead of assuming that it is always develop (necessary for release branch PRs).