Maintenance - AndersenLab/CAENDR GitHub Wiki

This page describes ongoing maintenance tasks for the site.

Clear Out Old Cloud Run Jobs

Problem

Currently, to run one of the site tools, we create a new Cloud Run Job to compute that specific tool for that specific dataset. This effectively creates a new Job for each tool submission. These Jobs simply run the tool code - they store all the data elsewhere in GCP. GCP does not preserve logs for older jobs, so these are essentially empty shells.

GCP Cloud Run Jobs are used to run a set of code to completion, as opposed to Cloud Run Services, which are available continuously to handle requests as needed. For a more detailed explanation, see the section GCP: Cloud Run on the page Cloud-Integration.

GCP limits the number of Cloud Run Jobs that can be created in one region / account. Eventually, our account will fill up with old Job records that do not store the job data. Once this happens, you will not be able to run new jobs through the site.

Current Solution

For now, the best solution is to delete old Cloud Run Job records:

  • In the GCP Console, navigate to "Cloud Run" > "Jobs".
  • Select the number of jobs you want to delete. Some thoughts:
  • When I cleared this earlier, I sorted by the "Created By" date and selected all "Succeeded" jobs from 2023.
  • You might consider sorting by name (and therefore tool), last execution date, etc. and manually selecting everything beyond 30 days.
  • Click "Delete".

Suggestions

Some thoughts for how this could be automated in the future:

  • In the API call that checks if a job is complete, you could delete the Cloud Run Job if it succeeded.
  • You could potentially run a job every month that clears out everything from more than 30 days ago.

Again - the Cloud Run Job objects don't store anything important, everything is stored elsewhere in GCP.

⚠️ **GitHub.com Fallback** ⚠️