Jenkins For Biocache Store And Other LA Tasks - AtlasOfLivingAustralia/documentation GitHub Wiki

Introduction

We explain in this page how to use jenkins as a task manager for biocache-store jobs and other LA operations.

Running biocache-store from jenkins has some advantages:

  • You have an history of biocache-store commands, so it's easy to compare duration, errors, outputs.
  • Other team members can see what operations were done or are running in your biocache-store, logs, etc.
  • You can enrich the biocache-store output, adding color to logs, summaries, etc
  • You can receive notifications (html, email, etc) when some jobs fails or ends successfully, or run other jobs.
  • You can automatize your biocache-store different jobs
  • etc

Installation

You can install jenkins in ubuntu, debian and derivatives, with a simple:

wget -q -O - https://pkg.jenkins.io/debian/jenkins.io.key | sudo apt-key add -
sudo sh -c 'echo deb http://pkg.jenkins.io/debian-stable binary/ > /etc/apt/sources.list.d/jenkins.list'
sudo apt update
sudo apt install jenkins

Jenkins can install in the same server than you have biocache-store installed, but also in other different server. In this last case, you will need to call biocache-store tasks through ssh or a jenkins agent etc.

Permissions

If you run biocache-store jobs as the jenkins user, you will need to change the ownership of this LA directories:

  • /data/biocache/
  • /data/solr/
  • /data/biocache-load/

This is not needed if you use a different approach (like sudo).

Tasks

In general we add new jobs to jenkins, with the sidebar "New item" and as "Freestyle project".

And as many biocache-store tasks needs parameters (like a data resource number) you will need to use a parameterized build.

Some ALA jenkins ingestion tasks

Citing @djtfmartin about ALA tasks workflow:

ALA doesn't use the ingest for large datasets (...) The way ALA uses it is we load datasets during the week, but we have jenkins jobs that twice a week that run processing, sampling and index everything

Recommended plugins

Some jenkins plugins we recommend to improve it use with biocache-store and other similar jobs:

Plugin Description Comments
AnsiColor Adds ANSI coloring to the Console Output Useful for enright the biocache jobs with colors using grc, so it's more easy to detect ERROR, WARN, etc log messages
Build Name and Description Setter This plug-in sets the display name and description of a build to something other than #1, #2, #3 So instead of #number we can rename a jobs like ingest dr615" or similar
HTML5 Notifier Plugin The HTML5 Notifier Plugin provides W3C Web Notifications support for builds. You can receive jobs notifications in your browser thanks to this plugin
Log Parser Plugin Parses the console log generated by a build This generates a summary of ERROR,WARN,INFO messages of a job
Mailer Plugin This plugin allows you to configure email notifications for build results For email notifications to your team
Email Extension Plugin This plugin is a replacement for Jenkins's email publisher. It allows to configure every aspect of email notifications: when an email is sent, who should receive it and what the email says Or this more advanced email extension

Adding tests to our jobs

You can add some additional tests, for instance, to your ingest jobs so we improve our data processing tasks with extra checks like:

These are just some unofficial scripts that we use instead of do manual checks, please feel free to improve them, or add more and documment it here.

Screenshots

Sample of jobs in gbif.es jenkins:

Ingestion logs with Build Name Setter plugin:

Logs with colors:

Logs summary per job: