Jenkins - GiselleSerate/pandorica GitHub Wiki

The SP-Solutions Jenkins server automates running the code daily and testing it when new changes are pushed.

Autorun

Background

The run_pandorica job on the SP-Solutions Jenkins server runs in a master-slave configuration with an agent configured on the SafeNetworking server. At each step, we source the virtual environment first, then run part of the test. The major steps are equivalent to simply running python src/pandorica.py, but I run each file separately in the Jenkins job to isolate each piece (for rerunning, analysis of failure, etc.).

The job runs twice daily to make sure that despite possible variance in when AV updates are deployed, we always promptly parse and write the latest update to Elasticsearch. Note that it is always safe to run this script more times than necessary--nothing bad will happen except we might eat more of your compute time. (Also don't try to run multiple instances in parallel; that's probably bad. You might be able to tag and residence-interval at the same time, but I would strongly recommend against it because race conditions are almost guaranteed to happen.)

This job is configured through pandorica/Jenkinsfile.

Process

Parse

Calls notes_parser.py, which downloads the latest notes and parses/writes to Elasticsearch anything that hasn’t already been written.

Tag

Calls domain_processor.py to tag all untagged domains in the database with information from AutoFocus.

Calculate intervals

Calls interval_calculator.py to calculate all residence/reinsert intervals of domains in the database that haven’t had this information calculated yet.

Test

Background

The test_pandorica job on the SP-Solutions Jenkins server runs an integration test on latest changes to develop. (We poll develop of https://github.com/GiselleSerate/pandorica for changes every 15 minutes.)

The test itself is configured through pandorica/src/test/Jenkinsfile (the Jenkinsfile at root is for actually running Pandorica).

Process

Setup/build

We build a Pandorica Docker image from the Dockerfile at root. We also build a custom ELK image (based on sebp/elk with some extra config on top). Then, we start both of them with Docker Compose (according to pandorica/src/test/docker-compose.yaml).

We don’t load a .panrc, but we add the AUTOFOCUS_API_KEY as a Jenkins environment variable.

Test

We scrape domains out of the predownloaded and abridged version notes found in pandorica/src/test/Updates_3026-3536.html. We write them to the ELK instance in the Docker container and check to make sure they are there. Finally, we use Autofocus to tag everything that we can and assert that enough domains were tagged in the ELK instance.

(Note that we do not test downloading new notes off the engineering tools server.)

Post

As cleanup, we remove both Docker containers to make sure everything is clean for the next run.