Continuous Integration and Deployment (CI CD) - pc2ccs/pc2v9 GitHub Wiki

Overview

The open-source version of PC2v9 is housed at GitHub in the pc2v9 repository under the pc2ccs organization. When any change is merged into the PC2v9 repo (whether on the master branch or the develop branch) a new "build" of the system is automatically generated. However, the build process does not take place on GitHub.

Instead, there is an equivalent pc2ccs project on GitLab. The GitLab project constantly monitors the PC2v9 project on GitHub, and whenever a change is committed it pulls a copy of the updated GitHub repo to GitLab and then invokes a GitLab Continuous Integration/Deployment (CI/CD) pipeline to perform a build.

The PC2v9 GitLab CI/CD pipeline is controlled by a configuration file named .gitlab-ci.yml; this file is part of the PC2v9 repository (on both GitHub and GitLab). The file contains GitLab YAML instructions which build the system from source, run a series of JUnit tests, package the resulting files into a "PC² Distribution", and copy the distribution to the appropriate repository within the PC2CCS project on GitHub.

Builds resulting from changes to the master branch are called releases, and are inserted into the pc2ccs/builds repository on GitHub; builds resulting from changes to the develop branch are called nightly builds and are inserted into the pc2ccs/nightly-builds repository (also on GitHub). As a backup, every build is also copied to the older PC² website maintained in the College of Engineering & Computer Science infrastructure at CSUS.

The PC² GitLab pipeline also invokes a tool called Hugo to generate a new version of the PC² GitHub website containing updated references to the newly-built distribution; see the separate wiki article on Building the PC2 GitHub Website for details.

CI/CD Details

As described above, the CI/CD process for building and deploying a PC² distribution to the PC² webpages runs in a GitLab pipeline driven by the commands (YAML statements) in the .gitlab-ci.yml file. This file defines the process as being split into multiple GitLab pipeline jobs run in three sequential stages: the build stage, the deploy stage, and the update website stage.[1] These stages and their jobs run on a GitLab Ubuntu Linux platform which includes openjdk-8, Hugo, Ant, curl, jq, and httpie.

Initially, several "access tokens" were generated under the PC2CCS group on GitLab, following the instructions given at this GitLab page (note that accessing the PC2CCS GitLab page to generate/manipulate things such as access tokens requires GitLab PC2CCS login credentials). The access tokens were then saved as Variables under the GitLab PC2CCS settings. Access tokens saved ahead of time in this manner include AUTHKEY, used to access the PC² ECS repository; GITHUB_TOKEN, used during HTTP accesses to the GitHub PC² repository; and SSH_PRIVATE_KEY, used during git clone operations which clone PC² GitHub repositories onto the GitLab machine.

Build Stage

The build stage runs in a single GitLab "job" named "job-build". This job runs a script which does the following:

  • Runs Ant on the PC² package.xml file. This command does the following things:
    • Compiles all the PC² project source (including the source for sub-projects such as the Web Team Interface(WTI)).
    • Runs all the project-defined JUnits.
    • Creates a set of distribution artifacts (.zip, .jar, .gz, and .txt files) in a folder named dist.
  • Writes a job-specific ID (called CI_JOB_ID) to the dist folder (for use in the subsequent deploy stage jobs).

Deploy Stage

The deploy stage comprises three separate jobs, each of which runs independently (that is, in parallel with the others). The deploy stage jobs run only after the build stage has completed. The three jobs in the deploy stage are named:

  • push release to GitHub
  • push nightly to GitHub
  • push to ECS at CSUS

Each of these jobs is described below.

Job: Push Release to GitHub

This job starts with a GitLab pipeline YAML rule that checks whether the Git commit that triggered this build was a commit on the GitHub master branch. If not, the rule terminates the job (in other words, this job only continues for commits to the master branch -- that is, for changes which result in creating a new "pre-release/release-candidate").

If the job continues, it next executes a series commands as follows:

  • Insure that the PC² private SSH key for accessing GitHub is contained in the .ssh folder on the GitLab machine.
  • Use the program ssh-keyscan to fetch the github.com SSH public key and store it in the local known_hosts file.
  • Set values for the user.name and user.email global Git parameters. The user.name value is PC2 bot. Currently, the user.email address is [email protected]; this is planned to change to [email protected] if/when approval for that domain is obtained. [2]
  • Use sed to construct a PC² VERSION number out of the just-built distribution.
  • Use Git to clone the pc2ccs/builds repo from GitHub onto the local machine.
  • Add a new release item to the local builds repo, then push it back to the GitHub builds repo.
  • Use httpie (a tool for doing command-line HTTP operations) to execute an HTTP POST via the GitHub API. This posts a new release, tagged with the PC² version number, in the /releases folder of the PC² GitHub builds repo. The posted release is marked as a prerelease (release candidate).
  • Execute a "for loop" which uses httpie to copy (POST) each .zip, .tar.gz, and (if it exists) sha256.txt and sha512.txt file to the /assets folder under the /releases folder for the current release on the PC² builds repo. jq (a tool for command-line JSON processing, similar to sed) is used to generate the JSON form of the uploaded data.

This completes the upload of the release artifacts to GitHub.

Job: Push Nightly to GitHub

This job is nearly identical to the Push Release to GitHub job. The differences are as follows:

  • This job starts with a rule which checks whether the current build was committed to the develop branch (instead of the master branch). If not, the job terminates.
  • Whereas the Push Release job clones the GitHub builds repo (into which it pushes the tagged release and its artifacts), this job clones and uses the nightly-builds GitHub repo.
  • This job uses a slightly different form for the tag which it assigns to the distribution which it pushes to the GitHub repo. (The job uses precisely the same sed command to build the VERSION string as that used in the Push Release to Github job, but this job then uses that VERSION string to generate a different variable TAG_NAME for the distribution.)
  • Whereas the Push Release job uses the pattern *.tar.gz when selecting artifacts to upload, this job uses the pattern *.gz. (The difference thus being that the Push Release job will only upload tar gzipped files while this job will upload all gzipped files. It's not clear whether there's a good reason for this or not.)

In all other respects the Push Nightly to GitHub job is identical to the Push Release to GitHub job.

Job: Push to ECS at CSUS

The purpose of this job (as mentioned above) is to push a backup copy of the newly-built PC² distribution to the ECS infrastructure at CSUS. It performs the following tasks:

  • Uses the sed stream editor to examine the .zip file contained in the just-built distribution in the /dist folder, and from that creates a variable BUILDNUMBER holding the currently PC² build number.
  • Constructs a variable BUILD_JOB_ID holding the GitLab job id saved during the build stage.
  • Uses curl to invoke a script named grabgitlabartifacts.cgi residing on the PC² ECS machine, passing to the script the AUTHKEY and the URL back to the GitLab location containing the artifacts for the specified BUILDNUMBER and BUILD_JOB_ID.

At this point the Push to ECS at CSUS job finishes. However, the operations which it has triggered continue.

Specifically, the grabgitlabartifacts.cgi script, which is stored in the cgi-bin folder on the ECS machine (at https://pc2.ecs.csus.edu/cgi-bin/grabgitlabartifacts.cgi) uses the provided access token and URL to fetch the build artifacts from GitLab, copy them to a new "build page" housed in a folder named as the build number under the ECS GitLab Builds page, and also adds a new entry pointing to the new build page onto the ECS GitLab Builds page, completing the backup.

Note that each new build is housed in a folder named for the build number (e.g., Build 5183 will be housed in https://pc2.ecs.csus.edu/pc2tug/builds/gitlab/5183). However, the information displayed on the GitLab Builds page will include additional information if the build was done on some branch other than the master branch. Specifically, for every build which occurs on a branch which is not the master branch, the links to the build artifacts will include a tilde (~) character followed by (a representation of) the name of the branch itself. Specifically, the branch name "representation" following the tilde will be the GitLab CI_COMMIT_REF_SLUG, a "URL-friendly" representation of the full branch name.

As an example, if Build 5183 was generated as a result of a commit made on a branch named "issue-9", then the name of the downloadable zip file for the build will be pc2-9.7build-5183~issue-9.zip. Branches with more complex names may be altered as defined by the specification for GitLab CI_COMMIT_REF_SLUG values (see this GitLab page for details).

Update Website Stage

Once all the jobs in the deploy stage have completed, the GitLab pipeline runs a stage named website (a clearer name might be update-website, since that's what happens in this stage). The website stage contains a single job, whose name actually is update website.

Like the deploy stage jobs, the update website job starts by checking to see what branch the commit which generated the current build occurred on. If the branch was either develop OR master, the job continues; otherwise it terminates without doing anything else.

The next part of update website operates analogously to the deploy jobs described above: it updates the SSH key store (with a private key allowing access to the PC² github.io website) along with updating the Git user.name and user.email values. After that, update website does the following things:

  • Execute a Python script named populate-releases which populates the JSON files used by Hugo with information on the currently available releases.
  • Clones the PC² GitHub.io website repository onto the local machine in a folder named website.
  • Changes to the website/public folder and runs Hugo in that folder. Hugo reads the contents of the folder and uses the template information therein to update the (local) website contents.
  • Runs sed to fix references to "&" in various HTML files in the local website files.
  • Commits all the changed website files to the (local) git repository and then pushes the changes back to the pc2ccs/github.io repository.

The effect of this job is to update all of the files defining the PC² github.io web site; refreshing any web page in that site will cause it to display links to the newly-created distribution.


[1] Stages in a GitLab CI/CD pipeline run serially (that is, one after the other); the .gitlab-ci.yml file specifies the order in which stages are to be executed. Jobs within a given stage can (and typically do) run in parallel, on separate virtual machines designed for that purpose.

[2] As of this writing (May 2020), the domain pc2.ccs.icpc.global has yet to be created...