For administrators - SydneyBioX/SC_CPC_workshop GitHub Wiki

Google Cloud access

You need to contact either Kevin or Jean for access to Google Cloud. You should get a Gmail account before you do so.

Workshop flowchart

This flowchart gives an overview of the whole pipeline that Kevin has designed as of October 2019.

Understanding the Google Cloud build

At the time of writing, every push to the master branch of this GitHub repo will trigger building of a Docker image in Google Cloud. Inside the Dockerfile, you should see that the entire build consists of three executions:

  1. docker_setup.sh
  2. install.R
  3. docker_test.R

In order to ensure reproducibility, only builds succeeded in executing all three scripts will be saved onto the Google Cloud.

We will go through each one of these below.

Updating docker_setup.sh

This is a shell script that does two things:

  1. git clone https://github.com/SydneyBioX/SC_CPC_workshop to /home/CPC
  2. wget https://storage.googleapis.com/scp_data/data.zip to /home/CPC

All commands are shell commands. While we could integrate each shell into Dockerfile, it is much easier to have a dedicated script with consistent syntax.

Updating data.zip

data.zip should be a simple zip file of the data/ folder. Note that this folder is not git-tracked (due to the size of data exceeds what git can handle). To update the data.zip file that workshop attendees have access to, simply rove the old data.zip file to the bin folder through the GC console and upload the new data.zip file.

Updating install.R

Developers of the workshop should already add the necessary packages installation instruction to install.R. This file uses BiocManager::install to perform all installations. CRAN and BioC packages can be install as usual. However, GitHub packages should be written as username/repo. If a specific version of a GitHub package needs to be installed, then the ref argument should be specified.

Updating docker_test.R

One major objective for the design of the workshop is to make ensure everything that we make available online - let it be data or the GitHub materials - must be executable during the workshop. There is no use that whatever we do can only be run on one of our laptops! And we have gone to painstaking length to make sure this is the case.

As the time of writing, docker_test.R renders each of the three main computational .Rmd files (qc.Rmd, scMerge.Rmd and downstream.Rmd) in a N1-highcpu-8 Google Cloud machine.

Deployment

In our shared folder, there is a deployment folder. This folder is git-ignored for security purpose. Briefly speaking, this folder contains a script that generates passwords for attendees and a script to deploy Docker container via a virtual machine.