Starter Kits (Part 1 DRS and WES) - ga4gh/ismb-2022-ga4gh-tutorial Wiki

Time: Sunday, July 10th, 2022 @ 3:30 pm - 4:45 pm

Slides: link

Outline

In this session, participants will gain familiarity with the GA4GH Starter Kits for Data Repository Service (DRS) and Workflow Execution Service (WES) standards. Using Docker and Docker Compose, participants will download and run DRS and WES API instances on their local machine. They will then populate the DRS instance with references to public genomics datasets, such as 1000 Genomes CRAM and CRAI files. Lastly, participants will execute containerized workflows (stored in Github) using WES.

The participants will play the roles of

  • Data Provider

    • System/Platform admin: Configure, start, and stop GA4GH Starter Kit services
    • Data Access Committee (DAC): Grant/revoke researcher access to datasets
  • Researcher / Data Consumer

    • Interact with the GA4GH services to accomplish a scientific objective
    • Search, authorize, access, and analyze using the hosted services

NOTE: We use Postman for all the HTTP requests

Tutorial Steps

1. Instructions to Set Up Environment

Follow these instructions to set up your environment.

Once the virtual environment "ismb-tutorial" is activated, go to "ismb-2022-ga4gh-tutorial" directory and install requirements

If you have not already done so, clone the tutorial repository

git clone https://github.com/ga4gh/ismb-2022-ga4gh-tutorial.git
cd ismb-2022-ga4gh-tutorial
pip3 install -r starterkit-requirements.txt

2. Check the environment

1. Python installation

Running this command should return the python version installed on your machine

python --version

2. Docker Desktop and command line

docker run hello-world

3. Postman (check -> GET google.com)

Run this GET request on Postman. It will return your IP

GET https://httpbin.org/ip

3. Run DRS and WES Starter Kit docker containers using docker-compose

cd sessions/StarterKits/Part_1_DRS_WES

Deploy DRS and WES Starter Kits using docker-compose

docker-compose up -d

4. Check if the docker containers are running & confirm the service-info endpoints

List all the running docker containers. You should see two containers, part_1_wes and part_1_drs

docker ps

You can also get this information from the docker desktop application. Check the list of running docker containers in the part_1_drs_wes stack image

Inspect service-info endpoint of DRS and WES containers. The service-info endpoint returns the server details of the current implementation.

GET http://localhost:5000/ga4gh/drs/v1/service-info
GET http://localhost:6000/ga4gh/wes/v1/service-info

5. Load 1000 genomes data into DRS

python3 scripts/populate-drs.py

6. Explore DRS endpoints

i. GET service-info

GET http://localhost:5000/ga4gh/drs/v1/service-info

ii. GET object by id

GET http://localhost:5000/ga4gh/drs/v1/objects/HG00449.1000genomes.highcov.downsampled.cram

iii. Bulk request objects

POST http://localhost:5000/ga4gh/drs/v1/objects

REQUEST BODY (raw, JSON):
{
    "selection":
        [
            "HG00449.1000genomes.highcov.downsampled.cram",
            "HG00449.1000genomes.highcov.downsampled.crai"
        ]
}

iv. OPTIONS object by id

This endpoint provides the details about any authorization required to access the object

OPTIONS http://localhost:5000/ga4gh/drs/v1/objects/HG00449.1000genomes.highcov.downsampled.cram 

You can learn more about DRS specification here.

7. Explore WES Endpoints

i. GET service-info

GET http://localhost:6000/ga4gh/wes/v1/service-info

ii. GET runs list

GET http://localhost:6000/ga4gh/wes/v1/runs

iii. POST a Nextflow workflow

Run samtools view on drs://localhost:5000/HG00445.1000genomes.highcov.downsampled.cram

POST http://localhost:6000/ga4gh/wes/v1/runs

REQUEST HEADER: 
Content-Type: multipart/form-data

REQUEST BODY (form-data):
workflow_type:NEXTFLOW
workflow_type_version:21.04.0
workflow_url:https://github.com/jb-adams/samtools-view-count-nf
workflow_params:{"input":"drs://localhost:5000/HG00445.1000genomes.highcov.downsampled.cram"}

The response from this request provides the "run_id" which can be used to monitor the run.

iv. Monitor the run

GET http://localhost:6000/ga4gh/wes/v1/runs/<run id>

It takes about 5 minutes to complete this run. Do not stop your docker containers until then.

You can learn more about WES specification here.

8. Stop and Remove all docker containers, networks, volumes, and images created in step 2

docker-compose down
⚠️ **GitHub.com Fallback** ⚠️