How to download dbGaP data? - hakyimlab/external GitHub Wiki
Author: Jiamao Zheng
This wiki describes the procedure for downloading and decrypting data from the NCBI’s database of Genotypes and Phenotypes (dbGaP, https://dbgap.ncbi.nlm.nih.gov/). In order to download and use dbGaP data, you need to apply to NIH for a dbGaP access account. Please be aware that this application may take anywhere from one month up to several months.
-
Request data
- Login your dbGaP account
- Go to "My Requests"
- Scroll down to the section "PI: Hae kyung Im , Project #6981: Transcriptome Prediction"
- Select "29789-3 Genotype-Tissue Expression (GTEx) Common Fund Project (phs000424.v6.p1) Exchange Area (phs000424.v6.p1.c999), NHGRI"
- Click "Request Files" on the left panel
-
Download data
- First install aspera connect (download from http://downloads.asperasoft.com/downloads (select aspera connect and then show all installers)). You don't have to configure this software, just use all default settings.
- Go to "Download".
- Click "Download" for Download request #50852.
- A window will pop up, and you can choose either "Browser download" or "Command line download". Command line download is recommended. Download data into the appropriate folder in the tarbell according to instructions.
-
Get repository key from Haky (something like pro_6981.ngc).
-
Download decryption tool
- Download decryption tool (the decryption.2.6.3-mac64) from dbGaP (http://www.ncbi.nlm.nih.gov/Traces/sra/?view=software) into "~/bin"
-
Configure decryption tool
- Go to the "bin" subdirectory for the decryption.2.6.3-mac64 ("/Users/jiamaozheng/bin/decryption.2.6.3-mac64/bin").
- Run
./vdb-config -i. - A blue window will pop up.
- Press 4 and this will bring up the file navigation dialog.
- Navigate to the location of the repository key file where you got from Haky, and import it (see more detailed instructions at http://ncbi.github.io/sra-tools/pd_usage_guide.html). This will create a working directory, something like "/Users/jiamaozheng/ncbi/dbGaP-6981".
- Press 5, and this will bring up another file navigation dialog. Please confirm that the directory is the correct working directory.
- Tell vdb-decrypt where to find your dbGaP password (that's your dbGaP account password).
vi ~/.ncbi/vdb-passwd(save your dbGaP password in this file)chmod 400 vdb-passwd
-
Decrypt data
- Navigate to the project working directory ("/Users/jiamaozheng/ncbi/dbGaP-6981")
- Run the following decrypt command:
~/bin/decryption.2.4.4-mac64/bin/vdb-decrypt /Volumes/im-lab/nas40t2/jiamao/Raw_Dataset/dbGaP/50852/gtex/exchange/GTEx_phs000424/exchange/subgroups/eqtl.
References: