Download test data - kundajelab/chrombpnet GitHub Wiki
Step 1
We will start by creating a directory (~/chrombpnet_tutorial/data/downloads
) to store downloaded data.
mkdir -p ~/chrombpnet_tutorial/data/downloads
Step 2
Download hg38 human reference genome data - fasta file, chromosome sizes file and blacklisted bed regions.
# download reference data
wget https://www.encodeproject.org/files/GRCh38_no_alt_analysis_set_GCA_000001405.15/@@download/GRCh38_no_alt_analysis_set_GCA_000001405.15.fasta.gz -O ~/chrombpnet_tutorial/data/downloads/hg38.fa.gz
yes n | gunzip ~/chrombpnet_tutorial/data/downloads/hg38.fa.gz
# download reference chromosome sizes
wget https://www.encodeproject.org/files/GRCh38_EBV.chrom.sizes/@@download/GRCh38_EBV.chrom.sizes.tsv -O ~/chrombpnet_tutorial/data/downloads/hg38.chrom.sizes
# download reference blacklist regions
wget https://www.encodeproject.org/files/ENCFF356LFX/@@download/ENCFF356LFX.bed.gz -O ~/chrombpnet_tutorial/data/downloads/blacklist.bed.gz
Step 3
We then download ENCSR868FGK ATAC-seq reads in bam format using the commands below.
# download bam files
wget https://www.encodeproject.org/files/ENCFF077FBI/@@download/ENCFF077FBI.bam -O ~/chrombpnet_tutorial/data/downloads/rep1.bam
wget https://www.encodeproject.org/files/ENCFF128WZG/@@download/ENCFF128WZG.bam -O ~/chrombpnet_tutorial/data/downloads/rep2.bam
wget https://www.encodeproject.org/files/ENCFF534DCE/@@download/ENCFF534DCE.bam -O ~/chrombpnet_tutorial/data/downloads/rep3.bam