3D‐DNA and Interacting with Hi‐C - heelsplitter/Grootmyers_EPP_531_Applied_Genome_Analytics GitHub Wiki
10. 3D-DNA
Create new folder
cd /pickett_sphinx/projects/EPP531_AGA/dgrootmy
mkdir 3D-DNA
cd 3D-DNA
source ~/.bashrc
Link the Fasta and Juicer output.
ln -s /pickett_sphinx/projects/EPP531_AGA/jesseparker/hifiasm_data/Juicer/aligned/merged_nodups.txt .
ln -s /pickett_sphinx/projects/EPP531_AGA/jesseparker/hifiasm_data/Juicer/references/assembly-CP_filtered.fasta.filtered
Load Parallel
spack load /acgsl7y
Run 3D-DNA
bash /pickett_shared/software/3d-dna-201008/run-asm-pipeline.sh assembly-CP_filtered.fasta.filtered merged_nodups.txt
or
bash /pickett_shared/software/3d-dna-201008/run-asm-pipeline.sh --editor-repeat-coverage 8 --editor-coarse-resolution 50000 --editor-coarse-region 250000 assembly-CP_filtered.fasta.filtered merged_nodups.txt
#Did not run either of these personally.
Finalize 3D-DNA
bash /pickett_shared/software/3d-dna-201008/finalize/finalize-output-w-stats.sh \
-c 12 \ #no. of chromosomes
-s 15000 \ #tiny_thresholds
-l Your_Assembly \
Your_Assembly.final.cprops \
Your_Assembly.final.asm \
Your_Assembly.final.fasta \
final \
>& Your_Assembly.3d_dna_final.out
#Did not run this personally.
Download Final .hic and .assembly files for Juicebox
Juicebox Documentation
https://github.com/aidenlab/Juicebox Juicebox Tutorial
Optional Step (on case by case basis)
bash /pickett_shared/software/3d-dna-201008/run-asm-pipeline-post-review.sh -r \
Your_Assembly.final.review.assembly \
Your_Assembly.fasta \
merged_nodups.txt
#Did not run this personally.
11. Accessing Completeness
Run BUSCO on your new Genome Assembly
12. SyRi: Visualizing the genome
Make new directory
screen -S syri
cd ..
mkdir Syri
cd Syri
Link the fasta files in the current directory
cp /pickett_sphinx/projects/EPP531_AGA/lyadav_EPPAGA/Syri/*.fa .
cp /pickett_sphinx/projects/EPP531_AGA/lyadav_EPPAGA/Syri/Dogwood_hap1-vs-hap2_Chr9.bam .
#Edited fasta headers to be "Hap1" and "Hap2" respectively ##Undid this
Align the two fasta sequences
spack load minimap2
minimap2 -ax asm5 -t 5 --eqx hap1_subset_9.fa hap2_subset_9.fa > hap1-vs-hap2.sam
convert SAM to BAM
spack load /r67sol
samtools view -b -@ 1 hap1-vs-hap2.sam > Dogwood_hap1-vs-hap2_Chr9.bam
Delete the SAM file to save space
rm hap1-vs-hap2.sam
Running SyRi on BAM file
singularity exec -B $PWD /sphinx_local/images/syri_1.6.3--py38hdbdd923_2.sif syri -c Dogwood_hap1-vs-hap2_Chr9.bam -q hap1_subset_9.fa -r hap2_subset_9.fa -F B --cigar --nc 3
#Finally ran correctly
SyRi troubleshooting with Arabidopsis data:
screen -S SyRi_Arabidopsis
cd /pickett_sphinx/projects/EPP531_AGA/dgrootmy
mkdir SyRi_Arabidopsis
cd SyRi_Arabidopsis
source ~/.bashrc
cp /pickett_sphinx/projects/EPP531_AGA/dgrootmy/Lab_2/TAIR10_chr_all.fas .
curl --cookie jgi_session=/api/sessions/8c83f65cc6732774fe59faf20d359467 --output download.20240408.215232.zip -d "{\"ids\":{\"Phytozome-384\":[\"585486967ded5e78cff8c52c\"]}}" -H "Content-Type: application/json" https://files-download.jgi.doe.gov/filedownload/
#Downloaded Arabidopsis lyrata genome
unzip download.20240408.215232.zip
cp /pickett_sphinx/projects/EPP531_AGA/dgrootmy/SyRi_Arabidopsis/Phytozome/PhytozomeV12/Alyrata/assembly/Alyrata_384_v1.fa.gz .
gunzip Alyrata_384_v1.fa.gz
conda install minimap2
minimap2 -ax asm5 -t 5 --eqx TAIR10_chr_all.fas Alyrata_384_v1.fa > Ar1-vs-Ar2.sam
spack load /r67sol
samtools view -b -@ 1 Ar1-vs-Ar2.sam > Ar1-vs-Ar2.bam
rm Ar1-vs-Ar2.sam
singularity exec -B $PWD /sphinx_local/images/syri_1.6.3--py38hdbdd923_2.sif syri -c Ar1-vs-Ar2.bam -r TAIR10_chr_all.fas -q Alyrata_384_v1.fa -F B --cigar --nc 3
#Uneven chromosome # error. Should not be an issue for actual data.
Make a text file to name Fasta files
nano genomes.txt
"Genome.txt" file format
hap1_subset_9.fa Hap1_Chr09 hap2_subset_9.fa Hap2_Chr09
#Make sure to replace spaces with tabs
Plot Figure with plotsr
#Install plotsr with conda
source ~/.bashrc
conda create -n plotsr
conda activate plotsr
conda install bioconda::plotsr
#Run Plotsr
#cp /pickett_sphinx/projects/EPP531_AGA/lyadav_EPPAGA/Syri/syri.out .
plotsr --sr syri.out --genomes genomes.txt -o ChBrave_chr9-vs-K2_chr9.png -H 8 -W 10 -d 300
conda deactivate