3D‐DNA and Interacting with Hi‐C - heelsplitter/Grootmyers_EPP_531_Applied_Genome_Analytics GitHub Wiki

10. 3D-DNA

Create new folder

cd /pickett_sphinx/projects/EPP531_AGA/dgrootmy
mkdir 3D-DNA
cd 3D-DNA
source ~/.bashrc

Link the Fasta and Juicer output.

ln -s /pickett_sphinx/projects/EPP531_AGA/jesseparker/hifiasm_data/Juicer/aligned/merged_nodups.txt .
ln -s /pickett_sphinx/projects/EPP531_AGA/jesseparker/hifiasm_data/Juicer/references/assembly-CP_filtered.fasta.filtered

Load Parallel

spack load /acgsl7y

Run 3D-DNA

bash /pickett_shared/software/3d-dna-201008/run-asm-pipeline.sh assembly-CP_filtered.fasta.filtered merged_nodups.txt

or

bash /pickett_shared/software/3d-dna-201008/run-asm-pipeline.sh --editor-repeat-coverage 8 --editor-coarse-resolution 50000 --editor-coarse-region 250000 assembly-CP_filtered.fasta.filtered merged_nodups.txt

#Did not run either of these personally.

Finalize 3D-DNA

bash /pickett_shared/software/3d-dna-201008/finalize/finalize-output-w-stats.sh \
        -c 12 \ #no. of chromosomes
        -s 15000 \ #tiny_thresholds
        -l Your_Assembly \
        Your_Assembly.final.cprops \
        Your_Assembly.final.asm \
        Your_Assembly.final.fasta \
        final \
        >& Your_Assembly.3d_dna_final.out

#Did not run this personally.

Download Final .hic and .assembly files for Juicebox

Juicebox Documentation

https://github.com/aidenlab/Juicebox Juicebox Tutorial

Optional Step (on case by case basis)

bash /pickett_shared/software/3d-dna-201008/run-asm-pipeline-post-review.sh -r \
Your_Assembly.final.review.assembly \
Your_Assembly.fasta \
merged_nodups.txt

#Did not run this personally.

11. Accessing Completeness

Run BUSCO on your new Genome Assembly

12. SyRi: Visualizing the genome

Make new directory

screen -S syri
cd ..
mkdir Syri
cd Syri

Link the fasta files in the current directory

cp /pickett_sphinx/projects/EPP531_AGA/lyadav_EPPAGA/Syri/*.fa .
cp /pickett_sphinx/projects/EPP531_AGA/lyadav_EPPAGA/Syri/Dogwood_hap1-vs-hap2_Chr9.bam .

#Edited fasta headers to be "Hap1" and "Hap2" respectively ##Undid this

Align the two fasta sequences

spack load minimap2
minimap2 -ax asm5 -t 5 --eqx hap1_subset_9.fa hap2_subset_9.fa > hap1-vs-hap2.sam

convert SAM to BAM

spack load /r67sol
samtools view -b -@ 1 hap1-vs-hap2.sam > Dogwood_hap1-vs-hap2_Chr9.bam

Delete the SAM file to save space

rm hap1-vs-hap2.sam

Running SyRi on BAM file

singularity exec -B $PWD /sphinx_local/images/syri_1.6.3--py38hdbdd923_2.sif syri -c Dogwood_hap1-vs-hap2_Chr9.bam -q hap1_subset_9.fa -r hap2_subset_9.fa -F B --cigar --nc 3

#Finally ran correctly

SyRi troubleshooting with Arabidopsis data:

screen -S SyRi_Arabidopsis
cd /pickett_sphinx/projects/EPP531_AGA/dgrootmy
mkdir SyRi_Arabidopsis
cd SyRi_Arabidopsis
source ~/.bashrc
cp /pickett_sphinx/projects/EPP531_AGA/dgrootmy/Lab_2/TAIR10_chr_all.fas .
curl --cookie jgi_session=/api/sessions/8c83f65cc6732774fe59faf20d359467 --output download.20240408.215232.zip -d "{\"ids\":{\"Phytozome-384\":[\"585486967ded5e78cff8c52c\"]}}" -H "Content-Type: application/json" https://files-download.jgi.doe.gov/filedownload/
#Downloaded Arabidopsis lyrata genome
unzip download.20240408.215232.zip
cp /pickett_sphinx/projects/EPP531_AGA/dgrootmy/SyRi_Arabidopsis/Phytozome/PhytozomeV12/Alyrata/assembly/Alyrata_384_v1.fa.gz .
gunzip Alyrata_384_v1.fa.gz
conda install minimap2
minimap2 -ax asm5 -t 5 --eqx TAIR10_chr_all.fas Alyrata_384_v1.fa > Ar1-vs-Ar2.sam
spack load /r67sol
samtools view -b -@ 1 Ar1-vs-Ar2.sam > Ar1-vs-Ar2.bam
rm Ar1-vs-Ar2.sam
singularity exec -B $PWD /sphinx_local/images/syri_1.6.3--py38hdbdd923_2.sif syri -c Ar1-vs-Ar2.bam -r TAIR10_chr_all.fas -q Alyrata_384_v1.fa -F B --cigar --nc 3
#Uneven chromosome # error. Should not be an issue for actual data.

Make a text file to name Fasta files

nano genomes.txt

"Genome.txt" file format

hap1_subset_9.fa Hap1_Chr09 hap2_subset_9.fa Hap2_Chr09

#Make sure to replace spaces with tabs

Plot Figure with plotsr

#Install plotsr with conda

source ~/.bashrc
conda create -n plotsr
conda activate plotsr
conda install bioconda::plotsr

#Run Plotsr

#cp /pickett_sphinx/projects/EPP531_AGA/lyadav_EPPAGA/Syri/syri.out .
plotsr --sr syri.out --genomes genomes.txt -o ChBrave_chr9-vs-K2_chr9.png -H 8 -W 10 -d 300
conda deactivate