Module 2 Lab 4: 3D‐DNA, Juicer, and SyRi - jacksonhturner/epp_531 GitHub Wiki

This code attempts to leverage a software pipeline to ultimately generate a polished sassafras assembly from Hi-C data and to create a synteny map between two haplotypes of the dogwood cultivar Cherokee Brave.

Part 1: 3D-DNA

Make a directory for 3D-DNA analysis.

mkdir 3D-DNA
cd 3D-DNA

Soft link Juicer output and reference fasta files into the analysis directory.

ln -s Path_to/Juicer/aligned/merged_nodups.txt .
ln -s Path_to/Juicer/references/*fasta .

Load the Parallel package through spack.

spack load /acgsl7y

Run 3D DNA on the assembly using the Juicer output.

bash /pickett_shared/software/3d-dna-201008/run-asm-pipeline.sh ../MT_CP_removal/Sassafras_hap1_no_organelles.fasta merged_nodups.txt

Use the following script to finalize the 3D-DNA output.

bash /pickett_shared/software/3d-dna-201008/finalize/finalize-output-w-stats.sh \
        -c 12 \ #no. of chromosomes
        -s 15000 \ #tiny_thresholds
        -l Your_Assembly \
        Your_Assembly.final.cprops \
        Your_Assembly.final.asm \
        Your_Assembly.final.fasta \
        final \
        >& assembly-CP_filtered.fasta.filtered.out

Review the Juicebox documentation available at https://github.com/aidenlab/Juicebox. Use it to fix the assembly, but if needed, run the following script:

bash /pickett_shared/software/3d-dna-201008/run-asm-pipeline-post-review.sh -r \
Your_Assembly.final.review.assembly \
Your_Assembly.fasta \
merged_nodups.txt

Part 2: SyRi

Create a new analysis directory for SyRi, which will allow for the visualization of the genome.

mkdir Syri
cd Syri

Link the requisite SyRi files into the analysis directory.

ln -s /pickett_sphinx/projects/EPP531_AGA/lyadav_EPPAGA/Syri/*.fa .

Use minimap2 to align the two sequences.

spack load minimap2
minimap2 -ax asm5 -t 5 hap1_subset_9.fa hap1_subset_9.fa > hap1-vs-hap2.sam

Use samtools to convert the SAM file into BAM format. Delete the SAM file to save space.

spack load /r67sol
samtools view -b -@ 1 hap1-vs-hap2.sam > Dogwood_hap1-vs-hap2_Chr9.bam
rm hap1-vs-hap2.sam

Create a new text file called genomes.txt with nano.

nano genomes.txt

Populate the text file with the following:

hap1_subset_9.fa        Hap1_Chr09
hap2_subset_9.fa        Hap2_Chr09

Plot the figure displaying the assembled genome with Plotsr.

#Install plotsr with conda
conda create -n plotsr
conda activate plotsr
conda install bioconda::plotsr

#Run Plotsr
plotsr --sr syri.out --genomes genomes.txt -o ChBrave_chr9-vs-K2_chr9.png -H 8 -W 10 -d 300

conda deactivate

Below demonstrates the image of the synteny map created by SyRi.