Module 2 Lab 4: 3D‐DNA, Juicer, and SyRi - jacksonhturner/epp_531 GitHub Wiki
This code attempts to leverage a software pipeline to ultimately generate a polished sassafras assembly from Hi-C data and to create a synteny map between two haplotypes of the dogwood cultivar Cherokee Brave.
Part 1: 3D-DNA
Make a directory for 3D-DNA
analysis.
mkdir 3D-DNA
cd 3D-DNA
Soft link Juicer
output and reference fasta files into the analysis directory.
ln -s Path_to/Juicer/aligned/merged_nodups.txt .
ln -s Path_to/Juicer/references/*fasta .
Load the Parallel
package through spack.
spack load /acgsl7y
Run 3D DNA on the assembly using the Juicer
output.
bash /pickett_shared/software/3d-dna-201008/run-asm-pipeline.sh ../MT_CP_removal/Sassafras_hap1_no_organelles.fasta merged_nodups.txt
Use the following script to finalize the 3D-DNA
output.
bash /pickett_shared/software/3d-dna-201008/finalize/finalize-output-w-stats.sh \
-c 12 \ #no. of chromosomes
-s 15000 \ #tiny_thresholds
-l Your_Assembly \
Your_Assembly.final.cprops \
Your_Assembly.final.asm \
Your_Assembly.final.fasta \
final \
>& assembly-CP_filtered.fasta.filtered.out
Review the Juicebox
documentation available at https://github.com/aidenlab/Juicebox
. Use it to fix the assembly, but if needed, run the following script:
bash /pickett_shared/software/3d-dna-201008/run-asm-pipeline-post-review.sh -r \
Your_Assembly.final.review.assembly \
Your_Assembly.fasta \
merged_nodups.txt
Part 2: SyRi
Create a new analysis directory for SyRi
, which will allow for the visualization of the genome.
mkdir Syri
cd Syri
Link the requisite SyRi
files into the analysis directory.
ln -s /pickett_sphinx/projects/EPP531_AGA/lyadav_EPPAGA/Syri/*.fa .
Use minimap2 to align the two sequences.
spack load minimap2
minimap2 -ax asm5 -t 5 hap1_subset_9.fa hap1_subset_9.fa > hap1-vs-hap2.sam
Use samtools
to convert the SAM file into BAM format. Delete the SAM file to save space.
spack load /r67sol
samtools view -b -@ 1 hap1-vs-hap2.sam > Dogwood_hap1-vs-hap2_Chr9.bam
rm hap1-vs-hap2.sam
Create a new text file called genomes.txt with nano
.
nano genomes.txt
Populate the text file with the following:
hap1_subset_9.fa Hap1_Chr09
hap2_subset_9.fa Hap2_Chr09
Plot the figure displaying the assembled genome with Plotsr
.
#Install plotsr with conda
conda create -n plotsr
conda activate plotsr
conda install bioconda::plotsr
#Run Plotsr
plotsr --sr syri.out --genomes genomes.txt -o ChBrave_chr9-vs-K2_chr9.png -H 8 -W 10 -d 300
conda deactivate
Below demonstrates the image of the synteny map created by SyRi.