2 Nuclear Genes Assembly from Multi‐omics Data - PhyloAI/Ortho2Web GitHub Wiki

HybPiper-nf is an advanced extension of the original HybPiper tool, specifically developed to streamline the assembly of target enrichment sequencing data, particularly for single- or low-copy genes. This Nextflow-based pipeline integrates several key features, including handling multiple samples, generating assembly statistics, and producing visualizations within a unified workflow. By allowing users to execute these tasks with a single command, HybPiper-nf significantly enhances operational efficiency compared to its predecessor. HybPiper-nf is particularly well-suited for phylogenomic studies, especially those involving small datasets, as it effectively automates the management of target enrichment data. Automating multiple steps into a cohesive workflow reduces the need for manual intervention and minimizes user error. Key features such as simultaneous sample processing, assembly statistics generation, and visualization collectively simplify evolutionary and phylogenetic research by automating critical aspects of data analysis. Moreover, the implementation of Nextflow provides improved scalability and reproducibility compared to the original HybPiper, making the pipeline more robust and adaptable for complex analyses.

2.1 Installation

conda install bioconda::nextflow
conda install conda-forge::singularity
singularity pull library://chrisjackson-pellicle/collection/hybpiper-paragone:latest

#Clone the Hybpiper repository
git clone https://github.com/chrisjackson-pellicle/hybpiper-nf.git

2.2 Running HybPiper with Nextflow

nextflow run hybpiper.nf -c hybpiper.config -entry assemble -profile standard_singularity --illumina_reads_directory directory/path/to/raw/data --targetfile_dna target_sequence.fasta --bwa --outdir directory/path/to/results --cov_cutoff 5 -with-trace pipeline_trace.txt
  • --targetfile_dna: Reference sequence file used in the assembly process.
  • --outdir: Specifies the output directory for storing results.
  • -with-trace: Logs the details for every steps in this pipeline.