Pangenome generation - SegataLab/panphlan GitHub Wiki
Generates the Bowtie2 indexes needed for mapping.
Example:
./panphlan/panphlan_new_pangenome_generation.py -c erectale --i_fna reference_genomes/ -o database/ --verbose
- Since PanPhlAn version 1.3, the pangenome generation uses ChocoPhlAn export as input.
The ChocoPhlAn export is a pangenome file (
panphlan_[NCBI_TAX_ID]_pangenome.csv
). - References genomes must be provided as
.fna
in the folder given by the-i_fna
argument. -
-c CLADE_NAME
to specify the clade or species database-name; PanPhlAn will search for a file namedpanphlan_CLADE_NAME_pangenome.csv
If no --output
argument is provided, the default value database
will lead to the creation of the database/
folder. In this folder :
- a
CLADE_NAME_ref_genomes.fna
containing the concatenation of the.fna
files from the input folder - 6 bowtie2 indexes files named
panphlan_CLADE_NAME.[1-4].bt2
andpanphlan_CLADE_NAME.rev.[1-2].bt2
./panphlan/panphlan_pangenome_generation.py -h
-h, --help show this help message and exit
-i INPUT_FNA_FOLDER, --i_fna INPUT_FNA_FOLDER
Folder containing the .fna genome sequence files
-c CLADE_NAME, --clade CLADE_NAME
Name of the species pangenome database, for example:
-c ecoli17
-o OUTPUT_FOLDER, --output OUTPUT_FOLDER
Result folder for all database files
--verbose Show progress information
For old pangenome generation detail (like with PanPhlAn <= 1.2 ), see older version