5. Custom and External Kraken2 databases - FOI-Bioinformatics/nanometa_live GitHub Wiki
Usage of Custom and External Kraken2 Databases
In Nanometa Live, users have the flexibility to utilize custom and external Kraken2 databases for taxonomic classification. This feature is essential for tailoring the analysis to specific research needs, such as focusing on particular taxonomic groups or utilizing databases that are more relevant to the project's scope.
Setting Up Custom Kraken2 Databases
- Custom Database Path: Specify the path to your custom Kraken2 database in the configuration file under
kraken_db. This path should lead to the directory where your custom Kraken2 database files are located. - Taxonomy Type: Alongside the custom database path, specify the taxonomy type (
gtdborncbi) underkraken_taxonomyto ensure compatibility with the chosen database.
Utilizing External Kraken2 Databases
- External Database Selection: In the configuration file, use the
external_kraken2_dbfield to specify the key of the desired pre-configured external database (e.g. 'Standard', 'PlusPF', 'PlusPFP'). - Database Information: Each external database option in
external_kraken2_infois accompanied by a description, database URL, inspect URL, and the associated taxonomy. This information helps in understanding the content and structure of the database. - Automatic Download and Setup: Upon selecting an external database and running the workflow,
Nanometa Liveautomatically downloads and sets up the database for use in the analysis.
Important Considerations
- Storage and Network Requirements: Ensure that your system has sufficient storage and network capacity for downloading and storing the databases, especially when working with large external databases.
- System Performance: The choice of database and its size can impact system performance. Larger databases may require more RAM and processing power.
Advantages of Custom and External Databases
- Flexibility: Cater to specific research needs by selecting databases that best align with your project.
- Accuracy: Using up-to-date and relevant databases enhances the accuracy of taxonomic classification.
- Convenience: The ability to automatically download and configure external databases streamlines the setup process, making it user-friendly and efficient.
By leveraging these features, researchers can enhance the specificity and relevance of their metagenomic analyses in Nanometa Live, ensuring that the taxonomic classification is as accurate and relevant as possible to their study.
Creation of Custom Kraken2 Databases
For a detailed guide on creating custom Kraken2 databases using FlexTaxD with GTDB taxonomy, you can refer to the official FlexTaxD documentation. This comprehensive guide covers everything from setting up the environment, downloading and processing taxonomy files, to building and purging the database.
It provides step-by-step instructions to ensure a successful setup of your custom Kraken2 database. To access this guide, please visit the FlexTaxD Wiki on Building a GTDB Database and Optional Modification.