Metagenome database - jsgounot/metagenomic-pipelines GitHub Wiki
I did not updated this page since a while and some elements might be outdated.
Database of MAGs and related data. Some note:
- A species is usually defined as a distance of 0.05 (95%). Mostly mash and after ANI value.
- A non-redundant genome (strain?) is defined (in UHGG) as mash distance 0.001 (99.9%).
General database
- GTDB: Genome Taxonomy DataBase [47,894 species clusters]
- ProGenomes: Annotated bacterial and archaeal genomes from over 12000 species
Gut database
- UHGG: A unified catalog of gut metagenomes (July 2020 - paper) [4,644 prok representative]
- HumGut: a comprehensive human gut prokaryotic genomes collection filtered by metagenome data (using both RefSeq and UHGG) (July 2021 - paper) (useful?)
- HRGM: Updated version of UHGG with additional Asian genomes (August 2021 - paper) [5,414 prok representative]
Genes, protein and functional database
- ProGenome: Annotated bacterial and archaeal genomes from over 12000 species
- UHGP: The UHGG (gut metagenome) protein database
- GMGC: The Global Microbial Gene Catalog is an integrated, consistently-processed, gene catalog of the microbial world, combining metagenomics and high-quality sequenced isolates