Metagenome database - jsgounot/metagenomic-pipelines GitHub Wiki

I did not updated this page since a while and some elements might be outdated.

Database of MAGs and related data. Some note:

  • A species is usually defined as a distance of 0.05 (95%). Mostly mash and after ANI value.
  • A non-redundant genome (strain?) is defined (in UHGG) as mash distance 0.001 (99.9%).

General database

  • GTDB: Genome Taxonomy DataBase [47,894 species clusters]
  • ProGenomes: Annotated bacterial and archaeal genomes from over 12000 species

Gut database

  • UHGG: A unified catalog of gut metagenomes (July 2020 - paper) [4,644 prok representative]
  • HumGut: a comprehensive human gut prokaryotic genomes collection filtered by metagenome data (using both RefSeq and UHGG) (July 2021 - paper) (useful?)
  • HRGM: Updated version of UHGG with additional Asian genomes (August 2021 - paper) [5,414 prok representative]

Genes, protein and functional database

  • ProGenome: Annotated bacterial and archaeal genomes from over 12000 species
  • UHGP: The UHGG (gut metagenome) protein database
  • GMGC: The Global Microbial Gene Catalog is an integrated, consistently-processed, gene catalog of the microbial world, combining metagenomics and high-quality sequenced isolates