Home - Ecogenomics/CheckM GitHub Wiki

Introduction

Bugs and Feature Requests

Installation

The latest release of CheckM is v1.1.6 (April9, 2022). CheckM >=1.1.0 requires Python 3.

  • System Requirements - What sort of computer is required to run CheckM?
  • Installation - How can I install CheckM?
  • Upgrade - How can I upgrade CheckM to the latest release?
  • Unit Tests - How can I test my CheckM installation?
  • KBase - CheckM is available online through KBase!

Quick Start

Command Line Overview

  • Overview - list of all CheckM features

Workflow Overview

Genome Quality Commands

  • tree - place bins in the reference genome tree
  • tree_qa - assess phylogenetic markers found in each bin
  • lineage_set - infer lineage-specific marker sets for each bin
  • taxon_list - list available taxonomic-specific marker sets
  • taxon_set - infer taxonomic-specific marker set
  • analyze - identify marker genes in bins
  • qa - assess bins for contamination and completeness

Reported Statistics

  • qa - description of statistics reported by qa command

Plots

  • gc_plot - create GC histogram and delta-GC plot
  • coding_plot - create coding density (CD) histogram and delta-CD plot
  • tetra_plot - create tetranucleotide distance (TD) histogram and delta-TD plot
  • dist_plot - create image with GC, CD, and TD distribution plots together
  • nx_plot - create Nx-plots
  • len_hist - sequence length histogram
  • marker_plot - plot position of marker genes on sequences
  • gc_bias_plot - plot bin coverage as a function of GC

Bin Exploration and Modification

  • unique - ensure no sequences are assigned to multiple bins
  • merge - identify bins with complementary sets of marker genes
  • outliers - identify outliers in bins relative to reference distributions
  • modify - modify sequences in a bin

Utility Commands

  • unbinned - identify unbinned sequences
  • coverage - calculate coverage of sequences
  • tetra - calculate tetranucleotide signature of sequences
  • profile - calculate percentage of reads mapped to each bin
  • join_tables - join tab-separated value tables containing bin information
  • ssu_finder - identify SSU (16S/18S) rRNAs in sequences