GenomeContamination - aechchiki/SIB_LongReadsWorkshop_Zurich18 GitHub Wiki

Genome contamination

Section: Genome assembly assessment [4/5].

In many cases, the isolation of DNA from one organism impossible. In these cases it is important to remove contaminants from the assembly.

Contaminants are usually identified using sequence similarity to known contaminants, deviant nucleotide composition, different coverage that the rest of assembly, or some combination of these.

Blobology is a method that uses all mentioned properties to create a blob plot.


Within a plot is usually apparent if an obvious contamination was in the assembly or not. The example picture shows very clear separation of three gnomes in the originally intended assembly of a single strand.

There is also a tool Anvi’o focused on metagenomic assemblies.

