Home - graph-genome/Schematize GitHub Wiki

Pantograph User Guide


Quick Start

If you already have familiarity with graph genomes, there’s just 6 things you need to know to interpret a Pantograph visualization. If there are unfamiliar terms, keep reading for a more detailed explanation.

  1. The graph nodes containing sequence are sorted left to right in their rough linear order. Alternative nodes are placed in their genomic context.

  2. Deletions / insertions appear in the MSA Matrix as columns with blank cells in most individuals.

  3. SNPs appear as a mostly blank column where individuals with the SNP have a cell filled in for the alternative allele.

  4. Pantograph identifies co-linear syntenic regions and encapsulates them in Components.

  5. Linear Components are connected by non-linear Links. These are edges representing inversions or translocations. This visualization technique allows Pantograph to treat all individuals who share a rearrangement as a single “variant” called a Link Column which contain colored cells indicating presence or absence of the rearrangement in an individual.

  6. You can zoom out by increasing “Bin Width”: the number of ‘adjacent’ nucleotides binned together in the pangenome. This tends to lump SNPs with their surrounding sequence while preserving large deletions. Bin widths of kilobases will show you only the large scale rearrangements of the pangenome.

Browse Public Sequence Resource Data

If you want to skip everything else, just go to Visualization of Public Sequence Resource Data to take a look at a graphical pangenome of COVID-19 data.

Next: Introduction