Bin Size - graph-genome/Schematize GitHub Wiki

Bin Size as Zoom Level

Because the whole genome is too long to show in one view, modern genome browsers support zooming in/out. Existing graph genome browsers have struggled with this issue. We explicitly define the zoom level as bin width on the pantograph pipeline so that you can choose bin width arbitrarily on building a pangenome.

The following two figures show how the zoom level works. When bin width is 1, all nucleotide sequences are shown on the top of Pangenome Sequence. You can see the entire nucleotide sequence, but it usually won’t fit on the screen. Zooming out is done by selecting a larger bin width.

For example, the image below is set to 4 nucleotides per column. Nucleotides will not be shown but a wider range of the pangenome is visible in the browser.

Notice that short variants tend to be compressed as the bin width becomes wider. So, you can focus on larger rearrangements when you set larger bin width. If there are links across a component, component_segmentation bundles all links in the component and moves them to the start or end of the component.

Figure: SARS-CoV-b data with bin width = 1

Figure: SARS-CoV-b data with bin width = 4

Next: Example of SARS and SARS-CoV2