Crop regions and output DPI - arklumpus/TreeViewer GitHub Wiki

This tutorial will show how to use "crop" regions to export plots of different parts of the tree, as well as how to correctly determine the correct resolution for rendering raster images.

A large tree

If you are working with a very large tree, the plot may be too big to be able to print it or to include it in a manuscript. For example, the file Cyanobacteria.tre contains a - probably not very accurate - phylogenetic tree of 2063 cyanobacterial strains. If you were to print this tree on paper, with the tip labels at a real size of 7.5pt (i.e., fairly small), the paper would have to be more than 7.5 metres (25 feet) long.

A way to deal with this would be to concentrate on smaller groups within the tree, one at a time. To do this, you have two options: either you can divide the tree in smaller sub-trees, or you can use crop regions to focus on different parts of the same plot, which is what we are going to do here.

You can start by downloading the tree file from the link above and opening it in TreeViewer (it might be a bit overwhelming at first). When you open it, the tree should be displayed in a rectangular (rooted) style; depending on your screen resolution, you may need to zoom in to see anything, because of the large number of taxa. The tree has branch labels with the branch lenghts; to remove them, click on the Modules tab at the top and then on Plot actions; this will open the plot actions panel on the left. There, you can remove the second Labels module by clicking on the X button. Now, click on the Coordinates module button, which will open the settings for the Coordinates module. Expand the Parameters, and set the Width to something more manageable (e.g., 5000), then click on the Apply button.

The plot will now become very narrow; you can click on the Fit button at the bottom-right of the interface to make it fit on the screen. It should look something like this:

Defining crop regions

We are now ready to focus on some specific groups. To begin with, let us focus on the genus Planktothrix, a genus containing toxic filamentous cyanobacteria that are involved in algal blooms. First of all, we need to figure out where in the tree are the Planktothrix strains. To do this, open the Search panel, either by clicking on the Search button in the Actions tab, or by pressing CTRL+F (CMD+F on macOS). Then, write Planktothrix in the Search box.

Note that the behaviour of the search function has been improved starting from TreeViewer v2.1.0. While you write the name in the box, the matching taxa will be highlighted in yellow; you can click on the Find button, press enter while the Search box is focused, or press F3 to select the taxa one at a time.

You should notice that most of the highlighted Planktothrix strains belong to the same monophyletic group, while only a couple of them are in a different position in the tree (likely due to mis-identification, though as mentioned the tree is not particularly accurate):

We will ignore the two strains outside of the main Planktothrix group; therefore, zoom in to focus on the actual Planktothrix genus:

To create a crop region centred on this group, start by clicking on the branch leading to the last common ancestor of the group to select it. Then, click on the Apply crop button in the Edit tab. This will add a rectangle with some arrows, highlighting the extent of the crop region (which will correspond to the region currently shown on screen):

If you now click on the Modules tab and then on the Plot actions button, you will notice that a new Plot action module has been added, called Crop region. The options for this module can be used to determine the extent of the crop region; for example, to make this tighter around the Planktothrix genus, you can set the X coordinate of the Top left point to -20 and the Y to -345, and then set the X coordinate of the Bottom right to 1100 and the Y to 77.

You should also give a name to the crop region by setting the Region name to e.g. Planktothrix, as this will make it easier to identify the region in the next steps. The settings should look similar to the ones on the right.

When you are satisfied with the selected crop region, you can uncheck the Show guides checkbox. This will cause the rectangle and the arrows to disappear, but the (now invisible) crop region will still exist. This is necessary if you do not want the rectangle and arrows to appear in the exported plot.

Exporting a crop region

To export the crop region you have just defined, click on File, then Export, and select a file format (PDF, SVG, or PNG/TIFF). If you have defined at least a crop region in the plot, you will notice a new option in the export panel, called Crop region. The meaning of this option should be self-explanatory: if you set this to Entire plot, the whole plot will be saved as a PDF/SVG/PNG/TIFF file, while if you select a crop region (identified by the name defined in the module parameters) only that region will be exported.

For example, if you select the Planktothrix region that we have just defined, this is what should be exported:

Adding another crop region

Following this approach, you can define and export any number of crop regions from the same plot. For example, say we also want to focus on the Fischerella genus of filamentous nitrogen fixing Cyanobacteria. You can use the "Search" function to locate the strains belonging to this genus, as before:

Then, select the last common ancestor of this group, and click on the Apply crop button in the Edit tab. Finally, open the options for the new Crop region Plot action module that has been added to the plot, and change the Name to Fischerella, the coordinates of the Top left point to -20 for the X and -405 for the Y, the coordinates of the Bottom right point to 1700 for X and 85 for Y and uncheck the Show guides checkbox.

You can now export this new crop region, and you should get something similar to the following:

Specifying the resolution (DPI) of an exported raster image

This is also a good time to explore producing raster images at a defined output resolution. Many journals will accept figures in PDF or SVG format, but some will insist that you provide raster images in TIFF or PNG formats; in this case, you need to make sure that the raster image you produce has a high enough resolution, otherwise it might be hard to read labels or distinguish taxa on the tree.

When exporting a raster image using TreeViewer, there are a number of parameters that you can use to ensure that the resolution of the image is appropriate.

Specifying the size in pixels of the image

The easiest case is to create an image with a defined pixel size (e.g., 1200px wide), as could be the case, for example, of an online-only figure. To produce such a figure in TreeViewer, click on File, then Export, select Export PNG/TIFF, select a crop region (if appropriate) and then change the Width in pixel of the image. This will automatically update the Height of the image to keep the right proportion and the DPI resolution.

For example, if you take the Cyanobacteria tree we have been using until now and export the Planktothrix region at a Width of 1200px, the values should look similar to the following:

Naturally, you can also use this approach to produce an image with a defined height rather than width. You can then click on the Export PNG or Export TIFF buttons to create the raster image file.

Specifying the physical size and resolution of the image

In other situations, you may want to produce an image with a specified physical size - e.g., if it needs to be printed, or even just included in an A4 page for publication. The first step is determining how big (in terms of millimetres/centimetres/inches) the image needs to be for your purpose; for example, BMC Bioinformatics requires a width of 85mm for half page figures and 170 mm for a full page figure.

You can input this size in the TreeViewer export window, by choosing the correct measurement unit (mm) and entering e.g. 170 in the corresponding number box.

Then, you need to figure out the resolution that you need for the image: this is specified as dots-per-inch (DPI), i.e. the number of pixels that represent a 1-inch length. The higher the DPI the better in terms of image definition; however, the image file will naturally become heavier as you increase the resolution.

For example, BMC Bioinformatics recommend that the figures be produced at a resolution of 300 dpi at the final size; hence, enter 300 in the Resolution box in the export window. This will keep the physical size of the image unchanged at a width of 170 mm, and increase the pixel size to reach the required resolution. If you do this for the Planktothrix crop region of our tree file, you should get values similar to the following:

You can then click on the Export PNG or Export TIFF buttons to create the raster image file (note that it might take some time to export a high-resolution image); if you inspect it (e.g. by right-clicking it and checking its properties on Windows, or by using ImageMagick's identify -verbose on other OSs), you should see that the pixel size, the physical (print) size and the DPI resolution correspond to the expected values.

This concludes the tutorial; if you wish, you can save the tree file from the File menu, or you can download the Cyanobacteria.tbi tree file, which contains the final tree with all the modules, including the crop regions.

Tips

  • The area within a crop region does not need to correspond to a monophyletic group.

  • Crop regions are "anchored" to a node, which is used as a reference point. For example, during this tutorial we used the last common ancestors of the groups we were interested in as anchors. This allows the crop region to move around the plot if you rearrange it (e.g. by rerooting or by collapsing some nodes). You might still need to do some fine-tuning, though - thus it is recommeneded that crop regions be defined as the last step before exporting the image.

  • When you create a crop region by clicking on the Apply crop button, by default the crop region will be anchored to the selected node (if any) and will include the area currently shown on screen by TreeViewer (Crop selection and current view). You can also anchor the crop region to the root node instead (Crop current view) or include only the selected node and its descendants within the crop region (Crop selection).

    • Note that this latter option will include all the nodes within the crop region, but not the associated paraphernalia (e.g. the tip labels). This can still be a good starting point for manual fine-tuning of the crop region.
  • From the parameters of the Crop region module you can also update the crop region to include the area of the plot that is currently on screen.

  • If you do not hide the crop regions (and maybe change their colours), you can use them to create a "mini map" of the large tree, to show where the groups originally were. For example, the following figure was created in Inkscape by merging the whole tree with the exported images of the two subtrees: