Displaying character states on a tree - arklumpus/TreeViewer GitHub Wiki

This guide will provide instructions on how to draw a phylogenetic tree displaying character states together with the tree topology.

The tree file Bacteria.tre contains a rooted tree with 166 bacterial strains. The tree was created mainly based on the NCBI taxonomy, so the topology might not be particularly accurate; this is OK for this example, because we are only interesting in showing the distribution of some features in the tree, without the pretense of providing an accurate bacterial tree of life. The root of the tree is also placed rather arbitrarily.

When the tree file is opened in TreeViewer, it should look similar to the following figure:

Cleaning up the tree for display

The tree is rather hard to read, owing to the many strains and the different branch lengths. To improve it and make it easier to read, we can first of all transform the branch lengths, so that the tree looks more like a cladogram.

To do this, open the modules panel by clicking on the Further transformations button in the Modules tab and click on the Add module button to add a new Transform lengths module. This module can be used to make all the branch lengths in the tree equal to each other, or to transform the tree into a cladogram. To transform the tree in a cladogram, expand the options for the Transform lengths module and set the Transform parameter to Cladogram.

At this point, the branch length labels are not particularly useful and can be removed from the plot by clicking on the Plot actions button in the Modules tab and then on the × button next to the second Labels module in the Plot elements. The tip labels can also be removed, since with so many taxa the tree would be very hard to read anyways.

To increase the vertical spacing between the branches of the tree, click on the Coordinates module button in the Modules tab, then expand the parameters of the module and set the Height to 4000. Finally, you can make the branches thicker and easier to see by expanding the options of the Branches module in the plot elements section and setting the Line weight to 5.

The tree should now look similar to the following image:

Adding the character state data

A tree file only contains the phylogenetic relationships between the various taxa, and does not include information about additional character state data. This information can be included by adding an "Attachment" to the tree. The file Bacteria_data.tsv contains a table that specifies, for each bacterial strain in the tree:

  • The kind of photosynthetic reaction centre that the bacterium has; this character has 4 possible states:

    • N, if the bacterium does not have any photosynthetic reaction centre genes.
    • A, if the bacterium has genes for a Type I reaction centre.
    • B, if the bacterium has genes for a Type II reaction centre.
    • C, if the bacterium has genes for both Type I and Type II reaction centres (i.e. cyanobacteria).
  • The presence or absence of a norV nitric oxide reductase gene:

    • Y if this gene is present.
    • N if this gene is absent.
  • The presence or absence of the hmpA gene (which is another gene involved in nitric oxide detoxification):

    • Y if this gene is present.
    • N if this gene is absent.
  • A colour in RGB hex format (based on the group to which the bacterium belongs).

  • The group to which the bacterium belongs.

If you open the file in a text editor, you will see that it looks similar to this:

Genome                                          Photosynthesis    norV    hmpA    Color     Group
Abditibacterium_utsteinense_GCF_002973605.1           N            N       N     #B4B4B4    N/A
Achromobacter_sp._KAs_3-5_GCF_001975485.1             N            N       Y     #117733    Betaproteobacteria
Amphiplicatus_metriothermophilus_GCF_900199215.1      N            N       N     #CC6677    Alphaproteobacteria
Anabaena_sp._YBS01_GCF_009498015.1                    C            Y       N     #44AA99    Cyanobacteria
Anaerococcus_octavius_GCF_002847745.1                 N            Y       N     #999933    Firmicutes
Anaerocolumna_cellulosilytica_GCF_014218335.1         N            Y       N     #999933    Firmicutes
Anaerolinea_thermolimosa_GCF_001050195.2              N            N       N     #882255    Chloroflexi
...

This file can be added as an attachment to the TreeViewer plot by clicking on the Add attachment button in the Attachments tab. After selecting the file, click OK on the dialog window that opens to load the Attachment with the default settings. The attachment is shown as a paperclip in the Attachments tab; by clicking on the paperclip you can perform various actions (such as deleting the attachment or exporting it in order to recreate the original file). Note that once a file is added as an attachment to the tree, it becomes completely independent of the original file on your computer. One of the actions you can do by clicking on the paperclip button is to select the Spreadsheet editor option: this will open the attachment in a spreadsheet interface, which can be useful to make quick changes to the file within TreeViewer. In cells that contain a colour (like the Color column), a small square shows a colour preview; you can double-click on these to open a colour picker that lets you easily change the colour.

After adding the file to the plot, we need to tell TreeViewer how to interpret the data contained in it and associate it to the tips of the tree. This can be done by adding a new Parse node states module to the Further transformations. Once the module has been added, expand the options for this module and select the Bacteria_data attachment as the Data file, then check the Use first row as header check box in the New attribute section. This parameter tells TreeViewer to use the first row of the file to determine the column headers. If you click on the Preview button, you will get a preview of the data that will be parsed from the file (so that you can check that everything is in order).

Finally, click on the Apply button to add the attributes to the tree. Now, if you click on any leaf of the tree and open the attributes section of the selection panel, you will see the new attributes Photosynthesis, norV, hmpA, Color and Group that have been loaded from the file. The colour of the terminal branches will also change to reflect the colour contained in the data file.

The tree should now look similar to the following figure:

Currently, all internal branches in the tree are black, even those within coloured groups. To ensure that these internal branches also take the colour of the group they belong to, we can use the Propagate attribute module. Add the module to the Further transformations by clicking on the Add module button, then expand its options and set the Attribute to Color and the Attribute type to String. Now, set the Default value to #B4B4B4 (i.e. the same grey that is specified in the file for strains that do not belong to any of the highlighted groups). You can then click on the Apply button to update the tree.

This module traverses the tree from the tips towards the root; at each node, it checks whether all the children have the same value for the specified attribute (in this case, Color): if they do (which is the case e.g. for two strains in the same group), it applies the same value to the ancestor node as well. If they have different values, it does not propagate the attribute further, and uses the specified Default value instead.

Plotting the character state data

We can now plot the actual character state data on the tree. To do so, you need to use a Node states Plot action module, which can be added to the tree by clicking on the Add module button in the Plot elements section. By default, the node states will be drawn as circles on the tips of the tree (you may want to zoom in to see them better).

To set up the way that the node states are displayed, open the options for the module and, first of all, set the Attribute (at the bottom) to Photosynthesis. This will update the plot, so that the circles reflect the kind of photosynthetic reaction centre each strain has. To change the colour associated with each reaction centre type, click on the Wizard edit state colours button. This will open a new window, in which you can specify the colour for the A, B, C, and N states. For example, you can set the following colours:

  • For the A state (i.e. Type I reaction centres), a dark orange colour (#D55E00 #D55E00).
  • For the B state (i.e. Type II reaction centres), a light blue colour (#56B4E9 #56B4E9).
  • For the C state (i.e. cyanobacterial Photosystems I and II), a green colour (#009E73 #009E73).
  • For the N state (i.e. non-photosynthetic strains), a grey colour (#DCDCDC #DCDCDC).

We now need to position the character state plots appropriately. To move them to the right of the tree labels, you can set the X component of the Position to 50. We can also change the plot Type to Rectangle and increase the Width and Height to 50 and 27, respectively. This will make it easier to see the state associated to each strain.

The plot should now look similar to the following figure:

We can add information about the other character states (i.e. the presence or absence of hmpA and norV) by adding more instances of the Node states Plot action module.

To show the data for norV, after adding the new Node states module, set the Attribute to norV, the X component of the Position to 120, the plot Type to Rectangle, the Width to 50 and the Height to 27. Finally, click on the Wizard edit state colours button and set the colour for the Y state (i.e. presence of the gene) to a red-pink hue (#CC79A7 #CC79A7) and the colour for the N state (i.e. absence of the gene) to grey (#DCDCDC #DCDCDC).

Similarly, for the hmpA gene, set the Attribute to hmpA, the X component of the Position to 190, the plot Type to Rectangle, the Width to 50 and the Height to 27. In this case, we can use an orange colour (#E69F00 #E69F00) for the Y state, and the same grey colour for the N state (#DCDCDC #DCDCDC).

The tree should now look similar to the following image:

Highlighting groups of taxa on the tree

Right now, while the various groups of bacteria have different colours, the tree does not show their names. To make the tree easier to interpret, we can show the group names.

In order to do this, first of all we need to propagate the Group attribute the same way that we propagated the Color attribute: add another instance of the Propagate attribute module to the Further transformations, and set the Attribute to Group, the Attribute type to String and then delete the Default value (so that the text box is empty). Then, click on the Apply button.

Now, you can add a Group labels Plot action module to the tree plot. In the options for this module, set the Attribute to Group and make sure that the Only on last ancestor check box is checked (otherwise, a separate label will be shown for each strain). You should notice that the labels are currently drawn on the left side of the tree (near the root). To move them to the right, you should increase the Distance parameter to 2520.

The text of the labels is a bit too small; to adjust this, you should change the Height to 100, and then click on the Font button to increase the font size to 75. The labels should now become larger and more visible. You will notice that the program does its best to prevent the labels from overlapping on each other. To increase the spacing between the "rows" of labels, increase the Row margin to 25.

As you can see, there are multiple labels that read N/A: this is because, in the data file, strains that do not belong to any of the groups that are highlighted on the tree had this value in the Group column. To prevent these labels from showing up on the tree, we can "delete" them by replacing the N/A values with empty strings. There are multiple ways to do this. For example, if you click on the Search button in the Actions tab, a search bar will appear. Now, if you click on Advanced to open the advanced search options, you can select Group as the Attribute to search. Then, enter N/A in the Search box and leave the Replace with box empty (notice how all the strains that match the criterion have become highlighted). Finally, click on the Replace all button. This will add a new instance of the Replace attribute module to the Further transformations that performs the required substitution. The N/A labels should now disappear.

The tree should now look similar to the following figure:

Adding a legend

The final touch that is missing is a legend to identify the meaning of the various colours for the tip states. This can be added using the Legend module.

To add a legend, add a new Plot action module, and select the Legend module. A legend will appear at the bottom of the tree, with a red circle, a blue square and a green star. Expand the options of this module, and click on the Edit button for the Markdown source parameter. In the Markdown editor window, enter the following Markdown code:

### **Legend**

**Photosynthesis** \
![](square://8,#DCDCDC) Non-photosynthetic taxon \
![](square://8,#D55E00) Type I reaction centre \
![](square://8,#56B4E9) Type II reaction centre \
![](square://8,#009E73) Photosystems I and II

**_norV_ Gene** \
![](square://8,#DCDCDC) Absent \
![](square://8,#CC79A7) Present

**_hmpA_ Gene** \
![](square://8,#DCDCDC) Absent \
![](square://8,#E69F00) Present

See the readme for the Legend module for more information about the "special image" syntax used to draw the coloured squares.

We now need to position the legend appropriately and to increase the font size. To align the top-left corner of the legend with the top-left corner of the tree plot, set both the Anchor and Alignment to Top-left. Then, increase the Width of the legend to 1000 and the Font size to 70. Finally, to prevent the legend from hiding the branches of the tree, you can change the Background colour to a completely transparent colour (or, alternatively, drag the Legend module up until it appears in the Plot elements list before the Branches).

The final tree figure should be similar to the following:

You can now save the tree file or the plot as a PDF or SVG file using the items from the File menu. You can also download the Bacteria.tbi tree file, which contains the tree along with all the modules.

A plot like this makes it easy to see to identify the character states for the various groups in the tree, as well as to highlight the [lack of] co-occurrence of different features (e.g., from this tree it can be easily seen that most strains with the hmpA gene do not have the norV gene and vice versa).

Tips

Rotating the tree

If you do not like the vertical layout (or the fact that the text for the group labels is rotated), you can rotate the tree by expanding the options for the Coordinates module and setting the Rotation to e.g. 270° using the button. Then, you just need to reposition the legend by changing its Anchor and Alignment to Bottom-left and setting the Y component of the Position to 300 (you can of course specify a different position, if you wish). The rest of the plot will be updated automatically to fit the new layout! The figure will look similar to the following:

Drawing a circular tree

This kind of plot is also particularly appropriate for being displayed as a circular tree. To change the layout to a circular tree, do not click on the Circular style button (because this will reset the plot elements that we have spent so much time customising to the default). Instead, click on the Reshape tree button and select Circular style from the drop-down menu. This will update the plot settings without removing our customisations. Then expand the options for the Coordinates module and set the Outer radius to 1500, then click on Apply.

Then, open the options for each Node states module and set the Anchor to Origin. Increase the X component of the position for each module (e.g. set it to 1550 for the first module, 1620 for the second module and 1690 for the third module). Now, set the plot Type to Wedge and update the Width so that it is equal to the Suggested width. You can also increase the Height to 50 for all three modules.

In the options for the Group labels module, reduce the distance to e.g. 1750. You can also click on the Font button and increase the font size to 100.

Finally, reposition the legend by changing its Anchor to Middle-right and the Alignment to Middle-left and set the Position so that the X is equal to 500 and the Y is set at 0. The new plot should look similar to this figure:

The Bacteria_circular.tbi tree file contains the circular tree plot with all the modules. The finished tree file is also available in the Examples section of TreeViewer's welcome page in the File page.