Visual test system instructions - wch/ggplot2 GitHub Wiki

Visual test system instructions

This is a primer on how to install and run the visual test code.

System requirements:

  • ImageMagick, to do image comparisons
  • Ghostscript, to convert from PDF to PNG (ImageMagick's convert function apparently uses Ghostscript to do this conversion but I had weird color consistency problems with it).
  • If you have ImageMagick installed, without a separate Ghostscript installation, you can use IM's convert utility by setting a flag in the convert_pdf2png function (this will be easier in the future).

I don't know if this will work on Windows right now -- it's possible I've handled path names and done other things that work only on Unixlike systems. If you try this on Windows, please let me know what works and doesn't work.

Also see the Visual test system overview page for further information.

Get the code

First, install the package:

library(devtools)
install_github('vtest', 'wch')

Run visual tests

In the ggplot2 directory, start up R.

library(vtest)

# Run all the visual tests
vtest(".")
# If you started in ggplot2's parent directory, you could use vtest("ggplot2")

This will run all the visual test scripts in ggplot2/visual_test/. The results from vtest will be saved in ggplot2/visual_test/vtest/`.

Whenever you run vtest(), the results are automatically saved in the lasttest results table, which is a temporary data store, separate from the test result database. If the working tree of your project is clean (you haven't modified files since the last git commit), then it will ask you if you want to add the results to the test database.

Visualize test results

After you run the tests, you can generate web pages that illustrate the results, highlighting which tests had warnings, and which had errors.

vtest_webpage(pkg=".")
# Writing /some_path/ggplot2-vtest/html/index.html
# ... [A bunch of other output] ...
# Open webpage in browser? (y/n)

Answer y to open the result page in a browser.

By default, vtest_webpage() generates pages for the lasttest results. You can generate web pages for any given commit (for which there are test results) by specifying ref. Any valid git commit ref can be used:

vtest_webpage(ref = "HEAD^")
vtest_webpage(ref = "master")
vtest_webpage(ref = "9c6542")

Make changes to your code

Now you can make changes that will result in different test images. You can modify the code for the visual tests. You can also add, remove, or rename tests. Or if you're serious, you can edit "real" code in your project.

After you make the changes, re-run the tests:

# ... edit code ...

vtest(".")

Compare test results to a set of results from another test run

Each commit of your project has a set of tests in it, and the results of the tests for a given commit (the output images and any warnings/errors) is called a resultset. The resultset for each commit can be stored in the test database. The current (uncommitted) working tree of your project can also have a resultset, but if the working tree is dirty (has been chnaged since the last commit), this resultset can't be saved to the database. Each resultset is identified with a MD5 hash; if two resultsets are the same, they will have the same hash, and if they are different at all, they will have different hashes.

If you would like to compare the last resultset to another resultset (based on tests in another commit), there are three functions that will be useful:

  • recent_vtest() returns a data frame of recent commits to your project, and, if the have test results stored in the database, their corresponding resultset_hash.
  • vdiffstat() returns a data frame of changes in test results between one commit and another.
  • vdiff_webpage() will generate web pages that show the differences between two commits.

recent_vtest()

This is how to see which recent commits have test results that are in the database. A NA value for resultset_hash indicates that there are no results for that commit in the database. Two resultsets are identical if and only if their hashes match. (Actually, there is a possibility of hash collisions, but it's extremely improbable.)

recent_vtest()
#  commit                   resultset_hash
#  3ee536                             <NA>
#  d23fba 852d1d5b3ec94c73a733062903b324a8
#  9fc863                             <NA>
#  9c6542 852d1d5b3ec94c73a733062903b324a8
#  bd2905                             <NA>
#  aac84d                             <NA>
#    ...

By default, recent_vtest() starts with the currently checked-out commit and lists the commits going backward from there. It shows only those commits on the current branch (and not merged-in branches); to change this, use main_branch=FALSE. For example, if you are currently on the master branch, it won't show any commits on branches that were merged in; however, it will show the merge commits.

It also assumes that the git repository is at .. See the help page for information on how to change this.

vdiffstat()

The command below will compare the most recent saved resultset to the resultset from the last-run test (which was not necessarily run on committed code). In the example here, the most recent saved resultset is d23fba, which is also HEAD^. Any valid git refspecs can be used, such as the HEAD^ used here, a branch name like mybranch, or a commit SHA hash like 04afe8. Additionally, there are two special ref strings:

  • "recent" refers to the most-recent saved resultset.
  • An empty string, "", refers to the last-run resultset.

Here is what the output might look like (with shortened descriptions and hashes for readability):

vdiffstat("recent", "")
#   context                      desc status  hash1  hash2 order1 order2
#   dotplot           multiple groups      D ecce89   <NA>     15     NA
#   dotplot stackgroups with 3 groups      A   <NA> 910c60     NA     26
#   dotplot bin y, dodging, stackgrou      A   <NA> 3ab86b     NA     29
# geom-path             lines, colour      C 06ae72 a1249e      3      3

This means that one test was (D) deleted, two tests were (A) added, and the output image for one test (C) changed.

The default is to compare HEAD to the last tests, and it assumes that that the package is in ..

vdiff_webpage()

This will generate webpages for viewing the differences:

vdiff_webpage("recent", "")
# Writing /some_path/ggplot2-vtest/diff/index.html
# ... [A bunch of other output] ...
# Open webpage in browser? (y/n)

Answer y to open the pages in a browser. They highlight added, deleted, and changed images. This differs from the vtest_webpage() pages, which instead of doing comparisons, shows test result images and highlights those that raised warnings and errors.

In most cases, you will want to commit the changes to your code if there were no changes to the test results. If you are fixing a bug or adding/removing tests, then you will of course want there to be changes to the results.


Using visual tests in code development

One of the important uses of visual tests is to check that your code doesn't break things, before you commit it to the respository. This can be done in two ways: by visually inspecting the test images for your commit, and by comparing the test results to the those from a previous commit. The first method, visual inspection, is straightfoward. I'll discuss the second method, comparing results, here.

Suppose that this is the structure of your commit tree:

 A---B---C---D---E---F master
              \
               G---H topic

If you are working on the master branch, and it is at commit F, then you can run the tests on F (and save the results), change your code, then run the tests again and check for changes. Typically, if there are no changes, then your code is good, and you can commit the changes. Instructions for this process are given above.

Suppose you are working on the topic branch and you are at commit H. You want to see if the test results from H are different from the results at D. To do this, you can check out the code at D, run the tests and save them, then checkout the code at H and compare them to the results from D. The commands would look something like this:

At the command shell, check out commit D:

git checkout D

Run the tests in a new R session:

vtest('.')
# Save the results to the database

Now check out commit H:

git checkout H

Run the tests again in a new R session

vtest('.')
# You can save the tests but it's not required

Now compare the results:

# Print table of changes
vdiffstat("D", "")

# If you saved the test results for H, you can also run
# vdiffstat("D", "H")

# Generate diff web pages
vdiff_webpage()

Downloading existing resultsets

Instead of running tests on each commit yourself, you can download existing resultset data from a git repository. They should go into the ggplot2/visual_test/vtest directory, and they can be retrieved this way:

cd ggplot2/visual_test/
git clone https://github.com/wch/ggplot2-vtest vtest
# Cloning into 'vtest'...

After that, you can retrieve new results by doing a git pull, provided that you do not modify your local database.

cd ggplot2/visual_test/vtest/
git pull

If you run tests locally, you can also push the changes and submit a pull request. (To submit pull requests, you'll first have to fork the repo on github, then clone it to your local machine.) If you do this, please only push tests on the master branch -- don't push tests on side branches.

If you want to save tests on a side branch while developing, you can do that, and as long as you don't run git commit on the saved results, you can later discard the results from the branch by running git reset --hard on the vtest repo.


Database maintenance

The integrity of the database and the associated image files can be checked with check_vtest_db().

check_vtest_db()
# Checking that commits don't have multiple entries in commit table... OK (3 unique commits).
# Checking that resultset hashes are correct... OK (1 resultset hashes checked).
# Checking that all resultset hashes in commit table are also in resultset table... OK (1 resultset hashes checked).
# Checking that all resultset hashes in resultset table are also in commit table... OK (1 resultset hashes checked).
# Checking that all commits in commits table are also in git history... OK (3 commits checked).
# Checking that all result image hashes referenced in resultsets table have files in images/... OK (110 image hashes checked).
# Checking that all files in images/ have matching image hash entries in resultsets table... OK (110 files).
# Checking that all images have correct hashes... OK (110 images checked).