Development notes and instructions - ncbi/vadr GitHub Wiki


Installation instructions for users

The file vadr-install.sh is an executable shell script that will install VADR and its dependencies and then output important instructions for updating your environment variables so you can run the vadr scripts.

More detailed instructions can be found in VADR's install.md


Installation instructions for developers

The instructions below are relevant only to developers who wish to develop VADR or modify it in some way. For users interested only in running VADR, see the Installation instructions for users above. The developer installation is broken down into three steps:

steps for initial checkout

VADR depends on several other repositories at github:

Further, Bio-Easel, infernal and hmmer also depend on easel.

vadr also requires NCBI BLAST version 2.12.0+.

To clone a vadr repository for the first time, and get it set up for development follow the steps below.

First, move into a directory that you want to keep all the code in. Below, you will define the $VADRINSTALLDIR environment variable to this directory. That directory is referred to as !PATH-TO-VADR-INSTALL-DIR! below:

   $ cd !PATH-TO-VADR-INSTALL-DIR!
   $ git clone https://github.com/ncbi/vadr.git
   $ git clone https://github.com/nawrockie/sequip.git
   $ git clone https://github.com/nawrockie/Bio-Easel.git
   $ (cd Bio-Easel; mkdir src; cd src; git clone https://github.com/EddyRivasLab/easel.git easel)
   $ git clone https://github.com/EddyRivasLab/infernal.git infernal
   $ cd infernal
   $ git clone https://github.com/EddyRivasLab/hmmer
   $ git clone https://github.com/EddyRivasLab/easel

This will set you up on the master branches for all packages.

To do development, you'll want to now checkout the develop branches in all of the git repos you just cloned using the commands listed below. Or alternatively you can skip this step to build the stable master branches.

   $ cd !PATH-TO-VADR-INSTALL-DIR!
   $ (cd vadr; git checkout develop;)
   $ (cd sequip; git checkout develop;)
   $ (cd Bio-Easel; git checkout develop;)
   $ (cd Bio-Easel/src/easel; git checkout develop;)
   $ (cd infernal; git checkout develop;)
   $ (cd infernal/easel; git checkout develop;)
   $ (cd infernal/hmmer; git checkout develop;)

You'll also want to download the BLAST-2.10.0+ distribution with pre-compiled binaries either for Linux:

   $ curl -k -L -o blast.tar.gz https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.12.0/ncbi-blast-2.12.0+-x64-linux.tar.gz

or for Mac/OSX:

   $ curl -k -L -o blast.tar.gz https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.12.0/ncbi-blast-2.12.0+-x64-macosx.tar.gz

And then unpack it:

   $ tar xfz blast.tar.gz
   $ rm blast.tar.gz
   $ mv ncbi-blast-2.12.0+ ncbi-blast

And you'll want to download the VADR virus libraries:

   $ curl -k -L -o vadr-models-flavi-1.2-1.tar.gz https://ftp.ncbi.nlm.nih.gov/pub/nawrocki/vadr-models/flaviviridae/1.2-1/vadr-models-flavi-1.2-1.tar.gz
   $ tar xfz vadr-models-flavi-1.2-1.tar.gz
   $ rm vadr-models-flavi-1.2-1.tar.gz
   $ curl -k -L -o vadr-models-calici-1.2-1.tar.gz https://ftp.ncbi.nlm.nih.gov/pub/nawrocki/vadr-models/caliciviridae/1.2-1/vadr-models-calici-1.2-1.tar.gz
   $ tar xfz vadr-models-calici-1.2-1.tar.gz
   $ rm vadr-models-calici-1.2-1.tar.gz

And optionally, the VADR cox1 models:

   $ curl -k -L -o vadr-models-cox1-1.2-1.tar.gz https://ftp.ncbi.nlm.nih.gov/pub/nawrocki/vadr-models/cox1/1.2-1/vadr-models-cox1-1.2-1.tar.gz
   $ tar xfz vadr-models-cox1-1.2-1.tar.gz
   $ rm vadr-models-cox1-1.2-1.tar.gz

And optionally, the VADR SARS-CoV-2 models:

   $ curl -k -L -o vadr-models-sarscov2-1.3-2.tar.gz https://ftp.ncbi.nlm.nih.gov/pub/nawrocki/vadr-models/sarscov2/1.3-2/vadr-models-sarscov2-1.3-2.tar.gz
   $ tar xfz vadr-models-sarscov2-1.3-2.tar.gz
   $ rm vadr-models-sarscov2-1.3-2.tar.gz

steps for building

To build Bio-Easel and infernal:

   $ cd !PATH-TO-VADR-INSTALL-DIR!
   $ cd Bio-Easel
   $ (cd src/easel; autoconf)
   $ perl Makefile.PL
   $ make
   $ make test 
   $ cd !PATH-TO-VADR-INSTALL-DIR!
   $ cd infernal
   $ autoconf
   $ sh ./configure; 
   $ make
   $ make check

steps for setting your environment variables

To set up your environment, if you use bash shell, add the following to your ~/.bashrc file:

export VADRINSTALLDIR=!PATH-TO-VADR-INSTALL-DIR!
export VADRSCRIPTSDIR="$VADRINSTALLDIR/vadr"
export VADRMODELDIR="$VADRINSTALLDIR/vadr-models"
export VADRINFERNALDIR="$VADRINSTALLDIR/infernal/src"
export VADRHMMERDIR="$VADRINSTALLDIR/infernal/hmmer/src"
export VADREASELDIR="$VADRINSTALLDIR/infernal/easel/miniapps"
export VADRBIOEASELDIR="$VADRINSTALLDIR/Bio-Easel"
export VADRSEQUIPDIR="$VADRINSTALLDIR/sequip"
export VADRBLASTDIR="$VADRINSTALLDIR/ncbi-blast/bin"
export PERL5LIB="$VADRSCRIPTSDIR":"$VADRSEQUIPDIR":"$VADRBIOEASELDIR/blib/lib":"$VADRBIOEASELDIR/blib/arch":"$PERL5LIB"
export PATH="$VADRSCRIPTSDIR":"$PATH"

or if you use C shell, add the following to your ~/.cshrc file:

setenv VADRINSTALLDIR !PATH-TO-VADR-INSTALL-DIR!
setenv VADRSCRIPTSDIR "$VADRINSTALLDIR/vadr"
setenv VADRMODELDIR "$VADRINSTALLDIR/vadr-models"
setenv VADRINFERNALDIR "$VADRINSTALLDIR/infernal/src"
setenv VADRHMMERDIR "$VADRINSTALLDIR/infernal/hmmer/src"
setenv VADREASELDIR "$VADRINSTALLDIR/infernal/easel/miniapps"
setenv VADRBIOEASELDIR "$VADRINSTALLDIR/Bio-Easel"
setenv VADRSEQUIPDIR "$VADRINSTALLDIR/sequip"
setenv VADRBLASTDIR "$VADRINSTALLDIR/ncbi-blast/bin"
setenv PERL5LIB "$VADRSCRIPTSDIR":"$VADRSEQUIPDIR":"$VADRBIOEASELDIR/blib/lib":"$VADRBIOEASELDIR/blib/arch":"$PERL5LIB"
setenv PATH "$VADRSCRIPTSDIR":"$PATH"

With the actual path replacing !PATH-TO-VADR-INSTALL-DIR! above.

Then, source one of those files with

source ~/.bashrc

Or:

source ~/.cshrc

For information about our git workflow, read on.


git workflow

VADR uses the popular git workflow that's often just called "git flow". Go read the 2010 blog post by Vincent Driessen that describes it. But we use it with the difference that we don't mind having feature branches on origin.

In what follows, first we'll give concise-ish examples of the flow for normal development, making a release, and making a "hotfix". A summary of the principles and rationale follows the examples.

Normal development

Generally, for any changes you make to our code, you will make on a feature branch, off of develop. So first you create your branch:

   $ git checkout -b myfeature develop            

Now you work, for however long it takes. You can make commits on your myfeature branch locally, and/or you can push your branch up to the origin and commit there too, as you see fit.

When you're done, and you've tested your new feature, you merge it to develop (using --no-ff, which makes sure a clean new commit object gets created), and delete your feature branch:

   $ git checkout develop                     
   $ git merge --no-ff -m "Merges myfeature branch into develop" myfeature
   $ git branch -d myfeature
   $ git push origin --delete myfeature
   $ git push origin develop                  

Small features: single commits can be made to develop

Alternatively, if you're sure your change is going to be a single commit, you can work directly on the develop branch.

   $ git checkout develop                  
     # make your changes
   $ git commit
   $ git push origin develop               

Big features: keeping up to date with develop

If your work on a feature is taking a long time (days, weeks...), and if the develop trunk is accumulating changes you want, you might want to periodically merge them in:

   $ git checkout myfeature
   $ git merge --no-ff -m "Merges develop branch into myfeature branch" develop           

Making a release

To make a release, you're going to make a release branch of the code, and of the sequip repo if you made changes there as well. You assign appropriate version numbers to each, test and stabilize. When everything is ready, you merge to master and tag that commit with the version number; then you also merge back to develop, and delete the release branch.

For example, here's the git flow for a VADR release, depending on sequip and Bio-Easel. Suppose vadr is currently at 0.16, and sequip and Bio-Easel are currently at 0.05. Suppose we decide this release will be vadr 0.2, and it does not depend on any new features in sequip or Bio-Easel, so we can use the last stable sequip and Bio-Easel releases as they are (this will be the head of the master sequip and Bio-Easel git repos, which is what you should be using unless you made changes in sequip or Bio-Easel). To proceed we first go over to sequip and Bio-Easel and just make a tag:

   $ cd ../sequip
   $ git checkout master
   $ git tag -a -m "Tags sequip 0.02 for vadr-0.2 release" vadr-0.2
   $ git push origin vadr-0.2
   $ cd ../Bio-Easel
   $ git checkout master
   $ git tag -a -m "Tags Bio-Easel 0.05 for vadr-0.2 release" vadr-0.2
   $ git push origin vadr-0.2

then go over and make a new release from vadr's develop branch:

   $ cd vadr
   $ git checkout develop # only necessary if you're not already on develop
   $ git checkout -b release-0.2 develop
     # 3 .pl, miniscripts/*pl (1), parse_blast.pl vadr.pm and vadr_seed.pm, README.md, INSTALL, RELEASE-NOTES, vadr-install.sh

And then update documentation and install script:

   $ git commit -a -m "Bumps version to 0.2"
     # do and commit any other work needed to test/stabilize vadr release.
     # Then, when code is ready:
     # update RELEASE-NOTES.md: look at commit logs on github and at jira tracking ticket
     # update examples in documentation/*.md (version) 
     # update vadr-install.sh (versions of all software *and* models)
     # test vadr-install.sh (but change to checkout vadr git repo instead of archived release)
     # run anecdotal and large tests on installed version
   $ git commit -a -m "Updates install file and documentation for 0.2 release"

When you're finished merge the vadr release branch as follows:

   $ git checkout master
   $ git merge --no-ff -m "Merges release-0.2 branch into master" release-0.2
   $ git tag -a -m "Tags vadr 0.2 release" vadr-0.2
   $ git push origin vadr-0.2
      # Now merge release branch back to develop...
   $ git checkout develop
   $ git merge --no-ff -m "Merges release-0.2 branch into develop" release-0.2
   $ git push
   $ git branch -d release-0.2
      # and if you had pushed release-0.2 to origin:
   $ git push origin --delete release-0.2

Release process with new features in dependencies

Alternatively, what if our new release depends on some new features in Bio-Easel (but not sequip). In this case, we first create a tag in sequip just like we did in the example above, but then we need to make a new Bio-Easel 0.06 release:

   $ cd ../Bio-Easel
   $ git checkout develop # only necessary if you're not already on develop
   $ git checkout -b release-0.06 develop
     # change version numbers to 0.06; also dates, copyrights
     # list of files with versions and dates and copyrights is in Bio-Easel/dev-README
   $ git commit -a -m "Version number bumped to 0.06"
     # do and commit any other work needed to test/stabilize Bio-Easel release

then go over and make the vadr release branch (but don't actually release) as explained above in the example that bundled stable sequip 0.02 and Bio-Easel 0.05.

When the vadr release is ready we need to merge the Bio-Easel release branches:

   $ cd ../Bio-Easel
   $ git checkout master
   $ git merge --no-ff -m "Merges release-0.06 branch into master" release-0.06
   $ git tag -a -m "Tags Bio-Easel 0.06 release" Bio-Easel-0.06
   $ git push origin Bio-Easel-0.06
   $ git tag -a -m "Tags Bio-Easel 0.06 for vadr-0.2 release" vadr-0.2  # This records that vadr-0.2 depends on Bio-Easel 0.06
   $ git push origin vadr-0.2
       # Now merge release branch back to develop...
   $ git checkout develop
   $ git merge --no-ff -m "Merges release-0.06 branch into develop" release-0.06
   $ git push
   $ git branch -d release-0.06
      # and if you had pushed release-0.06 to origin:
   $ git push origin --delete release-0.06

and finally, update documentation and install script and merge vadr release branch to master (see 'update documentation and install script above')

Dependencies always have a tag for their own release (Bio-Easel 0.05), and may have additional tags for packages that depend on them (vadr 0.2 bundles sequip 0.02? Then there's a vadr-0.02 tag pointing to that sequip commit object).


If you need to fix a critical bug and make a new release immediately, you create a hotfix release with an updated version number, and the hotfix release is named accordingly: for example, if we screwed up vadr 0.03, hotfix-0.04 is the updated 0.04 release.

A hotfix branch comes off master, but otherwise is much like a release branch.

   $ cd vadr
   $ git checkout -b hotfix-0.04 master                 
    # 3 .pl, parse_blast.pl, miniscripts/*pl (1), vadr.pm and vadr_seed.pm, README.md, INSTALL, RELEASE-NOTES, vadr-install.sh
   $ git commit -a -m "Version number bumped to 0.04"

Now you fix the bug(s), in one or more commits. When you're done, the finishing procedure is just like a release:

     # update examples in documentation/*.md (version) 
     # update vadr-install.sh (versions of all software *and* models)
     # test vadr-install.sh (but change to checkout vadr git repo instead of archived release)
    $ git checkout master              
    $ git merge --no-ff -m "Merges hotfix-0.04 branch into master" hotfix-0.04
    $ git push
    $ git tag -a -m "Tags vadr 0.04 release" vadr-0.04
    $ git push origin vadr-0.04
    $ git checkout develop              
    $ git merge --no-ff -m "Merges hotfix-0.04 branch into develop" hotfix-0.04
    $ git push
    $ git branch -d hotfix-0.04
      # and if you had pushed hotfix-0.04 to origin:
    $ git push origin --delete hotfix-0.04

And make a tag in all the dependencies:

   $ cd ../sequip
   $ git checkout master
   $ git tag -a -m "Tags sequip 0.02 for vadr-0.04 release" vadr-0.04
   $ git push origin vadr-0.04
   $ cd ../Bio-Easel
   $ git checkout master
   $ git tag -a -m "Tags Bio-Easel 0.05 for vadr-0.04 release" vadr-0.04
   $ git push origin vadr-0.04

And finally, test the vadr-install.sh script to make sure it works.


There are two long-lived vadr branches: origin/master, and origin/develop. All other branches have limited lifetimes.

master is stable. Every commit object on master is a tagged release, and vice versa.

develop is for ongoing development destined to be in the next release. develop should be in a close-to-release state. Another package (e.g. vadr) may need to create a release of a downstream dependency (e.g. sequip) at short notice. Therefore, commit objects on develop are either small features in a single commit, or a merge of a finished feature branch.

We make a feature branch off develop for any nontrivial new work -- anything that you aren't sure will be a single commit on develop. A feature branch:

  • comes from develop
  • is named anything informative (except master, develop, hotfix-* or release-*)
  • is merged back to develop (and deleted) when you're done
  • is deleted once merged

We make a release branch off develop when we're making a release. A release branch:

  • comes from develop
  • is named release-<version>, such as release-1.2
  • first commit on the hotfix branch consists of bumping version/date/copyright
  • is merged to master when you're done, and that new commit gets tagged as a release
  • is then merged back to develop too
  • is deleted once merged

We make a hotfix branch off master for a critical immediate fix to the current release. A hotfix branch:

  • comes from master
  • is named hotfix-<version>, such as hotfix-1.2.1
  • first commit on the hotfix branch consists of bumping version/date/copyright
  • is merged back to master when you're done, and that new commit object gets tagged as a release.
  • is then merged back to develop too
  • is deleted once merged

The text above was borrowed and modified with permission from Sean Eddy, from the HMMER github repository.


Questions, comments or feature requests? Send a mail to [email protected].

⚠️ **GitHub.com Fallback** ⚠️