-
Notifications
You must be signed in to change notification settings - Fork 23
Development notes and instructions
The file vadr-install.sh is an executable shell script that will install VADR and its dependencies and then output important instructions for updating your environment variables so you can run the vadr scripts.
More detailed instructions can be found in VADR's install.md
The instructions below are relevant only to developers who wish to develop VADR or modify it in some way. For users interested only in running VADR, see the Installation instructions for users above. The developer installation is broken down into three steps:
VADR depends on several other repositories at github:
Further, Bio-Easel, infernal and hmmer also depend on easel.
vadr also requires NCBI BLAST version 2.12.0+.
To clone a vadr repository for the first time, and get it set up for development follow the steps below.
First, move into a directory that you want to keep all the code in. Below, you will define the $VADRINSTALLDIR environment variable to this directory. That directory is referred to as !PATH-TO-VADR-INSTALL-DIR! below:
$ cd !PATH-TO-VADR-INSTALL-DIR!
$ git clone https://github.com/ncbi/vadr.git
$ git clone https://github.com/nawrockie/sequip.git
$ git clone https://github.com/nawrockie/Bio-Easel.git
$ (cd Bio-Easel; mkdir src; cd src; git clone https://github.com/EddyRivasLab/easel.git easel)
$ git clone https://github.com/EddyRivasLab/infernal.git infernal
$ cd infernal
$ git clone https://github.com/EddyRivasLab/hmmer
$ git clone https://github.com/EddyRivasLab/easel
This will set you up on the master
branches for all packages.
To do development, you'll want to now checkout the develop
branches
in all of the git repos you just cloned using the commands listed below. Or alternatively you
can skip this step to build the stable master branches.
$ cd !PATH-TO-VADR-INSTALL-DIR!
$ (cd vadr; git checkout develop;)
$ (cd sequip; git checkout develop;)
$ (cd Bio-Easel; git checkout develop;)
$ (cd Bio-Easel/src/easel; git checkout develop;)
$ (cd infernal; git checkout develop;)
$ (cd infernal/easel; git checkout develop;)
$ (cd infernal/hmmer; git checkout develop;)
You'll also want to download the BLAST-2.12.0+ distribution with pre-compiled binaries either for Linux:
$ curl -k -L -o blast.tar.gz https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.12.0/ncbi-blast-2.12.0+-x64-linux.tar.gz
or for Mac/OSX:
$ curl -k -L -o blast.tar.gz https://ftp.ncbi.nlm.nih.gov/blast/executables/blast+/2.12.0/ncbi-blast-2.12.0+-x64-macosx.tar.gz
And then unpack it:
$ tar xfz blast.tar.gz
$ rm blast.tar.gz
$ mv ncbi-blast-2.12.0+ ncbi-blast
And you'll want to download the VADR virus libraries:
$ curl -k -L -o vadr-models-flavi-1.2-1.tar.gz https://ftp.ncbi.nlm.nih.gov/pub/nawrocki/vadr-models/flaviviridae/1.2-1/vadr-models-flavi-1.2-1.tar.gz
$ tar xfz vadr-models-flavi-1.2-1.tar.gz
$ rm vadr-models-flavi-1.2-1.tar.gz
$ curl -k -L -o vadr-models-calici-1.2-1.tar.gz https://ftp.ncbi.nlm.nih.gov/pub/nawrocki/vadr-models/caliciviridae/1.2-1/vadr-models-calici-1.2-1.tar.gz
$ tar xfz vadr-models-calici-1.2-1.tar.gz
$ rm vadr-models-calici-1.2-1.tar.gz
And optionally, the VADR cox1 models:
$ curl -k -L -o vadr-models-cox1-1.2-1.tar.gz https://ftp.ncbi.nlm.nih.gov/pub/nawrocki/vadr-models/cox1/1.2-1/vadr-models-cox1-1.2-1.tar.gz
$ tar xfz vadr-models-cox1-1.2-1.tar.gz
$ rm vadr-models-cox1-1.2-1.tar.gz
And optionally, the VADR SARS-CoV-2 models:
$ curl -k -L -o vadr-models-sarscov2-1.3-2.tar.gz https://ftp.ncbi.nlm.nih.gov/pub/nawrocki/vadr-models/sarscov2/1.3-2/vadr-models-sarscov2-1.3-2.tar.gz
$ tar xfz vadr-models-sarscov2-1.3-2.tar.gz
$ rm vadr-models-sarscov2-1.3-2.tar.gz
To build Bio-Easel and infernal:
$ cd !PATH-TO-VADR-INSTALL-DIR!
$ cd Bio-Easel
$ (cd src/easel; autoconf)
$ perl Makefile.PL
$ make
$ make test
$ cd !PATH-TO-VADR-INSTALL-DIR!
$ cd infernal
$ autoconf
$ sh ./configure;
$ make
$ make check
To set up your environment, if you use bash shell, add the following to your ~/.bashrc file:
export VADRINSTALLDIR=!PATH-TO-VADR-INSTALL-DIR!
export VADRSCRIPTSDIR="$VADRINSTALLDIR/vadr"
export VADRMODELDIR="$VADRINSTALLDIR/vadr-models"
export VADRINFERNALDIR="$VADRINSTALLDIR/infernal/src"
export VADRHMMERDIR="$VADRINSTALLDIR/infernal/hmmer/src"
export VADREASELDIR="$VADRINSTALLDIR/infernal/easel/miniapps"
export VADRBIOEASELDIR="$VADRINSTALLDIR/Bio-Easel"
export VADRSEQUIPDIR="$VADRINSTALLDIR/sequip"
export VADRBLASTDIR="$VADRINSTALLDIR/ncbi-blast/bin"
export PERL5LIB="$VADRSCRIPTSDIR":"$VADRSEQUIPDIR":"$VADRBIOEASELDIR/blib/lib":"$VADRBIOEASELDIR/blib/arch":"$PERL5LIB"
export PATH="$VADRSCRIPTSDIR":"$PATH"
or if you use C shell, add the following to your ~/.cshrc file:
setenv VADRINSTALLDIR !PATH-TO-VADR-INSTALL-DIR!
setenv VADRSCRIPTSDIR "$VADRINSTALLDIR/vadr"
setenv VADRMODELDIR "$VADRINSTALLDIR/vadr-models"
setenv VADRINFERNALDIR "$VADRINSTALLDIR/infernal/src"
setenv VADRHMMERDIR "$VADRINSTALLDIR/infernal/hmmer/src"
setenv VADREASELDIR "$VADRINSTALLDIR/infernal/easel/miniapps"
setenv VADRBIOEASELDIR "$VADRINSTALLDIR/Bio-Easel"
setenv VADRSEQUIPDIR "$VADRINSTALLDIR/sequip"
setenv VADRBLASTDIR "$VADRINSTALLDIR/ncbi-blast/bin"
setenv PERL5LIB "$VADRSCRIPTSDIR":"$VADRSEQUIPDIR":"$VADRBIOEASELDIR/blib/lib":"$VADRBIOEASELDIR/blib/arch":"$PERL5LIB"
setenv PATH "$VADRSCRIPTSDIR":"$PATH"
With the actual path replacing !PATH-TO-VADR-INSTALL-DIR! above.
Then, source one of those files with
source ~/.bashrc
Or:
source ~/.cshrc
For information about our git workflow, read on.
VADR uses the popular git workflow that's often just called
"git flow". Go read the 2010 blog post by Vincent
Driessen
that describes it. But we use it with the difference that
we don't mind having feature branches on origin
.
In what follows, first we'll give concise-ish examples of the flow for normal development, making a release, and making a "hotfix". A summary of the principles and rationale follows the examples.
Generally, for any changes you make to our code, you will make on a
feature branch, off of develop
. So first you create your branch:
$ git checkout -b myfeature develop
Now you work, for however long it takes. You can make commits on your
myfeature
branch locally, and/or you can push your branch up to the
origin and commit there too, as you see fit.
When you're done, and you've tested your new feature, you merge it to
develop
(using --no-ff
, which makes sure a clean new commit object
gets created), and delete your feature branch:
$ git checkout develop
$ git merge --no-ff -m "Merges myfeature branch into develop" myfeature
$ git branch -d myfeature
$ git push origin --delete myfeature
$ git push origin develop
Alternatively, if you're sure your change is going to be a single
commit, you can work directly on the develop
branch.
$ git checkout develop
# make your changes
$ git commit
$ git push origin develop
If your work on a feature is taking a long time (days, weeks...), and
if the develop
trunk is accumulating changes you want, you might
want to periodically merge them in:
$ git checkout myfeature
$ git merge --no-ff -m "Merges develop branch into myfeature branch" develop
To make a release, you're going to make a release branch of the
code, and of the sequip repo if you made changes there as well.
You assign appropriate version numbers to each, test and
stabilize. When everything is ready, you merge to master
and tag
that commit with the version number; then you also merge back to
develop
, and delete the release branch.
For example, here's the git flow for a VADR release, depending on sequip and Bio-Easel. Suppose vadr is currently at 0.16, and sequip and Bio-Easel are currently at 0.05. Suppose we decide this release will be vadr 0.2, and it does not depend on any new features in sequip or Bio-Easel, so we can use the last stable sequip and Bio-Easel releases as they are (this will be the head of the master sequip and Bio-Easel git repos, which is what you should be using unless you made changes in sequip or Bio-Easel). To proceed we first go over to sequip and Bio-Easel and just make a tag:
$ cd ../sequip
$ git checkout master
$ git tag -a -m "Tags sequip 0.02 for vadr-0.2 release" vadr-0.2
$ git push origin vadr-0.2
$ cd ../Bio-Easel
$ git checkout master
$ git tag -a -m "Tags Bio-Easel 0.05 for vadr-0.2 release" vadr-0.2
$ git push origin vadr-0.2
then go over and make a new release from vadr's develop
branch:
$ cd vadr
$ git checkout develop # only necessary if you're not already on develop
$ git checkout -b release-0.2 develop
# bump version:
# 3 .pl, miniscripts/*pl (1), parse_blast.pl vadr.pm and vadr_seed.pm, README.md, INSTALL, RELEASE-NOTES, vadr-install.sh
And then update documentation and install script:
$ git commit -a -m "Bumps version to 0.2"
# do and commit any other work needed to test/stabilize vadr release.
# Then, when code is ready:
# update RELEASE-NOTES.md: look at commit logs on github and at jira tracking ticket
# update examples in documentation/*.md (version)
# update vadr-install.sh (versions of all software *and* models)
# test vadr-install.sh (but change to checkout vadr git repo instead of archived release)
# run anecdotal and large tests on installed version
$ git commit -a -m "Updates install file and documentation for 0.2 release"
When you're finished merge the vadr release branch as follows:
$ git checkout master
$ git merge --no-ff -m "Merges release-0.2 branch into master" release-0.2
$ git tag -a -m "Tags vadr 0.2 release" vadr-0.2
$ git push origin vadr-0.2
# Now merge release branch back to develop...
$ git checkout develop
$ git merge --no-ff -m "Merges release-0.2 branch into develop" release-0.2
$ git push
$ git branch -d release-0.2
# and if you had pushed release-0.2 to origin:
$ git push origin --delete release-0.2
# update the vadr-extra/MODEL-VERSIONS.txt file here: https://github.com/nawrockie/vadr-extra
$ cd PATH-TO/vadr-extra
# update vadr and model versions to latest
$ git commit -a -m "Bumps versions for vadr-0.2 release"
$ git push
$ git tag -a -m "Tags vadr-extra for vadr 0.2 release" vadr-0.2
$ git push origin vadr-0.2
Alternatively, what if our new release depends on some new features in Bio-Easel (but not sequip). In this case, we first create a tag in sequip just like we did in the example above, but then we need to make a new Bio-Easel 0.06 release:
$ cd ../Bio-Easel
$ git checkout develop # only necessary if you're not already on develop
$ git checkout -b release-0.06 develop
# change version numbers to 0.06; also dates, copyrights
# list of files with versions and dates and copyrights is in Bio-Easel/dev-README
$ git commit -a -m "Version number bumped to 0.06"
# do and commit any other work needed to test/stabilize Bio-Easel release
then go over and make the vadr release branch (but don't actually release) as explained above in the example that bundled stable sequip 0.02 and Bio-Easel 0.05.
When the vadr release is ready we need to merge the Bio-Easel release branches:
$ cd ../Bio-Easel
$ git checkout master
$ git merge --no-ff -m "Merges release-0.06 branch into master" release-0.06
$ git tag -a -m "Tags Bio-Easel 0.06 release" Bio-Easel-0.06
$ git push origin Bio-Easel-0.06
$ git tag -a -m "Tags Bio-Easel 0.06 for vadr-0.2 release" vadr-0.2 # This records that vadr-0.2 depends on Bio-Easel 0.06
$ git push origin vadr-0.2
# Now merge release branch back to develop...
$ git checkout develop
$ git merge --no-ff -m "Merges release-0.06 branch into develop" release-0.06
$ git push
$ git branch -d release-0.06
# and if you had pushed release-0.06 to origin:
$ git push origin --delete release-0.06
and finally, update documentation and install script and merge vadr release branch to master (see 'update documentation and install script above')
Dependencies always have a tag for their own release (Bio-Easel 0.05), and may have additional tags for packages that depend on them (vadr 0.2 bundles sequip 0.02? Then there's a vadr-0.02 tag pointing to that sequip commit object).
If you need to fix a critical bug and make a new release immediately,
you create a hotfix
release with an updated version number, and the
hotfix release is named accordingly: for example, if we screwed up
vadr 0.03, hotfix-0.04
is the updated 0.04 release.
A hotfix branch comes off master
, but otherwise is much like a
release branch.
$ cd vadr
$ git checkout -b hotfix-0.04 master
# 3 .pl, parse_blast.pl, miniscripts/*pl (1), vadr.pm and vadr_seed.pm, README.md, INSTALL, RELEASE-NOTES, vadr-install.sh
$ git commit -a -m "Version number bumped to 0.04"
Now you fix the bug(s), in one or more commits. When you're done, the finishing procedure is just like a release:
# update examples in documentation/*.md (version)
# update vadr-install.sh (versions of all software *and* models)
# test vadr-install.sh (but change to checkout vadr git repo instead of archived release)
$ git checkout master
$ git merge --no-ff -m "Merges hotfix-0.04 branch into master" hotfix-0.04
$ git push
$ git tag -a -m "Tags vadr 0.04 release" vadr-0.04
$ git push origin vadr-0.04
$ git checkout develop
$ git merge --no-ff -m "Merges hotfix-0.04 branch into develop" hotfix-0.04
$ git push
$ git branch -d hotfix-0.04
# and if you had pushed hotfix-0.04 to origin:
$ git push origin --delete hotfix-0.04
# update the vadr-extra/MODEL-VERSIONS.txt file here: https://github.com/nawrockie/vadr-extra
$ cd PATH-TO/vadr-extra
# update vadr and model versions to latest
$ git commit -a -m "Bumps versions for vadr-0.04 release"
$ git push
$ git tag -a -m "Tags vadr-extra for vadr 0.04 release" vadr-0.04
$ git push origin vadr-0.04
And make a tag in all the dependencies:
$ cd ../sequip
$ git checkout master
$ git tag -a -m "Tags sequip 0.02 for vadr-0.04 release" vadr-0.04
$ git push origin vadr-0.04
$ cd ../Bio-Easel
$ git checkout master
$ git tag -a -m "Tags Bio-Easel 0.05 for vadr-0.04 release" vadr-0.04
$ git push origin vadr-0.04
And finally, test the vadr-install.sh
script to make sure it works.
There are two long-lived vadr branches: origin/master
, and origin/develop
. All other branches
have limited lifetimes.
master
is stable. Every commit object on master
is a tagged
release, and vice versa.
develop
is for ongoing development destined to be in the next
release. develop
should be in a close-to-release state. Another
package (e.g. vadr) may need to create a release of a downstream
dependency (e.g. sequip) at short notice. Therefore, commit objects on
develop
are either small features in a single commit, or a merge of
a finished feature branch.
We make a feature branch off develop
for any nontrivial new work --
anything that you aren't sure will be a single commit on develop
. A
feature branch:
- comes from
develop
- is named anything informative (except
master
,develop
,hotfix-*
orrelease-*
) - is merged back to
develop
(and deleted) when you're done - is deleted once merged
We make a release branch off develop
when we're making a release.
A release branch:
- comes from
develop
- is named
release-<version>
, such asrelease-1.2
- first commit on the hotfix branch consists of bumping version/date/copyright
- is merged to
master
when you're done, and that new commit gets tagged as a release - is then merged back to
develop
too - is deleted once merged
We make a hotfix branch off master
for a critical immediate fix to
the current release. A hotfix branch:
- comes from
master
- is named
hotfix-<version>
, such ashotfix-1.2.1
- first commit on the hotfix branch consists of bumping version/date/copyright
- is merged back to
master
when you're done, and that new commit object gets tagged as a release. - is then merged back to
develop
too - is deleted once merged