If you are following the tutorial, switch to the tutorial branch.
This is a very simple workflow, using two simple python script to illustrate, as an example, how to bring several commandline application into a Docker container, and then how to express them into CWL.
The workflow is composed by two scripts:
transcribe.py
which gets in input a dna.txt file composed by nucleotides and writes a rna.txt as outputtranslate.py
which accepts a rna.txt as input and produce a peptide.txt
We create an image, called dna2protein
, and build it using the Dockerfile
.
Note: mmattioni
is the username on the Seven Bridges Platform, dna2protein
is the
development project we have created there. Change them to yours otherwise won't work.
docker build -t dna2protein .
docker run -it dna2protein bash
Transcribe example:
$ docker run dna2protein transcribe.py -h
usage: transcribe.py [-h] [-v] [--version] dna
Translates a DNA input test into a RNA
positional arguments:
dna DNA input file to transcribe
optional arguments:
-h, --help show this help message and exit
-v, --verbose
--version show program's version number and exit
Translate example:
$ docker run dna2protein translate.py -h
usage: translate.py [-h] [--verbose] [--version] mRNA
Transcribe the provided mRNA into a peptide.
positional arguments:
mRNA mRNA to transcribe
optional arguments:
-h, --help show this help message and exit
--verbose, -v Run in verbose mode (default: False)
--version show program's version number and exit
Now our scripts are dockerized and we can use it directly from the image.
First we need to tag the image. Make sure you have create a dev project called dna2protein
, otherwise you will
not be able to push the image.
docker tag dna2protein images.sbgenomics.com/mmattioni/dna2protein
Now we login and push it to the registry. Note: you have to provide you API TOKEN, not the password you use to log on the Seven Bridges Platform.
docker login --username <SBG_USERNAME> --email <SBG_EMAIL> --password <SBG_API_TOKEN> images.sbgenomics.com
docker push images.sbgenomics.com/mmattioni/dna2protein
The CWL associated for the whole workflow is in the root of the directory and it's called
dna2protein.cwl.json
you can run it locally with rabix (grab the latest release from https://github.com/rabix/bunny/releases)
rabix --verbose dna2protein.cwl.json -- --dna dna.txt