This Java project aims to develop tools for streaming assembly, as an extension of Japsa. In particular, assembly pipeline had been migrated here from its code base in Japsa for more convenient developing and maintaining the modules. In the future, the project might include additional analysis pipelines for Nanopore data as well.
- Linux operating system; not tested for Mac, Window.
- Java 11+.
- For npGraph ,if the pipeline requires an aligner (e.g. raw sequences in FASTA/FASTQ are provided instead of SAM/BAM), minimap2 (recommended) or bwa (later than 0.7.11) must be included.
After cloning the project, the tool can be built with maven2 (3.6.0) by following command:
git clone https://github.com/hsnguyen/assembly.git
mvn clean package
to generate a JAR file containing application modules (target/assembly-x.x.x-SNAPSHOT.jar).
Or you can download directly the JAR file from a release version without having to compile the source.
User can build an image directly from the Dockerfile that also includes bwa, minimap2 for aligners and kalign3, spoa for MSA-based consensus calling.
docker build -t npgraph .
The image is also made available on DockerHub as well
docker pull nguyenhoangson/npgraph
After having the docker image, one can run npGraph with GUI, minimap2 and kalign v3 by default
docker run --rm -it -e DISPLAY -v $HOME/.Xauthority:/home/developer/.Xauthority -v <local_data_folder>:/data --net=host npgraph
You might need to run xhost +
to disable X server access control on your local machine before running npGraph with GUI by above command.
If you want to disable GUI mode, overide the default executable behaviour, or invoke another module than npGraph then provide appropriate parameters, e.g.
docker run --rm -it -v <local_data_folder>:/data npgraph org.rtassembly.NPGraphCmd ...
For now two modules are on development:
- npGraph: streaming hybrid assembly using Nanopore data
- npGraph server for ReadUntil: combined with ReadUntil (ReadFish) for selective sequencing oriented to complete assembly (dev)
- npConcatemer: using signal processing to detect concatemeric reads for viral genomes (abandoned).
Similar to Japsa project, tools included in this repo is available under BSD-like license.