Skip to content

Commit

Permalink
Merge pull request #25 from Illumina/GT-804
Browse files Browse the repository at this point in the history
GT-804 v2.2b release
  • Loading branch information
traxexx authored Jun 15, 2019
2 parents 6a1863b + f912119 commit 9648877
Show file tree
Hide file tree
Showing 5 changed files with 139 additions and 144 deletions.
116 changes: 2 additions & 114 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -2,13 +2,7 @@

<!-- vscode-markdown-toc -->
* [Introduction](#Introduction)
* [System Requirements](#SystemRequirements)
* [Hardware](#Hardware)
* [Operating systems](#Operatingsystems)
* [Third-party libraries](#ThirdPartyLibraries)
* [Installation](#Installation)
* [Native build](#NativeBuild)
* [From Docker image](#FromDockerImage)
* [Run Paragraph from VCF](#RunParagraphFromVCF)
* [Example](#Example)
* [Input requirements](#InputRequirements)
Expand Down Expand Up @@ -38,115 +32,9 @@ Please reference Paragraph using:

Genotyping calls in this paper can be found at [paper-data/download-instructions.txt](paper-data/download-instructions.txt)

## <a name='SystemRequirements'></a>System Requirements

### <a name='Hardware'></a>Hardware

A standard workstation with at least 8GB of RAM should be sufficient for compilation and testing of the program.

### <a name='Operatingsystems'></a>Operating systems

Paragrpah is supported on the following systems:

- Ubuntu 16.04 and CentOS 5-7,
- macOS 10.11+,

Python 3.4+ is required.

We recommend using g++ (6.0+), or a recent version of Clang.

We use the C++11 standard, any Posix compliant compiler supporting this standard
should be usable.

### <a name='ThirdPartyLibraries'></a>Third-party libraries

Please check [requirements.txt](requirements) for required python modules.

[Boost libraries](http://www.boost.org) version >= 1.5 is required.
- We prefer to statically link Boost libraries to Paragraph executables:

```bash
cd ~
wget http://downloads.sourceforge.net/project/boost/boost/1.65.0/boost_1_65_0.tar.bz2
tar xf boost_1_65_0.tar.bz2
cd boost_1_65_0
./bootstrap.sh
./b2 --prefix=$HOME/boost_1_65_0_install link=static install
```

- To point Cmake to your version of Boost use the `BOOST_ROOT` environment variable:

```bash
export BOOST_ROOT=$HOME/boost_1_65_0_install
```

We have included copies of other dependent libraries in external/. They are:
- Google Test and Google Mock (v1.8.0)
- Htslib (v1.9)
- Spdlog

## <a name='Installation'></a>Installation

### <a name='NativeBuild'></a>Native buid
First, checkout the repository like so:

```bash
git clone https://github.com/Illumina/paragraph.git
cd paragraph-tools
```

Then create a new directory for the program and compile it there:

```bash
# Create a separate build folder.
cd ..
mkdir paragraph-tools-build
cd paragraph-tools-build

# Configure
# optional:
# export BOOST_ROOT=<path-to-boost-installation>
cmake ../paragraph-tools
# if this doesn't work, run this instead:
# cmake ../paragraph-tools -DCMAKE_CXX_COMPILER=`which g++` -DCMAKE_C_COMPILER=`which gcc` -DBOOST_ROOT=$BOOST_ROOT

# Make, use -j <n> to use n parallel jobs to build, e.g. make -j4
make
```

### <a name='FromDockerImage'></a>From Docker Image
We also provide a [Dockerfile](Dockerfile). To build a Docker image, run the following command inside the source
checkout folder:

```bash
docker build .
```

Once the image is built you can find out its ID like this:

```bash
docker images
```
```
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
<none> <none> 54c7d4015330 16 seconds ago 1.76 GB
```

Check the below section for how to run Paragraph, and execute this before running:

```bash
sudo docker run -v `pwd`:/data 54c7d4015330
```

The current directory can be accessed as `/data` inside the Docker container.

The default entry point is `multigrmpy.py`.

To override the default entrypoint and get an interactive shell, run:

```bash
sudo docker run --entrypoint /bin/bash -it 54c7d4015330
```
Please check [doc/Installation.md](doc/Installation.md) for system requirements and installation instructions.

## <a name='RunParagraphFromVCF'></a>Run Paragraph from VCF
### <a name='Example'></a>Example
Expand All @@ -161,7 +49,7 @@ python3 bin/multigrmpy.py -i share/test-data/round-trip-genotyping/candidates.vc

This runs a simple genotyping example for two test samples.
* **candidates.vcf**: this specifies candidate SV events in a vcf format.
* **samples.txt**: Manifest that specifies some test BAM files. Tab delimited.
* **samples.txt**: Manifest that specifies some test BAM files. Tab or comma delimited.
* **dummy.fa** a short dummy reference which only contains `chr1`

The output folder `test` then contains gzipped json for final genotypes:
Expand Down
18 changes: 7 additions & 11 deletions RELEASES.md
Original file line number Diff line number Diff line change
@@ -1,20 +1,22 @@
# Paragraph Release Notes / Change Log

# Version 2.2b

| Date Y-m-d | Ticket | Description |
|------------|---------|----------------------------------------------------------------------|
| 2019-06-14 | GT-804 | Simplify README and add static build |

# Version 2.2a

| 2019-05-27 | GT-802 | Update license to Apache and fix docker entry |

#Version 2.2
# Version 2.2

| Date Y-m-d | Ticket | Description |
|------------|---------|----------------------------------------------------------------------|
| 2019-05-11 | GT-743 | Update interface and error handling |
| 2018-12-11 | GT-696 | Fix newlines in validation scripts (public repo already fixed) |

# Version 2.1

| Date Y-m-d | Ticket | Description |
|------------|---------|----------------------------------------------------------------------|
| 2018-12-06 | GT-675 | Fix filters and alignment stats. Change depth test threshold on lower end |
| 2018-11-08 | GT-660 | Optimize GQ for variant genotypes |
| 2018-11-02 | GT-656 | Improvement for simple SV genotyping |
Expand All @@ -24,8 +26,6 @@

# Version 2.0

| Date Y-m-d | Ticket | Description |
|------------|---------|----------------------------------------------------------------------|
| 2018-06-27 | GT-490 | Paragraph 2.0 release; disable Poisson depth test by default |
| 2018-06-27 | GT-495 | Improved output of phasing information and paths |
| 2018-06-26 | GT-402 | support genotyping on male chrX |
Expand Down Expand Up @@ -59,8 +59,6 @@

# Version 1.2

| Date Y-m-d | Ticket | Description |
|------------|---------|----------------------------------------------------------------------|
| 2018-04-05 | GT-429 | option to turn off exact and graph aligners in grmpy |
| 2018-04-05 | GT-428 | upgrade htslib to version 1.8 |
| 2018-04-04 | GT-427 | GT-427 multigrmpy to generate graph ID if vc2toparagraph does not provide it|
Expand All @@ -81,8 +79,6 @@

# Version 1.1

| Date Y-m-d | Ticket | Description |
|------------|---------|----------------------------------------------------------------------|
| 2018-02-21 | GT-374 | support for read-level validation |
| 2018-02-19 | GT-379 | configure tool for installation |
| 2018-02-15 | GT-373 | Speedup bam processing by keeping the file open between the graphs |
Expand Down
16 changes: 0 additions & 16 deletions data/download-instructions.txt

This file was deleted.

127 changes: 127 additions & 0 deletions doc/Installation.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,127 @@
# Installation of Paragraph

* [System Requirements](#SystemRequirements)
* [Hardware](#Hardware)
* [Operating systems](#Operatingsystems)
* [Third-party libraries](#ThirdPartyLibraries)
* [Static Build](#StaticBuild)
* [Installation](#Installation)
* [Native build](#NativeBuild)
* [From Docker image](#FromDockerImage)

## <a name='SystemRequirements'></a>System Requirements

### <a name='Hardware'></a>Hardware

A standard workstation with at least 8GB of RAM should be sufficient for compilation and testing of the program.

### <a name='Operatingsystems'></a>Operating systems

Paragrpah is supported on the following systems:

- Ubuntu 16.04 and CentOS 5-7,
- macOS 10.11+,

Python 3.6+ is required.

We recommend using g++ (6.0+), or a recent version of Clang.

We use the C++11 standard, any Posix compliant compiler supporting this standard
should be usable.

### <a name='ThirdPartyLibraries'></a>Third-party libraries

Please check [requirements](../requirements.txt) for required python modules.

We have included copies of other dependent libraries in external/. They are:
- Google Test and Google Mock (v1.8.0)
- Htslib (v1.9)
- Spdlog

## <a name='Static Build'></a>Static Build

We provide a static build that works for GCC 5.2+ under linux environment. No installation is required for the static build.

Download the static build under "release" tag of the github repo.

## <a name='Installation'></a>Installation

### <a name='NativeBuild'></a>Native buid

[Boost libraries](http://www.boost.org) version >= 1.5 is required.
- We prefer to statically link Boost libraries to Paragraph executables:

```bash
cd ~
wget http://downloads.sourceforge.net/project/boost/boost/1.65.0/boost_1_65_0.tar.bz2
tar xf boost_1_65_0.tar.bz2
cd boost_1_65_0
./bootstrap.sh
./b2 --prefix=$HOME/boost_1_65_0_install link=static install
```

- To point Cmake to your version of Boost use the `BOOST_ROOT` environment variable:

```bash
export BOOST_ROOT=$HOME/boost_1_65_0_install
```

Once you have boost installed, checkout the repository like so:

```bash
git clone https://github.com/Illumina/paragraph.git
cd paragraph-tools
```

Then create a new directory for the program and compile it there:

```bash
# Create a separate build folder.
cd ..
mkdir paragraph-tools-build
cd paragraph-tools-build

# Configure
# optional:
# export BOOST_ROOT=<path-to-boost-installation>
cmake ../paragraph-tools
# if this doesn't work, run this instead:
# cmake ../paragraph-tools -DCMAKE_CXX_COMPILER=`which g++` -DCMAKE_C_COMPILER=`which gcc` -DBOOST_ROOT=$BOOST_ROOT

# Make, use -j <n> to use n parallel jobs to build, e.g. make -j4
make
```

### <a name='FromDockerImage'></a>From Docker Image
We also provide a [Dockerfile](Dockerfile). To build a Docker image, run the following command inside the source
checkout folder:

```bash
docker build .
```

Once the image is built you can find out its ID like this:

```bash
docker images
```
```
REPOSITORY TAG IMAGE ID CREATED VIRTUAL SIZE
<none> <none> 54c7d4015330 16 seconds ago 1.76 GB
```

Check the below section for how to run Paragraph, and execute this before running:

```bash
sudo docker run -v `pwd`:/data 54c7d4015330
```

The current directory can be accessed as `/data` inside the Docker container.

The default entry point is `multigrmpy.py`.

To override the default entrypoint and get an interactive shell, run:

```bash
sudo docker run --entrypoint /bin/bash -it 54c7d4015330
```
6 changes: 3 additions & 3 deletions src/python/bin/multigrmpy.py
Original file line number Diff line number Diff line change
Expand Up @@ -326,11 +326,11 @@ def run(args):
line = line.rstrip()
if line.startswith('#'):
line = line[1:]
f = line.split('\t')
fields = re.split('\t|,', line)
if id_index == -1:
id_index = f.index("id")
id_index = fields.index("id")
continue
sample_names.append(f[id_index])
sample_names.append(fields[id_index])
if args.input.endswith("vcf") or args.input.endswith("vcf.gz"):
grmpyOutput = vcfupdate.read_grmpy(result_json_path)
result_vcf_path = os.path.join(args.output, "genotypes.vcf.gz")
Expand Down

0 comments on commit 9648877

Please sign in to comment.