Skip to content

Commit

Permalink
Merge pull request #53 from phac-nml/development
Browse files Browse the repository at this point in the history
Release v2.1.0
  • Loading branch information
peterk87 authored Jul 31, 2018
2 parents e73b7fc + 9790624 commit b333f2e
Show file tree
Hide file tree
Showing 14 changed files with 664 additions and 352 deletions.
2 changes: 1 addition & 1 deletion .travis.yml
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,6 @@ notifications: # set notification options

language: python
python:
- '3.5'
- '3.6'
os:
- linux
Expand All @@ -20,6 +19,7 @@ os:
branches:
only:
- master
- development

# Blacklist of branches to not run CI testing on.
# branches:
Expand Down
66 changes: 51 additions & 15 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,17 +1,20 @@
|logo|

|conda| |nbsp| |pypi| |nbsp| |license| |nbsp| |nbsp| Master:|citest-master| |nbsp| Development:|citest-dev|

|license| |nbsp| |citest| |nbsp| |pypi| |nbsp| |conda|

.. |logo| image:: https://i.imgur.com/yYOkFlH.png
:target: https://github.com/phac-nml/bio_hansel

.. |logo| image:: logo.png
:target: https://github.com/phac-nml/biohansel
.. |pypi| image:: https://badge.fury.io/py/bio-hansel.svg
:target: https://pypi.python.org/pypi/bio_hansel/
.. |license| image:: https://img.shields.io/badge/License-Apache%20v2.0-blue.svg
:target: http://www.apache.org/licenses/LICENSE-2.0
.. |citest| image:: https://travis-ci.org/phac-nml/bio_hansel.svg?branch=master
:target: https://travis-ci.org/phac-nml/bio_hansel
.. |conda| image:: https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg?style=flat-square
.. |citest-dev| image:: https://travis-ci.org/phac-nml/biohansel.svg?branch=development
:target: https://travis-ci.org/phac-nml/biohansel
.. |citest-master| image:: https://travis-ci.org/phac-nml/biohansel.svg?branch=master
:target: https://travis-ci.org/phac-nml/biohansel
.. |conda| image:: https://img.shields.io/badge/install%20with-bioconda-brightgreen.svg
:target: https://bioconda.github.io/recipes/bio_hansel/README.html
.. |nbsp| unicode:: 0xA0
:trim:
Expand All @@ -26,7 +29,7 @@ Works on genome assemblies (FASTA files) or reads (FASTQ files)! Accepts Gzipped
Citation
========

If you find this tool useful, please cite as:
If you find the ``biohansel`` tool useful, please cite as:

.. epigraph::

Expand All @@ -38,11 +41,11 @@ If you find this tool useful, please cite as:
Requirements and Dependencies
=============================

This tool has only been tested on Linux (specifically Arch Linux). It may or may not work on OSX.
Each new build of ``biohansel`` is automatically tested on Linux using `Continuous Integration <https://travis-ci.org/phac-nml/bio_hansel/branches>`_. ``biohansel`` has been confirmed to work on Mac OSX (versions 10.13.5 Beta and 10.12.6) when installed with Conda_.

These are the dependencies required for ``bio_hansel``:
These are the dependencies required for ``biohansel``:

- Python_ (>=v3.5)
- Python_ (>=v3.6)
- numpy_ >=1.12.1
- pandas_ >=0.20.1
- pyahocorasick_ >=1.1.6
Expand All @@ -55,21 +58,21 @@ Installation
With Conda_
-----------

Install ``bio_hansel`` from Bioconda_ with Conda_ (`Conda installation instructions <https://bioconda.github.io/#install-conda>`_):
Install ``biohansel`` from Bioconda_ with Conda_ (`Conda installation instructions <https://bioconda.github.io/#install-conda>`_):

.. code-block:: bash
# setup Conda channels for Bioconda and Conda-Forge (https://bioconda.github.io/#set-up-channels)
conda config --add channels defaults
conda config --add channels conda-forge
conda config --add channels bioconda
# install bio_hansel
# install biohansel
conda install bio_hansel
With pip_ from PyPI_
---------------------

Install ``bio_hansel`` from PyPI_ with pip_:
Install ``biohansel`` from PyPI_ with pip_:

.. code-block:: bash
Expand All @@ -82,12 +85,12 @@ Or install the latest master branch version directly from Github:

.. code-block:: bash
pip install git+https://github.com/phac-nml/bio_hansel.git@master
pip install git+https://github.com/phac-nml/biohansel.git@master
Install into Galaxy_ (version >= 17.01)
---------------------------------------

Install ``bio_hansel`` from the main Galaxy_ toolshed:
Install ``biohansel`` from the main Galaxy_ toolshed:

https://toolshed.g2.bx.psu.edu/repository?repository_id=59b90ef18cc5dbbc&changeset_revision=4654c51dae72

Expand Down Expand Up @@ -246,6 +249,38 @@ Analysis of all FASTA/FASTQ files in a directory
``hansel`` will only attempt to analyze the FASTA/FASTQ files within the specified directory and will not descend into any subdirectories!


Development
===========


Get the latest development code using Git from GitHub:

.. code-block:: bash
git clone https://github.com/phac-nml/biohansel.git
cd biohansel/
git checkout development
# Create a virtual environment (virtualenv) for development
virtualenv -p python3 .venv
# Activate the newly created virtualenv
source .venv/bin/activate
# Install biohansel into the virtualenv in "editable" mode
pip install -e .
Run tests with pytest_:

.. code-block:: bash
# In the biohansel/ root directory, install pytest for running tests
pip install pytest
# Run all tests in tests/ directory
pytest
# Or run a specific test module
pytest -s tests/test_qc.py
Legal
=====

Expand Down Expand Up @@ -280,3 +315,4 @@ Contact
.. _attrs: http://www.attrs.org/en/stable/
.. _Python: https://www.python.org/
.. _Galaxy: https://galaxyproject.org/
.. _pytest: https://docs.pytest.org/en/latest/
2 changes: 1 addition & 1 deletion bio_hansel/__init__.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# -*- coding: utf-8 -*-

__version__ = '2.0.0'
__version__ = '2.1.0'
program_name = 'bio_hansel'
program_summary = 'Subtype microbial genomes using SNV targeting k-mer subtyping schemes.'
program_desc = program_summary + '''
Expand Down
4 changes: 2 additions & 2 deletions bio_hansel/const.py
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@
'version': '0.5.0',
'subtyping_params': SubtypingParams(low_coverage_depth_freq=20)},
'enteritidis': {'file': resource_filename(program_name, 'data/enteritidis/tiles.fasta'),
'version': '0.7.0',
'version': '0.8.0',
'subtyping_params': SubtypingParams(low_coverage_depth_freq=50)}}

COLUMNS_TO_REMOVE = '''
Expand Down Expand Up @@ -65,4 +65,4 @@
REGEX_FASTQ = re.compile(r'^(.+)\.(fastq|fq|fastqsanger)(\.gz)?$')
REGEX_FASTA = re.compile(r'^.+\.(fasta|fa|fna|fas)(\.gz)?$')

JSON_EXT_TMPL = '{}.json'
JSON_EXT_TMPL = '{}.json'
54 changes: 28 additions & 26 deletions bio_hansel/main.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,18 +4,20 @@
import argparse
import logging
import sys
import os
import re
import os
from typing import Optional, List, Any, Tuple

import attr
import pandas as pd

from . import program_desc, __version__
from .const import SUBTYPE_SUMMARY_COLS, REGEX_FASTQ, REGEX_FASTA, JSON_EXT_TMPL
from .subtype import Subtype
from .subtype_stats import subtype_counts
from .subtyper import \
query_contigs_ac, \
query_reads_ac
subtype_contigs_samples, \
subtype_reads_samples
from .metadata import read_metadata_table, merge_metadata_with_summary_results
from .utils import \
genome_name_from_fasta_path, \
Expand Down Expand Up @@ -192,36 +194,36 @@ def main():
scheme_subtype_counts = subtype_counts(scheme_fasta)
logging.debug(args)
subtyping_params = init_subtyping_params(args, scheme)
input_genomes, reads = collect_inputs(args)
if len(input_genomes) == 0 and len(reads) == 0:
input_contigs, input_reads = collect_inputs(args)
if len(input_contigs) == 0 and len(input_reads) == 0:
raise Exception('No input files specified!')
df_md = None
if args.scheme_metadata:
df_md = read_metadata_table(args.scheme_metadata)
n_threads = args.threads

subtype_results = [] # type: List[Subtype]
dfs = [] # type: List[pd.DataFrame]
if len(input_genomes) > 0:
query_contigs_ac(subtype_results=subtype_results,
dfs=dfs,
input_genomes=input_genomes,
scheme=scheme,
scheme_name=scheme_name,
subtyping_params=subtyping_params,
scheme_subtype_counts=scheme_subtype_counts,
n_threads=n_threads)
if len(reads) > 0:
query_reads_ac(subtype_results=subtype_results,
dfs=dfs,
reads=reads,
scheme=scheme,
scheme_name=scheme_name,
subtyping_params=subtyping_params,
scheme_subtype_counts=scheme_subtype_counts,
n_threads=n_threads)
subtype_results = [] # type: List[Tuple[Subtype, pd.DataFrame]]
if len(input_contigs) > 0:
contigs_results = subtype_contigs_samples(input_genomes=input_contigs,
scheme=scheme,
scheme_name=scheme_name,
subtyping_params=subtyping_params,
scheme_subtype_counts=scheme_subtype_counts,
n_threads=n_threads)
logging.info('Generated %s subtyping results from %s contigs samples', len(contigs_results), len(input_contigs))
subtype_results += contigs_results
if len(input_reads) > 0:
reads_results = subtype_reads_samples(reads=input_reads,
scheme=scheme,
scheme_name=scheme_name,
subtyping_params=subtyping_params,
scheme_subtype_counts=scheme_subtype_counts,
n_threads=n_threads)
logging.info('Generated %s subtyping results from %s contigs samples', len(reads_results), len(input_reads))
subtype_results += reads_results

dfsummary = pd.DataFrame(subtype_results)
dfs = [df for st, df in subtype_results] # type: List[pd.DataFrame]
dfsummary = pd.DataFrame([attr.asdict(st) for st, df in subtype_results])
dfsummary = dfsummary[SUBTYPE_SUMMARY_COLS]

if dfsummary['avg_tile_coverage'].isnull().all():
Expand Down
Loading

0 comments on commit b333f2e

Please sign in to comment.