Skip to content

Commit

Permalink
Merge branch 'main' into impl-validate-command
Browse files Browse the repository at this point in the history
  • Loading branch information
aryarm authored Oct 1, 2023
2 parents d16f7bd + 7856638 commit 9234bef
Show file tree
Hide file tree
Showing 12 changed files with 6 additions and 70 deletions.
7 changes: 0 additions & 7 deletions docs/api/data.rst
Original file line number Diff line number Diff line change
Expand Up @@ -193,13 +193,6 @@ The :class:`GenotypesPLINK` class offers experimental support for reading and wr

The time required to load various genotype file formats.

.. warning::
This class depends on the ``Pgenlib`` python library. This can be installed automatically with ``haptools`` if you specify the "files" extra requirements during installation.

.. code-block:: bash
pip install haptools[files]
The :class:`GenotypesPLINK` class inherits from the :class:`GenotypesVCF` class, so it has all the same methods and properties. Loading genotypes is the exact same, for example.

.. code-block:: python
Expand Down
7 changes: 0 additions & 7 deletions docs/formats/genotypes.rst
Original file line number Diff line number Diff line change
Expand Up @@ -22,13 +22,6 @@ There is also experimental support for `PLINK2 PGEN <https://github.com/chrchang

If you run out memory when using PGEN files, consider reading/writing variants from the file in chunks via the ``--chunk-size`` parameter.

.. note::
PLINK2 support depends on the ``Pgenlib`` python library. This can be installed automatically with ``haptools`` if you specify the "files" extra requirements during installation.

.. code-block:: bash
pip install haptools[files]
Converting from VCF to PGEN
---------------------------
To convert a VCF containing only biallelic SNPs to PGEN, use the following command.
Expand Down
4 changes: 2 additions & 2 deletions docs/project_info/contributing.rst
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ Types of Contributions
~~~~~~~~~~~~
Report a bug
~~~~~~~~~~~~
If you have found a bug, please report it on `our issues page <https://github.com/aryarm/haptools/issues>`_ rather than emailing us directly. Others may have the same issue and this helps us get that information to them.
If you have found a bug, please report it on `our issues page <https://github.com/CAST-genomics/haptools/issues>`_ rather than emailing us directly. Others may have the same issue and this helps us get that information to them.

Before you submit a bug, please search through our issues to ensure it hasn't already been reported. If you encounter an issue that has already been reported, please upvote it by reacting with a thumbs-up emoji. This helps us prioritize the issue.

Expand Down Expand Up @@ -80,7 +80,7 @@ Follow these steps to set up a development environment.

.. code-block:: bash
poetry install -E docs -E tests -E files
poetry install -E docs -E tests
Now, try importing ``haptools`` or running it on the command line.

Expand Down
28 changes: 2 additions & 26 deletions docs/project_info/installation.rst
Original file line number Diff line number Diff line change
Expand Up @@ -9,8 +9,8 @@ Using pip

You can install ``haptools`` from PyPI using ``pip``.

.. warning::
We recommend using ``pip >= 20.3`` because of `an issue in pysam <https://github.com/pysam-developers/pysam/issues/1132>`_.
.. note::
We recommend using ``pip >= 20.3``.

.. code-block:: bash
Expand All @@ -20,27 +20,6 @@ You can install ``haptools`` from PyPI using ``pip``.
pip install haptools
Installing ``haptools`` with the "files" extra requirements enables automatic support for a variety of additional file formats, like PLINK2 PGEN files.

.. note::
The "files" extra requirement requires ``gcc`` and a few other compiler tools. Please make sure that they are installed first. To install with conda, for example, please execute the following:

.. code-block:: bash
conda install -c conda-forge gxx_linux-64
Alternatively, you can use the following on Ubuntu:

.. code-block:: bash
sudo apt install build-essential
See `issue 217 <https://github.com/chrchang/plink-ng/issues/217>`_ for current progress on this problem.

.. code-block:: bash
pip install 'haptools[files]'
Using conda
-----------

Expand All @@ -50,9 +29,6 @@ We also support installing ``haptools`` from bioconda using ``conda``.
conda install -c conda-forge -c bioconda haptools
.. note::
Installing ``haptools`` from bioconda with PGEN support is not yet possible. See `issue 228 <https://github.com/chrchang/plink-ng/issues/228>`_ for current progress on this challenge.

Installing the latest, unreleased version
-----------------------------------------
Can't wait for us to tag and release our most recent updates? You can install ``haptools`` directly from the ``main`` branch of our Github repository using ``pip``.
Expand Down
14 changes: 1 addition & 13 deletions haptools/data/genotypes.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,6 +7,7 @@
from logging import getLogger, Logger
from collections import namedtuple, Counter

import pgenlib
import numpy as np
import numpy.typing as npt
from cyvcf2 import VCF, Variant
Expand Down Expand Up @@ -958,15 +959,6 @@ class GenotypesPLINK(GenotypesVCF):
def __init__(self, fname: Path | str, log: Logger = None, chunk_size: int = None):
super().__init__(fname, log)
self.chunk_size = chunk_size
try:
global pgenlib
import pgenlib
except ImportError:
raise ImportError(
f"We cannot read PGEN files without the pgenlib library. Please "
f"reinstall haptools with the 'files' extra requirement via\n"
f"pip install haptools[files]"
)

def read_samples(self, samples: list[str] = None):
"""
Expand Down Expand Up @@ -1238,7 +1230,6 @@ def read(
See documentation for :py:attr:`~.GenotypesVCF.read`
"""
super(Genotypes, self).read()
import pgenlib

sample_idxs = self.read_samples(samples)
pv = pgenlib.PvarReader(bytes(str(self.fname.with_suffix(".pvar")), "utf8"))
Expand Down Expand Up @@ -1395,7 +1386,6 @@ def __iter__(
See documentation for :py:meth:`~.GenotypesPLINK._iterate`
"""
super(Genotypes, self).read()
import pgenlib

pv = pgenlib.PvarReader(bytes(str(self.fname.with_suffix(".pvar")), "utf8"))

Expand Down Expand Up @@ -1484,8 +1474,6 @@ def write(self):
Write the variants in this class to PLINK2 files at
:py:attr:`~.GenotypesPLINK.fname`
"""
import pgenlib

# write the psam and pvar files
self.write_samples()
self.write_variants()
Expand Down
1 change: 0 additions & 1 deletion poetry.lock

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 1 addition & 5 deletions pyproject.toml
Original file line number Diff line number Diff line change
Expand Up @@ -16,7 +16,7 @@ click = ">=8.0.3"
pysam = ">=0.19.0"
cyvcf2 = ">=0.30.14"
matplotlib = ">=3.5.1"
Pgenlib = {version = ">=0.90.1", optional = true, extras = ["files"]}
Pgenlib = ">=0.90.1"

# docs
# these belong in dev-dependencies, but RTD doesn't support that yet -- see
Expand Down Expand Up @@ -61,10 +61,6 @@ tests = [
"nox-poetry"
]

files = [
"Pgenlib"
]

[tool.poetry.scripts]
haptools = 'haptools.__main__:main'

Expand Down
4 changes: 0 additions & 4 deletions tests/test_data.py
Original file line number Diff line number Diff line change
Expand Up @@ -313,7 +313,6 @@ def test_merge_variants(self):

class TestGenotypesPLINK:
def _get_fake_genotypes_plink(self):
pgenlib = pytest.importorskip("pgenlib")
gts_ref_alt = TestGenotypesVCF()._get_fake_genotypes_refalt()
gts = GenotypesPLINK(gts_ref_alt.fname)
gts.data = gts_ref_alt.data
Expand All @@ -322,7 +321,6 @@ def _get_fake_genotypes_plink(self):
return gts

def _get_fake_genotypes_multiallelic(self):
pgenlib = pytest.importorskip("pgenlib")
gts_ref_alt = TestGenotypesVCF()._get_fake_genotypes_multiallelic()
gts = GenotypesPLINK(gts_ref_alt.fname)
gts.data = gts_ref_alt.data
Expand All @@ -331,8 +329,6 @@ def _get_fake_genotypes_multiallelic(self):
return gts

def _get_fake_genotypes_multiallelic_tr(self):
pgenlib = pytest.importorskip("pgenlib")

gts_tr = GenotypesTR(DATADIR / "simple-tr.vcf")
gts_tr.read()

Expand Down
2 changes: 0 additions & 2 deletions tests/test_outputvcf.py
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,6 @@

def _get_files(plink_input=False, plink_output=False):
log = getLogger(name="test")
if plink_input or plink_output:
pytest.importorskip("pgenlib")
bkp_file = DATADIR / "outvcf_test.bp"
model_file = DATADIR / "outvcf_gen.dat"
vcf_file = DATADIR / ("outvcf_test" + (".pgen" if plink_input else ".vcf.gz"))
Expand Down
1 change: 0 additions & 1 deletion tests/test_simgenotype.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,6 @@ def test_basic(capfd):


def test_pgen_output(capfd):
pytest.importorskip("pgenlib")
prefix = DATADIR / "example_simgenotype.pgen"
dat_file = DATADIR / "outvcf_gen.dat"
map_dir = DATADIR / "map"
Expand Down
1 change: 0 additions & 1 deletion tests/test_simphenotype.py
Original file line number Diff line number Diff line change
Expand Up @@ -526,7 +526,6 @@ def test_ancestry(self, capfd):
tmp_transform.unlink()

def test_pgen(self, capfd):
pytest.importorskip("pgenlib")
# first, create a temporary file containing the output of transform
tmp_tsfm = Path("simple-haps.pgen")
gt_file = DATADIR / "simple.pgen"
Expand Down
1 change: 0 additions & 1 deletion tests/test_transform.py
Original file line number Diff line number Diff line change
Expand Up @@ -261,7 +261,6 @@ def test_basic_subset(capfd):


def test_basic_pgen_input(capfd):
pytest.importorskip("pgenlib")
expected = """##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##contig=<ID=1>
Expand Down

0 comments on commit 9234bef

Please sign in to comment.