Skip to content

Commit

Permalink
docs(.): Add documentation for lut_check
Browse files Browse the repository at this point in the history
Adds proper documentation for code introduced by
9324872. Updates docstrings in relevant
`scripts` files and modifies the necessary sphinx rst files. Ensures
doctests continue to pass by adding pandas as a dep to the relevant
environment.

See: #2
  • Loading branch information
rbpatt2019 committed Jul 6, 2021
1 parent bd0f62e commit 4042dbd
Show file tree
Hide file tree
Showing 7 changed files with 34 additions and 3 deletions.
4 changes: 2 additions & 2 deletions README.rst
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ The following will do the trick:
xargs -n1 curl -sL |
tar xzf -
After querying the github api to ge the most recent release information,
After querying the github api to get the most recent release information,
we grep for the desired URL,
split the line and extract the field,
trim superfluous characters,
Expand Down Expand Up @@ -195,7 +195,7 @@ as we ran it,
with the software versions,
as we used them.
To further aid in this effort,
`nox`_ and `pre-commit` are used,
`nox`_ and `pre-commit`_ are used,
which also ensures that development happens in reproducible environments.

Unfortunately,
Expand Down
7 changes: 7 additions & 0 deletions docs/data_handling.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,13 @@ data_handling Module

.. automodule:: scripts.data_handling

data_handling.request
---------------------

.. automodule:: scripts.data_handling.request
:members:
:private-members:

data_handling.process
---------------------

Expand Down
6 changes: 6 additions & 0 deletions docs/data_handling_tests.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,6 +5,12 @@ Tests for the data_handling Module

.. automodule:: tests.data_handling

Tests for the data_handling.request Submodule
---------------------------------------------

.. automodule:: tests.data_handling.test_request
:members:

Tests for the data_handling.process Submodule
---------------------------------------------

Expand Down
1 change: 1 addition & 0 deletions environments/doc_tests.txt
Original file line number Diff line number Diff line change
@@ -1,2 +1,3 @@
xdoctest==0.15.4
pytest==6.2.4
pandas==1.2.4
8 changes: 7 additions & 1 deletion scripts/data_handling/request.py
Original file line number Diff line number Diff line change
Expand Up @@ -29,7 +29,6 @@ def lut_check(gene: str, lut: pd.DataFrame) -> Optional[str]: # type: ignore
Common reasons (at least for me!) that a gene might not be found include
spelling errors and name errors (ie. using NGN2 instead of NEUROG2).
Parameters
----------
gene : str
Expand All @@ -41,6 +40,13 @@ def lut_check(gene: str, lut: pd.DataFrame) -> Optional[str]: # type: ignore
-------
Optional[str]
Example
-------
>>> lut = pd.DataFrame.from_dict({"name": ["ASCL1"], "id": ["ENSG00000139352.3"]})
>>> lut_check("ASCL1", lut)
'ENSG00000139352.3'
>>> lut_check("NotAGene", lut)
"""
with contextlib.suppress(IndexError):
return lut.loc[lut["name"] == gene, "id"].values[0]
3 changes: 3 additions & 0 deletions scripts/multithreading/request.py
Original file line number Diff line number Diff line change
Expand Up @@ -69,6 +69,9 @@ def _get_session(region: str) -> requests.Session:
def gtex_request(region: str, gene: str, output: str) -> None:
"""Make a thead-safe gtex request against medianTranscriptExpression.
If gene is a str, then a query is made to GTEx; however, if gene is None, then
a blank file is created and no query is performed.
A thread local session is provided by a call to ``_get_session``.
This allows the reuse of sessions, which, among other things,
provides significant speed ups.
Expand Down
8 changes: 8 additions & 0 deletions scripts/request.py
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,14 @@
This step queries the GTEx API for transcript expression data in the region
specified by the user,
using a user provided list of genes names.
To simplify the use experience,
the user should provide common gene names.
These are then automatically converted to the GTEx-required Ensembl IDs
by referencing Gencode v26 as this is the version used by GTEx.
If a gene name is not found in Gencode,
then a warning is dumped to the logs,
and a blank file created to propagate this error downstream.
As the GTEx API is quite straightforward,
these queries can be made using the standard `requests.session`_ object.
Data were pulled from the ``gtex_v8`` dataset limited to the
Expand Down

0 comments on commit 4042dbd

Please sign in to comment.