Skip to content

Commit

Permalink
Refactor GenomicRangesList & update all docstrings (#23)
Browse files Browse the repository at this point in the history
* Rewrite `GenomicRangesList`
* Use simple typehints
* Revamp docstrings across all classes
* Update readme

---------

Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com>
  • Loading branch information
jkanche and pre-commit-ci[bot] authored Sep 20, 2023
1 parent b74bc11 commit 69bbaa4
Show file tree
Hide file tree
Showing 13 changed files with 1,234 additions and 842 deletions.
16 changes: 8 additions & 8 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,8 +4,7 @@

# GenomicRanges

Container class to represent genomic locations and support genomic analysis in Python similar to Bioconductor's [GenomicRanges](https://bioconductor.org/packages/release/bioc/html/GenomicRanges.html).

GenomicRanges is a Python container class designed to represent genomic locations and support genomic analysis. It is similar to Bioconductor's [GenomicRanges](https://bioconductor.org/packages/release/bioc/html/GenomicRanges.html).

## Install

Expand All @@ -17,13 +16,13 @@ pip install genomicranges

## Usage

The package provide several ways to represent genomic annotations and intervals.
The package provides several ways to represent genomic annotations and intervals.

### Initialize a `GenomicRanges` object

#### From UCSC or GTF file

Methods are available to easily access UCSC genomes or load a genome annotation from GTF
You can easily access UCSC genomes or load a genome annotation from a GTF file using the following methods:

```python
import genomicranges
Expand All @@ -34,9 +33,10 @@ gr = genomicranges.from_ucsc(genome="hg19")
```
#### Pandas DataFrame

A common representation in Python is a pandas DataFrame for all tabular datasets. One can convert this into `GenomicRanges`. ***Intervals are inclusive on both ends.***
A common representation in Python is a pandas DataFrame for all tabular datasets. You can convert a DataFrame into a `GenomicRanges` object. Please note that intervals are inclusive on both ends, and your DataFrame must contain columns seqnames, starts, and ends to represent genomic coordinates.

Here's an example:

***Note: DataFrame must contain columns `seqnames`, `starts` and `ends` to represent genomic coordinates.***

```python
import genomicranges
Expand All @@ -58,7 +58,7 @@ gr = genomicranges.from_pandas(df)

### Interval Operations

Currently supports most commonly used [interval based operations](https://bioconductor.org/packages/release/bioc/html/GenomicRanges.html).
GenomicRanges currently supports most commonly used [interval based operations](https://bioconductor.org/packages/release/bioc/html/GenomicRanges.html).

```python
subject = genomicranges.from_ucsc(genome="hg38")
Expand All @@ -77,7 +77,7 @@ hits = subject.nearest(query)
print(hits)
```

Checkout the [documentation](https://biocpy.github.io/GenomicRanges/) for more usecases.
For more usage examples, check out the [documentation](https://biocpy.github.io/GenomicRanges/).


<!-- pyscaffold-notes -->
Expand Down
2 changes: 1 addition & 1 deletion setup.cfg
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,7 @@ package_dir =
install_requires =
importlib-metadata; python_version<"3.8"
pandas
biocframe>=0.3.5
biocframe>=0.3.9
numpy
prettytable

Expand Down
Loading

0 comments on commit 69bbaa4

Please sign in to comment.