Python implementation of the IRanges Bioconductor package.
To get started, install the package from PyPI
pip install iranges
# To install optional dependencies
pip install iranges[optional]
An IRanges
holds a start position and a width, and is most typically used to represent coordinates along some genomic sequence. The interpretation of the start position depends on the application; for sequences, the start is usually a 1-based position, but other use cases may allow zero or even negative values.
Note: Ends are inclusive.
from iranges import IRanges
starts = [1, 2, 3, 4]
widths = [4, 5, 6, 7]
x = IRanges(starts, widths)
print(x)
## output
IRanges object with 4 ranges and 0 metadata columns
start end width
<ndarray[int32]> <ndarray[int32]> <ndarray[int32]>
[0] 1 4 4
[1] 2 6 5
[2] 3 9 6
[3] 4 10 7
IRanges
supports most interval based operations. For example to compute gaps
x = IRanges([-2, 6, 9, -4, 1, 0, -6, 10], [5, 0, 6, 1, 4, 3, 2, 3])
gaps = x.gaps()
print(gaps)
## output
IRanges object with 2 ranges and 0 metadata columns
start end width
<ndarray[int32]> <ndarray[int32]> <ndarray[int32]>
[0] -3 -3 1
[1] 5 8 4
Or Perform interval set operations
x = IRanges([1, 5, -2, 0, 14], [10, 5, 6, 12, 4])
y = IRanges([14, 0, -5, 6, 18], [7, 3, 8, 3, 3])
intersection = x.intersect(y)
print(intersection)
## output
IRanges object with 3 ranges and 0 metadata columns
start end width
<ndarray[int32]> <ndarray[int32]> <ndarray[int32]>
[0] -2 2 5
[1] 6 9 3
[2] 14 17 4
IRanges uses nested containment lists under the hood to perform fast overlap and search based operations. These methods typically return a hits-like BiocFrame.
subject = IRanges([2, 2, 10], [1, 2, 3])
query = IRanges([1, 4, 9], [5, 4, 2])
overlap = subject.find_overlaps(query)
print(overlap)
## output
BiocFrame with 3 rows and 2 columns
self_hits query_hits
<ndarray[int64]> <ndarray[int64]>
[0] 1 0
[1] 0 0
[2] 2 2
Similarly one can perform search operations like follow, precede or nearest.
query = IRanges([1, 3, 9], [2, 5, 2])
subject = IRanges([3, 5, 12], [1, 2, 1])
nearest = subject.nearest(query, select="all")
print(nearest)
## output
BiocFrame with 4 rows and 2 columns
query_hits self_hits
<ndarray[int64]> <ndarray[int64]>
[0] 0 0
[1] 0 1
[2] 1 1
[3] 2 2
This project has been set up using PyScaffold 4.5. For details and usage information on PyScaffold see https://pyscaffold.org/.