Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MRG: misc updates #1

Merged
merged 10 commits into from
Sep 1, 2024
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
20 changes: 20 additions & 0 deletions .github/dependabot.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,20 @@
version: 2
updates:
- package-ecosystem: pip
directory: "/"
schedule:
interval: weekly
open-pull-requests-limit: 10
- package-ecosystem: cargo
directory: "/"
schedule:
interval: weekly
allow:
- dependency-type: "direct"
open-pull-requests-limit: 10
ignore:
- dependency-name: "zip"
- package-ecosystem: "github-actions"
directory: "/"
schedule:
interval: weekly
61 changes: 61 additions & 0 deletions .github/workflows/lint.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,61 @@
# This file is autogenerated by maturin v1.4.0
# To update, run
#
# maturin generate-ci github
#

name: "lint"
on:
pull_request:
push:
branches: [latest]
jobs:
tests_on_linux:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
with:
fetch-depth: 0

- name: cache conda
uses: actions/cache@v4
env:
CACHE_NUMBER: 1
with:
path: ~/conda_pkgs_dir
key:
${{ runner.os }}-conda-${{ env.CACHE_NUMBER }}-${{ hashFiles('environment.yml') }}

- name: cache rust
uses: Swatinem/rust-cache@v2

- name: setup conda
uses: conda-incubator/setup-miniconda@v3
with:
auto-update-conda: true
python-version: 3.12
channels: conda-forge,bioconda
miniforge-variant: Mambaforge
miniforge-version: latest
use-mamba: true
mamba-version: "*"
activate-environment: sourmash_dev
auto-activate-base: false
# use-only-tar-bz2: true

- run: conda info
- run: conda list
- run: conda config --show

- run: mamba search rust

- name: install dependencies
shell: bash -l {0}
run: mamba install rust==1.75.0

- name: install dependencies 2
shell: bash -l {0}
run: mamba install compilers maturin pytest pandas

- name: Run cargo fmt
run: cargo fmt --all -- --check --verbose
59 changes: 58 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,59 @@
# oxli
k-mers and the like

Author: C. Titus Brown (@ctb), [email protected]

oxli is a simple Rust library + Python interface for counting k-mers
in genomic sequencing data.

## Installation

You can try building it yourself:
```
mamba env create -f environment.yml -n oxli
make wheel
```
and then install the resulting wheel.

We are working on packaging via conda-forge.

## Documentation

Please see [the API documentation](doc/api.md).

## Is there anything I should know about oxli?

Two things -

First, oxli is channeling
[khmer](https://khmer.readthedocs.io/en/latest/), a package written by
@ctb and many others. You shouldn't be too surprised to see useful
functionality from khmer making an appearance in oxli.

Second, it's written on top of the
[sourmash](https://sourmash.readthedocs.io/)
[rust library](https://sourmash.readthedocs.io/), and the underlying
code for dealing with sequence data is pretty well tested.

## What's the history here?

The history is a bit convoluted:

* the khmer package was useful for inspecting large collections of
k-mers, but was hard to maintain and evolve.

* in ~2016 @ctb's lab more or less switched over to developing
sourmash, which was initially built on a similar tech stack to khmer
(Python & C++).

* at some point, @luizirber rewrote the sourmash C++ code into Rust.

* this forced @ctb to learn Rust to maintain sourmash.

* @ctb then decided he liked Rust an awful lot, and missed some of the
khmer functionality.

* voila, oxli was born.

---

(Sep 2024)
33 changes: 33 additions & 0 deletions doc/api.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
# A simple example of the API

Import necessary modules:

```python
>>> import screed
>>> import oxli

```

Create a KmerCountTable with a k-mer size of 31:

```python
>>> counts = oxli.KmerCountTable(31)

```

Open a FASTA file and consume k-mers from all the sequences within:

```python
>>> for record in screed.open('example.fa'):
... counts.consume(record.sequence)
349900

```

Get the count of `CGGAGGAAGCAAGAACAAAATATTTTTTCAT` in the data::

```python
>>> counts.get('CGGAGGAAGCAAGAACAAAATATTTTTTCAT')
1

```
Loading
Loading