Skip to content

Commit

Permalink
more docs and prep for v0.3.1
Browse files Browse the repository at this point in the history
  • Loading branch information
alisandra committed Feb 13, 2023
1 parent d6a283c commit e69efe3
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 12 deletions.
15 changes: 4 additions & 11 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -5,7 +5,6 @@ Gene calling with Deep Neural Networks.

## Disclaimer
This software is undergoing active testing and development.
Build on it at your own risk.

## Goal
Setup and train models for _de novo_ prediction of gene structure.
Expand Down Expand Up @@ -46,16 +45,9 @@ This example focuses only on applying trained models for gene calling, only.
Information on training and evaluating the models can be found in `docs`.

### Using trained models
> NOTE: the extensively evaluated models from the paper are available by
> running `git checkout v0.2.0` and following the instructions
> there in. But they were not yet _applicable_ for generating gff3 files.
We are working towards training another round of models w/ the current
architecture. For now a preliminary land plant model is available and
will be used for the rest of the example.

#### Acquire models
The best models for each or all lineages can automatically
The best models for each or all lineages can automatically be
downloaded with the `fetch_helixer_models.py` script.

The available lineages are `land_plant`, `vertebrate`, `invertebrate`,
Expand Down Expand Up @@ -111,8 +103,9 @@ that generalize well to your target species. When in doubt selection via `--line
this will use the best available model for that lineage.

##### `--subsequence-length` and overlapping parameters
> From v0.3.1 onwards these paramters are set to reasonable defaults when `--lineage`
> is used, but `--subsequence-length` will still need to be specified when using `--model-filepath`.
> From v0.3.1 onwards these parameters are set to reasonable defaults when `--lineage`
> is used, but `--subsequence-length` will still need to be specified when using `--model-filepath`,
> while the overlapping parameters can be derived automatically.
Subsequence length controls how much of the genome the Neural Network can see at once, and should
ideally be comfortably longer than the typical gene.
Expand Down
2 changes: 1 addition & 1 deletion setup.py
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@

setup(
name='helixer',
version='0.3.0',
version='0.3.1',
description='Deep Learning fun on gene structure data',
packages=['helixer', 'helixer.core', 'helixer.prediction', 'helixer.evaluation', 'helixer.tests', 'helixer.export'],
package_data={'helixer': ['testdata/*.fa', 'testdata/*.gff']},
Expand Down

0 comments on commit e69efe3

Please sign in to comment.