Releases: zellerlab/GECCO
Releases · zellerlab/GECCO
v0.9.10
Fixed
- Progress reading display when reading from compressed files.
- Change labeling routine to use broad overlaps when annotating genes with cluster tables (#15).
Changed
- Bump supported
polars
dependency tov0.20
. - Bump supported
statsmodels
dependency tov0.14
. - Report identifier of sequences with uni-valued labels when training.
v0.9.9
Added
- Support for
gzip
,bzip2
,lz4
andxz
-compressed input files.
Fixed
- Outdated use of
pandas
API ingecco cv
command.
Changed
- Bump
pyhmmer
dependency tov0.10.0
. - Bump
pyrodigal
dependency tov3.0.0
. - Make
gecco cv
output a gene table with a ground truth column.
v0.9.8
Fixed
ClusterTable.from_clusters
extracting cluster IDs in the wrong column.- Deprecation warnings in
polars.read_csv
andpolars.write_csv
with recentpolars
versions. - Deprecation warnings in
importlib_resources
with recent Python versions.
v0.9.7
Added
- Command line option to annotate proteins using bitscore cutoffs from HMMs.
- Command line option to disentangle overlapping domains after HMM annotation.
Changed
- Bump
pyhmmer
dependency tov0.8.0
. - Bump
pyrodigal
dependency tov2.1.0
. - Rewrite
gecco.model
to usepolars
for managing tabular data. - Replace
pandas
dependencies withpolars
- Update
gecco run
to skip type classification for tasks without an assigned cluster type.
Fixed
Cluster.to_seq_record
crashing when called on a cluster withtypes
attribute unset.- Progress bar resetting when performing domain annotation with multiple HMMs.
Removed
- Support for Python 3.7.
v0.9.6
Added
- Gene Ontology annotations to
gecco.interpro
local metadata. - Reference to Gene Ontology terms and derived functions to
gecco.model.Domain
objects. - Gene color based on predicted function in
gecco.model.Gene.to_seq_feature
.
Fixed
- Missing
gzip
import in the CLI preventing usage of gzip-compressed inputs. - Invalid coordinates of domains found in reverse-strand genes.
- Detection of entry points with
importlib.metadata
on older Python versions.
Changed
bgc_id
columns of cluster tables are renamedcluster_id
.gecco.model.ProductType
is renamed togecco.model.ClusterType
.- Bumped
pyrodigal
dependency tov2.0
. - Bumped
pyhmmer
dependency tov0.7
.
v0.9.5
v0.9.4
Added
classes_
property toTypeClassifier
to access theclasses_
attribute of theTypeBinarizer
.- Alternative ORF finder
CDSFinder
which simply extracts CDS features from input sequences (#8). - Support for annotating domains with "exclusive" HMMs to annotate genes with at most one HMM from the library.
Changed
ProductType
is not restricted to MIBiG types anymore and can support any string as a base type identifier.PyrodigalFinder
now usesmultiprocessing.pool.ThreadPool
instead of custom thread code thanks toOrfFinder.find_genes
reentrancy introduced in Pyrodigalv1.0
.PyrodigalFinder
can now be used in single / non-meta mode from the API.- BUmped minimum
rich
version to12.3
to useNone
total in progress bars when the size of an HMM library is unknown.
Fixed
- Broken MyPy type annotations in the
gecco.model
andgecco.cli
modules.
v0.9.3
Changed
--format
flag ofgecco annotate
andgecco run
CLI commands is now made lowercase before giving value toBio.SeqIO
.
Fixed
- Genes with duplicate IDs being silently ignored in
HMMER.run
.
v0.9.2
Added
- Padding of short sequences with empty genes when predicting probabilities in
ClusterCRF
.
v0.9.1
Changed
- Make the
genes.tsv
andfeatures.tsv
table contain all genes even when they come from a contig too short to be processed by the CRF sliding window. - Replaced the
--force-clusters-tsv
flag with a--force-tsv
flag to force writing TSV tables even when no genes or clusters were found ingecco run
orgecco annotate
.