Releases: zellerlab/GECCO
Releases · zellerlab/GECCO
0.6.0
Changed
- Updated internal model with a cleaned-up version of the MIBiG-2.0
Pfam-33.1/Tigrfam-15.0 embedding. - Updated internal InterPro catalog.
Fixed
- Features not being grouped together in
gecco cv
andgecco train
when provided with a feature table where rows were not sorted by
protein IDs.
0.5.5
0.5.4
0.5.3
0.5.2
Added
- Support for downloading HMM files directly from GitHub releases assets.
- Validation of filtered HMMs with MD5 checksum.
Fixed
- Invalid coordinates of protein domains in GenBank output files.
gecco.interpro
module not being added to wheel distribution.
Changed
- Bump required
pyhmmer
version tov0.2.1
.
0.5.1
0.5.0
Added
- Explicit support for Python 3.9.
Changed
pyhmmer
is used to annotate protein sequences instead of HMMER3 binaryhmmsearch
.- HMM files are stored in binary format to speedup parsing and reduce storage size.
tqdm
is now a training-only dependency.gecco cv
now requires training dependencies.
0.4.5
Added
- Additional
fold
column to cross-validation table output.
Changed
- Use sequence ID instead of protein ID to extract type from cluster in
gecco cv
. - Install HMM data in pre-pressed format to make
hmmsearch
runs faster on short sequences. gecco.orf
was rewritten to extract genes from input sequences in parallel.
0.4.4
Added
gecco cv loto
command to run LOTO cross-validation using BGC types
for stratification.header
keyword argument toFeatureTable.dump
andClusterTable.dump
to write the table without the column header allowing to append to an
existing table.__getitem__
implementation forFeatureTable
andClusterTable
that returns a single row or a sub-table from a table.
Fixed
gecco cv
command now writes results iteratively instead of holding
the tables for every fold in memory.
Changed
- Bumped
pandas
training dependency tov1.0
.