Releases: ksahlin/isONcorrect
Releases · ksahlin/isONcorrect
v0.1.3.5
This release restructures the folders to work well with Pythons new suggested way to build packages for PyPI (.toml files).
In essence:
- A build script pyproject.toml was added to the repo.
- A src/isoncorrect folder was created instead to replace previous
modules
folder. - The scripts
run_isoncorrect
andisONcorrect
was placed in the src/isoncorrect folder and given.py
file endings to behave as modules included in theinsoncorrect
library. - The build instructions now produce the binaries
run_isoncorrect
andisONcorrect
automatically from therun_isoncorrect.py
andisONcorrect.py
modues by giving the entry point functionmain()
in each file.
The new structure requires isONcorrect
to be installed with a package manager conda/pip.
For development (downloading github source), one needs to temporarily modify line 21 in isONcorrect.py from from isoncorrect import create_augmented_reference, help_functions, correct_seqs
to import create_augmented_reference, help_functions, correct_seqs
.
(Version number had to be increased several increments after several unsuccessful attempts to get the new build to install properly)
v0.1.0
This version adds the following over previous versions:
- An over-correction checker: The original read and the corrected read are aligned and eventual structural over-corrections are removed. Such events should be rare. We never observed any such event with previous defaults
--k 9 --w 10
but rare occurences happened with the new defaults--k 9 --w 20
introduced in v0.0.8. This should be fixed now. This check adds negligible time (~1-2%) to overall runtime - Better (sparser) minimizer sampling in poly-A/C/G/T regions with two new rules: 1. sample last minimizer if ties and 2. do not resample a minimizer if last minimizer is still in the window. Reduces repetitive anchors a lot in poly-regions. This improves runtime for instances where long ploy regions are frequent.
- Related to point 2; Upper limit on how repetitive a paired-minimizer anchor can be in the data (at most 10x the number of reads). I have not observed such cases yet in ONT but setting this just in case as it happened for some degenerate pacbio reads (for which isONcorrect does not typically need to be run anyway).