-
Notifications
You must be signed in to change notification settings - Fork 80
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Make skipmers robust, but keep #3395 functional in the meantime. This PR: - enables second skipmer types, so we have m1n3 in addition to m2n3 - switches to a reading frame approach for both translation + skipmers, which means we first build the reading frame, then kmerize, rather than building kmers + translating/skipping on the fly - avoids "extended length" needed for skipping on the fly Since this changes the `SeqToHashes` strategy a bit, there's one python test where we now see a different error. Future thoughts: - with the new structure, it would be straightforward to add validation to exclude protein k-mers with invalid amino acids (`X`). I guess I'm not entirely sure what happens to those atm...
- Loading branch information
Showing
5 changed files
with
707 additions
and
272 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.