Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add documentation for skipmers #549

Open
bluegenes opened this issue Dec 20, 2024 · 1 comment
Open

add documentation for skipmers #549

bluegenes opened this issue Dec 20, 2024 · 1 comment

Comments

@bluegenes
Copy link
Contributor

ref sourmash-bio/sourmash#3449

Skipmers are being added as an experimental feature for testing (#531).
Add documentation for use.

Also, with sourmash-bio/sourmash#3446, we should be able to use the sourmash python layer to evaluate skipmer signatures, instead of having to go look in the manifest and sig files themselves. Update tests accordingly

@bluegenes
Copy link
Contributor Author

from #531 description:

Skipmers are something we've considered adding for quite some time, as DNA kmer that ~allows "mismatches" aka increases entropy + sensitivity.

Over in sourmash-bio/sourmash#3395, I added a skipm1n3 and skipm2n3 moltypes, as well as code in SeqToHashes to build them. In sourmash-bio/sourmash#3446 I also added capacity for sourmash python function to read skipmer sigs, so sig cat, sig summarize, etc should now work.

There are two types of skipmers available, keep-2,skip-1 ("skipm2n3") and keep-1,skip-2 ("skipm1n3"). To sketch with skipmers, specify skipm2n3 or skipm1n3 in the parameter string. The skipmer ksize is the "final" size that the k-mer ends up. --i.e. for ksize 3, the sequence ACTAG would produce two skip-mers for m2n3: ACA, CTG.

example sketching commands:

manysketch:

sourmash scripts manysketch -p skipm2n3,k=21,scaled=1000 ms.csv -o output.zip

singlesketch:

sourmash scripts singlesketch -p skipm2n3,k=21,scaled=1000 myfile.fasta -o myfile.zip

Skipmer References:

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant