You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Some users may wish to count kmers from the forward and reverse strands separately. Discussed in #74.
Propose adding an option (stranded/strand_aware?) to make a KmerCountTable store fwd/rev kmers separately.
Option 1: Would disable canonical kmer selection and instead store +1 for both the fwd and reverse strands.
Option 2: Count only from fwd strand. Do not canonicalise.
@dr-joe-wirth would Opt 1 suite your use case? Or do you need the counts in separate tables (Opt 2)? i.e. count fwd strand only, then revcomp the sequence and count again in another table.
The text was updated successfully, but these errors were encountered:
store all the kmers for both the forward and reverse strands instead of storing all the canonical kmers for both the forward and reverse strands.
store all the kmers from the forward strand only.
For my purposes, I want to get the kmers that appear exactly once. Currently, I use khmer like this:
get kmers from the forward strand that appear once
flip the sequence then get the kmers that appears once in the flipped sequence (reverse strand kmers)
keep only the kmers that are not shared on the two strands (symmetric difference of sets)
So option 1 can work for my purposes if the count method reports the number of times a kmer appears in total, not just the number of times it appears on one strand. If that is not feasible, then I would prefer option 2.
Ok, I think option 2 is probably best here. Keeps things consistent with khmer and allows for cases where users just want the fwd strand kmers. If you want "all kmers", revcomp the sequence and consume that too.
@ctb I will add this after PRs that change kmer counting/hashing behaviour are wrapped up #10#83#87
Some users may wish to count kmers from the forward and reverse strands separately. Discussed in #74.
Propose adding an option (stranded/strand_aware?) to make a KmerCountTable store fwd/rev kmers separately.
Option 1: Would disable canonical kmer selection and instead store +1 for both the fwd and reverse strands.
Option 2: Count only from fwd strand. Do not canonicalise.
@dr-joe-wirth would Opt 1 suite your use case? Or do you need the counts in separate tables (Opt 2)? i.e. count fwd strand only, then revcomp the sequence and count again in another table.
The text was updated successfully, but these errors were encountered: