You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The existing Sequence.gc method purposefully ignores characters other than G/C and uses the sequence length as a denominator to produce "fraction g/c". This has a few benefits:
I'd welcome any pull request to implement something like:
Sequence.gc_iupac method that counts e.g. S=GC and W=AT, and also considers K=GT. This is considerably more difficult than the current method and requires some validation of the sequence to confirm that it only contains valid IUPAC letters
Sequence.gc_strict method that counts G/C and A/T, implicitly ignoring all other characters. This is probably closest to what people expect as GC content
The text was updated successfully, but these errors were encountered:
The existing
Sequence.gc
method purposefully ignores characters other than G/C and uses the sequence length as a denominator to produce "fraction g/c". This has a few benefits:len(sequence)
is fast to compute vs. counting more occurrences of charactersThe downside is that any non-GCAT characters may be included in the denominator:
pyfaidx/pyfaidx/__init__.py
Lines 254 to 266 in 7b4d8d7
I'd welcome any pull request to implement something like:
Sequence.gc_iupac
method that counts e.g. S=GC and W=AT, and also considers K=GT. This is considerably more difficult than the current method and requires some validation of the sequence to confirm that it only contains valid IUPAC lettersSequence.gc_strict
method that counts G/C and A/T, implicitly ignoring all other characters. This is probably closest to what people expect as GC contentThe text was updated successfully, but these errors were encountered: