Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add histo for frequency counts #29

Merged
merged 4 commits into from
Sep 18, 2024
Merged

Add histo for frequency counts #29

merged 4 commits into from
Sep 18, 2024

Conversation

Adamtaranto
Copy link
Collaborator

Closes #23

New functions:

  • histo(): Calculates frequency counts from KmerCountTable. Optionally include zero count frequencies up to max observed value with zero=True.
  • min(): Returns minimum observed kmer count.
  • max(): Returns maximum observed kmer count.

Example use:

import oxli
import pandas as pd

# Create new table
kct = oxli.KmerCountTable(ksize=3)

# Count some k-mers
kct.consume('AAAAA') # count 'AAA' x 3
kct.count('TTT') # count as revcomp 'AAA' + 1
kct.count('AAC') # count 1

# histo() yields (freq,count) tuples
histo_output = kct.histo(zero=True) # [(0, 0), (1, 1), (2, 0), (3, 0), (4, 1)]

# Create a Pandas DataFrame from the list of tuples
df = pd.DataFrame(histo_output, columns=['Frequency', 'Count'])

print(df)
# Returns:
"""
   Frequency  Count
0          0      0
1          1      1
2          2      0
3          3      0
4          4      1
"""

@Adamtaranto Adamtaranto requested a review from ctb September 15, 2024 02:23
@Adamtaranto Adamtaranto added the enhancement New feature or request label Sep 15, 2024
@Adamtaranto
Copy link
Collaborator Author

Min and Max don't take any args and should probably be attributes i.e. called as .min instead of .min()

src/lib.rs Show resolved Hide resolved
@Adamtaranto Adamtaranto merged commit 116d1e5 into main Sep 18, 2024
13 checks passed
@Adamtaranto Adamtaranto deleted the dev_histo branch September 18, 2024 02:15
@ctb ctb mentioned this pull request Sep 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add histo function to get frequency counts
2 participants