Dump Hash Count Pairs

You can use the .dump() method to write hash:count pairs from a KmerCountTable to a tab-delimited output file.

Example data:

import oxli

# Demo table
kct = oxli.KmerCountTable(ksize=4)
kct.count("AAAA")  # Count 'AAAA'
kct.count("TTTT")  # Count revcomp of 'AAAA'
kct.count("AATT")  # Count 'AATT'
kct.count("GGGG")  # Count 'GGGG'
kct.count("GGGG")  # Count again.

# Hashes
#  17832910516274425539 = AAAA/TTTT
# 382727017318141683 = AATT
# 73459868045630124 = GGGG

By default dump() will return unsorted records. Order will vary between runs.

kct.dump()
>>> [(17832910516274425539, 2), (382727017318141683, 1), (73459868045630124, 2)]

Use the sortcounts option to sort records on counts then on keys:

kct.dump(sortcounts=True)
>>> [(382727017318141683, 1), (73459868045630124, 2), (17832910516274425539, 2)]

Use the sortkeys option to sort records on hash keys:

kct.dump(sortkeys=True)
>>> [(73459868045630124, 2), (382727017318141683, 1), (17832910516274425539, 2)]

Sorted hash:count pairs can be written to a tab-delimited text file by specifying an output target:

# Write tab-delimited records to kct.dump
kct.dump(sortcounts=True, file="kct.dump")

If no output file is specified, records are returned as list of (hash,count) tuples (as above).

This list can be converted to a pandas dataframe:

import pandas as pd
table_dump = kct.dump(sortcounts=True)
df = pd.DataFrame(table_dump, columns=['Hash', 'Count'])
print(df)
>>>
  '''
                     Hash  Count
  0    382727017318141683      1
  1     73459868045630124      2
  2  17832910516274425539      2
  '''

If table is empty, returns empty list:

empty_kct = oxli.KmerCountTable(ksize=4)

empty_kct.dump()
>>> []

Home

Installing Oxli
Basic Setup
For Developers

Getting Started
Getting Started

Counting Kmers
Basic Counting
Extracting from Files
Handling Bad Kmers

Looking up Counts
Single Kmer Lookup
Multiple Kmer Lookup

Removing Records Remove Kmers Abundance Filtering

Exploring Count Tables
Iterating Records
Attributes

Set Operations
Basic SetOps

Exporting Data
Histo: Export Frequency Counts
Dump: Write Hash:Count Pairs
Save and Load KmerCountTables

F.A.Q
Frequently Asked Questions

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Dump Hash Count Pairs

Clone this wiki locally