Skip to content

Export Frequency Counts

Adam Taranto edited this page Sep 23, 2024 · 4 revisions

This section describes how to export frequency counts of k-mers using the histo() method.

import oxli
kct = oxli.KmerCountTable(ksize=3)

# Count some k-mers
kct.consume('AAAAA') # 'AAA' x 3
kct.count('TTT') # 'AAA' + 1
kct.count('AAC') # count 1

# Export (frequency,count) tuples
kct.histo(zero=False) # [(1, 1), (4, 1)]

# Include frequencies with 0 counts
kct.histo(zero=True) # [(0, 0), (1, 1), (2, 0), (3, 0), (4, 1)]

Convert frequency count list into a Pandas dataframe.

import pandas as pd
histo_output = kct.histo(zero=True)

# Create a Pandas DataFrame from the list of tuples
df = pd.DataFrame(histo_output, columns=['Frequency', 'Count'])

print(df)
   Frequency  Count
0          0      0
1          1      1
2          2      0
3          3      0
4          4      1

Write the histo dataframe to a tab-delimited file:

# Save histo daf to tab delimited file
df.to_csv('histo.tab', sep='\t', index=False) 
# Use `header=False` to also exclude header row.

# Loading the tab file
new_df = pd.read_csv('histo.tab', sep ='\t')
# Use `header=None` when importing from file with no headers.