Skip to content

Basic Counting

Adam Taranto edited this page Sep 23, 2024 · 3 revisions

This section explains basic k-mer counting.

The basics

Import necessary modules:

import oxli   # This package!

Create a KmerCountTable with a k-mer size of 4:

# New KmerCountTable object that will count 4-mers
kct = oxli.KmerCountTable(ksize=4)

You can count a single kmer using count()

# Increment count of 'AAAA' by 1
kct.count('AAAA')
# 1 # Returns current count 

# Count again
kct.count('AAAA')
# 2 # Count is now 2

Kmers and their reverse complement sequences are counted as one and will always have the save value.

kct.count('TTTT') #revcomp of 'AAAA'
# 3 # AAAA/TTTT has be counted 3 times

Use .get() to look up the count of AAAA/TTTT in the count table:

kct.get('AAAA')
# 3 # get() returns k-mer count

kct.get('TTTT')
# 3 # revcomp retrieve the same count record

You can count all the 4-mers from a longer sequence string using consume()

kct.consume('GGGGAA') # Contains the 4-mers GGGG, GGGA, GGAA

kct.get('GGGG') # GGGG/CCCC
# 1
kct.get('GGGA') # GGGA/TCCC
# 1
kct.get('GGAA') # GGAA/TTCC
# 1

You can also manually set the count for a specific kmer.

kct['GGGG'] = 1000

kct.get('GGGG')
# 1000
kct.get('CCCC')
# 1000