This is a modified KMC to compare K-mers from short reads data and an assembly, and make a plot.
git clone && cd KMC
make -j 16
if you compile the source code sucessfully, there will be a bin directory including all exectuable files you need. Otherwise, please refer to the old README.
Given an assembly in fasta/fasta.gz format and list of short reads files in fastq/fastq.gz format, you can use the following commands to make a comparison plot.
bin/kmc -k21 -ci0 -fm -t12 -m20 -sm $asm $asm.prefix tmp
bin/kmc -k21 -ci0 -t12 -m20 -sm @$reads $reads.prefix tmp
bin/kmc_tools analyze $reads.prefix $asm.prefix $output.matrix
python3 $output.matrix $output.png
when all the commands are finished, you will see a figure like this:
- How to make a list of short reads files (fastq/fastq.gz)
the read file list is a
deliminated text file, one read file per line, following a simple syntax:<READ_FILE_PATH><tab>[TRIM_NUMBER]
. Please notice if the TRIM_NUMBER is not set, it will be treated as 0. If you only have one read file, you can use KMC command directly without a read file list, and you can use-d
to set trimmed off bases.
This plot is just a small part learned from a K-mer Analysis Toolkit (KAT). If you'd like to know more, please go to their website: kat-web.