Analysis of mutations in engineered strains of Clostridium thermocellum
This repository has a set of tools for analyzing mutations from resequencing data Here is the basic outline for how to use it:
- Process raw sequence data with CLC Genomics Workbench to generate CSV files with mutations
- Run the "process_clc_files.py" script to compile the mutations into a Pandas Dataframe for further analysis.
- If desired, the mutations can be annotated with nearby protein coding regions (CDS) by running "annotate_df_with_CDS.py"
- Finally, a table of mutations can be created by running the "output_pivot_table()" function in "annotate_df_with_CDS.py"