Skip to content

msjimc/GeneMatrix

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

78 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Gene matrix

Contents

Introduction

GeneMatrix allows the rapid extraction of gene-specific sequences from either a folder of GenBank sequence files, a single GenBank file containing a series of entries, or a combination of the two. The program is specifically designed to process files downloaded from the NCBI site in the GenBank format. While a number of applications export data in this format, differences in formatting and annotation style may lead to errors during processing. GeneMatrix requires both the annotation and sequence to be present in the file.

Once imported, GeneMatrix extracts each DNA and/or protein feature linked to the CDS, tRNA or rRNA feature types and allows sequences with the same or related names to be exported as a single multi-sequence FASTA file such that the file contains all the sequences linked to a specific gene. The program can then direct the alignment of these files by ClustalW2, PRANK, MAFFT or Muscle (if present on the same computer), and if more than one gene feature were exported, combine the results of their alignments to form a super-alignment that could be used in a range of fields such as phylogenetic analysis.

If prompted, GeneMatrix will also direct the cleaning of the alignments by GBlocks (if present on the computer). Similarly, GeneMatrix can also convert a set of alignments into a single Phylip-formatted alignment file and then guide the selection of the parameters for downstream phylogenetic analysis with PartitionFinder2.

Guide

The user guide is here.

Download

The prebuilt program can be downloaded here.

Running on a Linux computer

GeneMatrix can run on Linux with the help of Wine as described here

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published