This repository contains the script I used to extract sequences to create new pHMMs. It also contains the seed sequences and alignments that have been used to create the pHMMs.
Motifsearch.py contains the script used for extracting sequences containing a predefined motif from a FASTA file containing a list of sequences. It takes a FASTA file and a regex search command as an input. It outputs a List of Sequences from the original FASTA that contain the requested regex motif.
The Seeds directory contains Lists of seqeuences that have been used to build the alignments, which the pHMMs where build from.
The Alignments directory contains said alignments.