forked from biopython/biopython
-
Notifications
You must be signed in to change notification settings - Fork 0
Home
bow edited this page Jun 29, 2012
·
5 revisions
The purpose of this Wiki is to serve as a 'sandbox' for the future SearchIO documentation. For now, I will try to note the important points that should later be covered in the real documentation.
Outline:
- Main functions: parse, read, to_dict, index, index_db, convert, write
- Supported programs and formats
- Object model: QueryResult, Hit, HSP
- Other things to notice (coordinate base (0 instead of 1), dna vs protein search coordinates, ID and Desc behavior, shallow vs deep copies)
- Format-specific usage guide
- Contributing your parser
- Attributes of common objects (e.g. seq_len or acc)
- Custom parsing / writing behavior (e.g. parsing custom columns in blast-tab, writing PSL files with headers)
For each of the formats + programs below, we should mention:
- Which program flavor and version do we support (or we're sure is supported)
- The custom attributes present in that format (e.g. HSP.z_score in fasta-m10 or HSP.cluster_num in hmmer-tab)
- Custom behavior not covered by the main API (e.g. custom blast-tab nonstandard column parsing)
- Gotchas / tricks that could be useful
- blast-xml ~ dealing with blast-generated Query and/or Hit IDs
- blast-tab ~ parsing + writing files with custom column order
- blast-tabc ~ writing files with custom column order
- blast-text ~ the extent of Biopython's support
- blat-psl ~ reading files with track lines (used in ensembl), dealing with non-dna sequence searches
- blat-pslx
- fasta-m10 ~ dealing with custom output (full alignment display), dealing with E2() values
~ general note on hmm{from,to} and ali{from,to} coordinates * hmmer-text ~ dealing with alignment annotations * hmmer-tab * hmmer-domtab (hmmscan-domtab, hmmsearch-domtab, phmmer-domtab)