Skip to content

AnnotationExtensionConfig

James Seager edited this page Jun 9, 2020 · 12 revisions

Configuring annotation extensions in Canto

The annotation config files are in pombase-config Git repository.

Config file format

Each row of the files configures a separate extension. The files are tab-delimited with these columns:

  • domain term ID - The extension configured on this row applies to this term and its descendants
  • subset relation - The relation used to find domain term descendants (usually "is_a")
  • allowed extension - The extension relation name
  • range - See below
  • text for display - The text to show in the user interface for this relation
  • cardinality - Possible values: "0","1", "2" or "*"
  • role - Possible values "user" and "admin". User extensions are shown to all, admin extensions just to admins.

Domain term ID

After an annotation is made the configurations are searched using the annotated term ID. Any configuration where the annotated ID is the domain term ID or a descendant is shown to the user. Normally this column contains just a term ID like: "GO:0031399"

If needed, subsets of terms can be excluded from the domain with syntax like:

"GO:0031399-is_a(GO:0043666)&is_a(GO:0045859)"

Which means:

This config line applies to terms where TERM is_a GO:0031399 and:
      TERM is not a GO:0043666
  AND TERM is not a GO:0045859

Annotation extension ranges

The range column must contain a combination of these, separated by pipes ("|"):

  • An ontology term (and its descendants) like "GO:0005575" or several terms
  • A specifier giving the type of feature that is allowed. Currently these all just mean that the range will be a gene from the session:
    • "GeneID"
    • "ProteinID"
    • "TranscriptID"
  • The string "Text" which will allow free text for this extension range

If there are multiple ontology terms configured for the range, the term name box with complete on those terms and any of their descendants.

Setting configuration file names

"extension_conf_files" should be set in canto_deploy.yaml to set the configuration file names. This setting is a list. eg.

extension_conf_files:
  - file_one.tsv
  - file_two.tsv

Loading the configuration

The extension configuration is loaded by adding the --process-extension-config flag to the canto_load.pl command:

 ./script/canto_load.pl --process-extension-config --ontology <file.obo> --ontology <file.obo> ...

See http://curation.pombase.org/docs/canto_admin/setup

The files from extension_conf_files are read to find all term IDs mentioned as domains or ranges.

This script then uses the output of "owtools --save-closure-for-chado" to find the child terms of all terms from the configuration. owtools must be in the path.

A cvtermprop named "canto_subset" is added to each domain or range term and to all descendant terms.

Usage in Canto

The Canto web app loads these configuration on start up. When a term is used in an annotation, its "canto_subset" properties are read. Those property values refer to domain or range term IDs in the config. If a domain ID matches the GUI code shows that relation and constrains the range.

Further information