-
Notifications
You must be signed in to change notification settings - Fork 2
Expander specification file
The expander specification file is an XML file that specifies how to build the expander index for a search engine.
The expander index is built using the alvisir-index-expander
command installed with AlvisIR:
alvisir-index-expander INDEX_DIR SPEC.xml
INDEX_DIR
is the directory where the index will be created.
Any previous index in this directory is cleared.
SPEC.xml
is the expander specification file.
The top tag indicated the type of resource, each type has different set of properties.
<compound>
<spec1>
...
</spec1>
<spec2>
....
</spec2>
...
</compound>
A compound expander is an aggregate of several resource expanders. Use a compound expander if you have several expansion types.
Each sub-element indicates a resource to include in the expander.
<obo
normalization="..."
source="..."
type="..."
prefix="..."
>
<source>...</source>
<type>...</type>
<prefix>...</prefix>
<property name="...">...</property>
<json-property root-id="">...</json-property>
</obo>
Set normalization filters for labels found in this resource. Normalization filters are described in the Search configuration page.
The default normalization filter is lowercase,ascii,english-stemming
.
The OBO file containing the ontology. This property can be set either as an attribute or as a sub-element.
Type of the expansion. This type can be referenced in the Search configuration. This property can be set either as an attribute or as a sub-element.
Prefix to add to expanded query terms.
Usually the prefix takes the form {entitytype}
.
This property can be set either as an attribute or as a sub-element.
Stores a property in the expander index with the specified name. Several properties can be stored.
Store the ontology sub-tree root-id
in JSON format as a property.
The contents of the element is the property name.
This property can be referenced in the Web UI configuration.
Sorted-vertical resources are tab-separated tabular text files, in which each line represents a synonym. The synonym, the label and the identifier are contained each in one specific column.
The lines are assumed to be sorted, all synonyms of the same canonical form must be together.
Column numbers start at zero.
<sorted-vertical
normalization="..."
source="..."
type="..."
type-column="..."
prefix="..."
suffix="..."
synonym="..."
canonical="..."
label="..."
>
<source>...</source>
<type>...</type>
<type-column>...</type-column>
<prefix>...</prefix>
<suffix>...<suffix>
<synonym>...</synonym>
<canonical>...</canonical>
<label>...</label>
<property name="...">...</property>
</sorted-vertical>
Set normalization filters for labels found in this resource. Normalization filters are described in the Search configuration page.
The default normalization filter is lowercase,ascii,english-stemming
.
The sorted vertical tabular file containing the terms. This property can be set either as an attribute or as a sub-element.
Type of the expansion. This type can be referenced in the Search configuration. These properties can be set either as an attribute or as a sub-element.
It is mandatory to set one of the two properties.
If type
is set, then all labels are of the same type.
Otherwise each label is of type the value of the column specified by type-column
.
Prefix to add to expanded query terms.
Usually the prefix takes the form {entitytype}
.
This property can be set either as an attribute or as a sub-element.
Suffix to add to expanded query terms.
Usually the suffix is used in path expansions to append the separator (e.g. ,
).
This property can be set either as an attribute or as a sub-element.
Column that contains the synonym.
By default, the synonym colum is 0
.
This property can be set either as an attribute or as a sub-element.
Column that contains the canonical form (identifier).
By default, the synonym colum is 1
.
This property can be set either as an attribute or as a sub-element.
Column that contains the label.
By default, the synonym colum is 2
.
This property can be set either as an attribute or as a sub-element.
Stores a property in the expander index with the specified name. Several properties can be stored.
Horizontal resources are tab-separated tabular text files, in which each line represents a term. The label and the identifier are contained each in one specific column. All synonyms of a term are contained in the last columns.
Column numbers start at zero.
<sorted-horizontal
normalization="..."
source="..."
type="..."
type-column="..."
prefix="..."
suffix="..."
first-synonym="..."
canonical="..."
label="..."
>
<source>...</source>
<type>...</type>
<type-column>...</type-column>
<prefix>...</prefix>
<suffix>...<suffix>
<first-synonym>...</first-synonym>
<canonical>...</canonical>
<label>...</label>
<property name="...">...</property>
</sorted-horizontal>
Set normalization filters for labels found in this resource. Normalization filters are described in the Search configuration page.
The default normalization filter is lowercase,ascii,english-stemming
.
The horizontal tabular file containing the terms. This property can be set either as an attribute or as a sub-element.
Type of the expansion. This type can be referenced in the Search configuration. These properties can be set either as an attribute or as a sub-element.
It is mandatory to set one of the two properties.
If type
is set, then all labels are of the same type.
Otherwise each label is of type the value of the column specified by type-column
.
Prefix to add to expanded query terms.
Usually the prefix takes the form {entitytype}
.
This property can be set either as an attribute or as a sub-element.
Suffix to add to expanded query terms.
Usually the suffix is used in path expansions to append the separator (e.g. ,
).
This property can be set either as an attribute or as a sub-element.
Column that contains the first synonym, the remaining columns until the last are considered as additional synonyms.
By default, the synonym colum is 3
.
This property can be set either as an attribute or as a sub-element.
Column that contains the canonical form (identifier).
By default, the synonym colum is 1
.
This property can be set either as an attribute or as a sub-element.
Column that contains the label.
By default, the synonym colum is 2
.
This property can be set either as an attribute or as a sub-element.
Stores a property in the expander index with the specified name. Several properties can be stored.