Some code to install and test various named entity recognition (NER) systems on different datasets, allowing for both in-domain and out-of-domain evaluation, using standard evaluation metrics and automatic error analysis.
Installation and testing scripts are provided for the following systems:
- CRF++ (version 0.58)
- Stanford NER (version 3.9.1)
- Illinois NER, i.e. the NER package in CogComp-NLP (version 4.0.9)
- NeuroNER
- SpaCy
- The baseline used for the CoNLL-2003 shared task (see paper).
The systems may be trained and tuned on in-domain data, out-of-domain data or both. See exp/test_scripts/exp.cfg
for details.
Some information on this project can be found in the slides (PDF file) in this directory.
- Python 3 (some scripts aren't compatible with Python 2 at the moment, and
NeuroNER
requires Python 3) - numpy
- matplotlib if you want to use
eval/plot_scores.py
- Bash
- Perl (for
conlleval
evaluation script) - Maven to compile
Illinois NER
To install and test one or more NER systems, do the following:
-
Obtain and prepare datasets. Datasets must be text files containing 2 whitespace-separated columns, with tokens in the first and labels in the second, and with empty lines between sentences. Labels should be encoded using BIO-2 format. Dataset names must match a specific pattern, as explained in
exp/test_scripts/exp.cfg
. The directorydata_utils
contains various scripts used to preprocess, analyze, and transform datasets. -
Install systems you would like to evaluate. You might want to review the installation scripts first. Note: if you install
NeuroNER
, make sure your default Python interpreter uses Python 3 (you can use a virtual environment if you don't want to change your system's default interpreter).
chmod a+x install/install*
install/install_crfpp.sh install-directory
install/install_stanford_ner.sh install-directory
install/install_illinois_ner.sh install-directory
install/install_neuroner.sh install-directory
install/install_spacy.sh
-
If using
Stanford NER
, prepare the configuration file you will use to train the model. You can use one of the configuration files used to train the models provided withStanford NER
, which are located in the sub-directoryclassifiers
(e.g.english.conll.4class.distsim.prop
). The values oftrainFile
andserializeTo
can be left blank, as they will be replaced automatically during the tests. Also, if you want to use distributional features, you will need to obtain some distributional clusters and provide their path in the configuration file -- see Christopher Manning's answer here for details, and note that Alex Clark's code can be used to create distributional clusters. Otherwise, setuseDistSim = false
. -
If using
SpaCy
orNeuroNer
, get pre-trained GloVe word embeddings. You can use any other pre-trained word embeddings, but they must be in a text file.SpaCy
expects that file to have a header as in word2vec's text format, whereasNeuroNER
expects no header (as in GloVe's text format).
chmod a+x install/get_glove_embeddings.sh
install/get_glove_embeddings.sh download-directory
- If using
SpaCy
, the text file containing the pre-trained word embeddings must have a header as in word2vec's text format, so if it does not (as in GloVe's format), you must add a header. You can use the scriptexp/get_glove_shape.py
to get this header. The following 2 lines will copy the embeddings inpath-output
and add that header.
python exp/print_glove_shape.py path-embeddings > path-output
cat path-embeddings >> path-output
- If using
SpaCy
, initialize a model containing the pre-trained word embeddings. Note: the parameternb-vectors-kept
specifies the number of unique embeddings that the vocabulary is pruned to (-1 for no pruning) -- see SpaCy's doc.
chmod a+x exp/spacy_init_model.sh
exp/spacy_init_model.sh language path-embeddings nb-vectors-kept path-model
- Test baseline system and
conlleval
evaluation script.
python exp/compute_baseline.py path-training-file path-test-file path-output
chmod a+x eval/conlleval
eval/conlleval < path-output
-
Review the configuration file
exp/test_scripts/exp.cfg
. -
Run the tests for a given system.
cd exp/test_scripts
chmod a+x exp_*.sh
./exp_baseline.sh
The predictions of the system on each test set and the evaluation results will be written in a time-stamped sub-directory. These include the results of the conlleval
evaluation script, as well as the output of the script eval/error_analysis.py
. You can also evaluate the predictions using eval/hardeval.py
.
You can copy the test_scripts
directory elsewhere, modify the configuration file, and run the tests there if you want, e.g. if you want to create different directories for different experimental configurations (e.g. whether you train in-domain or out-of-domain).