-
Notifications
You must be signed in to change notification settings - Fork 20
Evaluation
André Pires edited this page Apr 21, 2017
·
5 revisions
In order to have a better, stable, comparable estimation of the output of the NER from all the tools, all output was converted to the conll format and evaluated using conlleval script. Also, repeated 10-fold cross validation was performed and then an average of the results was obtained.
All results can be accessed here.
- Get output from tools
- Get golden data for each fold
- Join both (script)
- Evaluate each fold (scripts)
- Compute average for each repeat (script)
- Compute global average (script)
- Get result files for each fold
- Save each result into a list with the accuracy and each category
- Create dictionary for each category
- Using categories as keys, and a list with precision, recall and fb1
- For each result
- Save averages
- Save measures for each category
- Calculate average
- Calculate macro-average for fb1
- Print to file
Check script here.