There are no tags in the system response for the column... #8

creat89 · 2020-05-25T12:36:50Z

Hello,

Currently I'm having an issue doing some internal evaluations when my system only predicts a type of NER. The scorer stops when I do not provide labels for all the types of columns. In my opinion, if the user do not provide labels for a specific column it should return zero in the evaluation of that column rather than stopping it.

aflueckiger · 2020-05-25T14:05:10Z

Hello @creat89,

Thanks for your report. Yet, I cannot reproduce the error. What do you mean by "stopping"? Do you get an error? The following test seems to work as expected:

I replaced the NE-FINE-LIT with X with:

awk -vOFS='\t'  '{$4 = "X"; print}' data/release/v1.2/de/HIPE-data-v1.2-test-de.tsv > issue_8.tsv

Then I run the scorer with:

python ../CLEF-HIPE-2020-scorer/clef_evaluation.py --ref data/release/v1.2/de/HIPE-data-v1.2-test-de.tsv --pred issue_8.tsv --task nerc_fine --outdir data/system-evaluations --log issue_8.log

The scorer correctly complains about the missing column in this case:

The provided annotation columns ['NE-FINE-LIT'] are not available in both the gold standard and the system response 'issue_8.tsv'.

However, it runs through when you just provide an empty column with all the required fieldnames (e.g. 'NE-FINE-LIT').

Please provide more information if you think something is wrong.

creat89 · 2020-05-25T14:21:32Z

For instance, I have the following file:

TOKEN	NE-COARSE-LIT	NE-COARSE-METO	NE-FINE-LIT	NE-FINE-METO	NE-FINE-COMP	NE-NESTED	NEL-LIT	NEL-METO	MISC
# language = de
# newspaper = NZZ
# date = 1798-01-17
# document_id = NZZ-1798-01-17-a-p0002
# segment_iiif_link = _
Rußland	B-loc	O	O	O	O	O	_	_	_
.	O	O	O	O	O	O	_	_	_
Petersburg	B-loc	O	O	O	O	O	_	_	_

where my system only predicted labels for the column NE-COARSE-LIT but not for NE-COARSE-METO. In other words, for NE-COARSE-METO I just printed O. If I run the script:

python clef_evaluation.py --ref /home/HIPE-data-v1.0-dev-de.tsv --pred /home/Predictions_dev.tsv --skip_check --task nerc_coarse

The script tells me:

There are no tags in the system response file '/home/adrian/Programs/NER_BERT_News/Server_Models3/NER_models_europeana_german_fixed_nofixtags_weightedlosslogAlpha/Impresso_de_v1,0_8_5e-05//Predictions_test.tsv' for the column: ['NE-COARSE-METO']

And does not produce any output. However, in this case, it should indicate that for NE-COARSE-METO the score is zero rather than stopping the script.

However, it runs through when you just provide an empty column with all the required fieldnames (e.g. 'NE-FINE-LIT').

Thus, that this means that instead of having O in the second column I just need to put it empty?

aflueckiger · 2020-05-25T15:03:03Z

Thanks for clarifying. With b8bdbf8, the behavior is changed. As of now, the scorer logs missing tags only without quitting the evaluation.

aflueckiger added a commit that referenced this issue May 25, 2020

do not interrupt when tags are missing (issue #8)

b8bdbf8

aflueckiger closed this as completed May 25, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

There are no tags in the system response for the column... #8

There are no tags in the system response for the column... #8

creat89 commented May 25, 2020

aflueckiger commented May 25, 2020 •

edited by mromanello

Loading

creat89 commented May 25, 2020 •

edited

Loading

aflueckiger commented May 25, 2020

There are no tags in the system response for the column... #8

There are no tags in the system response for the column... #8

Comments

creat89 commented May 25, 2020

aflueckiger commented May 25, 2020 • edited by mromanello Loading

creat89 commented May 25, 2020 • edited Loading

aflueckiger commented May 25, 2020

aflueckiger commented May 25, 2020 •

edited by mromanello

Loading

creat89 commented May 25, 2020 •

edited

Loading