Use original (or at least consistent) outcome labels in id2outcome.txt #252

logological · 2015-07-13T09:07:09Z

Currently id2outcome.txt uses numeric IDs for the classification outcomes, but (at least for cross-validation experiments) these IDs are not consistent from file to file. For example, BrownPosDemo does a two-fold cross-validation. One of the id2outcome.txt file uses the following mapping of numeric IDs to the original labels:

0=NPg 2=JJ 1=(null) 3=RB 5=TO 4=PPS 6=RP 7=NP 8=NN 10=VBN 9=VB 11=pct 12=PPO 13=BE 14=MD 15=DTS 16=VBZ 17=AT 18=IN 19=CS 20=VBG 21=VBD 22=BEDZ 23=NNS 24=CC 25=CD 26=AP 27=PPg

The other id2outcome.txt file uses a slightly different mapping:

0=NPg 2=(null) 1=JJ 3=RB 5=PPS 4=TO 6=RP 7=NP 8=NN 10=VB 9=VBN 11=pct 12=PPO 13=BE 14=MD 15=DTS 16=VBZ 17=AT 18=IN 19=CS 20=VBG 21=VBD 22=BEDZ 23=NNS 24=CC 25=CD 26=AP 27=PPg

In order to get the raw classifications, not only do I need to combine the two files, but I first have to manually un-map all the numeric IDs to their original labels.

It would be better if id2outcome.txt didn't use numeric IDs at all, but rather used the original label IDs. If for some reason the mapping to numeric IDs is necessary, it would be helpful if the mapping were consistent across files.

The text was updated successfully, but these errors were encountered:

daxenberger · 2015-10-20T06:22:15Z

InnerBatchUsingTCEvaluationReport now generates a file "id2harmonizedOutcome.txt" in the crossvalidation context folder.

logological mentioned this issue Jul 13, 2015

Unit classification mode: Integrate TextClassificationUnit.setSuffix() with externalD #251

Open

daxenberger assigned andriyNadolskyy Aug 6, 2015

reckart added the bug label Sep 6, 2015

andriyNadolskyy added a commit that referenced this issue Oct 18, 2015

Fixes issue #252

2c1a4f5

daxenberger added a commit that referenced this issue Oct 20, 2015

Issue #252: adding demo for single label document classification

e700a54

daxenberger closed this as completed Oct 20, 2015

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use original (or at least consistent) outcome labels in id2outcome.txt #252

Use original (or at least consistent) outcome labels in id2outcome.txt #252

logological commented Jul 13, 2015

daxenberger commented Oct 20, 2015

Use original (or at least consistent) outcome labels in id2outcome.txt #252

Use original (or at least consistent) outcome labels in id2outcome.txt #252

Comments

logological commented Jul 13, 2015

daxenberger commented Oct 20, 2015