You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Model: distilbert
Language: English
The problem arises when using: QA inference via pipeline
The pipeline throws an exception when the model predicts a token that is not part of the document (e.g. final special token).
In the example below, the model predicts token 13 to be the end of the answer span.
The context however ends at token 12 and token 13 is the final [SEP] token. Therefore, we get a key error when trying to access feature.token_to_orig_map[13]) in here:
nlp = pipeline("question-answering",model="distilbert-base-uncased-distilled-squad",
tokenizer="distilbert-base-uncased",
device=-1)
nlp(question="test finding", context="My name is Carla and I live in Berlin")
results in
Traceback (most recent call last):
File "/home/mp/deepset/dev/haystack/debug.py", line 16, in <module>
nlp(question="test finding", context="My name is Carla and I live in Berlin")
File "/home/mp/miniconda3/envs/py37/lib/python3.7/site-packages/transformers/pipelines.py", line 1316, in __call__
for s, e, score in zip(starts, ends, scores)
File "/home/mp/miniconda3/envs/py37/lib/python3.7/site-packages/transformers/pipelines.py", line 1316, in <listcomp>
for s, e, score in zip(starts, ends, scores)
KeyError: 13
Expected behavior
Predictions that are pointing to tokens that are not part of the "context" (here: the last [SEP] token) should be filtered out from possible answers.
Environment info
transformers version: 3.0.2
Platform: Ubuntu 18.04
Python version: 3.7.6
PyTorch version (GPU?): 1.5.1, CPU
Using GPU in script?: No
Using distributed or parallel set-up in script?: No
The text was updated successfully, but these errors were encountered:
We did have an issue where predictions were going out of bounds on QA pipeline and it has been fixed on master:
>>>nlp=pipeline("question-answering",model="distilbert-base-uncased-distilled-squad",
tokenizer="distilbert-base-uncased",
device=-1)
>>>nlp(question="test finding", context="My name is Carla and I live in Berlin")
>>> {'score': 0.41493675112724304, 'start': 11, 'end': 16, 'answer': 'Carla'}
If you are able to checkout from master branch I would be happy to hear back from you to make sure it's working as expected on your side as well.
🐛 Bug
Information
Model: distilbert
Language: English
The problem arises when using: QA inference via
pipeline
The pipeline throws an exception when the model predicts a token that is not part of the document (e.g. final special token).
In the example below, the model predicts token 13 to be the end of the answer span.
The context however ends at token 12 and token 13 is the final [SEP] token. Therefore, we get a key error when trying to access
feature.token_to_orig_map[13])
in here:transformers/src/transformers/pipelines.py
Lines 1370 to 1380 in ce374ba
To reproduce
results in
Expected behavior
Predictions that are pointing to tokens that are not part of the "context" (here: the last [SEP] token) should be filtered out from possible answers.
Environment info
transformers
version: 3.0.2The text was updated successfully, but these errors were encountered: