Whisper pipeline raises error when using `return_timestamps` (ValueError: The following `model_kwargs` are not used by the model: ['return_timestamps']) #905

xenova · 2023-03-21T13:33:19Z

System Info

optimum: 1.7.1
Python: 3.8.3
transformers: 4.27.2
platform: Windows 10

Who can help?

@philschmid

Information

The official example scripts
My own modified scripts

Tasks

An officially supported task in the examples folder (such as GLUE/SQuAD, ...)
My own task or dataset (give details below)

Reproduction

This is a working example using the transformers pipeline function:

from transformers import pipeline

transcriber = pipeline('automatic-speech-recognition', 'openai/whisper-tiny.en')

text = transcriber(
    'https://xenova.github.io/transformers.js/assets/audio/ted_60.wav',

    return_timestamps=True,
    chunk_length_s=30,
    stride_length_s=5
)

print(f'{text=}')
# outputs correctly

After converting to onnx using this command:

python -m optimum.exporters.onnx --model openai/whisper-tiny.en whisper_onnx/

and running the equivalent code:

import onnxruntime
from transformers import pipeline, AutoProcessor
from optimum.onnxruntime import ORTModelForSpeechSeq2Seq

whisper_model_name = './whisper_onnx/'
processor = AutoProcessor.from_pretrained(whisper_model_name)
session_options = onnxruntime.SessionOptions()

model_ort = ORTModelForSpeechSeq2Seq.from_pretrained(
    whisper_model_name,
    use_io_binding=True,
    session_options=session_options
)
generator_ort = pipeline(
    task="automatic-speech-recognition",
    model=model_ort,
    feature_extractor=processor.feature_extractor,
    tokenizer=processor.tokenizer,
)

out = generator_ort(
    'https://xenova.github.io/transformers.js/assets/audio/ted_60.wav',

    return_timestamps=True,
    chunk_length_s=30,
    stride_length_s=5
)

print(f'{out=}')

I get the error:

ValueError: The following `model_kwargs` are not used by the model: ['return_timestamps'] (note: typos in the generate arguments will also show up in this     
list)

Expected behavior

The code which uses the ONNX model should work the same as the transformers version (and not throw an error).

The text was updated successfully, but these errors were encountered:

fxmarty · 2023-03-21T14:59:09Z

@xenova Thank you, will look into it asap!

xenova added the bug Something isn't working label Mar 21, 2023

fxmarty mentioned this issue Mar 24, 2023

Fix ORTModel MRO for whisper #919

Merged

fxmarty closed this as completed in #919 Mar 24, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Whisper pipeline raises error when using `return_timestamps` (ValueError: The following `model_kwargs` are not used by the model: ['return_timestamps']) #905

Whisper pipeline raises error when using `return_timestamps` (ValueError: The following `model_kwargs` are not used by the model: ['return_timestamps']) #905

xenova commented Mar 21, 2023

fxmarty commented Mar 21, 2023

Whisper pipeline raises error when using return_timestamps (ValueError: The following model_kwargs are not used by the model: ['return_timestamps']) #905

Whisper pipeline raises error when using return_timestamps (ValueError: The following model_kwargs are not used by the model: ['return_timestamps']) #905

Comments

xenova commented Mar 21, 2023

System Info

Who can help?

Information

Tasks

Reproduction

Expected behavior

fxmarty commented Mar 21, 2023

Whisper pipeline raises error when using `return_timestamps` (ValueError: The following `model_kwargs` are not used by the model: ['return_timestamps']) #905

Whisper pipeline raises error when using `return_timestamps` (ValueError: The following `model_kwargs` are not used by the model: ['return_timestamps']) #905