Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add IPU support for HF pipelines to Whisper #368

Merged
merged 11 commits into from
Jun 6, 2023

Conversation

paolot-gc
Copy link
Collaborator

Add support for HF pipelines

This PR makes it possible to use the HF pipeline for Whisper on the IPU.

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you make sure to update the documentation with your changes?
  • Did you write any new necessary tests?

@paolot-gc paolot-gc requested review from jimypbr and katalinic-gc May 5, 2023 09:24
@paolot-gc paolot-gc changed the title Add support for HF pipelines Add IPU support for HF pipelines to Whisper May 5, 2023
@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented May 5, 2023

The documentation is not available anymore as the PR was closed or merged.

@katalinic-gc
Copy link
Collaborator

For automatic_speech_recognition.py can you indicate which parts are actual modifications? E.g. today in generation/utils.py we have comments like # Change: ..., perhaps # IPU Change: ... is clearer. The file seems to be mostly a copy and paste of upstream so its unclear what can be re-used and where we need IPU specific patches.

@paolot-gc paolot-gc marked this pull request as ready for review May 18, 2023 09:49
paolot-gc and others added 4 commits June 2, 2023 17:32
Proper way of calling generate()

With support for optimisations

Noted changes, used US spelling, avoided warning

Subclassed AutomaticSpeechRecognitionPipeline

Parameter passing, better example

Pipeline support for batch

Adds batch_size support to pipeline

Adapt to new IPUConfig code

Fix chunking with batching

Avoid hardcoding of max_new_tokens

Batching using pipeline (with list arg)

Fix black complaints

Fixing isort complaints

Fix for batch_size=1

Avoid UnicodeEncodeError when outputting text
@katalinic-gc katalinic-gc merged commit c2b349b into huggingface:main Jun 6, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants