Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use new parametrization based weight norm if available #24030

Merged
merged 3 commits into from
Jun 6, 2023

Conversation

ezyang
Copy link
Contributor

@ezyang ezyang commented Jun 5, 2023

What does this PR do?

In pytorch/pytorch#103001 I introduce a new
parametrization based version of weight_norm. One big benefit of
the new API is that the resulting model is deepcopy'able; today, you can't
deepcopy Wav2Vec2 models.

Since the new API isn't even in PyTorch main yet, I'd like to feature
gate it here, so that it gets used whenever PyTorch is recent enough to
support it. It would be a big help for me if you could take this change
earlier rather than later; otherwise I will have to patch transformers
in our own CI to get our benchmark harness working on Wav2Vec2.

Signed-off-by: Edward Z. Yang [email protected]

Before submitting

  • This PR fixes a typo or improves the docs (you can dismiss the other checks if that's the case).
  • Did you read the contributor guideline,
    Pull Request section?
  • Was this discussed/approved via a Github issue or the forum? Please add a link
    to it if that's the case.
  • Did you make sure to update the documentation with your changes? Here are the
    documentation guidelines, and
    here are tips on formatting docstrings.
  • Did you write any new necessary tests?

Who can review?

cc @sanchit-gandhi

Copy link
Collaborator

@sgugger sgugger left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM as it's properly gated. @sanchit-gandhi can you have a second look?
You'll just need to run make style and make fix-copies on your branch to make the CI happy.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Jun 5, 2023

The documentation is not available anymore as the PR was closed or merged.

ezyang added 2 commits June 5, 2023 13:49
Signed-off-by: Edward Z. Yang <[email protected]>
Signed-off-by: Edward Z. Yang <[email protected]>
@ezyang
Copy link
Contributor Author

ezyang commented Jun 6, 2023

PyTorch side PR has landed!

Copy link
Contributor

@sanchit-gandhi sanchit-gandhi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM too - thanks for the PR @ezyang! Here's the PR in the PyTorch repo for reference: pytorch/pytorch#103001

@sgugger sgugger merged commit bc9ecef into huggingface:main Jun 6, 2023
novice03 pushed a commit to novice03/transformers that referenced this pull request Jun 23, 2023
…4030)

* Use new parametrization based weight norm if available

See pytorch/pytorch#103001

Signed-off-by: Edward Z. Yang <[email protected]>

* handle copies

Signed-off-by: Edward Z. Yang <[email protected]>

* black

Signed-off-by: Edward Z. Yang <[email protected]>

---------

Signed-off-by: Edward Z. Yang <[email protected]>
@ydshieh
Copy link
Collaborator

ydshieh commented Jun 28, 2023

@ezyang

With nightly pytorch, we get

AttributeError: encoder.pos_conv_embed.conv.weight_v not found in PyTorch model

when trying to load a pytorch model into a TF model.

The TF model is looking for encoder.pos_conv_embed.conv.weight_v but we now have encoder.pos_conv_embed.conv.parametrizations.weight.original0. (This is from our (TF)Wav2Vec2Model model class).

Question: In your PR pytorch/pytorch#103001, is this part

def _weight_norm_compat_hook()

that deals with the backward compatibility? If so, we will copy it :-)

ydshieh added a commit that referenced this pull request Jun 28, 2023
* fix

* fix

* fix

---------

Co-authored-by: ydshieh <[email protected]>
@ezyang
Copy link
Contributor Author

ezyang commented Jun 28, 2023

Yep. The change here is not FC so the ingester needs updating.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants