Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow gelu approximations #1911

Merged
merged 5 commits into from
Jul 14, 2023
Merged

Conversation

pcuenca
Copy link
Contributor

@pcuenca pcuenca commented Jul 11, 2023

The MIL gelu implementation accepts tanh or sigmoid approximations, but the frontend asserts that no approximation is requested.

This PR allows the supported approximations to be specified. tanh approximation is used in models such as the ones with the gpt_bigcode architecture, so conversion can now proceed by applying the same approximation.

@pcuenca
Copy link
Contributor Author

pcuenca commented Jul 12, 2023

I added a test for tanh and removed support for sigmoid in the frontend, as I realized it's not yet supported in PyTorch: pytorch/pytorch#102447

res = mb.gelu(x=inputs[0], name=node.name)
if approximate == "tanh":
approximate = "TANH_APPROXIMATION"
elif approximate == "none":
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If pytorch only support two modes right now,
it will be better to do:

else:
    assert approximate == "none"

to deal with the future possible change in the torch frontend (like they add more support for different mode)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't do it because an unsupported approximation would still fail in the mb.gelu implementation. But you are right that this is a better place to signal where the conversion needs to happen, changing it now!

@jakesabathia2
Copy link
Collaborator

https://gitlab.com/coremltools1/coremltools/-/pipelines/929504950

@pcuenca
Copy link
Contributor Author

pcuenca commented Jul 13, 2023

Not sure why the Python 3.10 tests fail, I checked some (the Spectrogram ones, for instance), and they passed locally.

@TobyRoseman
Copy link
Collaborator

Not sure why the Python 3.10 tests fail, I checked some (the Spectrogram ones, for instance), and they passed locally.

This is curious. I also can not reproduce it locally using this pull request.

I've restarted the failed job. Perhaps this is a non-deterministic issue.

A similar unit test is also inexplicable failing in an unrelated pull request (#1897). I also can not reproduce it locally using that pull request either. Although it is passing in main.

The failure doesn't seem very serious: only one element is mismatch (out of thousands).

I don't think this unit test failure should block merging this pull request.

@jakesabathia2
Copy link
Collaborator

I think it might be some flanky issue, which is triggered by certain random inputs ...

@jakesabathia2
Copy link
Collaborator

@TobyRoseman I can put another PR to fix that,
I am guessing we just need to do some input data "surgery" for that particular test.
@pcuenca Your PR looks awesome, I am going to merge it.

@jakesabathia2 jakesabathia2 merged commit 36544e2 into apple:main Jul 14, 2023
@pcuenca pcuenca deleted the allow-gelu-approximations branch July 15, 2023 15:26
pcuenca added a commit to huggingface/exporters that referenced this pull request Aug 23, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants