-
Notifications
You must be signed in to change notification settings - Fork 659
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Allow gelu approximations #1911
Conversation
(not yet supported in PyTorch)
I added a test for |
res = mb.gelu(x=inputs[0], name=node.name) | ||
if approximate == "tanh": | ||
approximate = "TANH_APPROXIMATION" | ||
elif approximate == "none": |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If pytorch only support two modes right now,
it will be better to do:
else:
assert approximate == "none"
to deal with the future possible change in the torch frontend (like they add more support for different mode)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I didn't do it because an unsupported approximation would still fail in the mb.gelu
implementation. But you are right that this is a better place to signal where the conversion needs to happen, changing it now!
Not sure why the Python 3.10 tests fail, I checked some (the Spectrogram ones, for instance), and they passed locally. |
This is curious. I also can not reproduce it locally using this pull request. I've restarted the failed job. Perhaps this is a non-deterministic issue. A similar unit test is also inexplicable failing in an unrelated pull request (#1897). I also can not reproduce it locally using that pull request either. Although it is passing in The failure doesn't seem very serious: only one element is mismatch (out of thousands). I don't think this unit test failure should block merging this pull request. |
I think it might be some flanky issue, which is triggered by certain random inputs ... |
@TobyRoseman I can put another PR to fix that, |
PRs apple/coremltools#1910 and apple/coremltools#1911 are now released in coremltools 7.0b2.
The MIL gelu implementation accepts tanh or sigmoid approximations, but the frontend asserts that no approximation is requested.
This PR allows the supported approximations to be specified.
tanh
approximation is used in models such as the ones with thegpt_bigcode
architecture, so conversion can now proceed by applying the same approximation.