-
Notifications
You must be signed in to change notification settings - Fork 449
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support BFloat16 in convolution_backward #7807
Conversation
Stack from ghstack (oldest at bottom): |
🔗 Helpful Links🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/executorch/7807
Note: Links to docs will display an error until the docs builds have been completed. ❌ 1 New FailureAs of commit 11f4d2d with merge base 466d98f ( NEW FAILURE - The following job has failed:
This comment was automatically generated by Dr. CI and updates every 15 minutes. |
auto expected_grad_weight = tf.make({4, 3, 4, 2}, expected_grad_weight_data); | ||
auto expected_grad_bias = tf.make({4}, expected_grad_bias_data); | ||
if (DTYPE == ScalarType::Half || DTYPE == ScalarType::BFloat16) { | ||
EXPECT_TENSOR_CLOSE_WITH_TOL(grad_input, expected_grad_input, 1e-2, 1e-8); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use defaults here? EXPECT_TENSOR_CLOSE_WITH_TOL
should apply the right tolerance given the type
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
because the default rtol is 1e-5; rtol and atol are different
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
right, but in the same way that we have kDefaultHalfAtol
and kDefaultBFloat16Atol
I think we should have kDefaultHalfRtol
and kDefaultBFloat16Rtol
and set it to a proper value.
You seem to be using 1e-2 for most of these tests. Why not introduced kDefaultHalfRtol
and kDefaultBFloat16Rtol
with value 1e-2?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not introduced kDefaultHalfRtol and kDefaultBFloat16Rtol with value 1e-2?
Because not all operators require the higher rtol.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not particularly uncommon to need to set rtol in pytorch core: https://github.com/search?q=repo%3Apytorch%2Fpytorch+%2Frtol%3D%5B1-9%5D%2F&type=code
Partial fix for pytorch#7748.
Partial fix for #7748.