-
Notifications
You must be signed in to change notification settings - Fork 509
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add BF16 in FP8 quantize ops #1961
Conversation
✅ Deploy Preview for pytorch-fbgemm-docs canceled.
|
This pull request was exported from Phabricator. Differential Revision: D47904459 |
This pull request was exported from Phabricator. Differential Revision: D47904459 |
Summary: Pull Request resolved: pytorch#1961 - Added output_dtype for half, bfloat16 and float as output in dequantization functions; currently it's an integer value defined by Sparse_dtype (float:0, half:1, bfloat16:5) - Added type conversion in quant and dequant kernels by using native CUDA/HIP functions for half to float conversion and writing everything explicitly. Differential Revision: D47904459 fbshipit-source-id: 41d3f0c50365d0482aab912c202f458a787419d8
8e44a0d
to
facb7ed
Compare
This pull request was exported from Phabricator. Differential Revision: D47904459 |
Summary: Pull Request resolved: pytorch#1961 - Added output_dtype for half, bfloat16 and float as output in dequantization functions; currently it's an integer value defined by Sparse_dtype (float:0, half:1, bfloat16:5) - Added type conversion in quant and dequant kernels by using native CUDA/HIP functions for half to float conversion and writing everything explicitly. Reviewed By: jianyuh Differential Revision: D47904459 fbshipit-source-id: f608d7da5dcf05ff78a6e0eb13d985ed99207d1a
facb7ed
to
56e870d
Compare
This pull request was exported from Phabricator. Differential Revision: D47904459 |
Summary: Pull Request resolved: pytorch#1961 - Added output_dtype for half, bfloat16 and float as output in dequantization functions; currently it's an integer value defined by Sparse_dtype (float:0, half:1, bfloat16:5) - Added type conversion in quant and dequant kernels by using native CUDA/HIP functions for half to float conversion and writing everything explicitly. Reviewed By: jianyuh Differential Revision: D47904459 fbshipit-source-id: 3fdca310b5262c249e7dc552070e27a569c9af23
56e870d
to
7d5b278
Compare
This pull request was exported from Phabricator. Differential Revision: D47904459 |
Summary: Pull Request resolved: pytorch#1961 - Added output_dtype for half, bfloat16 and float as output in dequantization functions; currently it's an integer value defined by Sparse_dtype (float:0, half:1, bfloat16:5) - Added type conversion in quant and dequant kernels by using native CUDA/HIP functions for half to float conversion and writing everything explicitly. Reviewed By: jianyuh Differential Revision: D47904459 fbshipit-source-id: 4f5bbf71cb3c5f0ec2f4ef0048f30c6cdf48cd2e
7d5b278
to
a6ee85a
Compare
Summary: Pull Request resolved: pytorch#1961 - Added output_dtype for half, bfloat16 and float as output in dequantization functions; currently it's an integer value defined by Sparse_dtype (float:0, half:1, bfloat16:5) - Added type conversion in quant and dequant kernels by using native CUDA/HIP functions for half to float conversion and writing everything explicitly. Reviewed By: jianyuh Differential Revision: D47904459 fbshipit-source-id: d48da0fc7b0b158c46628952a7c7ec8e1aa502df
This pull request was exported from Phabricator. Differential Revision: D47904459 |
a6ee85a
to
19fb8e1
Compare
This pull request has been merged in 4920770. |
Summary:
dequantization functions; currently it's an integer value defined by
Sparse_dtype (float:0, half:1, bfloat16:5)
CUDA/HIP functions for half to float conversion and writing
everything explicitly.
Differential Revision: D47904459