INT4 XPU enabling #1577

airMeng · 2025-01-17T03:20:25Z

The PR is a draft currently.

The PR will add 2 kinds of INT4 support on XPU: floating zero points and integer zero points, following the discussion in #1264.

Integer zero points which is natively supported via OneDNN, is planned to be merged into PyTorch main repo pytorch/pytorch#137566

Floating zero points, the default behaviour in this repo, the initial work has been done in XPU operators intel/torch-xpu-ops#1130, more implementations on the way.

pytorch-bot · 2025-01-17T03:20:29Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1577

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

This comment was automatically generated by Dr. CI and updates every 15 minutes.

jerryzh168 · 2025-01-17T03:48:00Z

test/dtypes/test_affine_quantized.py

@@ -46,6 +47,18 @@ def get_quantization_functions(
                        zero_point_domain=ZeroPointDomain.INT,
                    )
                )
+        elif device == "xpu" and  TORCH_VERSION_AT_LEAST_2_6:


2_7 or 2_6?

depend on the pytorch/pytorch#137566. It's really just a draft, so not yet ready for review. I will ping you when ready :)

jerryzh168 · 2025-01-17T03:48:26Z

test/integration/test_integration.py

@@ -1079,6 +1084,8 @@ def test_int4_weight_only_quant_subclass_api_grouped(self, device, dtype):
        layout_list = []
        if device == "cpu" and TORCH_VERSION_AT_LEAST_2_6:
            layout_list.append(Int4CPULayout())
+        elif device == "xpu" and TORCH_VERSION_AT_LEAST_2_6:


here as well, 2_6 or 2_7?

jerryzh168 · 2025-01-17T03:51:53Z

torchao/dtypes/uintx/int4_xpu_layout.py

+
+    __torch_function__ = torch._C._disabled_torch_function_impl
+
+    def get_plain(self) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]:


btw for this one, we have some unpacking op for tensor core tiled layout that we should really be using:

ao/torchao/csrc/cuda/tensor_core_tiled_layout/tensor_core_tiled_layout.cu

Lines 311 to 312 in cf45336

m.impl("torchao::unpack_tensor_core_tiled_layout", &_unpack_tensor_core_tiled_layout);

m.impl("torchao::dequantize_tensor_core_tiled_layout", &_dequantize_tensor_core_tiled_layout);

might be better to do the same instead of hacking with quantize ops

sure. I will give a check.

jerryzh168 · 2025-01-17T03:53:39Z

btw why the op is added in pytorch/pytorch#137566 instead of in torchao? any plans to move it to torchao?

airMeng · 2025-01-17T03:55:46Z

btw why the op is added in pytorch/pytorch#137566 instead of in torchao? any plans to move it to torchao?

@mingfeima @EikanWang can you comment?

mingfeima · 2025-01-21T01:56:51Z

btw why the op is added in pytorch/pytorch#137566 instead of in torchao? any plans to move it to torchao?

@mingfeima @EikanWang can you comment?

The situation is different for XPU (the intel GPUs) from CPU and CUDA here. Not sure that whether providing sycl or oneDNN xpu ops in ao is a feasible solution.

airMeng marked this pull request as draft January 17, 2025 03:20

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 17, 2025

backup

169ae84

jerryzh168 reviewed Jan 17, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

INT4 XPU enabling #1577

INT4 XPU enabling #1577

airMeng commented Jan 17, 2025 •

edited

Loading

pytorch-bot bot commented Jan 17, 2025

jerryzh168 Jan 17, 2025

airMeng Jan 17, 2025

jerryzh168 Jan 17, 2025

jerryzh168 Jan 17, 2025 •

edited

Loading

airMeng Jan 17, 2025

jerryzh168 commented Jan 17, 2025

airMeng commented Jan 17, 2025

mingfeima commented Jan 21, 2025


		__torch_function__ = torch._C._disabled_torch_function_impl

		def get_plain(self) -> Tuple[torch.Tensor, torch.Tensor, torch.Tensor]:

	m.impl("torchao::unpack_tensor_core_tiled_layout", &_unpack_tensor_core_tiled_layout);
	m.impl("torchao::dequantize_tensor_core_tiled_layout", &_dequantize_tensor_core_tiled_layout);

INT4 XPU enabling #1577

Are you sure you want to change the base?

INT4 XPU enabling #1577

Conversation

airMeng commented Jan 17, 2025 • edited Loading

pytorch-bot bot commented Jan 17, 2025

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1577

jerryzh168 Jan 17, 2025

Choose a reason for hiding this comment

airMeng Jan 17, 2025

Choose a reason for hiding this comment

jerryzh168 Jan 17, 2025

Choose a reason for hiding this comment

jerryzh168 Jan 17, 2025 • edited Loading

Choose a reason for hiding this comment

airMeng Jan 17, 2025

Choose a reason for hiding this comment

jerryzh168 commented Jan 17, 2025

airMeng commented Jan 17, 2025

mingfeima commented Jan 21, 2025

airMeng commented Jan 17, 2025 •

edited

Loading

jerryzh168 Jan 17, 2025 •

edited

Loading