[bc-breaking] enable direct configuration in quantize_ #1595

vkuzo · 2025-01-22T16:49:12Z

summary

This PR enables passing per-workflow arguments to quantize_ directly, without wrapping them in a Callable.

Motivation: passing direct configuraton is intuintive and widely used in similar contexts across various projects. Passing configuration wrapped in a callable is IMO not intuitive, hard to understand and debug, and we have evidence that it pushes a portion of users from building on top of torchao.

We will keep the old callable syntax supported by quantize_ for one release cycle, and delete it afterwards. We will keep the old names as aliases for new names going forward (example: int4_weight_only as an alias of Int4WeightOnlyConfig) to keep existing callsites working without changes.

user facing API changes

signature of quantize_

#
# before
#
def quantize(
    model: torch.nn.Module,
    apply_tensor_subclass: Callable[[torch.nn.Module], torch.nn.Module],
    ...,
): ...

#
# after - intermediate state, support both old and new for one release
#
def quantize(
    model: torch.nn.Module,
    config: Union[AOBaseWorkflowConfig, Callable[[torch.nn.Module], torch.nn.Module]],
    ...,
): ...

#
# after - long term state
#
def quantize(
    model: torch.nn.Module,
    config: AOBaseWorkflowConfig,
    ...,
): ...

usage example

An example for int4_weight_only

#
# before
#
quantize_(m, int4_weight_only(group_size=32))

#
# after, with new user facing names
#
quantize_(m, Int4WeightOnlyConfig(group_size=32))

#
# AND, after, with BC names
#
quantize_(m, int4_weight_only(group_size=32))

developer facing changes

See the PR details for examples, but they can be summarized as:

#
# old
#

# quantize_ calls the instance of calling this function on each module of the model
def int4_weight_only(group_size: int, ...) -> Callable:

    def new_callable(weight: torch.Tensor):
        # configuration is captured here via local variables
        ...
        
    # return type is a Callable
    return _get_linear_subclass_inserter(new_callable)

#
# new
#

# config base class
class AOBaseWorkflowConfig(abc.ABC):
    pass

# user facing configuration of a workflow
@dataclass
class Int4WeightOnlyConfig(AOBaseWorkflowConfig):
    group_size: int = 128
    ...

# not user facing transform of a module according to a worfklow's configuration
@register_quantize_module_handler(Int4WeightOnlyConfig)
def _int4_weight_only_transform(
    module: torch.nn.Module, 
    config: Int4WeightOnlyConfig,
) -> torch.nn.Module:
    # map to AQT, not user facing
    ...

current status

The current PR migrates three user facing workflows:

PTQ's int4_weight_only
QAT's intx_quantization_aware_training and from_intx_quantization_aware_training

I've chosen to migrate one PTQ and two QAT workflows to prove generality of the new flow, but avoid a high LOC in this PR to make it easier to review. We will migrate the rest of the workflows in future PRs, detailed below:

int8_dynamic_activation_int4_weight
int8_dynamic_activation_int8_weight
int8_dynamic_activation_int8_semi_sparse_weight
int8_weight_only
float8_weight_only
float8_dynamic_activation_float8_weight
float8_static_activation_float8_weight
uintx_weight_only
fpx_weight_only
gemlite_uintx_weight_only
callsites from the prototype folder

After a release cycle, we will delete the old callable syntax.

Test Plan:

pytest test/quantization/test_quant_api.py -s -x -k test_int4_weight_only_numerics
pytest test/quantization/test_qat.py -s -x -k test_quantize_api_standalone
pytest test/quantization/test_qat.py -s -x -k test_quantize_api_convert_path

Reviewers:

Subscribers:

Tasks:

Tags:

[ghstack-poisoned]

vkuzo · 2025-01-22T16:49:13Z

Stack from ghstack (oldest at bottom):

-> [bc-breaking] enable direct configuration in quantize_ #1595

pytorch-bot · 2025-01-22T16:49:16Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1595

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 94d9426 with merge base 32d9b0b ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

Summary: POC for: * decoupling configuration from transformation * stop passing obscure stateful callables around * enable printing of configuration * reduce amount of context switching to navigate the logic from `quantize_` to quantizing a single module TODO more polish before wider discussion. Test Plan: ``` pytest test/quantization/test_quant_api.py -s -x -k test_int4_weight_only_numerics pytest test/quantization/test_qat.py -s -x -k test_quantize_api_standalone pytest test/quantization/test_qat.py -s -x -k test_quantize_api_convert_path ``` Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: fb0703f88413bc06962dacde24ff6bb7cf0f3b19 ghstack-comment-id: 2607756510 Pull Request resolved: #1595

[ghstack-poisoned]

Summary: POC for: * decoupling configuration from transformation * stop passing obscure stateful callables around * enable printing of configuration * reduce amount of context switching to navigate the logic from `quantize_` to quantizing a single module TODO more polish before wider discussion. Test Plan: ``` pytest test/quantization/test_quant_api.py -s -x -k test_int4_weight_only_numerics pytest test/quantization/test_qat.py -s -x -k test_quantize_api_standalone pytest test/quantization/test_qat.py -s -x -k test_quantize_api_convert_path ``` Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 73e9a5c3bf03e2cb645cc0ea43bec162a5f4897e ghstack-comment-id: 2607756510 Pull Request resolved: #1595

[ghstack-poisoned]

Summary: POC for: * decoupling configuration from transformation * stop passing obscure stateful callables around * enable printing of configuration * reduce amount of context switching to navigate the logic from `quantize_` to quantizing a single module TODO more polish before wider discussion. Test Plan: ``` pytest test/quantization/test_quant_api.py -s -x -k test_int4_weight_only_numerics pytest test/quantization/test_qat.py -s -x -k test_quantize_api_standalone pytest test/quantization/test_qat.py -s -x -k test_quantize_api_convert_path ``` Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: ff2d58b120453a36d10c24da3df207b9348bdc7a ghstack-comment-id: 2607756510 Pull Request resolved: #1595

[ghstack-poisoned]

Summary: POC for: * decoupling configuration from transformation * stop passing obscure stateful callables around * enable printing of configuration * reduce amount of context switching to navigate the logic from `quantize_` to quantizing a single module TODO more polish before wider discussion. Test Plan: ``` pytest test/quantization/test_quant_api.py -s -x -k test_int4_weight_only_numerics pytest test/quantization/test_qat.py -s -x -k test_quantize_api_standalone pytest test/quantization/test_qat.py -s -x -k test_quantize_api_convert_path ``` Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: 05b6a547051288c8e59bad7d1df3bca402ea3991 ghstack-comment-id: 2607756510 Pull Request resolved: #1595

[ghstack-poisoned]

Summary: POC for: * decoupling configuration from transformation * stop passing obscure stateful callables around * enable printing of configuration * reduce amount of context switching to navigate the logic from `quantize_` to quantizing a single module TODO more polish before wider discussion. Test Plan: ``` pytest test/quantization/test_quant_api.py -s -x -k test_int4_weight_only_numerics pytest test/quantization/test_qat.py -s -x -k test_quantize_api_standalone pytest test/quantization/test_qat.py -s -x -k test_quantize_api_convert_path ``` Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: e4f1550e3130d523e244a2dfdebb7d4db824c388 ghstack-comment-id: 2607756510 Pull Request resolved: #1595

[ghstack-poisoned]

Summary: POC for: * decoupling configuration from transformation * stop passing obscure stateful callables around * enable printing of configuration * reduce amount of context switching to navigate the logic from `quantize_` to quantizing a single module TODO more polish before wider discussion. Test Plan: ``` pytest test/quantization/test_quant_api.py -s -x -k test_int4_weight_only_numerics pytest test/quantization/test_qat.py -s -x -k test_quantize_api_standalone pytest test/quantization/test_qat.py -s -x -k test_quantize_api_convert_path ``` Reviewers: Subscribers: Tasks: Tags: ghstack-source-id: c0716eda5694ddd9a649fc2cdbb292121a1f4da4 ghstack-comment-id: 2607756510 Pull Request resolved: #1595

drisspg · 2025-01-23T16:57:29Z

torchao/core/config.py

+
+
+# directory location for this might need more polish
+class AOBaseWorkflowConfig(abc.ABC):


Super Nit: maybe just AOBaseConfig

Update

24114ce

[ghstack-poisoned]

facebook-github-bot added the CLA Signed This label is managed by the Facebook bot. Authors need to sign the CLA before a PR can be reviewed. label Jan 22, 2025

vkuzo changed the title ~~[wip] configs configs configs!~~ [rfc] enable direct configuration in quantize_, v2 Jan 22, 2025

vkuzo added the topic: bc-breaking Use this tag if this PR breaks backward compatibility label Jan 22, 2025

vkuzo mentioned this pull request Jan 22, 2025

[rfc] enable direct configuration in quantize_ #1585

Closed

Update

5b9d876

[ghstack-poisoned]

Update

1cea42f

[ghstack-poisoned]

Update

138883b

[ghstack-poisoned]

Update

ba045ea

[ghstack-poisoned]

Update

94d9426

[ghstack-poisoned]

vkuzo requested review from andrewor14, jerryzh168, drisspg and HDCharles January 23, 2025 16:15

vkuzo changed the title ~~[rfc] enable direct configuration in quantize_, v2~~ [bc-breaking] enable direct configuration in quantize_, v2 Jan 23, 2025

vkuzo changed the title ~~[bc-breaking] enable direct configuration in quantize_, v2~~ [bc-breaking] enable direct configuration in quantize_ Jan 23, 2025

drisspg reviewed Jan 23, 2025

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[bc-breaking] enable direct configuration in quantize_ #1595

[bc-breaking] enable direct configuration in quantize_ #1595

vkuzo commented Jan 22, 2025 •

edited

Loading

vkuzo commented Jan 22, 2025 •

edited

Loading

pytorch-bot bot commented Jan 22, 2025 •

edited

Loading

drisspg Jan 23, 2025



		# directory location for this might need more polish
		class AOBaseWorkflowConfig(abc.ABC):

[bc-breaking] enable direct configuration in quantize_ #1595

Are you sure you want to change the base?

[bc-breaking] enable direct configuration in quantize_ #1595

Conversation

vkuzo commented Jan 22, 2025 • edited Loading

summary

user facing API changes

signature of quantize_

usage example

developer facing changes

current status

vkuzo commented Jan 22, 2025 • edited Loading

pytorch-bot bot commented Jan 22, 2025 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/ao/1595

✅ No Failures

drisspg Jan 23, 2025

Choose a reason for hiding this comment

vkuzo commented Jan 22, 2025 •

edited

Loading

vkuzo commented Jan 22, 2025 •

edited

Loading

pytorch-bot bot commented Jan 22, 2025 •

edited

Loading