Replies: 17 comments 23 replies
-
Notes
|
Beta Was this translation helpful? Give feedback.
-
Task List
|
Beta Was this translation helpful? Give feedback.
-
About the tuning space and tuning order.Case 1. Use the default tuning space and tuning order# for rtn
default_rtn_tuning_config = TuningConfig(quant_configs=get_default_rtn_quant_configs(), max_trials=100) Case 2. The user specifies the tuning space and tuning order
customized_rtn_configs = RTNWeightQuantConfig(weight_bits=[2, 4, 6, 8])
tuning_config = TuningConfig(quant_configs=customized_rtn_configs, max_trials=100)
def customized_sampler(config: RTNWeightQuantConfig) -> List[RTNWeightQuantConfig]:
...
customized_rtn_configs = RTNWeightQuantConfig(weight_bits=[2, 4, 6, 8])
customized_rtn_configs.set_sampler(customized_sampler)
tuning_config = TuningConfig(quant_configs=customized_rtn_configs, max_trials=100) |
Beta Was this translation helpful? Give feedback.
-
ConfigRegistry@register_config(framework_name=FRAMEWORK_NAME, algo_name=GPTQ, priority=100)
#registered_configs
class ConfigRegistry:
FRAMEWORK_NAME:
ALGORITHM_NAME:
PRIORITY
CLS
config_registry = ConfigRegistry
config_registry.get_all_configs()
config_registry.get_sorted_configs()
1. add `priority` into `register_config`
2. replace `registered_configs` with `config_registry` |
Beta Was this translation helpful? Give feedback.
-
About the implementation details of
|
Beta Was this translation helpful? Give feedback.
-
|
Beta Was this translation helpful? Give feedback.
-
Workspace
|
Beta Was this translation helpful? Give feedback.
-
Enhance the
|
Beta Was this translation helpful? Give feedback.
-
Default tuning space
|
Beta Was this translation helpful? Give feedback.
-
Tasks List https://github.com/orgs/intel/projects/54/views/1?filterQuery= |
Beta Was this translation helpful? Give feedback.
-
Divide the
|
Beta Was this translation helpful? Give feedback.
-
Idea about simplifying the CI for common components.# torch/3x/torch/test_common.py
Tests for common components.
Owner(s): ["module: common & auto-tune"]
These tests aim to assess the fundamental functionalities of common components and enhance code coverage.
Currently, there are three replicas in each framework test folder. We may organize them into individual folders
like 'test/3x/common' and update the CI scripts to include them in each framework's CI.
The folder structure:
.
├── 3x
│ ├── common # <---- New added
│ ├── onnxrt
│ ├── tensorflow
│ └── torch
For each fwk CI:
onnxrt_included_folder:
├── 3x
│ ├── common
│ ├── onnxrt
tensorflow_included_folder:
├── 3x
│ ├── common
│ ├── tensorflow
torch_included_folder:
├── 3x
│ ├── common
│ ├── torch
|
Beta Was this translation helpful? Give feedback.
-
Folder structure of INC 3.X!!! Avoid creating a folder for just a single file !!!├── fwk_name
│ ├── __init__.py
│ ├── quantization
│ │ ├── algorithm_entry.py
│ │ ├── autotune.py
│ │ ├── config.py
│ │ ├── __init__.py
│ │ └── quantize.py
│ ├── algorithms
│ │ ├── __init__.py
│ │ ├── smooth_quant
│ │ │ ├── __init__.py
│ │ │ ├── smooth_quant.py
│ │ │ └── utility.py
│ │ ├── static_quant
│ │ │ ├── __init__.py
│ │ │ ├── static_quant.py
│ │ │ └── utility.py
│ │ └── weight_only
│ │ ├── gptq.py
│ │ ├── __init__.py
│ │ └── rtn.py
│ └── utils
│ ├── constants.py
│ ├── __init__.py
│ └── utility.py
└── __init__.py # * Note some code snippets
# neural_compressor/fwk_name/quantization/algorithm_entry.py
@register_algo(RTN)
def rtn_algo_entry()
from neural_compressor.fwk_name.algorithms import rtn
...
@register_algo(SMOOTH_QUANT)
def smooth_quant_entry():
from neural_compressor.fwk_name.algorithms import smooth_quant
...
|
Beta Was this translation helpful? Give feedback.
-
Enhance autotune UT to replace checking logs |
Beta Was this translation helpful? Give feedback.
-
Separate Smooth and Quant from smooth quantization [WIP]# Option 1 (Currently implementation)
from neural_compressor.torch.quantization import SmoothQuantConfig, quantize
sq_config = SmoothQuantConfig(alpha=0.5)
q_model = quantize(model=float_model, quant_config=sq_config)
# Option 2 (Recommended for v2.5)
# usage 1
from neural_compressor.torch.quantization import SmoothConfig, StaticQuantConfig, quantize
sq_config = SmoothConfig(alpha=0.5)
static_config = StaticQuantConfig(w_sym=False, w_algo="minmax", white_list=["linear1"])
q_model = quantize(model=float_model, quant_config=[sq_config, static_config])
# usage 2
from neural_compressor.torch.quantization import quantize, get_default_smooth_quant_config
q_model = quantize(model=float_model, quant_config=get_default_smooth_quant_config())
# Option 3 (Recommended for further)
from neural_compressor.torch import SmoothConfig, optimize
sq_config = SmoothConfig(alpha=0.5)
# optimize: float model -> float model
optimized_model = optimize(model=float_model, optimize_config=composite_config)
from neural_compressor.torch import StaticQuantConfig, quantize
# quantize: float model -> quantized model
static_config = StaticQuantConfig(w_sym=False, w_algo="minmax", white_list=["linear1"])
q_model = quantize(model=optimized_model, quant_config=static_config) |
Beta Was this translation helpful? Give feedback.
-
Unify and extend
|
Beta Was this translation helpful? Give feedback.
-
INC3
autotune
Module DesignGoal
Design Overview
TuningConfig
: Used by users to set tuning spaces, tuning order, and stop conditions.TuningLogger
: Record the tuning process log, facilitating the automatic collection of tuning results by validation teams.ConfigLoader
: Takes the config set and sampler, yielding quantization configuration one by one.TuningMonitor
: Records trial information and provides interfaces to check stop conditions.Evaluator
: Wraps user-provided evaluation functions into a unified interface to obtain the final evaluation score.The
autotune
APIOne of the major changes in INC 3 is the separation of quantization and autotune into two distinct APIs. Quantization APIs align with stock frameworks. The
autotune
is the tuning interface used by all framework extensions, with framework-specific arguments. Theautotune
API accepts two common arguments: tune_config and eval_fns.Usage Examples
Advantage Topics [WIP]
Define multiple evaluation functions
The relationship between
ConfigLoader
,Sampler
,ConfigSet
, andXxxAgloConfig
.Customize the tuning order
-- END
Beta Was this translation helpful? Give feedback.
All reactions