Update README.md to add python formatting in code examples (microsoft…

…#729)
LeetJoe · Sep 15, 2023 · 841bfa9 · 841bfa9
1 parent 98954fd
commit 841bfa9
Showing 1 changed file with 2 additions and 2 deletions.
diff --git a/inference/huggingface/zero_inference/README.md b/inference/huggingface/zero_inference/README.md
@@ -115,7 +115,7 @@ INT4 weight quantization can be easily enabled with a few lines of configuration
 
 ### Quantized Initialization
 This is the easiest way to getting started. By providing a few lines of hints in ds_config, the model will be on-the-fly quantized during model initialization (e.g., AutoModel.from_pretrained). All candidate layers will be automatically quantized.
-```
+```python
 ds_config = {
   'weight_quantization': {
       'quantized_initialization': {
@@ -135,7 +135,7 @@ Currently, ZeRO-inference can quantize the weight matrix of nn.Embedding and nn.
 
 ### Post Initialization Quantization
 In this mode, model is first loaded in FP16 format and then convert into INT4. The advantage of enabling this mode is that users will have an overview of the model architecture. Thus, they will have fine-grained control over the quantization decision. For example, which layer should be quantized with which quantization configuration can be controlled. Only a few lines of code changes are needed. Note that we plan to expand this mode to accommodate more formats in the near future.
-```
+```python
 from deepspeed.compression.inference.quantization import _init_group_wise_weight_quantization
 ds_config = {
   'weight_quantization': {