From 841bfa9cd78ffdf68915e62edc40aa6682b0c3c9 Mon Sep 17 00:00:00 2001
From: Junaid Afzal <54235418+imJunaidAfzal@users.noreply.github.com>
Date: Tue, 12 Sep 2023 20:50:58 +0500
Subject: [PATCH] Update README.md to add python formatting in code examples
 (#729)

---
 inference/huggingface/zero_inference/README.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/inference/huggingface/zero_inference/README.md b/inference/huggingface/zero_inference/README.md
index f720414b8..d76de89cd 100644
--- a/inference/huggingface/zero_inference/README.md
+++ b/inference/huggingface/zero_inference/README.md
@@ -115,7 +115,7 @@ INT4 weight quantization can be easily enabled with a few lines of configuration
 
 ### Quantized Initialization
 This is the easiest way to getting started. By providing a few lines of hints in ds_config, the model will be on-the-fly quantized during model initialization (e.g., AutoModel.from_pretrained). All candidate layers will be automatically quantized.
-```
+```python
 ds_config = {
   'weight_quantization': {
       'quantized_initialization': {
@@ -135,7 +135,7 @@ Currently, ZeRO-inference can quantize the weight matrix of nn.Embedding and nn.
 
 ### Post Initialization Quantization
 In this mode, model is first loaded in FP16 format and then convert into INT4. The advantage of enabling this mode is that users will have an overview of the model architecture. Thus, they will have fine-grained control over the quantization decision. For example, which layer should be quantized with which quantization configuration can be controlled. Only a few lines of code changes are needed. Note that we plan to expand this mode to accommodate more formats in the near future.
-```
+```python
 from deepspeed.compression.inference.quantization import _init_group_wise_weight_quantization
 ds_config = {
   'weight_quantization': {