From aa4a5bc86c29575684ba19836ee34a8bed9b2237 Mon Sep 17 00:00:00 2001
From: wenhuach21 <wenhua.cheng@intel.com>
Date: Mon, 25 Nov 2024 09:14:23 +0800
Subject: [PATCH 1/6] fix typo

---
 README.md                                      | 18 +++++++++---------
 auto_round/mllm/README.md                      | 10 +++++-----
 ...md => Llama-3.2-11B-Vision-Instruct-sym.md} |  2 +-
 ...t_sym.md => Phi-3.5-vision-instruct-sym.md} |  2 +-
 ...ruct_sym.md => Qwen2-VL-7B-Instruct-sym.md} |  0
 ...ruct_sym.md => Qwen2.5-14B-Instruct-sym.md} |  0
 ...ruct_sym.md => Qwen2.5-32B-Instruct-sym.md} |  2 +-
 ...ruct_sym.md => Qwen2.5-72B-Instruct-sym.md} |  0
 ...truct_sym.md => Qwen2.5-7B-Instruct-sym.md} |  0
 ...B_sym.md => cogvlm2-llama3-chat-19B-sym.md} |  2 +-
 ...ava-v1.5-7b_sym.md => llava-v1.5-7b-sym.md} |  2 +-
 11 files changed, 19 insertions(+), 19 deletions(-)
 rename docs/{Llama-3.2-11B-Vision-Instruct_sym.md => Llama-3.2-11B-Vision-Instruct-sym.md} (99%)
 rename docs/{Phi-3.5-vision-instruct_sym.md => Phi-3.5-vision-instruct-sym.md} (99%)
 rename docs/{Qwen2-VL-7B-Instruct_sym.md => Qwen2-VL-7B-Instruct-sym.md} (100%)
 rename docs/{Qwen2.5-14B-Instruct_sym.md => Qwen2.5-14B-Instruct-sym.md} (100%)
 rename docs/{Qwen2.5-32B-Instruct_sym.md => Qwen2.5-32B-Instruct-sym.md} (99%)
 rename docs/{Qwen2.5-72B-Instruct_sym.md => Qwen2.5-72B-Instruct-sym.md} (100%)
 rename docs/{Qwen2.5-7B-Instruct_sym.md => Qwen2.5-7B-Instruct-sym.md} (100%)
 rename docs/{cogvlm2-llama3-chat-19B_sym.md => cogvlm2-llama3-chat-19B-sym.md} (99%)
 rename docs/{llava-v1.5-7b_sym.md => llava-v1.5-7b-sym.md} (99%)

diff --git a/README.md b/README.md
index fb272d9f..27aaaa04 100644
--- a/README.md
+++ b/README.md
@@ -312,16 +312,16 @@ release most of the models ourselves.
 
  Model                                  | Supported                                                                                                                                                                                                                                                                                                     |
 |----------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| THUDM/cogvlm2-llama3-chinese-chat-19B | [recipe](./docs/cogvlm2-llama3-chat-19B_sym.md)                                                                                                                                                                                                                                   |
-| Qwen/Qwen2-VL-Instruct | [recipe](./docs/Qwen2-VL-7B-Instruct_sym.md)                                                                                                                                                                                                                                   |
-| meta-llama/Llama-3.2-11B-Vision | [recipe](./docs/Llama-3.2-11B-Vision-Instruct_sym.md)                                                                                                                                                                                                                                   |
-| microsoft/Phi-3.5-vision-instruct | [recipe](./docs/Phi-3.5-vision-instruct_sym.md)                                                                                                                                                                                                                                   |
-| liuhaotian/llava-v1.5-7b | [recipe](./docs/llava-v1.5-7b_sym.md)                                                                                                                                                                                                                                   |
-| Qwen/Qwen2.5-7B-Instruct |[model-kaitchup-autogptq-int4*](https://beta-index.hf-mirror.com/kaitchup/Qwen2.5-7B-Instruct-AutoRound-GPTQ-asym-4bit), [recipe](./docs/Qwen2.5-7B-Instruct_sym.md)                                                                                                                                                                                                                                                                    |
-| Qwen/Qwen2.5-14B-Instruct |[recipe](./docs/Qwen2.5-14B-Instruct_sym.md)                                                                                                                                                                                                                                                                    |
-| Qwen/Qwen2.5-32B-Instruct |[recipe](./docs/Qwen2.5-32B-Instruct_sym.md)                                                                                                                                                                                                                                                                    |
+| THUDM/cogvlm2-llama3-chinese-chat-19B | [recipe](./docs/cogvlm2-llama3-chat-19B-sym)                                                                                                                                                                                                                                   |
+| Qwen/Qwen2-VL-Instruct | [recipe](./docs/Qwen2-VL-7B-Instruct-sym)                                                                                                                                                                                                                                   |
+| meta-llama/Llama-3.2-11B-Vision | [recipe](./docs/Llama-3.2-11B-Vision-Instruct-sym)                                                                                                                                                                                                                                   |
+| microsoft/Phi-3.5-vision-instruct | [recipe](./docs/Phi-3.5-vision-instruct-sym)                                                                                                                                                                                                                                   |
+| liuhaotian/llava-v1.5-7b | [recipe](./docs/llava-v1.5-7b-sym)                                                                                                                                                                                                                                   |
+| Qwen/Qwen2.5-7B-Instruct |[model-kaitchup-autogptq-int4*](https://beta-index.hf-mirror.com/kaitchup/Qwen2.5-7B-Instruct-AutoRound-GPTQ-asym-4bit), [recipe](./docs/Qwen2.5-7B-Instruct-sym)                                                                                                                                                                                                                                                                    |
+| Qwen/Qwen2.5-14B-Instruct |[recipe](./docs/Qwen2.5-14B-Instruct-sym)                                                                                                                                                                                                                                                                    |
+| Qwen/Qwen2.5-32B-Instruct |[recipe](./docs/Qwen2.5-32B-Instruct-sym)                                                                                                                                                                                                                                                                    |
 | Qwen/Qwen2.5-Coder-32B-Instruct |[model-kaitchup-autogptq-int4*](https://beta-index.hf-mirror.com/kaitchup/Qwen2.5-Coder-32B-Instruct-AutoRound-GPTQ-4bit)                                                                                                                                                                                                                                   |
-| Qwen/Qwen2.5-72B-Instruct |[model-kaitchup-autogptq-int4*](https://beta-index.hf-mirror.com/kaitchup/Qwen2.5-72B-Instruct-AutoRound-GPTQ-4bit),  [model-kaitchup-autogptq-int2*](https://beta-index.hf-mirror.com/kaitchup/Qwen2.5-72B-Instruct-AutoRound-GPTQ-2bit), [recipe](./docs/Qwen2.5-72B-Instruct_sym.md)                                                                                                                                                                                                                                                            |
+| Qwen/Qwen2.5-72B-Instruct |[model-kaitchup-autogptq-int4*](https://beta-index.hf-mirror.com/kaitchup/Qwen2.5-72B-Instruct-AutoRound-GPTQ-4bit),  [model-kaitchup-autogptq-int2*](https://beta-index.hf-mirror.com/kaitchup/Qwen2.5-72B-Instruct-AutoRound-GPTQ-2bit), [recipe](./docs/Qwen2.5-72B-Instruct-sym)                                                                                                                                                                                                                                                            |
 | meta-llama/Meta-Llama-3.1-70B-Instruct | [recipe](https://huggingface.co/Intel/Meta-Llama-3.1-70B-Instruct-int4-inc)                                                                                                                                                                                                                                   |
 | meta-llama/Meta-Llama-3.1-8B-Instruct  | [model-kaitchup-autogptq-int4*](https://huggingface.co/kaitchup/Meta-Llama-3.1-8B-Instruct-autoround-gptq-4bit-asym), [model-kaitchup-autogptq-sym-int4*](https://huggingface.co/kaitchup/Meta-Llama-3.1-8B-Instruct-autoround-gptq-4bit-sym), [recipe](https://huggingface.co/Intel/Meta-Llama-3.1-8B-Instruct-int4-inc) |
 | meta-llama/Meta-Llama-3.1-8B           | [model-kaitchup-autogptq-sym-int4*](https://huggingface.co/kaitchup/Meta-Llama-3.1-8B-autoround-gptq-4bit-sym)                                                                                                                                                                                                |
diff --git a/auto_round/mllm/README.md b/auto_round/mllm/README.md
index 80078715..ca3ef68e 100644
--- a/auto_round/mllm/README.md
+++ b/auto_round/mllm/README.md
@@ -125,11 +125,11 @@ from auto_round import AutoRoundConfig ## must import for auto-round format
 
 For more details on quantization, inference, evaluation, and environment, see the following recipe:
 
-- [Qwen2-VL-7B-Instruct](../../docs/Qwen2-VL-7B-Instruct_sym.md)
-- [Llama-3.2-11B-Vision](../../docs/Llama-3.2-11B-Vision-Instruct_sym.md) 
-- [Phi-3.5-vision-instruct](../../docs/Phi-3.5-vision-instruct_sym.md)
-- [llava-v1.5-7b](../../docs/llava-v1.5-7b_sym.md)
-- [cogvlm2-llama3-chat-19B](../../docs/cogvlm2-llama3-chat-19B_sym.md)
+- [Qwen2-VL-7B-Instruct](../../docs/Qwen2-VL-7B-Instruct-sym)
+- [Llama-3.2-11B-Vision](../../docs/Llama-3.2-11B-Vision-Instruct-sym) 
+- [Phi-3.5-vision-instruct](../../docs/Phi-3.5-vision-instruct-sym)
+- [llava-v1.5-7b](../../docs/llava-v1.5-7b-sym)
+- [cogvlm2-llama3-chat-19B](../../docs/cogvlm2-llama3-chat-19B-sym)
 
 
 
diff --git a/docs/Llama-3.2-11B-Vision-Instruct_sym.md b/docs/Llama-3.2-11B-Vision-Instruct-sym.md
similarity index 99%
rename from docs/Llama-3.2-11B-Vision-Instruct_sym.md
rename to docs/Llama-3.2-11B-Vision-Instruct-sym.md
index 86a17ab9..c41a853a 100644
--- a/docs/Llama-3.2-11B-Vision-Instruct_sym.md
+++ b/docs/Llama-3.2-11B-Vision-Instruct-sym.md
@@ -106,7 +106,7 @@ auto-round-mllm --eval --model Intel/Llama-3.2-11B-Vision-Instruct-inc-private -
 ### Generate the model
 Here is the sample command to reproduce the model.
 ```bash
-pip install auto_round
+pip install auto-round
 auto-round-mllm
 --model meta-llama/Llama-3.2-11B-Vision-Instruct \
 --device 0 \
diff --git a/docs/Phi-3.5-vision-instruct_sym.md b/docs/Phi-3.5-vision-instruct-sym.md
similarity index 99%
rename from docs/Phi-3.5-vision-instruct_sym.md
rename to docs/Phi-3.5-vision-instruct-sym.md
index bb1f9423..3141f00c 100644
--- a/docs/Phi-3.5-vision-instruct_sym.md
+++ b/docs/Phi-3.5-vision-instruct-sym.md
@@ -118,7 +118,7 @@ auto-round-mllm --eval --model Intel/Qwen2-VL-7B-Instruct-inc-private --tasks MM
 ### Generate the model
 Here is the sample command to reproduce the model.
 ```bash
-pip install auto_round
+pip install auto-round
 auto-round-mllm
 --model microsoft/Phi-3.5-vision-instruct \
 --device 0 \
diff --git a/docs/Qwen2-VL-7B-Instruct_sym.md b/docs/Qwen2-VL-7B-Instruct-sym.md
similarity index 100%
rename from docs/Qwen2-VL-7B-Instruct_sym.md
rename to docs/Qwen2-VL-7B-Instruct-sym.md
diff --git a/docs/Qwen2.5-14B-Instruct_sym.md b/docs/Qwen2.5-14B-Instruct-sym.md
similarity index 100%
rename from docs/Qwen2.5-14B-Instruct_sym.md
rename to docs/Qwen2.5-14B-Instruct-sym.md
diff --git a/docs/Qwen2.5-32B-Instruct_sym.md b/docs/Qwen2.5-32B-Instruct-sym.md
similarity index 99%
rename from docs/Qwen2.5-32B-Instruct_sym.md
rename to docs/Qwen2.5-32B-Instruct-sym.md
index 7d7f24fb..277b2ab2 100644
--- a/docs/Qwen2.5-32B-Instruct_sym.md
+++ b/docs/Qwen2.5-32B-Instruct-sym.md
@@ -141,7 +141,7 @@ auto-round --model "Intel/Qwen2.5-32B-Instruct-int4-inc" --eval --eval_bs 16  --
 
 Here is the sample command to generate the model. 
 
-For symmetric quantization, we found overflow/NAN will occur for some backends, so better fallback some layers. auto_round requires version >0.4.1
+For symmetric quantization, we found overflow/NAN will occur for some backends, so better fallback some layers. auto_round requires version > 0.3.1
 
 ```bash
 auto-round \
diff --git a/docs/Qwen2.5-72B-Instruct_sym.md b/docs/Qwen2.5-72B-Instruct-sym.md
similarity index 100%
rename from docs/Qwen2.5-72B-Instruct_sym.md
rename to docs/Qwen2.5-72B-Instruct-sym.md
diff --git a/docs/Qwen2.5-7B-Instruct_sym.md b/docs/Qwen2.5-7B-Instruct-sym.md
similarity index 100%
rename from docs/Qwen2.5-7B-Instruct_sym.md
rename to docs/Qwen2.5-7B-Instruct-sym.md
diff --git a/docs/cogvlm2-llama3-chat-19B_sym.md b/docs/cogvlm2-llama3-chat-19B-sym.md
similarity index 99%
rename from docs/cogvlm2-llama3-chat-19B_sym.md
rename to docs/cogvlm2-llama3-chat-19B-sym.md
index bb1601e0..b58b3e87 100644
--- a/docs/cogvlm2-llama3-chat-19B_sym.md
+++ b/docs/cogvlm2-llama3-chat-19B-sym.md
@@ -89,7 +89,7 @@ auto-round-mllm --lmms --model Intel/cogvlm2-llama3-chat-19B-inc-private --tasks
 ### Generate the model
 Here is the sample command to reproduce the model.
 ```bash
-pip install auto_round
+pip install auto-round
 auto-round-mllm
 --model THUDM/cogvlm2-llama3-chat-19B \
 --device 0 \
diff --git a/docs/llava-v1.5-7b_sym.md b/docs/llava-v1.5-7b-sym.md
similarity index 99%
rename from docs/llava-v1.5-7b_sym.md
rename to docs/llava-v1.5-7b-sym.md
index cbb614e0..633a65e0 100644
--- a/docs/llava-v1.5-7b_sym.md
+++ b/docs/llava-v1.5-7b-sym.md
@@ -95,7 +95,7 @@ auto-round-mllm --lmms --model Intel/llava-v1.5-7b-inc-private --tasks pope,text
 ### Generate the model
 Here is the sample command to reproduce the model.
 ```bash
-pip install auto_round
+pip install auto-round
 auto-round-mllm
 --model liuhaotian/llava-v1.5-7b \
 --device 0 \

From b647731835b2ce59384592780b8e679c9a1f35ba Mon Sep 17 00:00:00 2001
From: wenhuach21 <wenhua.cheng@intel.com>
Date: Mon, 25 Nov 2024 09:26:07 +0800
Subject: [PATCH 2/6] fix typo

---
 README.md | 84 +++++++++++++++++++++++++++----------------------------
 1 file changed, 42 insertions(+), 42 deletions(-)

diff --git a/README.md b/README.md
index 27aaaa04..18bb9e70 100644
--- a/README.md
+++ b/README.md
@@ -310,49 +310,49 @@ Please note that an asterisk (*) indicates third-party quantized models, which m
 different recipe. We greatly appreciate their efforts and encourage more users to share their models, as we cannot
 release most of the models ourselves.
 
- Model                                  | Supported                                                                                                                                                                                                                                                                                                     |
-|----------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
-| THUDM/cogvlm2-llama3-chinese-chat-19B | [recipe](./docs/cogvlm2-llama3-chat-19B-sym)                                                                                                                                                                                                                                   |
-| Qwen/Qwen2-VL-Instruct | [recipe](./docs/Qwen2-VL-7B-Instruct-sym)                                                                                                                                                                                                                                   |
-| meta-llama/Llama-3.2-11B-Vision | [recipe](./docs/Llama-3.2-11B-Vision-Instruct-sym)                                                                                                                                                                                                                                   |
-| microsoft/Phi-3.5-vision-instruct | [recipe](./docs/Phi-3.5-vision-instruct-sym)                                                                                                                                                                                                                                   |
-| liuhaotian/llava-v1.5-7b | [recipe](./docs/llava-v1.5-7b-sym)                                                                                                                                                                                                                                   |
-| Qwen/Qwen2.5-7B-Instruct |[model-kaitchup-autogptq-int4*](https://beta-index.hf-mirror.com/kaitchup/Qwen2.5-7B-Instruct-AutoRound-GPTQ-asym-4bit), [recipe](./docs/Qwen2.5-7B-Instruct-sym)                                                                                                                                                                                                                                                                    |
-| Qwen/Qwen2.5-14B-Instruct |[recipe](./docs/Qwen2.5-14B-Instruct-sym)                                                                                                                                                                                                                                                                    |
-| Qwen/Qwen2.5-32B-Instruct |[recipe](./docs/Qwen2.5-32B-Instruct-sym)                                                                                                                                                                                                                                                                    |
-| Qwen/Qwen2.5-Coder-32B-Instruct |[model-kaitchup-autogptq-int4*](https://beta-index.hf-mirror.com/kaitchup/Qwen2.5-Coder-32B-Instruct-AutoRound-GPTQ-4bit)                                                                                                                                                                                                                                   |
-| Qwen/Qwen2.5-72B-Instruct |[model-kaitchup-autogptq-int4*](https://beta-index.hf-mirror.com/kaitchup/Qwen2.5-72B-Instruct-AutoRound-GPTQ-4bit),  [model-kaitchup-autogptq-int2*](https://beta-index.hf-mirror.com/kaitchup/Qwen2.5-72B-Instruct-AutoRound-GPTQ-2bit), [recipe](./docs/Qwen2.5-72B-Instruct-sym)                                                                                                                                                                                                                                                            |
-| meta-llama/Meta-Llama-3.1-70B-Instruct | [recipe](https://huggingface.co/Intel/Meta-Llama-3.1-70B-Instruct-int4-inc)                                                                                                                                                                                                                                   |
+ Model                                  | Supported                                                                                                                                                                                                                                                                                                                 |
+|----------------------------------------|---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|
+| THUDM/cogvlm2-llama3-chinese-chat-19B | [recipe](./docs/cogvlm2-llama3-chat-19B-sym.md)                                                                                                                                                                                                                                                                           |
+| Qwen/Qwen2-VL-Instruct | [recipe](./docs/Qwen2-VL-7B-Instruct-sym.md)                                                                                                                                                                                                                                                                              |
+| meta-llama/Llama-3.2-11B-Vision | [recipe](./docs/Llama-3.2-11B-Vision-Instruct-sym.md)                                                                                                                                                                                                                                                                     |
+| microsoft/Phi-3.5-vision-instruct | [recipe](./docs/Phi-3.5-vision-instruct-sym.md)                                                                                                                                                                                                                                                                           |
+| liuhaotian/llava-v1.5-7b | [recipe](./docs/llava-v1.5-7b-sym.md)                                                                                                                                                                                                                                                                                     |
+| Qwen/Qwen2.5-7B-Instruct | [model-kaitchup-autogptq-int4*](https://beta-index.hf-mirror.com/kaitchup/Qwen2.5-7B-Instruct-AutoRound-GPTQ-asym-4bit), [recipe](./docs/Qwen2.5-7B-Instruct-sym.md)                                                                                                                                                      |
+| Qwen/Qwen2.5-14B-Instruct | [recipe](./docs/Qwen2.5-14B-Instruct-sym.md)                                                                                                                                                                                                                                                                              |
+| Qwen/Qwen2.5-32B-Instruct | [recipe](./docs/Qwen2.5-32B-Instruct-sym.md)                                                                                                                                                                                                                                                                              |
+| Qwen/Qwen2.5-Coder-32B-Instruct | [model-kaitchup-autogptq-int4*](https://beta-index.hf-mirror.com/kaitchup/Qwen2.5-Coder-32B-Instruct-AutoRound-GPTQ-4bit)                                                                                                                                                                                                 |
+| Qwen/Qwen2.5-72B-Instruct | [model-kaitchup-autogptq-int4*](https://beta-index.hf-mirror.com/kaitchup/Qwen2.5-72B-Instruct-AutoRound-GPTQ-4bit),  [model-kaitchup-autogptq-int2*](https://beta-index.hf-mirror.com/kaitchup/Qwen2.5-72B-Instruct-AutoRound-GPTQ-2bit), [recipe](./docs/Qwen2.5-72B-Instruct-sym.md)                                   |
+| meta-llama/Meta-Llama-3.1-70B-Instruct | [recipe](https://huggingface.co/Intel/Meta-Llama-3.1-70B-Instruct-int4-inc)                                                                                                                                                                                                                                               |
 | meta-llama/Meta-Llama-3.1-8B-Instruct  | [model-kaitchup-autogptq-int4*](https://huggingface.co/kaitchup/Meta-Llama-3.1-8B-Instruct-autoround-gptq-4bit-asym), [model-kaitchup-autogptq-sym-int4*](https://huggingface.co/kaitchup/Meta-Llama-3.1-8B-Instruct-autoround-gptq-4bit-sym), [recipe](https://huggingface.co/Intel/Meta-Llama-3.1-8B-Instruct-int4-inc) |
-| meta-llama/Meta-Llama-3.1-8B           | [model-kaitchup-autogptq-sym-int4*](https://huggingface.co/kaitchup/Meta-Llama-3.1-8B-autoround-gptq-4bit-sym)                                                                                                                                                                                                |
-| Qwen/Qwen-VL                           | [accuracy](./examples/multimodal-modeling/Qwen-VL/README.md), [recipe](./examples/multimodal-modeling/Qwen-VL/run_autoround.sh)                                                                                                                                                                               
-| Qwen/Qwen2-7B                          | [model-autoround-sym-int4](https://huggingface.co/Intel/Qwen2-7B-int4-inc), [model-autogptq-sym-int4](https://huggingface.co/Intel/Qwen2-7B-int4-inc)                                                                                                                                                         |
-| THUDM/glm-4-9b-chat                    | [recipe](./docs/glm-4-9b-chat-recipe.md)                                                                                                                                                                                                                                                                      |
-| Qwen/Qwen2-57B-A14B-Instruct           | [model-autoround-sym-int4](https://huggingface.co/Intel/Qwen2-57B-A14B-Instruct-int4-inc),[model-autogptq-sym-int4](https://huggingface.co/Intel/Qwen2-57B-A14B-Instruct-int4-inc)                                                                                                                            |
-| 01-ai/Yi-1.5-9B                        | [model-LnL-AI-autogptq-int4*](https://huggingface.co/LnL-AI/Yi-1.5-9B-4bit-gptq-autoround)                                                                                                                                                                                                                    |
-| 01-ai/Yi-1.5-9B-Chat                   | [model-LnL-AI-autogptq-int4*](https://huggingface.co/LnL-AI/Yi-1.5-9B-Chat-4bit-gptq-autoround)                                                                                                                                                                                                               |
-| Intel/neural-chat-7b-v3-3              | [model-autogptq-int4](https://huggingface.co/Intel/neural-chat-7b-v3-3-int4-inc)                                                                                                                                                                                                                              |
-| Intel/neural-chat-7b-v3-1              | [model-autogptq-int4](https://huggingface.co/Intel/neural-chat-7b-v3-1-int4-inc)                                                                                                                                                                                                                              |
-| TinyLlama-1.1B-intermediate            | [model-LnL-AI-autogptq-int4*](https://huggingface.co/LnL-AI/TinyLlama-1.1B-intermediate-step-1341k-3T-autoround-lm_head-symFalse)                                                                                                                                                                             |
-| mistralai/Mistral-7B-v0.1              | [model-autogptq-lmhead-int4](https://huggingface.co/Intel/Mistral-7B-v0.1-int4-inc-lmhead), [model-autogptq-int4](https://huggingface.co/Intel/Mistral-7B-v0.1-int4-inc)                                                                                                                                      |
-| google/gemma-2b                        | [model-autogptq-int4](https://huggingface.co/Intel/gemma-2b-int4-inc)                                                                                                                                                                                                                                         |
-| tiiuae/falcon-7b                       | [model-autogptq-int4-G64](https://huggingface.co/Intel/falcon-7b-int4-inc)                                                                                                                                                                                                                                    |
-| sapienzanlp/modello-italia-9b          | [model-fbaldassarri-autogptq-int4*](https://huggingface.co/fbaldassarri/modello-italia-9b-autoround-w4g128-cpu)                                                                                                                                                                                               |
-| microsoft/phi-2                        | [model-autoround-sym-int4](https://huggingface.co/Intel/phi-2-int4-inc) [model-autogptq-sym-int4](https://huggingface.co/Intel/phi-2-int4-inc)                                                                                                                                                                |
-| microsoft/Phi-3.5-mini-instruct        | [model-kaitchup-autogptq-sym-int4*](https://huggingface.co/kaitchup/Phi-3.5-Mini-instruct-AutoRound-4bit)                                                                                                                                                                                                     |
-| microsoft/Phi-3-vision-128k-instruct   | [recipe](./examples/multimodal-modeling/Phi-3-vision/run_autoround.sh)                                                                                                                                                                                                                                        
-| mistralai/Mistral-7B-Instruct-v0.2     | [accuracy](./docs/Mistral-7B-Instruct-v0.2-acc.md), [recipe](./examples/language-modeling/scripts/Mistral-7B-Instruct-v0.2.sh)                                                                                                                                      |
-| mistralai/Mixtral-8x7B-Instruct-v0.1   | [accuracy](./docs/Mixtral-8x7B-Instruct-v0.1-acc.md), [recipe](./examples/language-modeling/scripts/Mixtral-8x7B-Instruct-v0.1.sh)                                                                                                                                  |
-| mistralai/Mixtral-8x7B-v0.1            | [accuracy](./docs/Mixtral-8x7B-v0.1-acc.md), [recipe](./examples/language-modeling/scripts/Mixtral-8x7B-v0.1.sh)                                                                                                                                                    |
-| meta-llama/Meta-Llama-3-8B-Instruct    | [accuracy](./docs/Meta-Llama-3-8B-Instruct-acc.md), [recipe](./examples/language-modeling/scripts/Meta-Llama-3-8B-Instruct.sh)                                                                                                                                      |
-| google/gemma-7b                        | [accuracy](./docs/gemma-7b-acc.md), [recipe](./examples/language-modeling/scripts/gemma-7b.sh)                                                                                                                                                                      |
-| meta-llama/Llama-2-7b-chat-hf          | [accuracy](./docs/Llama-2-7b-chat-hf-acc.md), [recipe](./examples/language-modeling/scripts/Llama-2-7b-chat-hf.sh)                                                                                                                                                  |
-| Qwen/Qwen1.5-7B-Chat                   | [accuracy](./docs/Qwen1.5-7B-Chat-acc.md), [sym recipe](./examples/language-modeling/scripts/Qwen1.5-7B-Chat-sym.sh), [asym recipe ](./examples/language-modeling/scripts/Qwen1.5-7B-Chat-asym.sh)                                                                  |
-| baichuan-inc/Baichuan2-7B-Chat         | [accuracy](./docs/baichuan2-7b-chat-acc.md), [recipe](./examples/language-modeling/scripts/baichuan2-7b-chat.sh)                                                                                                                                                    |         
-| 01-ai/Yi-6B-Chat                       | [accuracy](./docs/Yi-6B-Chat-acc.md), [recipe](./examples/language-modeling/scripts/Yi-6B-Chat.sh)                                                                                                                                                                  |                                     
-| facebook/opt-2.7b                      | [accuracy](./docs/opt-2.7b-acc.md), [recipe](./examples/language-modeling/scripts/opt-2.7b.sh)                                                                                                                                                                      |
-| bigscience/bloom-3b                    | [accuracy](./docs/bloom-3B-acc.md), [recipe](./examples/language-modeling/scripts/bloom-3b.sh)                                                                                                                                                                    |
-| EleutherAI/gpt-j-6b                    | [accuracy](./docs/gpt-j-6B-acc.md), [recipe](./examples/language-modeling/scripts/gpt-j-6b.sh)                                                                                                                                                                  | 
+| meta-llama/Meta-Llama-3.1-8B           | [model-kaitchup-autogptq-sym-int4*](https://huggingface.co/kaitchup/Meta-Llama-3.1-8B-autoround-gptq-4bit-sym)                                                                                                                                                                                                            |
+| Qwen/Qwen-VL                           | [accuracy](./examples/multimodal-modeling/Qwen-VL/README.md), [recipe](./examples/multimodal-modeling/Qwen-VL/run_autoround.sh)                                                                                                                                                                                           
+| Qwen/Qwen2-7B                          | [model-autoround-sym-int4](https://huggingface.co/Intel/Qwen2-7B-int4-inc), [model-autogptq-sym-int4](https://huggingface.co/Intel/Qwen2-7B-int4-inc)                                                                                                                                                                     |
+| THUDM/glm-4-9b-chat                    | [recipe](./docs/glm-4-9b-chat-recipe.md)                                                                                                                                                                                                                                                                                  |
+| Qwen/Qwen2-57B-A14B-Instruct           | [model-autoround-sym-int4](https://huggingface.co/Intel/Qwen2-57B-A14B-Instruct-int4-inc),[model-autogptq-sym-int4](https://huggingface.co/Intel/Qwen2-57B-A14B-Instruct-int4-inc)                                                                                                                                        |
+| 01-ai/Yi-1.5-9B                        | [model-LnL-AI-autogptq-int4*](https://huggingface.co/LnL-AI/Yi-1.5-9B-4bit-gptq-autoround)                                                                                                                                                                                                                                |
+| 01-ai/Yi-1.5-9B-Chat                   | [model-LnL-AI-autogptq-int4*](https://huggingface.co/LnL-AI/Yi-1.5-9B-Chat-4bit-gptq-autoround)                                                                                                                                                                                                                           |
+| Intel/neural-chat-7b-v3-3              | [model-autogptq-int4](https://huggingface.co/Intel/neural-chat-7b-v3-3-int4-inc)                                                                                                                                                                                                                                          |
+| Intel/neural-chat-7b-v3-1              | [model-autogptq-int4](https://huggingface.co/Intel/neural-chat-7b-v3-1-int4-inc)                                                                                                                                                                                                                                          |
+| TinyLlama-1.1B-intermediate            | [model-LnL-AI-autogptq-int4*](https://huggingface.co/LnL-AI/TinyLlama-1.1B-intermediate-step-1341k-3T-autoround-lm_head-symFalse)                                                                                                                                                                                         |
+| mistralai/Mistral-7B-v0.1              | [model-autogptq-lmhead-int4](https://huggingface.co/Intel/Mistral-7B-v0.1-int4-inc-lmhead), [model-autogptq-int4](https://huggingface.co/Intel/Mistral-7B-v0.1-int4-inc)                                                                                                                                                  |
+| google/gemma-2b                        | [model-autogptq-int4](https://huggingface.co/Intel/gemma-2b-int4-inc)                                                                                                                                                                                                                                                     |
+| tiiuae/falcon-7b                       | [model-autogptq-int4-G64](https://huggingface.co/Intel/falcon-7b-int4-inc)                                                                                                                                                                                                                                                |
+| sapienzanlp/modello-italia-9b          | [model-fbaldassarri-autogptq-int4*](https://huggingface.co/fbaldassarri/modello-italia-9b-autoround-w4g128-cpu)                                                                                                                                                                                                           |
+| microsoft/phi-2                        | [model-autoround-sym-int4](https://huggingface.co/Intel/phi-2-int4-inc) [model-autogptq-sym-int4](https://huggingface.co/Intel/phi-2-int4-inc)                                                                                                                                                                            |
+| microsoft/Phi-3.5-mini-instruct        | [model-kaitchup-autogptq-sym-int4*](https://huggingface.co/kaitchup/Phi-3.5-Mini-instruct-AutoRound-4bit)                                                                                                                                                                                                                 |
+| microsoft/Phi-3-vision-128k-instruct   | [recipe](./examples/multimodal-modeling/Phi-3-vision/run_autoround.sh)                                                                                                                                                                                                                                                    
+| mistralai/Mistral-7B-Instruct-v0.2     | [accuracy](./docs/Mistral-7B-Instruct-v0.2-acc.md), [recipe](./examples/language-modeling/scripts/Mistral-7B-Instruct-v0.2.sh)                                                                                                                                                                                            |
+| mistralai/Mixtral-8x7B-Instruct-v0.1   | [accuracy](./docs/Mixtral-8x7B-Instruct-v0.1-acc.md), [recipe](./examples/language-modeling/scripts/Mixtral-8x7B-Instruct-v0.1.sh)                                                                                                                                                                                        |
+| mistralai/Mixtral-8x7B-v0.1            | [accuracy](./docs/Mixtral-8x7B-v0.1-acc.md), [recipe](./examples/language-modeling/scripts/Mixtral-8x7B-v0.1.sh)                                                                                                                                                                                                          |
+| meta-llama/Meta-Llama-3-8B-Instruct    | [accuracy](./docs/Meta-Llama-3-8B-Instruct-acc.md), [recipe](./examples/language-modeling/scripts/Meta-Llama-3-8B-Instruct.sh)                                                                                                                                                                                            |
+| google/gemma-7b                        | [accuracy](./docs/gemma-7b-acc.md), [recipe](./examples/language-modeling/scripts/gemma-7b.sh)                                                                                                                                                                                                                            |
+| meta-llama/Llama-2-7b-chat-hf          | [accuracy](./docs/Llama-2-7b-chat-hf-acc.md), [recipe](./examples/language-modeling/scripts/Llama-2-7b-chat-hf.sh)                                                                                                                                                                                                        |
+| Qwen/Qwen1.5-7B-Chat                   | [accuracy](./docs/Qwen1.5-7B-Chat-acc.md), [sym recipe](./examples/language-modeling/scripts/Qwen1.5-7B-Chat-sym.sh), [asym recipe ](./examples/language-modeling/scripts/Qwen1.5-7B-Chat-asym.sh)                                                                                                                        |
+| baichuan-inc/Baichuan2-7B-Chat         | [accuracy](./docs/baichuan2-7b-chat-acc.md), [recipe](./examples/language-modeling/scripts/baichuan2-7b-chat.sh)                                                                                                                                                                                                          |         
+| 01-ai/Yi-6B-Chat                       | [accuracy](./docs/Yi-6B-Chat-acc.md), [recipe](./examples/language-modeling/scripts/Yi-6B-Chat.sh)                                                                                                                                                                                                                        |                                     
+| facebook/opt-2.7b                      | [accuracy](./docs/opt-2.7b-acc.md), [recipe](./examples/language-modeling/scripts/opt-2.7b.sh)                                                                                                                                                                                                                            |
+| bigscience/bloom-3b                    | [accuracy](./docs/bloom-3B-acc.md), [recipe](./examples/language-modeling/scripts/bloom-3b.sh)                                                                                                                                                                                                                            |
+| EleutherAI/gpt-j-6b                    | [accuracy](./docs/gpt-j-6B-acc.md), [recipe](./examples/language-modeling/scripts/gpt-j-6b.sh)                                                                                                                                                                                                                            | 
 
 ## Integration
 

From 8fa6d62e689433555ac4045dc2cd24487868d930 Mon Sep 17 00:00:00 2001
From: wenhuach21 <wenhua.cheng@intel.com>
Date: Mon, 25 Nov 2024 09:31:38 +0800
Subject: [PATCH 3/6] fix typo

---
 auto_round/mllm/README.md | 10 +++++-----
 1 file changed, 5 insertions(+), 5 deletions(-)

diff --git a/auto_round/mllm/README.md b/auto_round/mllm/README.md
index ca3ef68e..dc7ff69b 100644
--- a/auto_round/mllm/README.md
+++ b/auto_round/mllm/README.md
@@ -125,11 +125,11 @@ from auto_round import AutoRoundConfig ## must import for auto-round format
 
 For more details on quantization, inference, evaluation, and environment, see the following recipe:
 
-- [Qwen2-VL-7B-Instruct](../../docs/Qwen2-VL-7B-Instruct-sym)
-- [Llama-3.2-11B-Vision](../../docs/Llama-3.2-11B-Vision-Instruct-sym) 
-- [Phi-3.5-vision-instruct](../../docs/Phi-3.5-vision-instruct-sym)
-- [llava-v1.5-7b](../../docs/llava-v1.5-7b-sym)
-- [cogvlm2-llama3-chat-19B](../../docs/cogvlm2-llama3-chat-19B-sym)
+- [Qwen2-VL-7B-Instruct](../../docs/Qwen2-VL-7B-Instruct-sym.md)
+- [Llama-3.2-11B-Vision](../../docs/Llama-3.2-11B-Vision-Instruct-sym.md) 
+- [Phi-3.5-vision-instruct](../../docs/Phi-3.5-vision-instruct-sym.md)
+- [llava-v1.5-7b](../../docs/llava-v1.5-7b-sym.md)
+- [cogvlm2-llama3-chat-19B](../../docs/cogvlm2-llama3-chat-19B-sym.md)
 
 
 

From b7ee3e29bc00214af0c84bf04e70e01933a74e8e Mon Sep 17 00:00:00 2001
From: wenhuach21 <wenhua.cheng@intel.com>
Date: Mon, 25 Nov 2024 09:52:22 +0800
Subject: [PATCH 4/6] update blog

---
 README.md | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/README.md b/README.md
index 18bb9e70..0bf08ac6 100644
--- a/README.md
+++ b/README.md
@@ -27,7 +27,7 @@ more accuracy data and recipes across various models.
 
 ## What's New
 * [2024/11] We provide experimental support for VLLM quantization, please check out [MLLM README](./auto_round/mllm/README.md)
-* [2024/11] We provide some tips and tricks for LLM&VLM quantization, please check out [this file](./docs/tips_and_tricks.md)
+* [2024/11] We provide some tips and tricks for LLM&VLM quantization, please check out [this blog](https://medium.com/@NeuralCompressor/10-tips-for-quantizing-llms-and-vlms-with-autoround-923e733879a7)
 * [2024/10] AutoRound has been integrated to [torch/ao](https://github.com/pytorch/ao), check out
   their [release note](https://github.com/pytorch/ao/releases/tag/v0.6.1)
 * [2024/10] Important update: We now support full-range symmetric quantization and have made it the default

From 9e52a9e49ef8807496dd86af30d78eba61a01811 Mon Sep 17 00:00:00 2001
From: wenhuach21 <wenhua.cheng@intel.com>
Date: Mon, 25 Nov 2024 09:53:57 +0800
Subject: [PATCH 5/6] update

---
 README.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index 0bf08ac6..9fea247f 100644
--- a/README.md
+++ b/README.md
@@ -10,7 +10,7 @@ AutoRound
 ---
 <div align="left">
 
-AutoRound is an advanced quantization algorithm for low-bits LLM inference. It's tailored for a wide range
+AutoRound is an advanced quantization algorithm for low-bits LLM/VLM inference. It's tailored for a wide range
 of models. AutoRound adopts sign gradient descent to fine-tune rounding values and minmax values of weights in just 200
 steps,
 which competes impressively against recent methods without introducing any additional inference overhead and keeping low
@@ -26,7 +26,7 @@ more accuracy data and recipes across various models.
 <div align="left">
 
 ## What's New
-* [2024/11] We provide experimental support for VLLM quantization, please check out [MLLM README](./auto_round/mllm/README.md)
+* [2024/11] We provide experimental support for VLLM quantization, please check out the [README](./auto_round/mllm/README.md)
 * [2024/11] We provide some tips and tricks for LLM&VLM quantization, please check out [this blog](https://medium.com/@NeuralCompressor/10-tips-for-quantizing-llms-and-vlms-with-autoround-923e733879a7)
 * [2024/10] AutoRound has been integrated to [torch/ao](https://github.com/pytorch/ao), check out
   their [release note](https://github.com/pytorch/ao/releases/tag/v0.6.1)

From 21ea08df4a31e3b4a49289744ad86465f011efe4 Mon Sep 17 00:00:00 2001
From: wenhuach21 <wenhua.cheng@intel.com>
Date: Mon, 25 Nov 2024 09:55:53 +0800
Subject: [PATCH 6/6] update

---
 README.md | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/README.md b/README.md
index 9fea247f..7bf68812 100644
--- a/README.md
+++ b/README.md
@@ -76,8 +76,8 @@ pip install auto-round[hpu]
 
 ### Basic Usage (Gaudi2/CPU/GPU)
 
-[//]: # (A user guide detailing the full list of supported arguments is provided by calling ```auto-round -h``` on the terminal.)
-Alternatively, you can use ```auto_round``` instead of ```auto-round```. Set the format you want in `format` and
+ A user guide detailing the full list of supported arguments is provided by calling ```auto-round -h``` on the terminal.
+ Set the format you want in `format` and
 multiple formats exporting has been supported. Please check out [step-by-step-instruction](./docs/step_by_step.md) for more details about calibration dataset or evaluation.
 
 ```bash