initial check for the mi300x workload tuning guide update

minor fixes to formatting fix spelling errors more spelling fixes quantization update fix format simplify wording in tunableops and format fix Apply suggestions from code review review feedback by Peter Co-authored-by: Peter Park <[email protected]> Apply suggestions from code review addressing feedback Co-authored-by: Peter Park <[email protected]> Apply suggestions from code review feedback again Co-authored-by: Peter Park <[email protected]> add hipblaslt yaml file figure feedback and minor formatting formatting update wordlist.txt remove outdated sentence regarding fsdp and rccl fmt more fmt fmt
peterjunpark · Dec 6, 2024 · 87fa9fd · 87fa9fd
1 parent f53faa1
commit 87fa9fd
Show file tree

Hide file tree

Showing 5 changed files with 610 additions and 300 deletions.
diff --git a/.wordlist.txt b/.wordlist.txt
@@ -158,6 +158,8 @@ HWS
 Haswell
 Higgs
 Hyperparameters
+Huggingface
+ICD
 ICV
 IDE
 IDEs
@@ -455,6 +457,7 @@ avx
 awk
 backend
 backends
+benchmarked
 benchmarking
 bfloat
 bilinear

diff --git a/docs/data/how-to/tuning-guides/hipblaslt_auto_tuning_output_files.png b/docs/data/how-to/tuning-guides/hipblaslt_auto_tuning_output_files.png
diff --git a/docs/data/how-to/tuning-guides/hipblaslt_yaml_template.png b/docs/data/how-to/tuning-guides/hipblaslt_yaml_template.png
diff --git a/docs/how-to/llm-fine-tuning-optimization/llm-inference-frameworks.rst b/docs/how-to/llm-fine-tuning-optimization/llm-inference-frameworks.rst
@@ -135,11 +135,13 @@ Installing vLLM
 
             {"text":["What is AMD Instinct?\nAmd Instinct is a brand new line of high-performance computing (HPC) processors from Advanced Micro Devices (AMD). These processors are designed to deliver unparalleled performance for HPC workloads, including scientific simulations, data analytics, and machine learning.\nThe Instinct lineup includes a range of processors, from the entry-level Inst"]}
 
-Refer to :ref:`mi300x-vllm-optimization` for performance optimization tips.
+.. seealso::
 
-ROCm provides a prebuilt optimized Docker image for validating the performance of LLM inference with vLLM 
-on the MI300X accelerator. The Docker image includes ROCm, vLLM, PyTorch, and tuning files in the CSV 
-format. For more information, see :doc:`/how-to/performance-validation/mi300x/vllm-benchmark`.
+   See :ref:`mi300x-vllm-optimization` for performance optimization tips.
+
+   ROCm provides a prebuilt optimized Docker image for validating the performance of LLM inference with vLLM
+   on the MI300X accelerator. The Docker image includes ROCm, vLLM, PyTorch, and tuning files in CSV
+   format. For more information, see :doc:`/how-to/performance-validation/mi300x/vllm-benchmark`.
 
 .. _fine-tuning-llms-tgi: