Skip to content

Commit

Permalink
initial check for the mi300x workload tuning guide update
Browse files Browse the repository at this point in the history
minor fixes to formatting

fix spelling errors

more spelling

fixes

quantization update

fix format

simplify wording in tunableops and format fix

Apply suggestions from code review

review feedback by Peter

Co-authored-by: Peter Park <[email protected]>

Apply suggestions from code review

addressing feedback

Co-authored-by: Peter Park <[email protected]>

Apply suggestions from code review

feedback again

Co-authored-by: Peter Park <[email protected]>

add hipblaslt yaml file figure

feedback and minor formatting

formatting

update wordlist.txt

remove outdated sentence regarding fsdp and rccl

fmt

more fmt

fmt
  • Loading branch information
hongxiayang authored and peterjunpark committed Dec 6, 2024
1 parent f53faa1 commit 87fa9fd
Show file tree
Hide file tree
Showing 5 changed files with 610 additions and 300 deletions.
3 changes: 3 additions & 0 deletions .wordlist.txt
Original file line number Diff line number Diff line change
Expand Up @@ -158,6 +158,8 @@ HWS
Haswell
Higgs
Hyperparameters
Huggingface
ICD
ICV
IDE
IDEs
Expand Down Expand Up @@ -455,6 +457,7 @@ avx
awk
backend
backends
benchmarked
benchmarking
bfloat
bilinear
Expand Down
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Original file line number Diff line number Diff line change
Expand Up @@ -135,11 +135,13 @@ Installing vLLM
{"text":["What is AMD Instinct?\nAmd Instinct is a brand new line of high-performance computing (HPC) processors from Advanced Micro Devices (AMD). These processors are designed to deliver unparalleled performance for HPC workloads, including scientific simulations, data analytics, and machine learning.\nThe Instinct lineup includes a range of processors, from the entry-level Inst"]}
Refer to :ref:`mi300x-vllm-optimization` for performance optimization tips.
.. seealso::

ROCm provides a prebuilt optimized Docker image for validating the performance of LLM inference with vLLM
on the MI300X accelerator. The Docker image includes ROCm, vLLM, PyTorch, and tuning files in the CSV
format. For more information, see :doc:`/how-to/performance-validation/mi300x/vllm-benchmark`.
See :ref:`mi300x-vllm-optimization` for performance optimization tips.

ROCm provides a prebuilt optimized Docker image for validating the performance of LLM inference with vLLM
on the MI300X accelerator. The Docker image includes ROCm, vLLM, PyTorch, and tuning files in CSV
format. For more information, see :doc:`/how-to/performance-validation/mi300x/vllm-benchmark`.

.. _fine-tuning-llms-tgi:

Expand Down
Loading

0 comments on commit 87fa9fd

Please sign in to comment.