MD lint the /docs/* dir (NVIDIA#597)

MD lint the /docs dir Signed-off-by: Brent Salisbury <[email protected]>
dcurran90 · Mar 27, 2024 · 3652034 · 3652034
1 parent 275b858
commit 3652034
Show file tree

Hide file tree

Showing 5 changed files with 184 additions and 66 deletions.
diff --git a/.github/PULL_REQUEST_TEMPLATE.md b/.github/PULL_REQUEST_TEMPLATE.md
@@ -1,4 +1,6 @@
-**Which issue is resolved by this Pull Request:** 
+# Changes
+
+**Which issue is resolved by this Pull Request:**
 Resolves #
 
 **Description of your changes:**
diff --git a/docs/README.md b/docs/README.md
@@ -1,10 +1,17 @@
-Workflow figure is generated using [PlantUML](https://plantuml.com/ditaa) with the [ditaa](https://ditaa.sourceforge.net).
-To generate it yourself, the easiest way is to install the [PlantUML plugin in VS Code](https://marketplace.visualstudio.com/items?itemName=jebbs.plantuml) (with the prerequisite installed), open the file and click preview.
+# Workflow PlantUML
+
+Workflow figure is generated using [PlantUML](https://plantuml.com/ditaa) with
+the [ditaa](https://ditaa.sourceforge.net).
+To generate it yourself, the easiest way is to install the
+[PlantUML plugin in VS Code](https://marketplace.visualstudio.com/items?itemName=jebbs.plantuml)
+(with the prerequisite installed), open the file and click preview.
+
+If you don't want to install the dependencies locally, you can use the following
+settings to make the preview work with a remote render:
 
-If you don't want to install the dependencies locally, you can use the following settings to make the preview work with a remote render:
 ```json
 "plantuml.render": "PlantUMLServer",
 "plantuml.server": "https://www.plantuml.com/plantuml",
 ```
 
-[ASCIIFlow](https://asciiflow.com/#/) is a helpful tool to edit the source code.
+[ASCIIFlow](https://asciiflow.com/#/) is a helpful tool to edit the source code.
diff --git a/docs/containerization.md b/docs/containerization.md
@@ -1,10 +1,12 @@
-# Putting `lab` in a Container AND making it go fast 
+# Putting `lab` in a Container AND making it go fast
 
-Containerization of `lab` allows for portability and ease of setup. With this, users can now run lab on OpenShift to test the speed of `lab train` and `generate` using dedicated GPUs. This guide shows you how to put the `lab`CLI, all of its dependencies,
-and your GPU into a container for an isolated and easily reproducible experience.
+Containerization of `lab` allows for portability and ease of setup. With this,
+users can now run lab on OpenShift to test the speed of `lab train` and `generate`
+using dedicated GPUs. This guide shows you how to put the `lab`CLI, all of its
+dependencies, and your GPU into a container for an isolated and easily reproducible
+experience.
 
-
-## Steps to build an image then run a container:
+## Steps to build an image then run a container
 
 **Containerfile:**
 
@@ -30,25 +32,35 @@ CMD ["/bin/bash"]
 
 Or image: TBD (am I allowed to have a public image with references to lab in it?)
 
-This containerfile is based on Nvidia's CUDA image, which lucky for us plugs directly into Podman via their `nvidia-container-toolkit`! The ubi9 base image does not have most packages installed. The bulk of the `containerfile` is spent configuring your system so `lab` can be installed and run properly. ubi9 as compared to ubuntu cannot install the entire nvidia-12-4 toolkit. This did not impact performance during testing.
+This containerfile is based on Nvidia's CUDA image, which lucky for us plugs
+directly into Podman via their `nvidia-container-toolkit`! The ubi9 base image
+does not have most packages installed. The bulk of the `containerfile` is spent
+configuring your system so `lab` can be installed and run properly. ubi9 as compared
+to ubuntu cannot install the entire nvidia-12-4 toolkit. This did not impact
+performance during testing.
 
-1. Podman build –ssh=default -f <Containerfile_Path>
+```shell
+1. podman build –ssh=default -f <Containerfile_Path>
 2. curl -s -L https://nvidia.github.io/libnvidia-container/stable/rpm/nvidia-container-toolkit.repo |   sudo tee /etc/yum.repos.d/nvidia-container-toolkit.repo
 3. sudo yum-config-manager --enable nvidia-container-toolkit-experimental
 4. sudo dnf install -y nvidia-container-toolkit
 5. sudo nvidia-ctk cdi generate --output=/etc/cdi/nvidia.yaml
 6. nvidia-ctk cdi list
     Example output: 
-    INFO[0000] Found 2 CDI devices                     	 
+    INFO[0000] Found 2 CDI devices
     nvidia.com/gpu=0
     nvidia.com/gpu=all
 7. podman run --device nvidia.com/gpu=0  --security-opt=label=disable -it <IMAGE_ID>
+```
 
 Voila! You now have a container with CUDA and GPUs enabled!
 
-#### Sources:
-https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html – nvidia container toolkit
-https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/cdi-support.html podman
+### Sources
+
+[Nvidia Container Toolkit Install Guide](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/install-guide.html)
+
+[Podman Support for Container Device Interface](https://docs.nvidia.com/datacenter/cloud-native/container-toolkit/latest/cdi-support.html)
+
+### Notes
 
-#### Notes:
 Thanks to Taj Salawu for figuring out how to pass the git ssh keys properly!
diff --git a/docs/converting_GGUF.md b/docs/converting_GGUF.md
@@ -1,9 +1,12 @@
-<a name="model-convert-quant"></a>
+# Optional: Converting a Model to GGUF and Quantizing
 
-# Optional: Converting a Model to GGUF and Quantizing 
-
-The latest [llama.cpp](https://github.com/ggerganov/llama.cpp) framework 
-requires the model to be converted into [GGUF](https://medium.com/@sandyeep70/ggml-to-gguf-a-leap-in-language-model-file-formats-cd5d3a6058f9) format. [GGUF](https://medium.com/@sandyeep70/ggml-to-gguf-a-leap-in-language-model-file-formats-cd5d3a6058f9) is a quantization technique. [Quantization](https://www.tensorops.ai/post/what-are-quantized-llms) is a technique used to reduce the size of large neural networks, including large language models (LLMs) by modifying the precision of their weights. If you have a model already in GGUF format, you can skip this step.
+The latest [llama.cpp](https://github.com/ggerganov/llama.cpp) framework
+requires the model to be converted into [GGUF](https://medium.com/@sandyeep70/ggml-to-gguf-a-leap-in-language-model-file-formats-cd5d3a6058f9)
+format. [GGUF](https://medium.com/@sandyeep70/ggml-to-gguf-a-leap-in-language-model-file-formats-cd5d3a6058f9)
+is a quantization technique. [Quantization](https://www.tensorops.ai/post/what-are-quantized-llms)
+is a technique used to reduce the size of large neural networks, including large
+language models (LLMs) by modifying the precision of their weights. If you have a
+model already in GGUF format, you can skip this step.
 
 ## Clone the llama.cpp repository
 
@@ -42,7 +45,8 @@ def write(self):
 
 ## Convert a model to GGUF
 
-The following command converts a Hugging Face model (safetensors) to [GGUF](https://github.com/ggerganov/ggml/blob/master/docs/gguf.md) format and saves it in your model directory with a `.gguf` extension.
+The following command converts a Hugging Face model (safetensors) to [GGUF](https://github.com/ggerganov/ggml/blob/master/docs/gguf.md)
+format and saves it in your model directory with a `.gguf` extension.
 
 ```shell
 export MODEL_DIR={model_directory}
@@ -53,9 +57,10 @@ python convert-hf-to-gguf.py $MODEL_DIR --outtype f16
 
 ## Quantize
 
-Optionally, for smaller/faster models with varying loss of quality use a quantized model.
+Optionally, for smaller/faster models with varying loss of quality use a
+quantized model.
 
-#### Make the llama.cpp binaries
+### Make the llama.cpp binaries
 
 Build binaries like `quantize` etc. for your environment.
 
@@ -65,15 +70,17 @@ make
 
 #### Run quantize command
 
-
 ```shell
 ./quantize {model_directory}/{f16_gguf_model} <type>
 ```
 
-For example, the following command converts the f16 GGUF model to a Q4_K_M quantized model and saves it in your model directory with a `<type>.gguf` suffix (e.g. ggml-model-Q4_K_M.gguf).
+For example, the following command converts the f16 GGUF model to a Q4_K_M
+quantized model and saves it in your model directory with a `<type>.gguf`
+suffix (e.g. ggml-model-Q4_K_M.gguf).
 
 ```shell
 ./quantize $MODEL_DIR/ggml-model-f16.gguf Q4_K_M
 ```
 
-> Tip: Use `./quantize help` for a list of quantization types with their relative size and output quality along with additional usage parameters.
+> Tip: Use `./quantize help` for a list of quantization types with their
+> relative size and output quality along with additional usage parameters.