diff --git a/.github/scripts/spellcheck_conf/wordlist.txt b/.github/scripts/spellcheck_conf/wordlist.txt index 8bbc8da4a..ec01ea247 100644 --- a/.github/scripts/spellcheck_conf/wordlist.txt +++ b/.github/scripts/spellcheck_conf/wordlist.txt @@ -1432,4 +1432,4 @@ CPUs modelUpgradeExample guardrailing MaaS - +MFU \ No newline at end of file diff --git a/README.md b/README.md index 007b3df37..aa08f1eef 100644 --- a/README.md +++ b/README.md @@ -1,6 +1,6 @@ # Llama Recipes: Examples to get started using the Llama models from Meta -The 'llama-recipes' repository is a companion to the [Meta Llama](https://github.com/meta-llama/llama-models) models. We support the latest version, [Llama 3.1](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/MODEL_CARD.md), in this repository. The goal is to provide a scalable library for fine-tuning Meta Llama models, along with some example scripts and notebooks to quickly get started with using the models in a variety of use-cases, including fine-tuning for domain adaptation and building LLM-based applications with Llama and other tools in the LLM ecosystem. The examples here showcase how to run Llama locally, in the cloud, and on-prem. +The 'llama-recipes' repository is a companion to the [Meta Llama](https://github.com/meta-llama/llama-models) models. We support the latest version, [Llama 3.1](https://github.com/meta-llama/llama-models/blob/main/models/llama3_1/MODEL_CARD.md), in this repository. The goal is to provide a scalable library for fine-tuning Meta Llama models, along with some example scripts and notebooks to quickly get started with using the models in a variety of use-cases, including fine-tuning for domain adaptation and building LLM-based applications with Llama and other tools in the LLM ecosystem. The examples here showcase how to run Llama locally, in the cloud, and on-prem. > [!IMPORTANT] @@ -31,7 +31,7 @@ The 'llama-recipes' repository is a companion to the [Meta Llama](https://github > ``` > Each message gets trailed by an `<|eot_id|>` token before a new header is started, signaling a role change. > -> More details on the new tokenizer and prompt template can be found [here](https://llama.meta.com/docs/model-cards-and-prompt-formats/llama3_1). +> More details on the new tokenizer and prompt template can be found [here](https://llama.meta.com/docs/model-cards-and-prompt-formats/llama3_1). > > [!NOTE] @@ -55,6 +55,7 @@ The 'llama-recipes' repository is a companion to the [Meta Llama](https://github - [Repository Organization](#repository-organization) - [`recipes/`](#recipes) - [`src/`](#src) + - [Supported Features](#supported-features) - [Contributing](#contributing) - [License](#license) @@ -160,6 +161,30 @@ Contains modules which support the example recipes: | [utils](src/llama_recipes/utils/) | Utility files for:
- `train_utils.py` provides training/eval loop and more train utils.
- `dataset_utils.py` to get preprocessed datasets.
- `config_utils.py` to override the configs received from CLI.
- `fsdp_utils.py` provides FSDP wrapping policy for PEFT methods.
- `memory_utils.py` context manager to track different memory stats in train loop. | +## Supported Features +The recipes and modules in this repository support the following features: + +| Feature | | +| ---------------------------------------------- | - | +| HF support for inference | ✅ | +| HF support for finetuning | ✅ | +| PEFT | ✅ | +| Deferred initialization ( meta init) | ✅ | +| Low CPU mode for multi GPU | ✅ | +| Mixed precision | ✅ | +| Single node quantization | ✅ | +| Flash attention | ✅ | +| Activation checkpointing FSDP | ✅ | +| Hybrid Sharded Data Parallel (HSDP) | ✅ | +| Dataset packing & padding | ✅ | +| BF16 Optimizer (Pure BF16) | ✅ | +| Profiling & MFU tracking | ✅ | +| Gradient accumulation | ✅ | +| CPU offloading | ✅ | +| FSDP checkpoint conversion to HF for inference | ✅ | +| W&B experiment tracker | ✅ | + + ## Contributing Please read [CONTRIBUTING.md](CONTRIBUTING.md) for details on our code of conduct, and the process for submitting pull requests to us. diff --git a/recipes/README.md b/recipes/README.md index 9b5234eec..86d90b7e0 100644 --- a/recipes/README.md +++ b/recipes/README.md @@ -4,8 +4,8 @@ This folder contains examples organized by topic: | Subfolder | Description | |---|---| -[quickstart](./quickstart)|The "Hello World" of using Llama 3, start here if you are new to using Llama 3 -[use_cases](./use_cases)|Scripts showing common applications of Llama 3 -[3p_integrations](./3p_integrations)|Partner-owned folder showing Meta Llama 3 usage along with third-party tools +[quickstart](./quickstart)|The "Hello World" of using Llama, start here if you are new to using Llama +[use_cases](./use_cases)|Scripts showing common applications of Llama +[3p_integrations](./3p_integrations)|Partner-owned folder showing Llama usage along with third-party tools [responsible_ai](./responsible_ai)|Scripts to use PurpleLlama for safeguarding model outputs -[experimental](./experimental)|Meta Llama implementations of experimental LLM techniques +[experimental](./experimental)| Llama implementations of experimental LLM techniques diff --git a/recipes/quickstart/README.md b/recipes/quickstart/README.md index 4c82bfbbd..326cbdb29 100644 --- a/recipes/quickstart/README.md +++ b/recipes/quickstart/README.md @@ -2,28 +2,8 @@ If you are new to developing with Meta Llama models, this is where you should start. This folder contains introductory-level notebooks across different techniques relating to Meta Llama. -* The [Running_Llama3_Anywhere](./Running_Llama3_Anywhere/) notebooks demonstrate how to run Llama inference across Linux, Mac and Windows platforms using the appropriate tooling. -* The [Prompt_Engineering_with_Llama_3](./Prompt_Engineering_with_Llama_3.ipynb) notebook showcases the various ways to elicit appropriate outputs from Llama. Take this notebook for a spin to get a feel for how Llama responds to different inputs and generation parameters. +* The [Running_Llama_Anywhere](./Running_Llama3_Anywhere/) notebooks demonstrate how to run Llama inference across Linux, Mac and Windows platforms using the appropriate tooling. +* The [Prompt_Engineering_with_Llama](./Prompt_Engineering_with_Llama_3.ipynb) notebook showcases the various ways to elicit appropriate outputs from Llama. Take this notebook for a spin to get a feel for how Llama responds to different inputs and generation parameters. * The [inference](./inference/) folder contains scripts to deploy Llama for inference on server and mobile. See also [3p_integrations/vllm](../3p_integrations/vllm/) and [3p_integrations/tgi](../3p_integrations/tgi/) for hosting Llama on open-source model servers. -* The [RAG](./RAG/) folder contains a simple Retrieval-Augmented Generation application using Llama 3. -* The [finetuning](./finetuning/) folder contains resources to help you finetune Llama 3 on your custom datasets, for both single- and multi-GPU setups. The scripts use the native llama-recipes finetuning code found in [finetuning.py](../../src/llama_recipes/finetuning.py) which supports these features: - -| Feature | | -| ---------------------------------------------- | - | -| HF support for finetuning | ✅ | -| Deferred initialization ( meta init) | ✅ | -| HF support for inference | ✅ | -| Low CPU mode for multi GPU | ✅ | -| Mixed precision | ✅ | -| Single node quantization | ✅ | -| Flash attention | ✅ | -| PEFT | ✅ | -| Activation checkpointing FSDP | ✅ | -| Hybrid Sharded Data Parallel (HSDP) | ✅ | -| Dataset packing & padding | ✅ | -| BF16 Optimizer ( Pure BF16) | ✅ | -| Profiling & MFU tracking | ✅ | -| Gradient accumulation | ✅ | -| CPU offloading | ✅ | -| FSDP checkpoint conversion to HF for inference | ✅ | -| W&B experiment tracker | ✅ | +* The [RAG](./RAG/) folder contains a simple Retrieval-Augmented Generation application using Llama. +* The [finetuning](./finetuning/) folder contains resources to help you finetune Llama on your custom datasets, for both single- and multi-GPU setups. The scripts use the native llama-recipes finetuning code found in [finetuning.py](../../src/llama_recipes/finetuning.py) which supports these features: