Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[LLM] Llama 3.1 serving example #3780

Merged
merged 10 commits into from
Jul 25, 2024
Merged

[LLM] Llama 3.1 serving example #3780

merged 10 commits into from
Jul 25, 2024

Conversation

romilbhardwaj
Copy link
Collaborator

Tested (run the relevant ones):

  • Code formatting: bash format.sh
  • Any manual or new tests for this PR (please specify below)
  • All smoke tests: pytest tests/test_smoke.py
  • Relevant individual smoke tests: pytest tests/test_smoke.py::test_fill_in_the_name
  • Backward compatibility tests: conda deactivate; bash -i tests/backward_compatibility_tests.sh

Copy link
Member

@concretevitamin concretevitamin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome @romilbhardwaj, some quick comments.

  • add to docs & repo README's LLM list too, for both finetuning & serving?

@@ -39,6 +39,7 @@ Contents
DBRX (Databricks) <llms/dbrx>
Llama-2 (Meta) <llms/llama-2>
Llama-3 (Meta) <llms/llama-3>
Llama-3.1 (Meta) <llms/llama-3_1>
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should add to docs home page's LLM list too.

docs/source/_static/custom.js Show resolved Hide resolved
llm/llama-3_1/README.md Outdated Show resolved Hide resolved
llm/llama-3_1/README.md Outdated Show resolved Hide resolved
llm/llama-3_1/README.md Outdated Show resolved Hide resolved
@romilbhardwaj
Copy link
Collaborator Author

Thanks, resolved comments.


## Serving Llama 3.1 on your infra

We will first test the model on a GPU dev node, then package it for deployment using SkyPilot.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we have highlighted sentence for the complete deployment YAML so people can directly jump to that section?

Suggested change
We will first test the model on a GPU dev node, then package it for deployment using SkyPilot.
We offer a step-by-step guide for how to use SkyPilot to test a new model on a GPU dev node, and then package it for deployment using SkyPilot. **For the complete deployment guide of Llama 3.1, see [Step 3: Package and deploy using SkyPilot](#Step-3-Package-and-deploy-using-SkyPilot)

@romilbhardwaj romilbhardwaj added this pull request to the merge queue Jul 25, 2024
Merged via the queue into master with commit b02a66d Jul 25, 2024
20 checks passed
@romilbhardwaj romilbhardwaj deleted the llama-31 branch July 25, 2024 01:34
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants