[LLM] Llama 3.1 serving example #3780

romilbhardwaj · 2024-07-24T02:32:34Z

Tested (run the relevant ones):

Code formatting: bash format.sh
Any manual or new tests for this PR (please specify below)
All smoke tests: pytest tests/test_smoke.py
Relevant individual smoke tests: pytest tests/test_smoke.py::test_fill_in_the_name
Backward compatibility tests: conda deactivate; bash -i tests/backward_compatibility_tests.sh

concretevitamin

Awesome @romilbhardwaj, some quick comments.

add to docs & repo README's LLM list too, for both finetuning & serving?

concretevitamin · 2024-07-24T17:35:07Z

docs/source/_gallery_original/index.rst

@@ -39,6 +39,7 @@ Contents
   DBRX (Databricks) <llms/dbrx>
   Llama-2 (Meta) <llms/llama-2>
   Llama-3 (Meta) <llms/llama-3>
+   Llama-3.1 (Meta) <llms/llama-3_1>


We should add to docs home page's LLM list too.

docs/source/_static/custom.js

llm/llama-3_1/README.md

…o llama-31

romilbhardwaj · 2024-07-24T20:30:50Z

Thanks, resolved comments.

Michaelvll · 2024-07-24T20:39:07Z

llm/llama-3_1/README.md

+
+## Serving Llama 3.1 on your infra
+
+We will first test the model on a GPU dev node, then package it for deployment using SkyPilot.


Can we have highlighted sentence for the complete deployment YAML so people can directly jump to that section?

Suggested change

We will first test the model on a GPU dev node, then package it for deployment using SkyPilot.

We offer a step-by-step guide for how to use SkyPilot to test a new model on a GPU dev node, and then package it for deployment using SkyPilot. **For the complete deployment guide of Llama 3.1, see [Step 3: Package and deploy using SkyPilot](#Step-3-Package-and-deploy-using-SkyPilot)

romilbhardwaj added 3 commits July 23, 2024 19:20

llama 3.1 serve example

55d70b0

llama 3.1 serve example

f41fa8f

Updates

5a3c534

concretevitamin reviewed Jul 24, 2024

View reviewed changes

romilbhardwaj added 3 commits July 24, 2024 10:57

Merge branch 'master' of https://github.com/skypilot-org/skypilot int…

ae4dc86

…o llama-31

comments

5b04eb9

comments

4ff7028

romilbhardwaj requested a review from concretevitamin July 24, 2024 20:30

concretevitamin approved these changes Jul 24, 2024

View reviewed changes

comments

2d03425

Michaelvll reviewed Jul 24, 2024

View reviewed changes

romilbhardwaj added 3 commits July 24, 2024 14:27

comments

e18ca78

comments

c594a8c

fix link

c0499e7

romilbhardwaj added this pull request to the merge queue Jul 25, 2024

Merged via the queue into master with commit b02a66d Jul 25, 2024
20 checks passed

romilbhardwaj deleted the llama-31 branch July 25, 2024 01:34

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[LLM] Llama 3.1 serving example #3780

[LLM] Llama 3.1 serving example #3780

romilbhardwaj commented Jul 24, 2024

concretevitamin left a comment •

edited by romilbhardwaj

Loading

concretevitamin Jul 24, 2024

romilbhardwaj commented Jul 24, 2024

Michaelvll Jul 24, 2024


		## Serving Llama 3.1 on your infra

		We will first test the model on a GPU dev node, then package it for deployment using SkyPilot.

	We will first test the model on a GPU dev node, then package it for deployment using SkyPilot.
	We offer a step-by-step guide for how to use SkyPilot to test a new model on a GPU dev node, and then package it for deployment using SkyPilot. **For the complete deployment guide of Llama 3.1, see [Step 3: Package and deploy using SkyPilot](#Step-3-Package-and-deploy-using-SkyPilot)

[LLM] Llama 3.1 serving example #3780

[LLM] Llama 3.1 serving example #3780

Conversation

romilbhardwaj commented Jul 24, 2024

concretevitamin left a comment • edited by romilbhardwaj Loading

Choose a reason for hiding this comment

concretevitamin Jul 24, 2024

Choose a reason for hiding this comment

romilbhardwaj commented Jul 24, 2024

Michaelvll Jul 24, 2024

Choose a reason for hiding this comment

concretevitamin left a comment •

edited by romilbhardwaj

Loading