Change post training run.yaml inference config #710

SLR722 · 2025-01-02T23:49:24Z

Context

Colab notebook provides some limited free T4 GPU.

Making post training template e2e works with colab notebook T4 is critical for early adoption of the stack post training apis. However, we found that the existing LlamaModelParallelGenerator (https://github.com/meta-llama/llama-stack/blob/main/llama_stack/providers/inline/inference/meta_reference/inference.py#L82) in meta-reference inference implementation isn't compatible with T4 machine.

In this PR, We change to disable create_distributed_process_group for inference api in post training run.yaml config and setup up the distributed env variables in notebook

to make meta reference inference compatible with the free T4 machine

test

Test with the WIP post training showcase colab notebook https://colab.research.google.com/drive/1K4Q2wZq232_Bpy2ud4zL9aRxvCWAwyQs?usp=sharing

commit

f1766c1

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Jan 2, 2025

SLR722 marked this pull request as ready for review January 3, 2025 04:24

SLR722 requested review from ashwinb, yanxi0830, hardikjshah, dltn, raghotham, dineshyv and vladimirivic as code owners January 3, 2025 04:24

ashwinb approved these changes Jan 3, 2025

View reviewed changes

ashwinb merged commit f450a0f into main Jan 3, 2025
2 checks passed

ashwinb deleted the change_post_training_run_config branch January 3, 2025 16:37

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Change post training run.yaml inference config #710

Change post training run.yaml inference config #710

SLR722 commented Jan 2, 2025

Change post training run.yaml inference config #710

Change post training run.yaml inference config #710

Conversation

SLR722 commented Jan 2, 2025

Context

test