[WIP] Generate base class for better integration of distributed inference #1355

mreso · 2024-11-07T21:37:17Z

This PR introduces a new Generator base class to better integrate distributed inference with the already present infra

… it up

pytorch-bot · 2024-11-07T21:37:20Z

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1355

📄 Preview Python docs built from this PR

Note: Links to docs will display an error until the docs builds have been completed.

✅ No Failures

As of commit 144c81c with merge base 2fcc37c ():
💚 Looks good so far! There are no failures yet. 💚

This comment was automatically generated by Dr. CI and updates every 15 minutes.

mikekgfb

Rename SingleGpuGenerator -> LocalGenerator, or some other name that reflects the broader use of this class, please?

torchchat/generate.py

mreso · 2024-11-08T19:07:49Z

Hi @Jack-Khuu would be great to get your opinion on this PR as well because it does some more aggressive refactoring to integrate distributed inference. Its only a step into the direction to keep PR size small. The final goal is to have LocalGenerator and DistributedGenerator use a unified interface. Then we can reuse the code base for preparing the data etc by sharing the chat() method which would eventually also be raise into the base class as well. This branch contains the next PR where I aligned the generate() interface. Next step is to refactor chat() to support both Local- and DistributedGenerator.

Jack-Khuu · 2024-11-12T09:50:14Z

Thanks for putting up the PR @mreso, conceptually this PR makes sense and will make the Distributed integration smoother, though I have ideas to refactor the overall Generator architecture during Q1.

Main idea is that I want to avoid abstractions classes wherever possible since it hurts "copy and pastability". To temporarily avoid code duplication and move the distributed integration along, we can ride with the LocalGenerator concept for now.

The CI is currently borked from what I suspect is a Nvidia bug: huggingface/diffusers#9704

But I'll force land this PR if I find the CI fix non-trivial

Jack-Khuu · 2024-11-13T02:27:54Z

CI is fixed

The changes with LocalGenerator look intuitive to me and I see that generate is going to be pushed up a layer into Generator in your next PR which is great.

Left one question about gen_model_input, but once that's resolved we're gtg

Jack-Khuu · 2024-11-13T02:30:45Z

torchchat/utils/generator.py

+    def _gen_model_input(
+        self,
+        prompt: Union[str | List[Any]],
+        image_prompts: Optional[List[str | Image.Image]] = None,
+        max_new_tokens: Optional[int] = None,
+        max_seq_len: Optional[int] = 2048,
+    ) -> Tuple[torch.Tensor, Optional[Dict[str, Any]]]:


Something similar to generator was something we were looking into for Q4/H1, so I'm glad to see the intial work here lining up with my mental model.

Is _gen_model_input used in the DistributedGenerator? If not let's keep this in the LocalGenerator so that Generator is more succinct.

We can be more aggressive with leaving things in LocalGenerator and Generator being thin, but that's something we can do later (but gen_model_input should be moved unless there's a need)

Thanks for reviewing and feedback. The "copy and pastability" design aspect of TorchChat got to me through the recent discussions. Regarding _gen_model_input, I was planning to only move things into Generator that I could reuse in distributed. In my dev branch I actually pulled chat into the base class to reuse it so _gen_model_input would need to live in Generator as well. I think in the end you could end up with only load_model, decode_one_token and decode_n_tokens, prefill and some model properties in the Local/DistributedGenerator which in some sense would make it "copy and pastability" again as everything lives in Generator.

mreso · 2024-11-13T18:23:11Z

Lets hold on merging this, and let me look into a more "copy and pastabile" approach. As we'll end up with only model specifics in the Local/DistributedGenerator we might get away with just touching the model level.

Jack-Khuu · 2024-11-13T19:43:15Z

Feel free to spin up a RFC/doc or notes. We'll gladly take a look since it's something we've been thinking about doing for a while now

cc: @Gasoonjia

mreso · 2024-11-16T05:46:54Z

Closing this in favor of #1381

mreso added 2 commits November 7, 2024 13:29

Create base class for Generator

e780f6c

Move generator base class into its own module so distributed can pick…

2e08c8f

… it up

facebook-github-bot added the CLA Signed This label is managed by the Meta Open Source bot. label Nov 7, 2024

mreso requested review from Jack-Khuu and lessw2020 November 7, 2024 22:13

mreso added 2 commits November 7, 2024 14:26

Use SingleGPUGenerator instead of Generator in openai_api.py

78aa666

Merge branch 'main' into refactor/generator_for_distributed_integration

a4fbf88

mikekgfb suggested changes Nov 8, 2024

View reviewed changes

torchchat/generate.py Outdated Show resolved Hide resolved

Rename SingleGpuGenerator -> LocalGenerator

9454ff4

mreso requested a review from mikekgfb November 8, 2024 05:47

Jack-Khuu approved these changes Nov 12, 2024

View reviewed changes

Merge branch 'main' into refactor/generator_for_distributed_integration

144c81c

Jack-Khuu reviewed Nov 13, 2024

View reviewed changes

mreso changed the title ~~Generate base class for better integration of distributed inference~~ [WIP] Generate base class for better integration of distributed inference Nov 13, 2024

mreso closed this Nov 16, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[WIP] Generate base class for better integration of distributed inference #1355

[WIP] Generate base class for better integration of distributed inference #1355

mreso commented Nov 7, 2024

pytorch-bot bot commented Nov 7, 2024 •

edited

Loading

mikekgfb left a comment

mreso commented Nov 8, 2024

Jack-Khuu commented Nov 12, 2024

Jack-Khuu commented Nov 13, 2024 •

edited

Loading

Jack-Khuu Nov 13, 2024

Jack-Khuu Nov 13, 2024

mreso Nov 13, 2024 •

edited

Loading

mreso commented Nov 13, 2024

Jack-Khuu commented Nov 13, 2024

mreso commented Nov 16, 2024

[WIP] Generate base class for better integration of distributed inference #1355

[WIP] Generate base class for better integration of distributed inference #1355

Conversation

mreso commented Nov 7, 2024

pytorch-bot bot commented Nov 7, 2024 • edited Loading

🔗 Helpful Links

🧪 See artifacts and rendered test results at hud.pytorch.org/pr/pytorch/torchchat/1355

✅ No Failures

mikekgfb left a comment

Choose a reason for hiding this comment

mreso commented Nov 8, 2024

Jack-Khuu commented Nov 12, 2024

Jack-Khuu commented Nov 13, 2024 • edited Loading

Jack-Khuu Nov 13, 2024

Choose a reason for hiding this comment

Jack-Khuu Nov 13, 2024

Choose a reason for hiding this comment

mreso Nov 13, 2024 • edited Loading

Choose a reason for hiding this comment

mreso commented Nov 13, 2024

Jack-Khuu commented Nov 13, 2024

mreso commented Nov 16, 2024

pytorch-bot bot commented Nov 7, 2024 •

edited

Loading

Jack-Khuu commented Nov 13, 2024 •

edited

Loading

mreso Nov 13, 2024 •

edited

Loading