Example/Test Model Benchmarks (Canonical WandB runs) #74

cat-state · 2022-10-27T17:23:28Z

🚀 The feature, motivation, and pitch

If we had links to benchmarks for the example (and/or test) models, it would be easier to add new models, and keep track of improvements in method implementations. Additionally, during refactoring, it would allow checking that no performance degrading changes were introduced.

This can be a minimal version of #13

Alternatives

No response

Additional context

No response

albertsun1 · 2022-11-07T18:35:53Z

Hey! I'm new to contributing to trlx, would it be worth for me to give this a go for the ppo/ilql sentiment examples?

maxreciprocate · 2022-11-07T21:53:50Z

@cat-state something like that? vwxyzjn/cleanrl#307

cat-state · 2022-11-11T01:37:29Z

@albertsun1

Hey! I'm new to contributing to trlx, would it be worth for me to give this a go for the ppo/ilql sentiment examples?

Sure, although you might need compute?
@reciprocated maybe we should make a single-node config version that can be finetuned on a single gpu fast?

cat-state · 2022-11-11T02:18:44Z

So I see that WandB actually lists the commit hash used for a run. So if we could find/tag TRLX runs in wandb then each commit could be matched up to a specific state of the repository.

maxreciprocate · 2022-11-11T21:39:21Z

Sure, although you might need compute? @reciprocated maybe we should make a single-node config version that can be finetuned on a single gpu fast?

fwiw {ppo,ilql}_config.yml were meant to be single gpu, up to a batch_size, since they both use gpt2 small

maxreciprocate · 2023-09-01T10:49:07Z

Resolved with #357

maxreciprocate mentioned this issue Nov 21, 2022

Restructure sweeps for reuse #102

Merged

4 tasks

cat-state added the feature request New feature or request label Feb 2, 2023

maxreciprocate closed this as completed Sep 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Example/Test Model Benchmarks (Canonical WandB runs) #74

Example/Test Model Benchmarks (Canonical WandB runs) #74

cat-state commented Oct 27, 2022

albertsun1 commented Nov 7, 2022

maxreciprocate commented Nov 7, 2022

cat-state commented Nov 11, 2022 •

edited

Loading

cat-state commented Nov 11, 2022 •

edited

Loading

maxreciprocate commented Nov 11, 2022 •

edited

Loading

maxreciprocate commented Sep 1, 2023

Example/Test Model Benchmarks (Canonical WandB runs) #74

Example/Test Model Benchmarks (Canonical WandB runs) #74

Comments

cat-state commented Oct 27, 2022

🚀 The feature, motivation, and pitch

Alternatives

Additional context

albertsun1 commented Nov 7, 2022

maxreciprocate commented Nov 7, 2022

cat-state commented Nov 11, 2022 • edited Loading

cat-state commented Nov 11, 2022 • edited Loading

maxreciprocate commented Nov 11, 2022 • edited Loading

maxreciprocate commented Sep 1, 2023

cat-state commented Nov 11, 2022 •

edited

Loading

cat-state commented Nov 11, 2022 •

edited

Loading

maxreciprocate commented Nov 11, 2022 •

edited

Loading