Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Train V2] Adding Ray Train V2 Codebase, implementing the "Train + Tune API Revamp" REP #49376

Merged
merged 4 commits into from
Dec 23, 2024

Conversation

hongpeng-guo
Copy link
Contributor

@hongpeng-guo hongpeng-guo commented Dec 20, 2024

Summary

Ray Tune and Ray Train have been tightly coupled since Ray 2.0, when Ray Tune became the common execution engine for both libraries.

Ray Train execution invokes Tune’s execution logic under the hood, which leads to a complex, layered system. The original intention behind this was to increase the interoperability of the two libraries, but the dependency of Ray Train on Ray Tune has led to many usability and stability issues, and it has stalled feature development.

ray-project/enhancements#57 proposed a much clearer design to improve the Usability, Extensibility, Interoperability, and Testability.

This PR contains the implementation of the above REP for the revamped Ray Train. This implementation is contained in the python/ray/train/v2 directory. These changes pave the way for improved feature development and enhanced user experience. Please refer to the REP for details on the design, as well as the remaining changes which will be added shortly in follow-up PRs.

Ray Train V2 can be enabled by setting the RAY_TRAIN_V2_ENABLED=1 environment variable on the driver process.

Signed-off-by: Hongpeng Guo <[email protected]>
Signed-off-by: Hongpeng Guo <[email protected]>
@hongpeng-guo hongpeng-guo force-pushed the hpguo/v2/train_v2_init branch from 3896f50 to 654a7ff Compare December 20, 2024 01:55
Signed-off-by: Hongpeng Guo <[email protected]>
@hongpeng-guo
Copy link
Contributor Author

Good to review. @matthewdeng @justinvyu

@hongpeng-guo hongpeng-guo added the go add ONLY when ready to merge, run all tests label Dec 20, 2024
Copy link
Contributor

@justinvyu justinvyu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! 🐳

@justinvyu justinvyu merged commit 3048c66 into ray-project:master Dec 23, 2024
6 checks passed
@hongpeng-guo hongpeng-guo deleted the hpguo/v2/train_v2_init branch December 24, 2024 08:23
srinathk10 pushed a commit that referenced this pull request Jan 3, 2025
…ne API Revamp" REP (#49376)

Ray Tune and Ray Train have been tightly coupled since Ray 2.0, when Ray
Tune became the common execution engine for both libraries.

Ray Train execution invokes Tune’s execution logic under the hood, which
leads to a complex, layered system. The original intention behind this
was to increase the interoperability of the two libraries, but the
dependency of Ray Train on Ray Tune has led to many usability and
stability issues, and it has stalled feature development.

ray-project/enhancements#57 proposed a much
clearer design to improve the **Usability**, **Extensibility**,
**Interoperability**, and **Testability**.

This PR contains the implementation of the above REP for the revamped
Ray Train. This implementation is contained in the `python/ray/train/v2`
directory. These changes pave the way for improved feature development
and enhanced user experience. Please refer to the REP for details on the
design, as well as the remaining changes which will be added shortly in
follow-up PRs.

---------

Signed-off-by: Hongpeng Guo <[email protected]>
anyadontfly pushed a commit to anyadontfly/ray that referenced this pull request Feb 13, 2025
…ne API Revamp" REP (ray-project#49376)

Ray Tune and Ray Train have been tightly coupled since Ray 2.0, when Ray
Tune became the common execution engine for both libraries.

Ray Train execution invokes Tune’s execution logic under the hood, which
leads to a complex, layered system. The original intention behind this
was to increase the interoperability of the two libraries, but the
dependency of Ray Train on Ray Tune has led to many usability and
stability issues, and it has stalled feature development.

ray-project/enhancements#57 proposed a much
clearer design to improve the **Usability**, **Extensibility**,
**Interoperability**, and **Testability**.

This PR contains the implementation of the above REP for the revamped
Ray Train. This implementation is contained in the `python/ray/train/v2`
directory. These changes pave the way for improved feature development
and enhanced user experience. Please refer to the REP for details on the
design, as well as the remaining changes which will be added shortly in
follow-up PRs.

---------

Signed-off-by: Hongpeng Guo <[email protected]>
Signed-off-by: Puyuan Yao <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
go add ONLY when ready to merge, run all tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants