Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable Deepspeed on Windows #6651

Closed
guillochon opened this issue Mar 23, 2021 · 6 comments · Fixed by #8488
Closed

Enable Deepspeed on Windows #6651

guillochon opened this issue Mar 23, 2021 · 6 comments · Fixed by #8488
Labels
feature Is an improvement or enhancement help wanted Open to be worked on priority: 2 Low priority task

Comments

@guillochon
Copy link
Contributor

🚀 Feature

Currently, deepspeed is explicitly disabled on Windows systems:

_DEEPSPEED_AVAILABLE = not _IS_WINDOWS and _module_available('deepspeed')

I checked by commenting out the not _IS_WINDOWS part that deepspeed indeed does not currently work on Windows, the following error is emitted:

    595         elif backend == Backend.NCCL:
    596             if not is_nccl_available():
--> 597                 raise RuntimeError("Distributed package doesn't have NCCL "
    598                                    "built in")
    599             pg = ProcessGroupNCCL(
RuntimeError: Distributed package doesn't have NCCL built in

Motivation

It would be awesome to get the speed and memory benefits of deepspeed on Windows machines too!

Pitch

My understanding is that enabling this requires using our own implementation of parallelism via MPI rather than Pytorch distributed, which doesn't support Windows. Not sure how involved this is.

Alternatives

N/A

Additional context

Discussion with @SeanNaren about it here: https://pytorch-lightning.slack.com/archives/CRBLFHY79/p1616436343124400

@guillochon guillochon added feature Is an improvement or enhancement help wanted Open to be worked on labels Mar 23, 2021
@aribornstein
Copy link
Contributor

@SeanNaren is this something the deep speed team can support?

@SeanNaren
Copy link
Contributor

DeepSpeed already supports MPI which is windows compatible! In order to integrate this, we'll need to refactor our code such that we use DeepSpeed's initialization function: https://deepspeed.readthedocs.io/en/latest/initialize.html#deepspeed.init_distributed

@stale
Copy link

stale bot commented May 5, 2021

This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, Pytorch Lightning Team!

@stale stale bot added the won't fix This will not be worked on label May 5, 2021
@SeanNaren SeanNaren removed the won't fix This will not be worked on label May 5, 2021
@edenlightning edenlightning added the priority: 2 Low priority task label May 9, 2021
@stale
Copy link

stale bot commented Jun 8, 2021

This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, Pytorch Lightning Team!

@stale stale bot added the won't fix This will not be worked on label Jun 8, 2021
@SeanNaren SeanNaren removed the won't fix This will not be worked on label Jun 9, 2021
@stale
Copy link

stale bot commented Jul 9, 2021

This issue has been automatically marked as stale because it hasn't had any recent activity. This issue will be closed in 7 days if no further activity occurs. Thank you for your contributions, Pytorch Lightning Team!

@stale stale bot added the won't fix This will not be worked on label Jul 9, 2021
@stale stale bot closed this as completed Jul 16, 2021
@SeanNaren SeanNaren reopened this Jul 16, 2021
@stale stale bot removed the won't fix This will not be worked on label Jul 16, 2021
@SeanNaren
Copy link
Contributor

Support has been tentatively added in #8488, without end to end testing. So if anyone gets the chance to try this out on Windows end to end and report back it would be awesome!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Is an improvement or enhancement help wanted Open to be worked on priority: 2 Low priority task
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants