-
Notifications
You must be signed in to change notification settings - Fork 3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Including support for Deepspeed 0.8.0 #14506
Conversation
Can u pls test it locally to make sure both 0.7.3 and 0.8.0 would run thru your code wo/ issue? |
@ytaous Yes, I have already verified that it works for both 0.7.3 and 0.8.0. |
pass | ||
else: | ||
if not get_accelerator().device_name().startswith("cuda"): | ||
return False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For better debugging, please also add a warning.warn here to specify the reason of failure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If possible, let's avoid warning unless it is really a warning. For debugging, we should probably add logging capabilities and log as debug or info.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's kind of warning or error. When it goes there, it means user requires to modify the optimizer, if we cannot do that, we should tell/warn user the reason.
@@ -39,14 +39,22 @@ def can_be_modified(self): | |||
# it's safe to update the version supporting list. Otherwise, or the file is moved or renamed, | |||
# we need to check the implementation of these functions in detail. | |||
ds_version = Version(deepspeed.__version__) | |||
if ds_version > Version("0.7.3") or ds_version < Version("0.4.0"): | |||
if ds_version > Version("0.8.0") or ds_version < Version("0.4.0"): | |||
warnings.warn( | |||
"Skip modifying optimizer because of unsupported DeepSpeed version {}, " | |||
"supported version: 0.4.0 - 0.7.3.".format(deepspeed.__version__), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you need to update this to 0.8.0? Can we create a variable with version number for start and end versions and reference that instead of making changes everywhere?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there's only place needs the update, thus there's no need to create start/end variables
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The warning message has 0.7.3 in it. I think we missed updating that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"supported version: 0.4.0 - 0.7.3.".format(deepspeed.__version__), | |
"supported version: 0.4.0 - 0.8.0.".format(deepspeed.__version__), |
except ImportError as e: | ||
pass |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Under what circumstances would this import fail?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Import would fail for older deepspeed versions (I think for version less than 0.8.0). This is because get_accelerator was introduced recently by deepspeed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the new DS code would use a factory to create the accelerator.
https://github.com/delock/DeepSpeedSYCLSupport/blob/e9687254eae72608ed8c76a185d5f6cffe3fd6b6/accelerator/real_accelerator.py#L49
There might be case where "XPU_Accelerator" package is unavailable. Adding you to a chat thread.
### Description Including Support for Deepspeed 0.8.0. ### Motivation and Context Deepspeed 0.8.0 has a bug fix and mlfow integration.
### Description Including Support for Deepspeed 0.8.0. ### Motivation and Context Deepspeed 0.8.0 has a bug fix and mlfow integration.
### Description Including Support for Deepspeed 0.8.0. ### Motivation and Context Deepspeed 0.8.0 has a bug fix and mlfow integration.
### Description Including Support for Deepspeed 0.8.0. ### Motivation and Context Deepspeed 0.8.0 has a bug fix and mlfow integration.
Description
Including Support for Deepspeed 0.8.0.
Motivation and Context
Deepspeed 0.8.0 has a bug fix and mlfow integration.