Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Switches BBS example to use mbsize=3 and gas=2 to fit in 16GB of memory. #341

Merged
merged 2 commits into from
Sep 1, 2020

Conversation

ShadenSmith
Copy link
Contributor

No description provided.

@tjruwase
Copy link
Contributor

tjruwase commented Sep 1, 2020

What issue is this PR related to?

@ShadenSmith
Copy link
Contributor Author

ShadenSmith commented Sep 1, 2020

We are ramping up additional testing machines with 16GB GPUs. BBS is the only test that requires >16GB of memory to complete. As an added bonus this now adds model test coverage for ZeRO-2 with gradient accumulation too :-).

Do you see any issues with decreasing the microbatch size for these model tests? The effective batch size is untouched.

@tjruwase
Copy link
Contributor

tjruwase commented Sep 1, 2020

Thanks for the clarification. I don't see any issues with decreasing micro batch size. I was just curious about the motivation.

@ShadenSmith
Copy link
Contributor Author

Sure sure, was just wondering too :-). I haven't worked with this test much in the past.

@ShadenSmith ShadenSmith merged commit 838f53b into deepspeedai:master Sep 1, 2020
@ShadenSmith ShadenSmith deleted the bbs-16gb branch September 1, 2020 15:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants