Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DeepSpeed-Chat Llama2/stability release #4240

Merged
merged 120 commits into from
Aug 31, 2023
Merged

DeepSpeed-Chat Llama2/stability release #4240

merged 120 commits into from
Aug 31, 2023

Conversation

awan-10
Copy link
Contributor

@awan-10 awan-10 commented Aug 31, 2023

DeepSpeed-Chat is a general system framework for RLHF training that enables easy, fast, affordable, and scalable training of ChatGPT-style models that we publicly released on GitHub. The detailed performance and capabilities of DeepSpeed-Chat have been published in our blog post and arXiv paper.

We are happy to share that today we are improving DeepSpeed-Chat along three areas: i) system support for Llama/Llama-2 family of models, ii) system features for improved efficiency and accessibility, and iii) stability and software enhancements.

HeyangQin and others added 30 commits June 21, 2023 11:51
* fix conv_flops_compute when padding is a str when stride=1

* fix error

* change type of paddings to tuple

* fix padding calculation

* apply formatting check

---------

Co-authored-by: Cheng Li <[email protected]>
Co-authored-by: Olatunji Ruwase <[email protected]>
* Update profiler.py

* pre-commit run --all-files

* Delete .DS_Store

* Delete .DS_Store

* Delete .DS_Store

---------

Co-authored-by: Jeff Rasley <[email protected]>
Co-authored-by: Cheng Li <[email protected]>
* zeropp chinese blog

* try better quality images

* make title larger

* even larger...

* various fix

* center captions

* more fixes

* fix format
Co-authored-by: Stephen Youn <[email protected]>
Co-authored-by: Arash Bakhtiari <[email protected]>
Co-authored-by: Cheng Li <[email protected]>
Co-authored-by: Ethan Doe <[email protected]>
Co-authored-by: yidoe <[email protected]>
Co-authored-by: Jeff Rasley <[email protected]>
Co-authored-by: HeyangQin <[email protected]>
Co-authored-by: GuanhuaWang <[email protected]>
Co-authored-by: cmikeh2 <[email protected]>
Co-authored-by: Ammar Ahmad Awan <[email protected]>
Co-authored-by: Jeff Rasley <[email protected]>
Co-authored-by: Michael Wyatt <[email protected]>
Co-authored-by: Olatunji Ruwase <[email protected]>
Co-authored-by: Reza Yazdani <[email protected]>
* zeropp chinese blog

* try better quality images

* make title larger

* even larger...

* various fix

* center captions

* more fixes

* fix format

* add ZeRO++ Japanese blog

* add links

---------

Co-authored-by: HeyangQin <[email protected]>
Co-authored-by: Conglong Li <[email protected]>
* fix autotuner when backward is not called

* fix format

---------

Co-authored-by: Olatunji Ruwase <[email protected]>
Co-authored-by: Ammar Ahmad Awan <[email protected]>
Co-authored-by: Jeff Rasley <[email protected]>
Co-authored-by: Logan Adams <[email protected]>
* Bug fix

* Fixed formatting error

---------

Co-authored-by: Logan Adams <[email protected]>
Co-authored-by: Stephen Youn <[email protected]>
Co-authored-by: Jeff Rasley <[email protected]>
@awan-10 awan-10 merged commit 2420e23 into master Aug 31, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.