-
Notifications
You must be signed in to change notification settings - Fork 4.2k
Issues: deepspeedai/DeepSpeed
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Author
Label
Projects
Milestones
Assignee
Sort
Issues list
[BUG] deepspeed zero2 training hangon and timeout after a fixed step
bug
Something isn't working
training
#7044
opened Feb 17, 2025 by
leeruibin
Getting requirements to build wheel: finished with status 'error'
windows
Questions or PRs relating to running DeepSpeed on Windows
#7043
opened Feb 17, 2025 by
Avroboros
[REQUEST] Runable solution of RTX 5090 GPU + Linux Driver version + Pytorch version + Deepspeed version for LLM finetuning?
enhancement
New feature or request
#7042
opened Feb 17, 2025 by
0781532
[REQUEST] activation checkpoint API should have parity with Pytorch, keywords arguments not supported
enhancement
New feature or request
#7038
opened Feb 15, 2025 by
AndreasMadsen
[REQUEST] Why is the column linear layer with all-gather not implemented in DeepSpeed Inference?
enhancement
New feature or request
#7037
opened Feb 14, 2025 by
zhangvia
Fix - Update DeepSpeed to be PEP517 compliant, update to Improvements to the build and testing systems.
install
Installation and package dependencies
pyproject.toml
build
#7031
opened Feb 13, 2025 by
loadams
[BUG] Something isn't working
training
import deepspeed
crashes on deepspeed==0.16.3
with triton==3.2.0
on CPU machine
bug
#7028
opened Feb 13, 2025 by
hongpeng-guo
[BUG]Issues with Running DeepSpeed Zero2 & Zero3 Not Taking Effect
bug
Something isn't working
training
#7026
opened Feb 12, 2025 by
fengdian8564
[REQUEST]Can the Mamba model be supported?
enhancement
New feature or request
#7022
opened Feb 11, 2025 by
fxnie
Out-of-Memory (OOM) Error with CPU Offload Using ZeRO Stage 3
bug
Something isn't working
inference
#7021
opened Feb 10, 2025 by
lorenaromerom02
[REQUEST] option to shard weights only in each node
enhancement
New feature or request
#7019
opened Feb 8, 2025 by
cyr0930
[REQUEST] Support Offload deepspeed engine in RLHF training
enhancement
New feature or request
#7013
opened Feb 7, 2025 by
hijkzzz
[REQUEST] Possiblity of integrating LongVU with DeepSpeed
enhancement
New feature or request
#7006
opened Feb 5, 2025 by
xiaoqian-shen
[BUG] mpi based training error
bug
Something isn't working
training
#6997
opened Feb 4, 2025 by
cyr0930
[BUG] loading model error
bug
Something isn't working
training
#6994
opened Feb 3, 2025 by
tengwang0318
[REQUEST] adding type hints and New feature or request
py.typed
metadata
enhancement
#6988
opened Jan 31, 2025 by
jamesbraza
[BUG] pdsh runner doesn't work with tqdm bar
bug
Something isn't working
training
#6978
opened Jan 29, 2025 by
Superskyyy
[BUG] Errors in GPT-MoE models Inferences
bug
Something isn't working
inference
#6973
opened Jan 25, 2025 by
1155157110
Previous Next
ProTip!
Adding no:label will show everything without a label.