-
Notifications
You must be signed in to change notification settings - Fork 55
Pull requests: NVIDIA/Fuser
Author
Label
Projects
Milestones
Reviews
Assignee
Sort
Pull requests list
Check that warps are only accessing the subpartition of TMem that it can access
#4016
opened Mar 6, 2025 by
zasdfgbnm
Loading…
Relax BFS target constraint in getInnerMmaLoopGroup
#4012
opened Mar 5, 2025 by
jacobhinkle
Loading…
Run a DeepSeek-V3 transformer block from Hugging Face
#4009
opened Mar 4, 2025 by
wujingyue
Loading…
Translate MatmulOp and LinearOp on Hopper without AxisMapping
#3986
opened Feb 27, 2025 by
jacobhinkle
Loading…
don't need to sync before write to smem in warp reduction runtime func
#3958
opened Feb 25, 2025 by
liqiangxl
Loading…
DID loop split for reshape without pre-sharding reshape propagation
#3953
opened Feb 24, 2025 by
Priya2698
Loading…
Expose
MultiDeviceExecutor
and overlapped AG+matmul to python API
#3923
opened Feb 19, 2025 by
samnordmann
Loading…
Previous Next
ProTip!
What’s not been updated in a month: updated:<2025-02-06.