can llama using TP (Tensor parallize)? #11937
Unanswered
Deeperfinder
asked this question in
Q&A
Replies: 0 comments
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
-
when i run deepseek-R1 671b in NVIDIA-A100, i see the default parallel strategy is PP !
Beta Was this translation helpful? Give feedback.
All reactions