-
Notifications
You must be signed in to change notification settings - Fork 150
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
FSDP oneshot #1939
FSDP oneshot #1939
Conversation
…into sparse_auto_recipe
See updated PR comment :( device_map="auto" doesn't seem to be compatible with quantization so I'm leaving it off the default. It can still be specified on the CLI for non-quantized one-shot |
This PR updates the one-shot modifiers SparseGPT, Wanda and SmoothQuant and Quantization to be compatible with FSDP. This enables us to run alternating one-shot/finetuning flows with FSDP
** NOTE: ** #1912 should be merged first, it covers the initial alternating flow implementation.
Summary of Changes
Remove any references of specific devices from the one-shot modifiers, device is now handled by"auto" actually isn't compatible with quantization :( so keeping the default as "cuda:0", but you can pass "auto" through the CLI for a non-quantized oneshotSparseCausalLM
, and defaults to "auto" for splitting the model across multiple GPUs (this isn't FSDP related, we can split the model even outside of FSDP)SparseGPT
class to be a module wrapper, so that we can update weights using module.apply as required for FSDP compatibility