-
Notifications
You must be signed in to change notification settings - Fork 4.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
bumping DSE commit for pillow security fix #312
Merged
Merged
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
jeffra
added a commit
that referenced
this pull request
Jul 18, 2022
* Staging compression library v1 (#314) * prototype * add sparse/row/head pruning * add bert test examples, not testing yet * rm moq * add deepspeed based glue example to test compression * add get/set attr * tested replacement module * Custimized Linear Layer Accuracy Checked without any compression technique * sparse pruning tested * head pruning tested * row pruning tested * enable act dy quantization * change l1 mask to buffer for better resume training * add final model saving helper function, only for sparse prunin now * tested sparse pruning resume training and final model saving * row pruning resume training and final saving checked * head pruning resuming training / final model saving * rm bert from deepspeed * restruct the code * add mixed-precision quantization support * add binary/ternary support * add weight quantization FP16 assert * add conv2d * add compression function * move config generation to deepspeed side, need elton to take a look * add activation quantization support * add sparse pruning support * add row pruning * add head pruning * add channel pruning * support matching patterns for module names * update * fix typo in fix_compression * add compression scheduler, rm the offset scheduler from MoQ * fix some errors in head pruning, support redudent clearning (naive version) * add dim-reduction redudent clearning * update linear layer * make cnn example work * add bn2d * fix bias issue * add static act quantization * support mpu row/colomn parallel linear layer * add skip_bias_add for mpu linear layers * make mpu compress work, remove_redundent is not tested yet * fix several small errors * add conv1d to linear converter function * add conv1d to linear converter function * add conv1d to linear converter function * make dy-act-quantization per-token or per-image * cleaning part of the code; more is coming * enable forward weight quantization which supports both FP32 and some tricky settings * update readme * Update README.md * naming cleaning * fix static activation loading issue * update parameter * Update utils.py fix a typo * fix typo * fix typo * replace expand_as with view * Zheweiyao/compression library (#304) * add forward weight quantization constraint * add quantize_weight_in_forward warning: a lot of features are not supported * offset 0 fixing * add forward weight quantization constraint * add quantize_weight_in_forward warning: a lot of features are not supported * offset 0 fixing * fix a small issue * omit bias if the model does not have bias * add contiguous to aviod memory issue * add scale associated to weight, so people can quantize the weight after training * add fix weight quantization, change name based on constant.py file * disable eigen-based MoQ * When a method is disable (enable: false), we do not need to initialize its related parameters * weight quantization cleaning * fix get_quantize_enabled missing problem * fix redundent cleaning issue, make sure we either get mask from related-module or we enable the method in config * sort the redundent cleaning step, so we always do quantization, then sparse pruning, then others * a lot of comment cleaning and args explanation * add args in config-json.md * fix format issue * fix quantization offset step=1 with FP16 optimizer * Zheweiyao/compression library from s1 (#305) * add binary/ternary support for FP32 training; this is used to resolve FP16 unstable extreme compression training * add embedding quantization support * Xiaoxia/compression library v1 (#307) * add layer reduction (Xiaoxia/Zhewei) * fixing bug for sym activation and clean layer reduction (Xiaoxia) * fixing compression initialization (Xiaoxia/Zhewei) * fix format issue (#310) * Xiaoxia/compression library v1 (#311) * add layer reduction * fixing bug for sym activation and clean layer reduction * fixingn compression initialization * pre-commit... * Zheweiyao/compression library from s1 (#312) * fix format issue * fix the accuracy mismatch after quantization cleaning * fix clean_model bug and add layer_reduction configuration Co-authored-by: yaozhewei <[email protected]> Co-authored-by: Elton Zheng <[email protected]> Co-authored-by: Jeff Rasley <[email protected]> * switch to deepspeed comm * dummy tutorial * improve config json * Zheweiyao/compression library based on s2 (#315) * change the name and merge layer reduction to init_compression * add conv1d to linear test unit, fix errors introduced by merging studient initialtization to init_compression * Update config-json.md * fix for cifar10 channel pruning * fix the block_eigenvalue is None bug * fix the block_eigenvalue is None bug * move compression-related constants and configs to compression * tutorial and json config Co-authored-by: Xiaoxia (Shirley) Wu <[email protected]> Co-authored-by: yaozhewei <[email protected]> Co-authored-by: Elton Zheng <[email protected]> Co-authored-by: Jeff Rasley <[email protected]> Co-authored-by: xiaoxiawu <[email protected]> Co-authored-by: xiaoxiawu <[email protected]>
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
No description provided.