-
Notifications
You must be signed in to change notification settings - Fork 246
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement FracBitsQuantizationBuilder and Controller #1234
Implement FracBitsQuantizationBuilder and Controller #1234
Conversation
e731ccb
to
3a7d4e2
Compare
- Implement Builder and Controller - Add and test ModelSizeCompressionLoss Signed-off-by: Kim, Vinnam <[email protected]>
3a7d4e2
to
757e6a4
Compare
Signed-off-by: Kim, Vinnam <[email protected]>
Signed-off-by: Kim, Vinnam <[email protected]>
Signed-off-by: Kim, Vinnam <[email protected]>
Signed-off-by: Kim, Vinnam <[email protected]>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vinnamkim, thanks for your contribution! I have major comments/questions about this PR and #1231. Could you share what results you have achieved with the FracBits quantization algorithm? What user cases are you going to cover by this algorithm? What is the benefit?
In accordance with the contribution guide for NNCF. The PRs with fully implemented features are only approved. All research must be done on forks.
cc' @AlexKoff88
Signed-off-by: Kim, Vinnam <[email protected]>
First of all, as far as I know this item is agreed with Lee, Wonju and Kozlov, Alexander on seeing results after quick implementation. I'm working now to obtain the experiment results on MobileNetV2, and etc... Basically, it can extend user choice for QAT algorithms. Currently, we has been provided only two algorithm: HaWQ and AutoQ which has an additional exploration stage aside QAT. Fracbits provides fully differential training for bit-widths, which is completely different from HaWQ and AutoQ. Therefore, this algorithm's benefit is exactly same for HaWQ and AutoQ. On top of that, I think the mixed precision will be more important in the future. This is because the model size of the vision models that NNCF has dealt with so far was so small if we compared to Transformers (BERT-Large ~1.4G). However, Transformers becomes the major player in AI scene and I believe there is demand from users to compress their deployment model size. We all know that commercial hardware acceleration for NN is supported up to 8bit, and OV runtime doesn't allow to save model weights on the mixed precision. But, I think it's easy to implement at the software level.
I understand that this was against the rules. How about make a branching strategy to use a feature branch. We can update and review code in a fast cycle on the feature branch. If all work ends in that feature branch, we can try to merge that feature branch into the develop branch. This is because trying to merge large PRs at once should be avoided in terms of software development. What do you think? |
Hi, @vinnamkim. Thank you for your response! I have no concerns about researching in the mixed precision area, in particular, implementing FracBits quantization algorithm. My concern is that the PR must meet the contribution guide and be justified in order to be accepted for merging with develop. The fork strategy is a default strategy for NNCF and uses in many repositories under the OpenVINO umbrella. If you do not like the fork strategy, I think you can create a branch in the NNCF repository to implement the research feature. |
Hi @alexsu52, FracBits feature should be merged into the develop branch in a viewpoint of "implement a recently published compression algorithm" in the contribution guide. If only the fully implemented feature including example and recipe can be merged into the develop, @vinnamkim, could you work on your forked branch and make PRs to implement all related to FracBits by separating PRs (a small PR must be preferred) and merge this branch to the develop in the upstream repository. |
I changed this PR's target branch to |
Signed-off-by: Kim, Vinnam <[email protected]>
Signed-off-by: Kim, Vinnam <[email protected]>
Signed-off-by: Kim, Vinnam <[email protected]>
Hi, @wonjuleee. Thanks for your comment. I totally agree with you. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM.
@vinnamkim, will you make a new PR for FracBits algorithm implementation?
Is this okay to merge this first?
Yes, some remaining works may be uploaded as PRs targeting to feature/fracbits branch. If all works is done, we can try to merge feature/fracbits to develop branch. |
@alexsu52 |
…#1234) * Implement FracBitsQuantizationBuilder and Controller - Implement Builder and Controller - Add and test ModelSizeCompressionLoss Signed-off-by: Kim, Vinnam <[email protected]>
Changes
Reason for changes
Related tickets
87888, 87889
Tests
Related unit tests added