Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is the model really Quantized? #13

Open
navinranjan7 opened this issue May 27, 2023 · 3 comments
Open

Is the model really Quantized? #13

navinranjan7 opened this issue May 27, 2023 · 3 comments

Comments

@navinranjan7
Copy link

No description provided.

@navinranjan7
Copy link
Author

  1. GFLOPs remain the same after quantization.
  2. Compared to full precision model training, Memory requirements increase with quantization during training.
  3. The quantized model size is too large. Quantized Swin-T has 380MB compared to 109MB full precision size.

Please help me understand. How did you calculate the GFLOPs, and if the model is really quantized?

@spbob0418
Copy link

same here. I had to reduce batch size, because of the GPU memory limit which is not the case in full precision DeiT model
And I found that the weight and activation scaled right after quantized(why not after multiplication??)

@spbob0418
Copy link

And I cannot find the code of compute entropy in quantization code which is described in paper method

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants