Is the model really Quantized? #13

navinranjan7 · 2023-05-27T23:09:18Z

No description provided.

navinranjan7 · 2023-05-27T23:19:06Z

GFLOPs remain the same after quantization.
Compared to full precision model training, Memory requirements increase with quantization during training.
The quantized model size is too large. Quantized Swin-T has 380MB compared to 109MB full precision size.

Please help me understand. How did you calculate the GFLOPs, and if the model is really quantized?

spbob0418 · 2024-10-24T06:27:01Z

same here. I had to reduce batch size, because of the GPU memory limit which is not the case in full precision DeiT model
And I found that the weight and activation scaled right after quantized(why not after multiplication??)

spbob0418 · 2024-10-24T06:28:53Z

And I cannot find the code of compute entropy in quantization code which is described in paper method

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Is the model really Quantized? #13

Is the model really Quantized? #13

navinranjan7 commented May 27, 2023

navinranjan7 commented May 27, 2023

spbob0418 commented Oct 24, 2024

spbob0418 commented Oct 24, 2024

Is the model really Quantized? #13

Is the model really Quantized? #13

Comments

navinranjan7 commented May 27, 2023

navinranjan7 commented May 27, 2023

spbob0418 commented Oct 24, 2024

spbob0418 commented Oct 24, 2024