Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Get scale value after Full Integer Quantization #132

Open
torukskywalker opened this issue Apr 18, 2024 · 5 comments
Open

Get scale value after Full Integer Quantization #132

torukskywalker opened this issue Apr 18, 2024 · 5 comments

Comments

@torukskywalker
Copy link

torukskywalker commented Apr 18, 2024

There is a related issue #111,
But, I'm still confused about how to get the scale value after full integer quantization , any one knows how to get it ?

@jurevreca12
Copy link

The quantizers (e.g. quantized_bits) have a .scale attribute. So you can get the scale after running some inference by just getting the .scale attribute.
Note that scaling factors can be separate for each neuron, or in convolutional networks they can be different for every channel. So in general, the scale attribute returns an array of scaling factors.
https://github.com/cs-jsi/chisel4ml/blob/f78192562fe00633a9f5320e93001dc0fd802a2f/chisel4ml/transforms/qkeras_util.py#L358
Here is an example function that returns the scaling factors from quantizers.

@torukskywalker
Copy link
Author

a navie question, for INT8 quantization using Mnist example, is the Input divided by 255 (normilized to [0 1] ) or not ?

@jurevreca12
Copy link

Well, I usually don't divide the input to normalize it 0 to 1. That way the inputs are integer. You could divide it, and that way you will get non-integer inputs, that on PC can work with floating-point. Then if you want to do deploy this to some custom hardware with out floating-point units, you can use fixed-point. With regards on how this affects training, in my experience it does not, at least for (relatively) shallow networks I usually train.

@torukskywalker
Copy link
Author

when doing the Inference on edge device with INT8 weights/bias, the accuracy is very low,
after tracing the C++ code step by step, we found the error is caused by the Softmax after the last Dense layer.
The Softmax may output INF since there are some "larger" integer numbers,
We tried to divide the output of Dense layer by eg 255 but did not help.
Is there a solution or replacement of Softmax in Qkeras for INT8 training ?
(so we could also replace current Softmax in Inference for INT8 weights)

@jurevreca12
Copy link

Hm. This is a bit unusual. I am not exactly sure what is causing your issues, but in general you can also skip softmax in inference, if you are only using it for picking the best class.

Regarding large integers. One thing you could do (if you are not doing it yet) is to saturate the intermediate activations. E.g. have a ReLU with saturation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants