MX Quantization About Subnorm #35

Jzz24 · 2025-01-03T10:06:08Z

Hi~ great work ! I have some questions about the choice of private_exp. The quantization scales of subnormal and normal values should be different. Why private_exp clip to min_exp? I think it should clip to 1.0.

As shown in the figure：
alpha = 2**(shared_exp - emax), alpha is a scaling factor
private_exp = floor(log2(abs(A/alpha)).clip(1 or min_exp), A is input tensor
quantize_scale = 2**(private_exp - m)

    if exp_bits != 0:
        private_exp = torch.floor(torch.log2(torch.abs(A) + (A == 0).type(A.dtype)))
    
        # #The minimum representable exponent for 8 exp bits is -126
        # min_exp = -(2 ** (exp_bits - 1)) + 2
        # private_exp = private_exp.clip(min=min_exp)
    
        # subnorm and norm part has different scale
        # private_exp >= 1, norm scale
        # private_exp < 1, subnorm scale
        private_exp = private_exp.clip(min=1.0)
    else:
        private_exp = None

code
image

The text was updated successfully, but these errors were encountered:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

MX Quantization About Subnorm #35

MX Quantization About Subnorm #35

Jzz24 commented Jan 3, 2025

MX Quantization About Subnorm #35

MX Quantization About Subnorm #35

Comments

Jzz24 commented Jan 3, 2025