Unique Alphas - how many per layer #10

wanderingweights · 2023-04-28T15:55:37Z

Hi again,

Your model saves don't seem to have the self.nbit parameters saved in the model saves you have released, meaning reproducing your results is not possible.

Let me know if I'm missing something.

wanderingweights · 2023-04-28T16:23:28Z

Ah apologies, another issue is:

In your release of the code, it seems that there is an issue with the number of unique alphas.

I was expecting there to be one per attention head, but it seems that there is one per input channel:

576 in the first blocks.0.attn.qkv
etc.

Does this seem correct to you?

Because currently there are far too many alphas

YanjingLi0202 · 2023-05-22T06:38:49Z

Ah apologies, another issue is:

In your release of the code, it seems that there is an issue with the number of unique alphas.

I was expecting there to be one per attention head, but it seems that there is one per input channel:

576 in the first blocks.0.attn.qkv etc.

Does this seem correct to you?

Because currently there are far too many alphas

“blocks.0.attn.qkv” actually is the linear layers to obtain query, key, and value. And for all linear or conv2d layers, we apply channel-wise activation quantization. The head-wise quantization is for quantizing query, key, and value (i.e. activations after the qkv linear layer).

YanjingLi0202 · 2023-05-22T06:40:20Z

Hi again,

Your model saves don't seem to have the self.nbit parameters saved in the model saves you have released, meaning reproducing your results is not possible.

Let me know if I'm missing something.

To reproduce our results, there is no need to save the nbits in the ckpt. You can use the option "--model" to change the bit-width of the quantized model.

wanderingweights changed the title ~~Round pass issue~~ nbit parameters not saved with model releases Apr 28, 2023

wanderingweights changed the title ~~nbit parameters not saved with model releases~~ Unique Alphas - how many per layer Apr 28, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unique Alphas - how many per layer #10

Unique Alphas - how many per layer #10

wanderingweights commented Apr 28, 2023 •

edited

Loading

wanderingweights commented Apr 28, 2023 •

edited

Loading

YanjingLi0202 commented May 22, 2023

YanjingLi0202 commented May 22, 2023

Unique Alphas - how many per layer #10

Unique Alphas - how many per layer #10

Comments

wanderingweights commented Apr 28, 2023 • edited Loading

wanderingweights commented Apr 28, 2023 • edited Loading

YanjingLi0202 commented May 22, 2023

YanjingLi0202 commented May 22, 2023

wanderingweights commented Apr 28, 2023 •

edited

Loading

wanderingweights commented Apr 28, 2023 •

edited

Loading