Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unique Alphas - how many per layer #10

Open
wanderingweights opened this issue Apr 28, 2023 · 3 comments
Open

Unique Alphas - how many per layer #10

wanderingweights opened this issue Apr 28, 2023 · 3 comments

Comments

@wanderingweights
Copy link

wanderingweights commented Apr 28, 2023

Hi again,

Your model saves don't seem to have the self.nbit parameters saved in the model saves you have released, meaning reproducing your results is not possible.

Let me know if I'm missing something.

@wanderingweights wanderingweights changed the title Round pass issue nbit parameters not saved with model releases Apr 28, 2023
@wanderingweights
Copy link
Author

wanderingweights commented Apr 28, 2023

Ah apologies, another issue is:

In your release of the code, it seems that there is an issue with the number of unique alphas.

I was expecting there to be one per attention head, but it seems that there is one per input channel:

576 in the first blocks.0.attn.qkv
etc.

Does this seem correct to you?

Because currently there are far too many alphas

@wanderingweights wanderingweights changed the title nbit parameters not saved with model releases Unique Alphas - how many per layer Apr 28, 2023
@YanjingLi0202
Copy link
Owner

Ah apologies, another issue is:

In your release of the code, it seems that there is an issue with the number of unique alphas.

I was expecting there to be one per attention head, but it seems that there is one per input channel:

576 in the first blocks.0.attn.qkv etc.

Does this seem correct to you?

Because currently there are far too many alphas

“blocks.0.attn.qkv” actually is the linear layers to obtain query, key, and value. And for all linear or conv2d layers, we apply channel-wise activation quantization. The head-wise quantization is for quantizing query, key, and value (i.e. activations after the qkv linear layer).

@YanjingLi0202
Copy link
Owner

Hi again,

Your model saves don't seem to have the self.nbit parameters saved in the model saves you have released, meaning reproducing your results is not possible.

Let me know if I'm missing something.

To reproduce our results, there is no need to save the nbits in the ckpt. You can use the option "--model" to change the bit-width of the quantized model.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants