Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

score of softmax on Text4k; linformer-256 & nystrom-64 doesn't work #15

Open
ZiweiHe opened this issue Mar 24, 2022 · 1 comment
Open

Comments

@ZiweiHe
Copy link

ZiweiHe commented Mar 24, 2022

Hi,

Thanks for the excellent work!

I found some issues in my humble trials (I didn't change anything in the code):

  1. using softmax attention on Text4k I got ~63.7 acc instead of 65.02 you posted in your paper.
  2. again I tried linear attention Text4k I got ~64 acc, it's even higher than vanilla transformer, I wonder did you get the same result from your side?
  3. the attention types linformer-256 and nystrom-64 doesn't work, the errors are either dimensions mismatching or config key error. It seems like not all the attention types can successfully run when you release the code. Btw I didn't try out all the choices.

Thank you for your time, I look forward to your reply~

Ziwei

@mlpen
Copy link
Owner

mlpen commented Mar 28, 2022

Are you using code from LRA? This config file is an example. To run LRA on other attentions, you can modify the "attn_type" (see possible attention methods in code) and add the specified attention related setting.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants