We read every piece of feedback, and take your input very seriously.
To see all available qualifiers, see our documentation.
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
As can be seen in from esm.utils.constants import esm3 as C, there are two kinds of tokens.
from esm.utils.constants import esm3 as C
SS8_UNK_TOKEN = 2 SS8_PAD_TOKEN = 0
ss8_tokens
Although the PAD tokens may be learned as UNK tokens by training, I wonder what the best none token is during only-sequence embedding extraction.
Besides, is this inconsistency a bug? Will you fix this inconsistency later?
The text was updated successfully, but these errors were encountered:
The default ss8_token should be 0 for an all masked sequence. Will fix in the next release.
Sorry, something went wrong.
This should be fixed on main, expect a release to pip in the next week.
ebetica
No branches or pull requests
As can be seen in
from esm.utils.constants import esm3 as C
, there are two kinds of tokens.ss8_tokens
will be 2 by defaults.ss8_tokens
will be 0 by default_protein_tensor.Although the PAD tokens may be learned as UNK tokens by training, I wonder what the best none token is during only-sequence embedding extraction.
Besides, is this inconsistency a bug? Will you fix this inconsistency later?
The text was updated successfully, but these errors were encountered: