You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, thank you for sharing the code and the weights with the community and congrats on your results and subsequent work.
I wanted to ask something for the single chain model. You state in your paper:
"We first sought to improve performance of the model on recovering the amino acid sequences of native monomeric proteins given their backbone structures, using as training and validation sets 19.7k high resolution single-chain structures from the PDB split based on the CATH protein classification"
Could you make available this list of 19.7k pdb ids? If not could you maybe clarify the following points:
What does high resolution mean exactly? Is it the same resolution cutoff as for the rest of the paper, 3.5A?
Are there any sequence length constrains? For the multi-chain model you state: less than 10,000
residues
Is the cutoff date the same as for the rest of the paper, Aug 02, 2021?
Any other filters that you might have in place like discarding chains that have too many missing residues or too larger of a coil content?
Using the guidelines in your paper I end up with over 60,000 distinct PDB IDs. I am not sure how to reach your 19.7k set.
Thank you
The text was updated successfully, but these errors were encountered:
Hi, thank you for sharing the code and the weights with the community and congrats on your results and subsequent work.
I wanted to ask something for the
single chain
model. You state in your paper:"We first sought to improve performance of the model on recovering the amino acid sequences of native monomeric proteins given their backbone structures, using as training and validation sets 19.7k high resolution single-chain structures from the PDB split based on the CATH protein classification"
Could you make available this list of 19.7k pdb ids? If not could you maybe clarify the following points:
residues
Using the guidelines in your paper I end up with over 60,000 distinct PDB IDs. I am not sure how to reach your 19.7k set.
Thank you
The text was updated successfully, but these errors were encountered: