-
Notifications
You must be signed in to change notification settings - Fork 16
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
protein.pdb files are not valid PDB files #20
Comments
I've changed the issue title to reflect the fact that many of these We will likely have to remediate all of these files in order for them to be useful. Here's another example: the CDK2
No @dfhahn : Can you point me to the scripts you used to generate these files? I can see if I can find a different route that uses much the same geometry/models but produced valid PDB files that can be processed by programs that expect the PDB files to comply with the PDB format specification. |
@jchodera I do not have scripts which generated these files. They come from public sources to ensure compatibility with former calculations. E.g. for thrombin, it is Vytas Gapsys work. I think they were generated with Gromacs |
I agree these files should comply with the PDB format specifications. It would be great if we changed the format without touching the coordinates. |
pdbs from the repository were generated by pdb2gmx and are compatible with the gromacs-based topologies that are also in the same github repository. This is the reason why the connectivities are not present and residue numbering as well as some nonstandard residue namings are there: this information is in the topology files. |
@vgapsys: Do we have the original topology and coordinate files from which these were generated? I wonder if there is a way to generate new PDB files from the source information that is compliant with the PDB standard so that other packages could use these files as well. |
This issue is blocking several others in the 0.3.0 milestone; is it possible to resolve this issue within this week, or at least before EOW next week? We are tentatively aiming for 0.3.0 release by 2022.05.31, and there will be follow-up work required following this issue. |
Shooting for end of this week for fixed PDBs and re-docked ligands. |
The thrombin
protein.pdb
file appears to have several defects that make it noncompliant with the PDB format specification.SEQRES
sequence informationACE
andNME
, but residues are numbered sequentiallyTER
records denoting the chain breaksCONECT
records that would be required byACE
andNME
since these are nonstandard residues.I'm not quite sure where the current file comes from---was it generated by Spruce?
Is there another alternative PDB file that is more compliant with the PDB format that others have been using?
The text was updated successfully, but these errors were encountered: