Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clashing binding poses #24

Closed
msuruzhon opened this issue Feb 28, 2022 · 14 comments · Fixed by #52
Closed

Clashing binding poses #24

msuruzhon opened this issue Feb 28, 2022 · 14 comments · Fixed by #52
Assignees
Milestone

Comments

@msuruzhon
Copy link

Hello,

I have been looking into the CDK2 test set and I ended up having some numerical issues related to high clashes between some of the ligands (1h1s, 28 and 29). I have attached an example picture for reference:

image

One of the protein hydrogen atoms is in very close proximity to one of the ligand oxygen atoms. Is this a known issue with this part of the test set? It might make sense to change the input coordinates because as it stands some of the systems are not possible to run "out of the box". Any suggestions will be appreciated.

Many thanks.

@ppxasjsm
Copy link
Contributor

ppxasjsm commented Mar 2, 2022

@jchodera, @dfhahn any ideas what might be going on here? Has this input not been used for a whole benchmark run?

@davidlmobley
Copy link

I agree that looks funky. This may be what was actually run, though; in general there’s a lot of inherited stuff here that is only gradually getting filtered out. For example, in some of the earlier benchmarking work from others that this built on, there were missing loops and residues in some of the protein structures, varied handling of water across targets, etc. etc. In other words — all kinds of issues. HOWEVER, free energy calculations often gave reasonable results anyway. We (as a community) are beginning to get some of those problems removed, curated out, etc., but there’s likely still a not more to be done.

@davidlmobley
Copy link

I'll leave it to @dfhahn and @ldamore to comment specifically.

@dfhahn
Copy link
Collaborator

dfhahn commented Mar 10, 2022

Hi @msuruzhon, @ppxasjsm thanks for raising this issue. It is indeed not an ideal starting pose and originates presumably from aligning the core of the ligands to the crystal structure ligands.
This was the structure used for the previous benchmark runs, the reason for it being in here as well. An energy minimization, at least in the Gromacs/pmx workflow resolved the clash.
As @davidlmobley pointed out, there are still many inherited issues in this set which need to be removed.

@ppxasjsm
Copy link
Contributor

I am struggling to see how this is a benchmark dataset then if we can't use the inputs as benchmarks. I can understand that there may be some inherited issues, but steric clashes that don't easily resolve in a minimization doesn't really seem like a sensible dataset to push in the first place. Why not used the minimized/equilibrated structures that work with Gromacs? Is there anyone from OpenFF working on this at the moment? Does OpenFF not run automated bechmarks at the moment?

@davidlmobley
Copy link

I believe we normally start with an energy minimization.

@ppxasjsm
Copy link
Contributor

I agree, but what if the minimisation doesn’t resolve the clashes?

@davidlmobley
Copy link

That would be a problem, but all of these ARE successfully used for our binding free energy benchmarking. I'm also surprised by the clash, and I suppose it could be resolved by depositing the minimized structure instead, but is that what we want? I'm not sure.

@ppxasjsm
Copy link
Contributor

I see! Is all the information needed to reproduce your benchmarks successfully in the repo? What would be the approach to propose alternative input used, that worked in a different set of benchmarks. Would it be helpful to have the input and successfully run protocols and final outputs (not trajectories) available in this case?

@dfhahn
Copy link
Collaborator

dfhahn commented Mar 11, 2022

All steric clashes which are present were easily resolved by energy minimization. But I agree it would be more sensible to provide the minimized structures. Although that could lead to less aligned ligand sets and could (presumably only slightly) break compatibility with previously run benchmarks.

What would be the approach to propose alternative input used, that worked in a different set of benchmarks.

I guess we want to only have one input, not alternative ones as this might be confusing. I would suggest to create PRs with better structures which will go into next releases. Then you can point to the release used when reporting results.

Would it be helpful to have the input and successfully run protocols and final outputs (not trajectories) available in this case?

What do you mean with having successfully run protocols? Just name it or link to a repo with the protocol? Having the output could be an option, but does it add value? Will people use it for something?

@bcossins
Copy link

Hi,
Just to follow on from Miro's post. We have found that some clashes for CDK2 were not resolvable and we tried minimising with a few different protocols including GMX-2021 on a cpu. The image shows a clash that goes through a lysine side-chain. We are using standard and well tested setting for minimisation.

There were various other clashes for a few other systems that caused us to want to adjust the inputs to our minimisations. This seems like it would make reproducibility and good comparisons more difficult. Removing these clashes will make these files more useable as at the moment some who encounter the same problems as us would have to make up their own alternative inputs.

image (12)

@davidlmobley
Copy link

Propagating this to #binding-benchmarks on OpenFF Slack; I'm thinking maybe we should also have minimized and/or cleaned up structures here and deprecate those with clashes... The important thing is to document, I think.

@dotsdl dotsdl added this to the Release 0.3.0 milestone Mar 23, 2022
@dotsdl dotsdl assigned IAlibay and unassigned ldamore Apr 19, 2022
@dotsdl
Copy link
Member

dotsdl commented Apr 20, 2022

Closing this out may be dependent on resolving #20 first, but we can still find a solution to the clashing problem in the meantime before writing out new sets of structure files in a PR.

@IAlibay
Copy link
Collaborator

IAlibay commented May 3, 2022

Note: To be updated with more information

Affected systems

Using a distance cutoff of 0.5 A, the following systems have at least once clash for at least one of its ligands:

  • JNK1
  • PDE2
  • P38
  • CDK2

MCL1

  • lig_41: atom 1: 3HD1, LEU, atom 2: C38, LIG, distance: 0.49

BACE_hunt

  • lig_41: atom 1: HA, GLN, atom 2: H40, LIG, distance: 0.29 A

PBKFB3

  • lig_42: atom 1: 1HG2, VAL, atom 2: H45, LIG, distance: 0.48 A

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
8 participants