Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add example notebook for computing conformer energies #533

Merged
merged 12 commits into from
May 1, 2020

Conversation

jchodera
Copy link
Member

@jchodera jchodera commented Mar 1, 2020

This PR adds an example illustrating computing the vacuum energies of different conformers of one or more molecules.

@jchodera jchodera requested a review from j-wags March 1, 2020 00:57
@jchodera
Copy link
Member Author

jchodera commented Mar 1, 2020

@j-wags : In preparing this example, I noticed three opportunities for improving our API. We could split these into separate issues if you agree they might be useful:

  1. Molecule.load_file() does not automatically coalesce multiconformer molecules into a single Molecule object. We could change the default or optional behavior to do this, integrating the small code snippet I use in the notebook to do so.
  2. Computing the vacuum energy of a molecule in a force field is a common operation we might want to make simpler, such as molecule.compute_energy(forcefield) or forcefield.compute_vacuum_energy(molecule).
  3. Minimizing a molecule conformation might also be made easier, such as Molecule.optimize_conformations(forcefield)

(1) seems like a clear win; (2) and (3) might benefit from waiting for our new portable system container object.

Copy link
Member

@j-wags j-wags left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great. Thanks! In response to your points:

Molecule.load_file() does not automatically coalesce multiconformer molecules into a single Molecule object. We could change the default or optional behavior to do this, integrating the small code snippet I use in the notebook to do so.

I started realizing the trouble with multiconformer molecules in #281 -- Basically, if they have associated properties, it becomes hard to determine whether the user wants them to be multiple conformers of the same molecule, or multiple molecules. At least in the case of SDF, I'm going to err on the side of "always treat them like different molecules". We could open this discussion up on different formats (like mol2), though.

Computing the vacuum energy of a molecule in a force field is a common operation we might want to make simpler, such as molecule.compute_energy(forcefield) or forcefield.compute_vacuum_energy(molecule).
Minimizing a molecule conformation might also be made easier, such as Molecule.optimize_conformations(forcefield)

I agree that these functions would be useful. This is harkening back to the decision about openforcefield.utils.structure -- it's handy stuff to provide the user with, but it requires us to take ownership over a lot of things that aren't in the pathway of "just parameterize the molecule", and it starts a slippery slope of "just throw every handy little function in here", which ends up with a large maintenance burden. So, if we can draw a clear dividing line between what we will and won't bundle, I'm fine with it. I saw your post of where to bundle these functions, and will continue that discussion on Slack.

@jchodera
Copy link
Member Author

jchodera commented Mar 3, 2020

I started realizing the trouble with multiconformer molecules in #281 -- Basically, if they have associated properties, it becomes hard to determine whether the user wants them to be multiple conformers of the same molecule, or multiple molecules. At least in the case of SDF, I'm going to err on the side of "always treat them like different molecules". We could open this discussion up on different formats (like mol2), though.

What about a simple optional API extension, like

from openforcefield.topology import Molecule
# Collapse sequential multiple conformers into one molecule when possible
multiconformer_molecule = Molecule.from_file('conformers.sdf', use_multiconformer_molecules=True)
# Load them as separate molecules
separate_molecules = Molecule.from_file('conformers.sdf')

A little bit of code at the end of Molecule.from_file() could simply coalesce sequential conformers using the same code that I use above when use_multiconformer_molecules=True, and this would work with any toolkit and file format.

I agree that these functions would be useful. This is harkening back to the decision about openforcefield.utils.structure -- it's handy stuff to provide the user with, but it requires us to take ownership over a lot of things that aren't in the pathway of "just parameterize the molecule", and it starts a slippery slope of "just throw every handy little function in here", which ends up with a large maintenance burden. So, if we can draw a clear dividing line between what we will and won't bundle, I'm fine with it.

I think I'd be happy to have these functions make it into the portable system object, since we could then do things like

potential = forcefield.create_system(topology).potential_energy
minimized_conformers = forcefield.create_system(topology).minimize().conformers

and hide all the complexity.

@davidlmobley
Copy link
Contributor

@jchodera are you adding tests for this one? I believe all of our other notebooks are actually tested by the CI.

@j-wags
Copy link
Member

j-wags commented Mar 10, 2020 via email

@davidlmobley
Copy link
Contributor

OK, so we should be able to go ahead and check off the add-tests box, then update and merge?

@jchodera
Copy link
Member Author

@davidlmobley : Yes, it was automatically being tested, but some tests were failing because they did not include the OpenEye toolkit and didn't support Molecule.from_iupac(). I've changed this to use Molecule.from_smiles().

@jchodera
Copy link
Member Author

Just waiting for tests to pass now!

@jchodera
Copy link
Member Author

@j-wags : It looks like the RDKit backend automatically collates multiple conformers from loaded SDF molecules into a multi-conformer molecule (?), while RDKit loads them as separate molecules. I'll file an issue.

@j-wags j-wags merged commit 9c33fad into master May 1, 2020
@j-wags j-wags deleted the compute-energies-example branch May 1, 2020 15:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants