Version 1.2.0 "Parsley" Update
Description
We are pleased to announce the release of Parsley version 1.2.0
! The major change we made in the new release is a new quantum chemical dataset, which is utilized in training valence parameters in the force field. Parsley 1.2.0
has been fitted to the new, carefully designed quantum chemical dataset and the successive benchmark results showed significantly improved performance in reproducing QM optimized geometries especially for certain phosphonate-containing molecules and for exocyclic divalent and trivalent nitrogen-containing molecules over Parsley 1.1.0
.
Fitting Data and Results
-
Fitting targets: 2nd generation training sets (For the details of the training set generation scheme: Training dataset selection from OFF May virtual meeting 2020.
-
Input force field :
SMIRNOFF99Frosst
w/ some parameter modification (same initial guess used inParsley 1.1.0
fitting) -
The objective function decreased from
3.619e+04
to6.877e+03
in 57 steps. After the optimization, additional Removal of redundancy in t108 SMIRKS pattern was done:[#6X3:1]-@[#16X2,#16X1-1,#16X3+1:2]-@[#6X3,#7X2;r5:3]=@[#6,#7;r5:4]
→[#6X3:1]-@[#16X2,#16X3+1:2]-@[#6X3,#7X2;r5:3]=@[#6,#7;r5:4]
Note: forcebalance v1.7.1 and openforcefield v0.6.0 were used for the target generation and the optimization.
Benchmark Results
For the calculation, full benchmark set was used (25168 optimized geometries, relative energies of 2005 molecules). Details of the test set can be found Here.
Two types of benchmarks were done to compare the performances: (1) QM vs MM optimized geometries and (2) the relative energies between conformers at “QM optimized geometries”. The final objective function value(X2) from FB single point calculation gives a brief overview of the agreement between QM and MM. The lower X2 is, the better the force field reproduces QM structures and energetics.
(1) Reproduction of QM optimized geometries
To investigate the improved performance in reproducing QM optimized geometries, the residual of each molecule, a sum of squared weighted residual of internal coordinates was calculated and plotted.
Equation of residual (demon. for bond, angle, improper torsion are set to be 0.05 Angstrom, 8 degrees and 20 degrees respectively.)
scatter plot of residual from `v1.2.0` - residual from `v1.1.0`
Each y value in the plot (delta residual) is residual from v1.2.0
- residual from v1.1.0
. Negative y value means residual from v1.2.0
is lower than that from v1.1.0
, indicating an improved performance over v1.1.0
. The mean of residual difference is -1.364.
Especially, molecules having deprotonated phosphonate group(RP(=O)(OH)(O^-))
showed significant improvement in v1.2.0, fixing a known issue with protonated phosphorus-connected oxygens which has plagued AMBER-family force fields.
QM optimized geometry of CC(O)(P@@(O)[O-])P@(O)[O-]. ( transparent orange: MM optimized geometry with v1.1.0, transparent green: v1.2.0)
Here’s one of molecules which has two deprotonated phosphonate groups. In the optimized geometry from the v1.1.0
(transparent orange in the figure), the hydroxyl hydrogen is located much closer to the negatively charged oxygen (1.10 Å) than in the QM optimized geometry (> 2.4 Å), forming an overly strong intramolecular hydrogen bond. The v1.2.0
corrects this error and gives a closer agreement with QM optimized geometry by having a larger equilibrium angle for phosphorus-centered angle term(a38
: [*:1]~[#15:2]~[*:3]
).
QM optimized geometry of CC1(O[C@@h]2CO[C@@]3([C@H]([C@@h]2O1)OC(O3)(C)C)COS(=O)(=O)N)C. ( transparent orange in the left figure: MM optimized geometry with v1.1.0, transparent green in the right figure: MM optimized geometry with v1.2.0))
For some conformers of CC1(O[C@@H]2CO[C@@]3([C@H]([C@@H]2O1)OC(O3)(C)C)COS(=O)(=O)N)C
, v1.1.0
introduced a weird spatial orientation of covalent bonds to tetravalent sulfur, resulting in a much smaller angle for N-S-O(63.3 degree) than in QM optimized geometry (97.0 degree). This issue is fixed in v1.2.0
.
(2) Reproduction of QM relative energies between conformers at QM optimized geometries
To investigate the improved performance of the new parameter set in reproducing QM relative energies between conformers, QM vs MM relative energies between conformers “at QM optimized geometries” were calculated. Two different ways to calculate MM relative energies were used. For the left figure, MM relative energies were calculated by taking a difference between MM energy at each point and MM energy at the QM minimum. And for the right figure, MM relative energies were obtained by taking a difference between MM energy at each point and MM energy at the MM minimum. Both distribution plots show lower MAD(mean absolute deviation) compared to v1.1.0
, indicating a slight, but not notable improvement of the new parameter set in reproducing QM energetics.
QM vs MM relative energies between conformers at QM optimized geometries.
Benchmark analysis done by Victoria Lim shows that the v1.1.0 fails to describe (1) molecules having `#7X2-#7X3` (single bond between divalent nitrogen and trivalent nitrogen); (2) azetidines; and (3) octahydrotetracenes. For further details of the benchmark analysis, please see [here](http://doi.org/10.5281/zenodo.3774680)We speculated that the poor performances on the chemical moieties are due to the poor chemical coverage of the training set used in Parsley fitting. In the previous release, periodicities for some N-N rotations were modified to properly describe N-N bonds which are under conjugation effect and the change included the modification of components t128
. However since there were no exocyclic #7X2-#7X3
rotations in the training set, a proper fitting of t128
couldn’t be carried out, resulting in unphysical optimized k values.
ddE w.r.t. RMSD and ddE w.r.t. TFD of divalent and trivalent nitrogen-containing molecules
Our new training set for this release now includes exocyclic rotations about #7X2-#7X3
and the new parameter set fitted to this new set shows clear improvement in lowering TFD and RMSD which indicates it is improved in reproducing QM optimized geometries while showing slight sacrifice in ddE.
QM optimized geometry of Cc1c(sc(n1)N/N=C/c2ccccn2)C. ( transparent orange: MM optimized geometry with v1.1.0, transparent green: MM optimized geometry with v1.2.0)
One example of molecules with exocyclic #7X2-#7X3
. v1.2.0
could successfully reproduce planar structure of Cc1c(sc(n1)N/N=C/c2ccccn2)C
.