Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

QChem: add CMIRS solvent model support #2215

Merged
merged 40 commits into from
Nov 22, 2022
Merged
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
40 commits
Select commit Hold shift + click to select a range
ef5d3e7
QChem: translate DMSO name in smd_solvent
rkingsbury Aug 11, 2021
970b085
QCInput: support $svp and $pcm_nonels
rkingsbury Aug 11, 2021
f02a4c2
QCinput more thorough docstring for svp and pcm_nonels
rkingsbury Aug 11, 2021
bf4832c
QChemDictSet: support svp and pcm_nonels
rkingsbury Aug 11, 2021
bf95315
bugfix
rkingsbury Aug 11, 2021
6de0663
bugfix
rkingsbury Aug 11, 2021
286efce
don't lowercase svp block
rkingsbury Aug 11, 2021
f2773ee
Merge branch 'master' of https://github.com/materialsproject/pymatgen…
rkingsbury Aug 7, 2022
f343d1d
Merge branch 'master' of https://github.com/materialsproject/pymatgen…
rkingsbury Sep 21, 2022
9ad3ed7
QCInput: dmso bugfix
rkingsbury Sep 21, 2022
23147b6
QCInput: format svp as dict; update authors+license
rkingsbury Sep 21, 2022
eb0a6fe
QCInput: fix read_svp for dict format
rkingsbury Sep 21, 2022
805b71e
QCInput: type conversion fix in read_svp
rkingsbury Sep 21, 2022
aeabaeb
QCInput: make everything a str for consistency
rkingsbury Sep 21, 2022
e5da4ea
utils: update docstring of lower_and_check_unique
rkingsbury Sep 21, 2022
f114e86
QChemDictSet: ISOSVP/CMIRS support; more coming
rkingsbury Sep 21, 2022
e24e701
QChemDictSet: add warnings and tests for bad inputs
rkingsbury Sep 22, 2022
e7c7c97
QChemDictSet: add CMIRS 0.0005 params; type hinting
rkingsbury Sep 22, 2022
8f241e1
QChem: refactor and test lower_and_check_unique
rkingsbury Sep 22, 2022
f667018
QChem: propagate CMIRS/ISOSVP to all InputSet
rkingsbury Sep 22, 2022
6a15781
QChem: CMIRS and ISOSVP tests for SinglePointSet
rkingsbury Sep 22, 2022
5b7c065
QChem: author/credit/copyright updates
rkingsbury Sep 22, 2022
668ea14
QCInput: desc for pcm_nonels and svp templates etc.
rkingsbury Sep 23, 2022
5023658
Merge branch 'master' of https://github.com/materialsproject/pymatgen…
rkingsbury Sep 23, 2022
7e4e511
QCInput: revert is not None for svp and pcm_nonels
rkingsbury Sep 23, 2022
38de6d2
QCOutput: WIP CMIRS and ISOSVP parsing
rkingsbury Sep 23, 2022
9865c77
QChem: restore __credits__; author update in test_outputs
rkingsbury Sep 26, 2022
db79c93
Merge branch 'master' of https://github.com/materialsproject/pymatgen…
rkingsbury Nov 1, 2022
29ed510
revert minor author changes
rkingsbury Nov 1, 2022
8210eb2
QCOutput: revert typo
rkingsbury Nov 1, 2022
68db373
QCOutput: revert comment about possible bug
rkingsbury Nov 1, 2022
dd69c8a
QCOutput: update multi test files for CMIRS/ISOSVP
rkingsbury Nov 1, 2022
1a16d22
QCOutput: add unit tests for CMIRS / ISOSVP
rkingsbury Nov 2, 2022
01b0744
QCOutput: ensure all solvent keys populated; upd tests
rkingsbury Nov 2, 2022
8b04d30
QCOutput: nest ISOSVP/CMIRS data in solvent_data
rkingsbury Nov 2, 2022
a873aaa
QCOutput: bugfix for PCM and SMD solvent_data
rkingsbury Nov 4, 2022
35f31f8
QChem: disable gen_scfman in ISOSVP calcs
rkingsbury Nov 8, 2022
40f1056
QChem: add gen_scfman comment to docstring
rkingsbury Nov 8, 2022
03ce038
QChem: add ipnrf flag for CMIRS
rkingsbury Nov 9, 2022
df3a879
QChem: update test for ipnrf
rkingsbury Nov 9, 2022
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
128 changes: 122 additions & 6 deletions pymatgen/io/qchem/inputs.py
Original file line number Diff line number Diff line change
Expand Up @@ -18,10 +18,8 @@

from .utils import lower_and_check_unique, read_pattern, read_table_pattern

__author__ = "Brandon Wood, Samuel Blau, Shyam Dwaraknath, Julian Self, Evan Spotte-Smith"
__copyright__ = "Copyright 2018, The Materials Project"
__version__ = "0.1"
__email__ = "[email protected]"
__author__ = "Brandon Wood, Samuel Blau, Shyam Dwaraknath, Evan Spotte-Smith, Ryan Kingsbury"
__copyright__ = "Copyright 2018-2022, The Materials Project"
__credits__ = "Xiaohui Qu"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Probably Xiaohui should still get credit. His original work inspired what's now in pymatgen.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess I'm not clear on the distinction between __credits__ and __authors__ (I might just be ignorant). Those seemed equivalent to me. Happy to revert though if __credits__ implies origination or some other special status.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I did make sure to add Xiaohui to the __authors__ list for any file where he was listed in __credits__, in case that wasn't obvious


logger = logging.getLogger(__name__)
Expand Down Expand Up @@ -50,6 +48,8 @@ def __init__(
plots: dict | None = None,
nbo: dict | None = None,
geom_opt: dict | None = None,
svp: dict | None = None,
pcm_nonels: dict | None = None,
):
"""
Args:
Expand Down Expand Up @@ -110,6 +110,8 @@ def __init__(
self.plots = lower_and_check_unique(plots)
self.nbo = lower_and_check_unique(nbo)
self.geom_opt = lower_and_check_unique(geom_opt)
self.svp = lower_and_check_unique(svp)
self.pcm_nonels = lower_and_check_unique(pcm_nonels)

# Make sure rem is valid:
# - Has a basis
Expand Down Expand Up @@ -194,6 +196,13 @@ def __str__(self):
if self.geom_opt is not None:
combined_list.append(self.geom_opt_template(self.geom_opt))
combined_list.append("")
# svp section
if self.svp:
rkingsbury marked this conversation as resolved.
Show resolved Hide resolved
combined_list.append(self.svp_template(self.svp))
combined_list.append("")
# pcm_nonels section
if self.pcm_nonels:
rkingsbury marked this conversation as resolved.
Show resolved Hide resolved
combined_list.append(self.pcm_nonels_template(self.pcm_nonels))
return "\n".join(combined_list)

@staticmethod
Expand Down Expand Up @@ -238,6 +247,8 @@ def from_string(cls, string: str) -> QCInput:
plots = None
nbo = None
geom_opt = None
svp = None
pcm_nonels = None
if "opt" in sections:
opt = cls.read_opt(string)
if "pcm" in sections:
Expand All @@ -256,6 +267,10 @@ def from_string(cls, string: str) -> QCInput:
nbo = cls.read_nbo(string)
if "geom_opt" in sections:
geom_opt = cls.read_geom_opt(string)
if "svp" in sections:
svp = cls.read_svp(string)
if "pcm_nonels" in sections:
pcm_nonels = cls.read_pcm_nonels(string)
return cls(
molecule,
rem,
Expand All @@ -269,6 +284,8 @@ def from_string(cls, string: str) -> QCInput:
plots=plots,
nbo=nbo,
geom_opt=geom_opt,
svp=svp,
pcm_nonels=pcm_nonels,
)

@staticmethod
Expand Down Expand Up @@ -429,6 +446,9 @@ def smx_template(smx: dict) -> str:
for key, value in smx.items():
if value == "tetrahydrofuran":
smx_list.append(f" {key} thf")
# Q-Chem bug, see https://talk.q-chem.com/t/smd-unrecognized-solvent/204
elif value == "dimethyl sulfoxide":
smx_list.append(f" {key} dmso")
else:
smx_list.append(f" {key} {value}")
smx_list.append("$end")
Expand Down Expand Up @@ -519,14 +539,35 @@ def nbo_template(nbo: dict) -> str:
nbo_list.append("$end")
return "\n".join(nbo_list)

@staticmethod
def svp_template(svp: dict) -> str:
"""
Template for the $svp section.

Args:
svp: dict of SVP parameters, e.g.
{"rhoiso": "0.001", "nptleb": "1202", "itrngr": "2", "irotgr": "2"}

Returns:
str: the $svp section. Note that all parameters will be concatenated onto
a single line formatted as a FORTRAN namelist. This is necessary
because the isodensity SS(V)PE model in Q-Chem calls a secondary code.
"""
svp_list = []
svp_list.append("$svp")
param_list = [f"{_key}={value}" for _key, value in svp.items()]
svp_list.append(", ".join(param_list))
svp_list.append("$end")
return "\n".join(svp_list)

@staticmethod
def geom_opt_template(geom_opt: dict) -> str:
"""
Args:
geom_opt ():

Returns:
(str)
(str) geom_opt parameters.
"""
geom_opt_list = []
geom_opt_list.append("$geom_opt")
Expand All @@ -535,6 +576,36 @@ def geom_opt_template(geom_opt: dict) -> str:
geom_opt_list.append("$end")
return "\n".join(geom_opt_list)

@staticmethod
def pcm_nonels_template(pcm_nonels: dict) -> str:
"""
Template for the $pcm_nonels section.

Arg
pcm_nonels: dict of CMIRS parameters, e.g.
{
"a": "-0.006736",
"b": "0.032698",
"c": "-1249.6",
"d": "-21.405",
"gamma": "3.7",
"solvrho": "0.05",
"delta": 7,
"gaulag_n": 40,
}

Returns:
(str)
"""
pcm_nonels_list = []
pcm_nonels_list.append("$pcm_nonels")
for key, value in pcm_nonels.items():
# if the value is None, don't write it to output
rkingsbury marked this conversation as resolved.
Show resolved Hide resolved
if value is not None:
pcm_nonels_list.append(f" {key} {value}")
pcm_nonels_list.append("$end")
return "\n".join(pcm_nonels_list)

@staticmethod
def find_sections(string: str) -> list:
"""
Expand Down Expand Up @@ -767,6 +838,9 @@ def read_smx(string: str) -> dict:
smx[key] = val
if smx["solvent"] == "tetrahydrofuran":
smx["solvent"] = "thf"
# Q-Chem bug, see https://talk.q-chem.com/t/smd-unrecognized-solvent/204
elif smx["solvent"] == "dimethyl sulfoxide":
smx["solvent"] = "dmso"
return smx

@staticmethod
Expand Down Expand Up @@ -859,7 +933,7 @@ def read_geom_opt(string: str) -> dict:
string (str): String

Returns:
(dict) geom_opt parameters.
(dict) nbo parameters.
rkingsbury marked this conversation as resolved.
Show resolved Hide resolved
"""
header = r"^\s*\$geom_opt"
row = r"\s*([a-zA-Z\_]+)\s*=?\s*(\S+)"
Expand All @@ -872,3 +946,45 @@ def read_geom_opt(string: str) -> dict:
for key, val in geom_opt_table[0]:
geom_opt[key] = val
return geom_opt

@staticmethod
def read_svp(string: str) -> dict:
"""
Read svp parameters from string.
"""
header = r"^\s*\$svp"
row = r"(\w.*)\n"
footer = r"^\s*\$end"
svp_table = read_table_pattern(string, header_pattern=header, row_pattern=row, footer_pattern=footer)
if svp_table == []:
print("No valid svp inputs found.")
return {}
svp_list = svp_table[0][0][0].split(", ")
svp_dict = {}
for s in svp_list:
svp_dict[s.split("=")[0]] = s.split("=")[1]
return svp_dict

@staticmethod
def read_pcm_nonels(string: str) -> dict:
"""
Read pcm_nonels parameters from string.

Args:
string (str): String

Returns:
(dict) PCM parameters
"""
header = r"^\s*\$pcm_nonels"
row = r"\s*([a-zA-Z\_]+)\s+(.+)"
footer = r"^\s*\$end"
pcm_nonels_table = read_table_pattern(string, header_pattern=header, row_pattern=row, footer_pattern=footer)
if not pcm_nonels_table:
print(
"No valid $pcm_nonels inputs found. Note that there should be no '=' \
chracters in $pcm_nonels input lines."
)
return {}

return dict(pcm_nonels_table[0])
132 changes: 129 additions & 3 deletions pymatgen/io/qchem/outputs.py
Original file line number Diff line number Diff line change
Expand Up @@ -38,9 +38,8 @@
have_babel = False


__author__ = "Samuel Blau, Brandon Wood, Shyam Dwaraknath, Evan Spotte-Smith"
__copyright__ = "Copyright 2018, The Materials Project"
__version__ = "0.1"
__author__ = "Brandon Wood, Samuel Blau, Shyam Dwaraknath, Evan Spotte-Smith, Ryan Kingsbury"
rkingsbury marked this conversation as resolved.
Show resolved Hide resolved
__copyright__ = "Copyright 2018-2022, The Materials Project"
__credits__ = "Gabe Gomes"

logger = logging.getLogger(__name__)
Expand Down Expand Up @@ -144,6 +143,10 @@ def __init__(self, filename: str):
self.data["solvent_method"] = "PCM"
if read_pattern(self.text, {"key": r"solvent_method\s*=?\s*smd"}, terminate_on_match=True).get("key") == [[]]:
self.data["solvent_method"] = "SMD"
if read_pattern(self.text, {"key": r"solvent_method\s*=?\s*isosvp"}, terminate_on_match=True).get("key") == [
[]
]:
self.data["solvent_method"] = "ISOSVP"

# Parse information specific to a solvent model
if self.data["solvent_method"] == "PCM":
Expand All @@ -154,6 +157,9 @@ def __init__(self, filename: str):
self.data["solvent_data"]["PCM_dielectric"] = float(temp_dielectric[0][0])
self._read_pcm_information()
elif self.data["solvent_method"] == "SMD":
# TODO - is this a bug? It appears that the code block below if the string
rkingsbury marked this conversation as resolved.
Show resolved Hide resolved
# 'Unrecognized solvent' is NOT found in the output, but it seems like it
# should be the reverse
if read_pattern(self.text, {"key": r"Unrecognized solvent"}, terminate_on_match=True).get("key") == [[]]:
if not self.data.get("completion", []):
self.data["errors"] += ["unrecognized_solvent"]
Expand All @@ -173,6 +179,22 @@ def __init__(self, filename: str):
self.data["warnings"]["questionable_SMD_parsing"] = True
self.data["solvent_data"]["SMD_solvent"] = temp_solvent[0][0]
self._read_smd_information()
elif self.data["solvent_method"] == "ISOSVP":
self.data["solvent_data"] = {"CMIRS_enabled": False}
self._read_isosvp_information()
if read_pattern(
self.text,
{"cmirs": r"DEFESR calculation with single-center isodensity surface"},
terminate_on_match=True,
).get("cmirs") != [[]]:
# this is a CMIRS calc
# note that all other outputs follow the same format as ISOSVP
self._read_cmirs_information()
self.data["solvent_data"]["CMIRS_enabled"] = True

# TODO - parsing to identify the CMIRS solvent?
# this would require parsing the $pcm_nonels section of input and comparing
# parameters to the contents of CMIRS_SETTINGS in pymatgen.io.qchem.sets

# Parse the final energy
temp_final_energy = read_pattern(self.text, {"key": r"Final\senergy\sis\s+([\d\-\.]+)"}).get("key")
Expand Down Expand Up @@ -1315,6 +1337,110 @@ def _read_smd_information(self):
for key in pcm_keys:
self.data["solvent_data"][key] = None

def _read_isosvp_information(self):
"""
Parses information from ISOSVP solvent calculations.

There are 5 energies output, as in the example below

--------------------------------------------------------------------------------
The Final SS(V)PE energies and Properties
--------------------------------------------------------------------------------

Energies
--------------------
The Final Solution-Phase Energy = -40.4850599390
The Solute Internal Energy = -40.4846329759
The Change in Solute Internal Energy = 0.0000121970 ( 0.00765 KCAL/MOL)
The Reaction Field Free Energy = -0.0004269631 ( -0.26792 KCAL/MOL)
The Total Solvation Free Energy = -0.0004147661 ( -0.26027 KCAL/MOL)

In addition, we need to parse the DIELST fortran variable to get the dielectric
constant used.
"""
temp_dict = read_pattern(
self.text,
{
"final_soln_phase_e": r"\s*The Final Solution-Phase Energy\s+=\s+([\d\-\.]+)\s*",
"solute_internal_e": r"\s*The Solute Internal Energy\s+=\s+([\d\-\.]+)\s*",
"total_solvation_free_e": r"\s*The Total Solvation Free Energy\s+=\s+([\d\-\.]+)\s*",
"change_solute_internal_e": r"\s*The Change in Solute Internal Energy\s+=\s+(\s+[\d\-\.]+)\s+\(\s+([\d\-\.]+)\s+KCAL/MOL\)\s*", # pylint: disable=line-too-long
"reaction_field_free_e": r"\s*The Reaction Field Free Energy\s+=\s+(\s+[\d\-\.]+)\s+\(\s+([\d\-\.]+)\s+KCAL/MOL\)\s*", # pylint: disable=line-too-long
"isosvp_dielectric": r"\s*DIELST=\s+(\s+[\d\-\.]+)\s*",
},
)

for key, _v in temp_dict.items():
if temp_dict.get(key) is None:
self.data["solvent_data"][key] = None
elif len(temp_dict.get(key)) == 1:
self.data["solvent_data"][key] = float(temp_dict.get(key)[0][0])
else:
temp_result = np.zeros(len(temp_dict.get(key)))
for ii, entry in enumerate(temp_dict.get(key)):
temp_result[ii] = float(entry[0])
self.data["solvent_data"][key] = temp_result

pcm_keys = [
"g_electrostatic",
"g_cavitation",
"g_dispersion",
"g_repulsion",
"total_contribution_pcm",
"solute_internal_energy",
]
for key in pcm_keys:
self.data["solvent_data"][key] = None

smd_keys = ["smd0", "smd3", "smd4", "smd6", "smd9"]
for key in smd_keys:
self.data["solvent_data"][key] = None

def _read_cmirs_information(self):
"""
Parses information from CMIRS solvent calculations.

In addition to the 5 energies returned by ISOSVP (and read separately in
_read_isosvp_information), there are 4 additional energies reported, as shown
in the example below

--------------------------------------------------------------------------------
The Final SS(V)PE energies and Properties
--------------------------------------------------------------------------------

Energies
--------------------
The Final Solution-Phase Energy = -40.4751881546
The Solute Internal Energy = -40.4748568841
The Change in Solute Internal Energy = 0.0000089729 ( 0.00563 KCAL/MOL)
The Reaction Field Free Energy = -0.0003312705 ( -0.20788 KCAL/MOL)
The Dispersion Energy = 0.6955550107 ( -2.27836 KCAL/MOL)
The Exchange Energy = 0.2652679507 ( 2.15397 KCAL/MOL)
Min. Negative Field Energy = 0.0005235850 ( 0.00000 KCAL/MOL)
Max. Positive Field Energy = 0.0179866718 ( 0.00000 KCAL/MOL)
The Total Solvation Free Energy = -0.0005205275 ( -0.32664 KCAL/MOL)
"""
temp_dict = read_pattern(
self.text,
{
"dispersion_e": r"\s*The Dispersion Energy\s+=\s+(\s+[\d\-\.]+)\s+\(\s+([\d\-\.]+)\s+KCAL/MOL\)\s*",
"exchange_e": r"\s*The Exchange Energy\s+=\s+(\s+[\d\-\.]+)\s+\(\s+([\d\-\.]+)\s+KCAL/MOL\)\s*",
"min_neg_field_e": r"\s*Min. Negative Field Energy\s+=\s+(\s+[\d\-\.]+)\s+\(\s+([\d\-\.]+)\s+KCAL/MOL\)\s*", # pylint: disable=line-too-long
"max_pos_field_e": r"\s*Max. Positive Field Energy\s+=\s+(\s+[\d\-\.]+)\s+\(\s+([\d\-\.]+)\s+KCAL/MOL\)\s*", # pylint: disable=line-too-long
},
)

for key, _v in temp_dict.items():
if temp_dict.get(key) is None:
self.data["solvent_data"][key] = None
elif len(temp_dict.get(key)) == 1:
self.data["solvent_data"][key] = float(temp_dict.get(key)[0][0])
else:
temp_result = np.zeros(len(temp_dict.get(key)))
for ii, entry in enumerate(temp_dict.get(key)):
temp_result[ii] = float(entry[0])
self.data["solvent_data"][key] = temp_result

def _read_nbo_data(self):
"""
Parses NBO output
Expand Down
Loading