censo 2.1.1 (#63)

* prepare reload capability * reload capability * tm proc basic setup * tm proc basic setup * tm prep almost done * sm/prog compatibility check * settings case insensitive * tm solvation cosmo + dcosmors done, cosmors basics * optimization additional sort * cosmors gsolv calculation basics * tm gsolv sp calls * change on how to copy mo * turbomole single-point * cosmors done * tm xtb_opt done * tm nmr done * property calculations boilerplate class * cosmors gsolv calculation done * property boilerplate * hopefully removed all faulty format strings * fixed missing tm import * next try * next try 2 * fixed cml parser * parallel call fix * loglevel setting cml * proper environment variable setting (no strings allowed) * fixed missing cefine call * fixed outdated code? * fixed usage of 'solvent' * fixed solvation in tm prep * tm cefine prep change * formatting * formatting again * log level adjustable via cml * cefine fix? * added basis lookup for turbomole * tm encoding * cefine proper error handling * subprocess error reraise + removed tm file encoding * coord file generation order fix * turbomole gcp + d4 bug avoided * turbomole gcp + d4 bug avoided for real now * r2scan-3c turbomole grid fix * tm solv prep missing newlines fix * tm copy mo moved before prep * tm xtbopt fixed * new handling for solvent and dfa availability check * small fix 1 * small fix 2 * small fix 3 * small fix 4 * small fix 5 * small fix 6 * small fix 7 * 'legacy' load balancing for using tm * part json dump contains part name * read_output reads new jsons correctly * read_output error fixed * small fix * small fix 2 * improved _set_energy for property_calculator * tm nmr prep fix * tm nmr nucsel fix * tm gcp+d4 bug switching disp/gcp off decision * typo fix * nmr grid settings fix * cosmors redone * copy_mo/prep order never violated (hopefully) * write_results for screening fixed when using gsolv * screening fixes * better check for explicit gsolv inclusion * screening refienement writeresults final fix * small fix * fixed dummy functionality + not available for tm * tm nmr fconly * finalized basic tm implementation * small fix * setting up tests * refactoring and restructuring part architecture 1 * some dfa modifications + part_no references fixed * some changes to code structure for less clutter * small fix * small fix 2 * small fix 3 * small fix 4 * small fix 5 * small fix 6 * small fix 7 * some fixes w.r.t. part execution * refactoring and restructuring part architecture 2 * small fixes * small fix * small fix 2 * Revert "small fix 2" This reverts commit dd8bcae. * Revert "small fix" This reverts commit 19297b6. * Revert "small fixes" This reverts commit 1df8332. * Revert "refactoring and restructuring part architecture 2" This reverts commit a059dc0. * Revert "some fixes w.r.t. part execution" This reverts commit 10305b8. * Revert "small fix 6" This reverts commit fa4cb55. * Revert "small fix 5" This reverts commit 1b7394f. * Revert "small fix 4" This reverts commit b9a2610. * Revert "small fix 3" This reverts commit 1b40624. * Revert "small fix 2" This reverts commit 24c0870. * Revert "small fix" This reverts commit 22a77b6. * Revert "some changes to code structure for less clutter" This reverts commit 916fd4f. * big part/results refactor try 1 * big refactor stragglers * big refactor stragglers 2 * big refactor stragglers 3 * setup conformers fix * calc boltzmannweights fix * print comparison fix * ensemble cutting fix * some small fixes * optimization key error fix * orca proc fix * small fix * small fix 2 * optimization key error fix * print_comparison fix * moved _write_results call and fixed results appending to ensemble * small fix * moving around xtb energies etc * small fix * small fix 2 * small fix 3 * output writing/printing refactor * small fixes * small fix * small print_comparison fix * small fix * small fix 2 * small fix 3 * small fix 4 * small fix 5 * small fix 6 * properties calculator fix * small fix * some changes to read_input * ensembledata read_input fix
grimme-lab · Nov 5, 2024 · 6459005 · 6459005
1 parent a77f2d5
commit 6459005
Show file tree

Hide file tree

Showing 46 changed files with 1,774 additions and 1,998 deletions.
diff --git a/README.md b/README.md
@@ -1,5 +1,5 @@
 # NEW: CENSO 2.0
-This is the updated version of the former CENSO 1.3 program. New features include the possibility to use ``CENSO`` as a package from within Python, template files, dummy functionals, and more! For more information about the use and the capabilities of CENSO 2.0 visit the documentation [here](https://xtb-docs.readthedocs.io/en/latest/CENSO_docs/censo.html).
+This is the updated version of the former CENSO 1.3 program. New features include the possibility to use CENSO as a package from within Python, template files, dummy functionals, json outputs, and more! For more information about the use and the capabilities of CENSO 2.0 visit the documentation [here](https://xtb-docs.readthedocs.io/en/latest/CENSO_docs/censo.html).
 
 # Installation
 Can be installed using `pip` by running
@@ -21,36 +21,63 @@ If you want to run it via helper script after adding it to your `$PATH`:
 
 Please note that the ``--maxcores`` option is required for every run.
 
-CENSO can also be used as a package. A basic setup for a CENSO run in a Python file would look like this:
+CENSO can also be used as a package. A basic setup for a CENSO run in a Python file could look like this:
 ```python
 from censo.ensembledata import EnsembleData
 from censo.configuration import configure
 from censo.ensembleopt import Prescreening, Screening, Optimization
 from censo.properties import NMR
+from censo.params import Config
 
-workdir = "/absolute/path/to/your/workdir" # CENSO will put all files in this directory
+# CENSO will put all files in the current working directory (os.getcwd())
 input_path = "rel/path/to/your/inputfile" # path relative to the working directory
-ensemble = EnsembleData(workdir)
-ensemble.read_input(input_path, charge=0, unpaired=0)
-ncores = os.cpu_count() # Could be set to any other positive integer
+ensemble = EnsembleData(input_file=input_path) 
+# the above can be used if you molecule is neutral and closed shell, otherwise
+# it is necessary to proceed with e.g.
+# ensemble = EnsembleData()
+# ensemble.read_input(input_path, charge=-1, unpaired=1)
 
 # If the user wants to use a specific rcfile:
-configure(rcpath="/abs/path/to/rcfile")
-
-# Setup all the parts that the user wants to run
-parts = [
-    part(ensemble) for part in [Prescreening, Screening, Optimization, NMR]
-]
-
-# Run all the parts and collect their runtimes
-part_timings = []
-for part in parts:
-    part_timings.append(part.run(ncores))
-
-# If no Exceptions were raised, all the output can now be found in 'workdir'
-# Data is given in a formatted plain text format (*.out) and and json format
-# The files used in the computations for each conformer can be found in the folders 
-# generated by each part, respectively (e.g. '0_PRESCREENING/CONF2/...')
+configure("/abs/path/to/rcfile")
+
+# Get the number of available cpu cores on this machine
+# This is also the default value that CENSO uses
+# This number can also be set to any other integer value and automatically checked for validity
+Config.NCORES = os.cpu_count()
+
+# Another possibly important setting is OMP, which will get used if you disabled the automatic 
+# load balancing in the settings
+Config.OMP = 4
+
+# The user can also choose to change specific settings of the parts
+# Please take note of the following:
+# - the settings of certain parts, e.g. Prescreening are changed using set_setting(name, value)
+# - general settings are changed by using set_general_setting(name, value) (it does not matter which part you call it from)
+# - the values you want to set must comply with limits and the type of the setting
+Prescreening.set_setting("threshold", 5.0)
+Prescreening.set_general_setting("solvent", "dmso")
+
+# It is also possible to use a dict to set multiple values in one step
+settings = {
+    "threshold": 3.5,
+    "func": "pbeh-3c",
+    "implicit": True,
+}
+Screening.set_settings(settings, complete=False)  
+# the complete kwarg tells the method whether to set the undefined settings using defaults or leave them on their current value
+
+
+# Setup and run all the parts that the user wants to run
+# Running the parts in order here, while it is also possible to use a custom order or run some parts multiple times
+# Running a part will return an instance of the respective type
+# References to the resulting part instances will be appended to a list in the EnsembleData object (ensemble.results)
+# Note though, that currently this will lead to results being overwritten in your working directory
+# (you could circumvent this by moving/renaming the folders)
+results, timings = zip(*[part.run(ensemble) for part in [Prescreening, Screening, Optimization, NMR]])
+
+# You access the results using the ensemble object
+# You can also find all the results the <part>.json output files
+print(ensemble.results[0].data["results"]["CONF5"]["sp"]["energy"])
 ```
 
 # License

diff --git a/pyproject.toml b/pyproject.toml
@@ -12,7 +12,7 @@ homepage = "https://github.com/grimme-lab/CENSO"
 documentation = "https://xtb-docs.readthedocs.io/en/latest/CENSO_docs/censo.html"
 
 [project.optional-dependencies]
-dev = []
+dev = ["black", "pytest", "tox"]
 scripts = [
     "numpy",
     "matplotlip",
@@ -24,3 +24,7 @@ readme = {file = "README.md"}
 
 [tool.setuptools_scm]
 version_file = "src/censo/__version__.py"
+
+[tool.pytest.ini_options]
+testpaths = ["test"]
+pythonpath = ["src"]
diff --git a/src/censo/__init__.py b/src/censo/__init__.py
@@ -6,6 +6,15 @@
 configure()
 
 from .cli import interface, cml_parser
-from . import (configuration, ensembledata, datastructure, orca_processor,
-               parallel, part, procfact, qm_processor, utilities, ensembleopt,
-               properties)
+from . import (
+    configuration,
+    ensembledata,
+    datastructure,
+    orca_processor,
+    parallel,
+    part,
+    qm_processor,
+    utilities,
+    ensembleopt,
+    properties,
+)
diff --git a/src/censo/assets/censo_dfa_settings.json b/src/censo/assets/censo_dfa_settings.json
@@ -159,13 +159,13 @@
     "b97-d3": {
       "tm": "b97-d",
       "orca": "b97-d3",
-      "disp": "d3bj",
+      "disp": "included",
       "type": "gga"
     },
-    "b97-d3(0)": {
-      "tm": "b97-d",
-      "orca": null,
-      "disp": "d3(0)",
+    "b97-d4": {
+      "tm": null,
+      "orca": "b97",
+      "disp": "d4",
       "type": "gga"
     },
     "kt1-novdw": {
@@ -294,6 +294,24 @@
       "disp": "included",
       "type": "rs_hybrid"
     },
+    "chyf-b95-novdw": {
+      "tm": "chyf-b95",
+      "orca": null,
+      "disp": "novdw",
+      "type": "local_hybrid"
+    },
+    "chyf-b95-d3": {
+      "tm": "chyf-b95",
+      "orca": null,
+      "disp": "d3bj",
+      "type": "local_hybrid"
+    },
+    "chyf-b95-d4": {
+      "tm": "chyf-b95",
+      "orca": null,
+      "disp": "d4",
+      "type": "local_hybrid"
+    },
     "dsd-blyp-d3": {
       "tm": null,
       "orca": "ri-dsd-blyp",
@@ -308,4 +326,3 @@
     }
   }
 }
-
diff --git a/src/censo/cli/cml_parser.py b/src/censo/cli/cml_parser.py
@@ -3,6 +3,7 @@
 cml parsing
 """
 
+from ..params import START_DESCR
 import argparse
 
 
@@ -31,7 +32,7 @@ def check_soft_requirements(args: argparse.Namespace) -> bool:
         return True
 
 
-def parse(startup_description, argv=None) -> argparse.Namespace:
+def parse(argv=None) -> argparse.Namespace:
     """
     Process commandline arguments
 
@@ -40,7 +41,7 @@ def parse(startup_description, argv=None) -> argparse.Namespace:
     """
 
     parser = argparse.ArgumentParser(
-        description=startup_description,
+        description=START_DESCR,
         prog="censo",
     )
 
@@ -118,6 +119,14 @@ def parse(startup_description, argv=None) -> argparse.Namespace:
         help="Number of cores that should be used for CENSO on the machine. If this is not provided CENSO will use "
         "the maximum number available. For a default run this is REQUIRED.",
     )
+    groups[0].add_argument(
+        "-O",
+        "--omp",
+        dest="omp",
+        type=int,
+        help="Number of OpenMP threads, e.g. 4. Effectively translates to the number of cores used per calculation "
+        "if load balancing is disabled.",
+    )
     groups[0].add_argument(
         "--loglevel",
         dest="loglevel",
@@ -200,14 +209,6 @@ def parse(startup_description, argv=None) -> argparse.Namespace:
         const=True,
         help="Run calculation in gas-phase, overriding all solvation settings.",
     )
-    groups[1].add_argument(
-        "-O",
-        "--omp",
-        dest="omp",
-        type=int,
-        help="Number of OpenMP threads, e.g. 4. Effectively translates to the number of cores used per calculation "
-        "if load balancing is disabled.",
-    )
     groups[1].add_argument(
         "--imagthr",
         dest="imagthr",

diff --git a/src/censo/cli/interface.py b/src/censo/cli/interface.py
@@ -11,7 +11,7 @@
 from ..ensembleopt import Prescreening, Screening, Optimization, Refinement
 from ..part import CensoPart
 from ..properties import NMR, UVVis
-from ..params import START_DESCR, __version__
+from ..params import __version__, Config
 from ..utilities import print
 from ..logging import setup_logger, set_loglevel
 
@@ -23,7 +23,7 @@ def entry_point(argv: list[str] | None = None) -> int:
     Console entry point to execute CENSO from the command line.
     """
     try:
-        args = parse(START_DESCR, argv)
+        args = parse(argv=argv)
     except ArgumentError as e:
         print(e.message)
         return 1
@@ -43,22 +43,17 @@ def entry_point(argv: list[str] | None = None) -> int:
         return 0
 
     # Print general settings once
-    CensoPart(ensemble).print_info()
+    CensoPart(ensemble, print_info=True)
 
     run = filter(
         lambda x: x.get_settings()["run"],
         [Prescreening, Screening, Optimization, Refinement, NMR, UVVis],
     )
 
-    ncores = 4
-    if args.maxcores:
-        ncores = args.maxcores
-
     time = 0.0
     for part in run:
-        p = part(ensemble)
-        runtime = p.run(ncores)
-        print(f"Ran {p._name} in {runtime:.2f} seconds!")
+        res, runtime = part.run(ensemble)
+        print(f"Ran {res.name} in {runtime:.2f} seconds!")
         time += runtime
 
     time = timedelta(seconds=int(time))
@@ -96,14 +91,30 @@ def startup(args) -> EnsembleData | None:
     elif args.inprcpath is not None:
         configure(args.inprcpath)
 
+    if args.loglevel:
+        set_loglevel(args.loglevel)
+
     # Override settings with command line arguments
     override_rc(args)
 
     # initialize ensemble, constructor get runinfo from args
-    ensemble = EnsembleData(cwd, args=args)
+    ensemble = EnsembleData()
 
     # read input and setup conformers
-    ensemble.read_input(args.inp)
+    ensemble.read_input(
+        args.inp, charge=args.charge, unpaired=args.unpaired, nconf=args.nconf
+    )
+
+    # if data should be reloaded, do it here
+    if args.reload:
+        for filename in args.reload:
+            ensemble.read_output(os.path.join(cwd, filename))
+
+    if args.maxcores:
+        Config.NCORES = args.maxcores
+
+    if args.omp:
+        Config.OMP = args.omp
 
     # if data should be reloaded, do it here
     if args.reload:

diff --git a/src/censo/configuration.py b/src/censo/configuration.py
@@ -3,7 +3,7 @@
 import configparser
 from argparse import Namespace
 
-from .params import CENSORCNAME, ASSETS_PATH, USER_ASSETS_PATH
+from .params import Config
 from .qm_processor import QmProc
 from .utilities import DfaHelper, SolventHelper, print
 
@@ -33,10 +33,12 @@ def configure(rcpath: str = None, create_new: bool = False):
         censorc_path = rcpath
 
     # Set up the DFAHelper
-    DfaHelper.set_dfa_dict(os.path.join(ASSETS_PATH, "censo_dfa_settings.json"))
+    DfaHelper.set_dfa_dict(os.path.join(Config.ASSETS_PATH, "censo_dfa_settings.json"))
 
     # Set up the SolventHelper
-    SolventHelper.set_solvent_dict(os.path.join(ASSETS_PATH, "censo_solvents_db.json"))
+    SolventHelper.set_solvent_dict(
+        os.path.join(Config.ASSETS_PATH, "censo_solvents_db.json")
+    )
 
     # map the part names to their respective classes
     # NOTE: the DFAHelper and the databases should be setup before the parts are imported,
@@ -94,8 +96,8 @@ def configure(rcpath: str = None, create_new: bool = False):
         QmProc._paths.update(paths)
 
     # create user assets folder if it does not exist
-    if not os.path.isdir(USER_ASSETS_PATH):
-        os.mkdir(USER_ASSETS_PATH)
+    if not os.path.isdir(Config.USER_ASSETS_PATH):
+        os.mkdir(Config.USER_ASSETS_PATH)
 
 
 def read_rcfile(path: str, silent: bool = True) -> dict[str, dict[str, any]]:
@@ -138,7 +140,7 @@ def write_rcfile(path: str) -> None:
     if os.path.isfile(path):
         print(
             f"An existing configuration file has been found at {path}.\n",
-            f"Renaming existing file to {CENSORCNAME}_OLD.\n",
+            f"Renaming existing file to {Config.CENSORCNAME}_OLD.\n",
         )
         # Read program paths from the existing configuration file
         print("Reading program paths from existing configuration file ...")
@@ -151,7 +153,6 @@ def write_rcfile(path: str) -> None:
         parser = configparser.ConfigParser()
 
         # collect all default settings from parts and feed them into the parser
-        global parts
         from .part import CensoPart
 
         parts["general"] = CensoPart
@@ -181,9 +182,9 @@ def write_rcfile(path: str) -> None:
         "Right now the settings are at their default values.\n"
     )
 
-    if CENSORCNAME not in path:
+    if Config.CENSORCNAME not in path:
         print(
-            f"Additionally make sure that the file name is '{CENSORCNAME}'.\n"
+            f"Additionally make sure that the file name is '{Config.CENSORCNAME}'.\n"
             f"Currently it is '{os.path.split(path)[-1]}'.\n"
         )
 
@@ -250,8 +251,8 @@ def find_rcfile() -> str | None:
 
     rcpath = None
     # check for .censorc in $home
-    if os.path.isfile(os.path.join(os.path.expanduser("~"), CENSORCNAME)):
-        rcpath = os.path.join(os.path.expanduser("~"), CENSORCNAME)
+    if os.path.isfile(os.path.join(os.path.expanduser("~"), Config.CENSORCNAME)):
+        rcpath = os.path.join(os.path.expanduser("~"), Config.CENSORCNAME)
 
     return rcpath
 
@@ -267,8 +268,6 @@ def override_rc(args: Namespace) -> None:
         None
     """
     # Override general and part specific settings
-    # TODO - might be made nicer by using the argument groups?
-    global parts
     from .part import CensoPart
 
     for part in list(parts.values()) + [CensoPart]: