Skip to content

aspuru-guzik-group/gemini

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

3 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Gemini: Dynamic Bias Correction for Autonomous Experimentation and Molecular Simulation

Build Status codecov GitHub license

Gemini is an open-source Python package which provides scalable multi-fidelity machine learning targeting the design and discovery of functional molecules and advanced materials. (https://arxiv.org/abs/2103.03391v1)

Installation

Install Gemini from source,

git clone https://github.com/rileyhickman/gemini.git
cd matter-gemini
pip install -e .

GPU use is optional. We recommend using the following

tensorflow-gpu            2.4.1
CUDA Version: 11.1
cuda-toolkit-11-1         11.1.1-1
Latest cuDNN

Usage

Supervised learning tasks with multi-fidelity data

Gemini can be easily trained given 2D (# samples, # dimensions) NumPy arrays containing features (x) and targets (y) for exp and cheap datasets. Predictions using Gemini are furnished with frequentist uncertainty estimates.

from gemini import Gemini

gemini = Gemini()

gemini.train(x_exp, y_exp,
	 x_cheap, y_cheap)

pred_mu, pred_std = gemini.predict(x_exp_test)

Scalable multi-fidelity Bayesian optimization

Gemini's predictions of expensive-to-evaluate objective functions can be used to reduce the number of expensive black-box evaluations necessary to achieve a desired target value.

The deep Bayesian optimizer Gryffin currently supports Gemini as a built-in predictive model. After installing Gemini and Gryffin,

from gryffin import Gryffin

# instantiate Gryffin
gryffin = Gryffin('config_file.json')

# optimization loop
while num_eval < budget:

    samples = gryffin.recommend(observations,
				proxy_observations)

The Gryffin config file must include a section specifying the predictive model, i.e.

...
"predictive_model": {
		"model_kind": "gemini"
},
...

Alternatively, you can train Gemini in an external manner, this gives the user greater flexibility in their expreiment. Gryffin allows for the optional passing of a callable object to its recommend method.

from gryffin import Gryffin
from gemini import GeminiOpt as Gemini

# instantiate Gryffin
gryffin = Gryffin('config_file.json')

# instantiate Gemini
gemini = Gemini()

# optimization loop
while num_eval < budget:

    if len(observations) >= 2 and len(proxy_observations) >= 2:

        # construct training set with current observations
        training_set = gryffin.construct_training_set(observations, proxy_observations)

        # train Gemini
        gemini.train(training_set['train_features'], training_set['train_targets'],
                     training_set['proxy_train_features'], training_set['proxy_train_targets'],
                     num_folds=3)

    # pass callable when asking Gryffin for new samples
    samples = gryffin.recommend(observations,
                                predictive_model=gemini)

In this external trianing case, you need only provide a Gryffin config file (i.e. no predictive model entries)

Applications of Gemini (so far...)

  • Inverse design of hybrid organic inorganic perovskites
  • Inverse design of multi-component metal-oxide catalysts for the oxygen evolution reaction
  • Inverse design of non-fullerene acceptor molecules for light harvesting applications

Datasets

We provide methods for facile multi-fidelity data preprocessing/testing for 4 datasets reported in the literature.

  • dataset_perovskites (10.1038/sdata.2017.57)
  • dataset_freesolv (10.1007/s10822-014-9747-x)
  • dataset_photobleaching (10.1002/adma.201907801)
  • dataset_cat_oer_1_4 (10.1039/C9SC05999G)

Contributing

Academic collaborations and extensions/improvements to the code are encouraged. Please reach out to Riley via email if you have questions/concerns.

Developers

Citation

Gemini is an open-source research software. If you use Gemini in a scientific report, please cite the following article

@misc{gemini,
      title={Gemini: Dynamic Bias Correction for Autonomous Experimentation},
      author={Riley J. Hickman and Florian Häse and Loïc M. Roch and Alán Aspuru-Guzik},
      year={2021},
      eprint={2103.03391},
      archivePrefix={arXiv},
      primaryClass={stat.ML}
}

License

MIT

About

scalable multi-fidelity machine learning

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published