Skip to content

Implementation for ICLR2024 Oral paper "Unified Generative Modeling of 3D Molecules with Bayesian Flow Networks"

Notifications You must be signed in to change notification settings

AlgoMole/GeoBFN

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

58 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

GeoBFN

Official implementation of ICLR2024 Oral Unified Generative Modeling of 3D Molecules with Bayesian Flow Networks

Update

Please refer to our recent work for applying GeoBFN on Structure-based Drug Design(SBDD) at MolCRAFT: Structure-Based Drug Design in Continuous Parameter Space (ICML2024) with code available at https://github.com/AlgoMole/MolCRAFT.

Prerequisite

You will need to have a host machine with gpu, and have a docker with nvidia-container-runtime enabled.

Tip

  • This repo provide an easy to use script to install docker and nvidia-container-runtime, in ./GeoBFN/docker run sudo ./setup_docker_for_host.sh to setup your host machine.
  • You can also refer to install guide if you don't have them installed.

Quick start

Environment setup

Clone the repo with git clone,

git clone https://github.com/AlgoMole/GeoBFN.git

setup environment with docker,

cd ./GeoBFN/docker

make # a make is all you need

Note

  • The make will automatically build the docker image and run the container. with your host home directory mounted to the ${HOME}/home directory inside the container. highly recommended

  • If you need to setup the environment manually, please refer to files docker/Dockerfile, docker/asset/requirements.txt and docker/asset/apt_packages.txt.

Train a model on qm9 dataset

Inside container, find path to your repo. inside GeoBFN/ run

make -f train.mk

Note

  • this command will automatically attempt to download dataset if not exist, and run training script python geobfn_train.py --config_file configs/bfn4molgen.yaml --epochs 3000 on a default gpu, if you want to change the default gpu, you run export CUDA_VISIBLE_DEVICES=<gpu_id> before the make -f train.mk command.
  • Comment/delete the --no_wandb option in train.mk if you want to use wandb to log the training process. You probably will be prompted to enter your wandb api key.

Caution

  • You could encounter connection error if your server is in China, you can manually download the dataset from baidu netdisk and put it in ./GeoBFN directory with scp <path/to/local/qm9.tar.gz> <username>@<remotehost>:<path/to/remote/GeoBFN/>. run the script block again after the dataset is downloaded.

  • Alternatively you can use a proxy to alow the script download the dataset automatically.

Tip

  • Better run the training command inside a tmux session, as it takes long time to finish training.

  • exiting from container wound't stop the container, run make from host at GeoBFN/docker to log in the running container again. if really need to kill the container run make kill from GeoBFN/docker.

Citations

If you find the idea or code useful for your research, please consider citing

@article{song2024unified,
  title={Unified Generative Modeling of 3D Molecules via Bayesian Flow Networks},
  author={Song, Yuxuan and Gong, Jingjing and Qu, Yanru and Zhou, Hao and Zheng, Mingyue and Liu, Jingjing and Ma, Wei-Ying},
  journal={arXiv preprint arXiv:2403.15441},
  year={2024}}

About

Implementation for ICLR2024 Oral paper "Unified Generative Modeling of 3D Molecules with Bayesian Flow Networks"

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published