Edge-augmented Graph Transformer (PyTorch)

News

02/09/2024 - The preprint of our paper "Triplet Interaction Improves Graph Transformers: Accurate Molecular Graph Learning with Triplet Graph Transformers" is now available on ArXiv. We achieved SOTA results on PCQM4Mv2, OC20 IS2RE, QM9, MOLPCBA and LIT-PCBA. We will include our new results, and methods, along with model weights soon at https://github.com/shamim-hussain/tgt.
11/23/2023 - We achieved SOTA results (again) on the PCQM4M-V2 dataset by incorporating triangular attention and 3D molecular structure. You can find the new implementation at https://github.com/shamim-hussain/egt_triangular and a technical report (full paper coming soon!).
06/21/2022 - The trained checkpoints on the PCQM4M-V2 have been released. They are available at https://zenodo.org/record/6680242. For additional information, see the "Download Trained Model Checkpoints" section below.
06/05/2022 - The accepted prerprint our paper in KDD '22 is now available on arXiv. It includes discussions on dynamic centrality scalers, random masking, attention dropout and other details about the latest experiments and results. Note that the title is changed to "Global Self-Attention as a Replacement for Graph Convolution".
05/18/2022 - Our paper "Global Self-Attention as a Replacement for Graph Convolution" has been accepted at KDD'22. The preprint at arXiv will be updated soon with the latest version of the paper.

Introduction

This is the official PyTorch implementation of the Edge-augmented Graph Transformer (EGT) as described in https://arxiv.org/abs/2108.03348, which augments the Transformer architecture with residual edge channels. The resultant architecture can directly process graph-structured data. For a Tensorflow implementation see: https://github.com/shamim-hussain/egt.

This implementation focuses on the OGB-Mol datasets and OGB-LSC datasets. (OGB-Mol datasets utilize transfer learning from PCQM4Mv2 dataset.)

Results

Dataset	#layers	#params	Metric	Valid	Test
PCQM4M	18	47.4M	MAE	0.1225	--
PCQM4M-V2	18	47.4M	MAE	0.0883	--
PCQM4M-V2	24	89.3M	MAE	0.0857	0.0862
OGBG-MolPCBA	30	110.8M	Avg. Precision	0.3021 ± 0.0053	0.2961 ± 0.0024
OGBG-MolHIV	30	110.8M	ROC-AUC	0.8060 ± 0.0065	0.8051 ± 0.0030

Download Trained Model Checkpoints

The trained model checkpoints on the PCQM4M-V2 dataset are available at https://zenodo.org/record/6680242. Individual zip files are downloadable. The extracted folders can be put under the models/pcqm4mv2 directory. See the config_input.yaml file contained within to see the training configurations.

We found that the results can be further improved by freezing the node channel layers and training the edge channel layers for a few additional epochs. The corresponding tuned models are given the suffix -T and achieve better results than their untuned counterparts. However, its effect on transfer learning has not yet been studied. That is why we include checkpoints for both tuned and untuned models.

Model	#layers	#params	Valid MAE	Test MAE	Comment
EGT-48M-SIMPLE	18	47.2M	0.0872	--	EGT-Simple (lightweight variant of EGT)
EGT-48M-SIMPLE-T	18	47.2M	0.0860	--	Tuned version of above
EGT-90M	24	89.3M	0.0869	0.0872	Submitted to the leaderboard
EGT-90M-T	24	89.3M	0.0857	0.0862	Submitted tuned version of above
EGT-110M	30	110.8M	0.0870	--	Used for transfer learning
EGT-110M-T	30	110.8M	0.0859	--	Tuned version of above

Requirements

python >= 3.7
pytorch >= 1.6.0
numpy >= 1.18.4
numba >= 0.50.1
ogb >= 1.3.2
rdkit>=2019.03.1
yaml >= 5.3.1

Run Training and Evaluations

You can specify the training/prediction/evaluation configurations by creating a yaml config file and also by passing a series of yaml readable arguments. (Any additional config passed as argument willl override the config specified in the file.)

To run training: python run_training.py [config_file.yaml] ['config1: value1'] ['config2: value2'] ...
To make predictions: python make_predictions.py [config_file.yaml] ['config1: value1'] ['config2: value2'] ...
To perform evaluations: python do_evaluations.py [config_file.yaml] ['config1: value1'] ['config2: value2'] ...

Config files for the results can be found in the configs directory. Examples:

python run_training.py configs/pcqm4m/egt_47m.yaml
python run_training.py 'scheme: pcqm4m' 'model_height: 6'
python make_predictions.py configs/pcqm4m/egt_47m.yaml 'evaluate_on: ["val"]'

More About Training

Once the training is started a model folder will be created in the models directory, under the specified dataset name. This folder will contain a copy of the input config file, for the convenience of resuming training/evaluation. Also, it will contain a config.yaml which will contain all configs, including unspecified default values, used for the training. Training will be checkpointed per epoch. In the case of any interruption, you can resume training by running the run_training.py with the config.yaml file again.

Configs

There many different configurations. The only required configuration is scheme, which specifies the training scheme. If the other configurations are not specified, a default value will be assumed for them. Here are some of the commonly used configurations:

scheme: pcqm4m/pcqm4mv2/molpcba/mohiv.

dataset_path: Where the downloaded OGB datasets will be saved.

model_name: Serves as an identifier for the model, also specifies default path of the model directory, weight files etc.

save_path: The training process will create a model directory containing the logs, checkpoints, configs, model summary and predictions/evaluations. By default it creates a folder at models/<dataset_name> but it can be changed via this config.

cache_dir: During first time of training/evaluation the data will be cached. Default path is cache_data/<dataset_name>. But it can be changed via this config.

distributed: In a multi-gpu setting you can set it to True, for distributed training. Note that, the batch size should also be adjusted accordingly.

batch_size: Batch size. In case of distributed training it is the local batch size. So, the total batch size = batch_size x number of available gpus.

num_epochs: Maximum Number of epochs.

max_lr: Maximum learning rate.

min_lr: Minimum learning rate.

lr_warmup_steps: Initial linear learning rate warmup steps.

lr_total_steps: Total number of gradient updates to be performed, including linear warmup and cosine decay.

model_height: The number of layers L.

node_width: The dimensionality of the node channels d_h.

edge_width: The dimensionality of the edge channels d_e.

num_heads: The number of attention heads. Default is 8.

node_ffn_multiplier: FFN multiplier for node channels.

edge_ffn_multiplier: FFN multiplier for edge channels.

virtual_nodes: number of virtual nodes. 0 (default) would result in global average pooling being used instead of virtual nodes.

upto_hop: Clipping value of the input distance matrix.

attn_dropout: Dropout rate for the attention matrix.

node_dropout: Dropout rate for the node channel's MHA and FFN blocks.

edge_dropout: Dropout rate for the edge channel's MHA and FFN blocks.

sel_svd_features: Rank of the SVD encodings r.

svd_calculated_dim : Number of left and right singular vectors calculated and cached for svd encodings.

svd_output_dim : Number of left and right singular vectors used as svd encodings.

svd_random_neg : Whether to randomly flip the signs of the singular vectors. Default - true.

pretrained_weights_file : Used to specify the learned weights of an already trained model.

Python Environment

The Anaconda environment in which the experiments were conducted is specified in the environment.yml file.

Citation

Please cite the following paper if you find the code useful:

@article{hussain2021global,
  title={Global Self-Attention as a Replacement for Graph Convolution},
  author={Hussain, Md Shamim and Zaki, Mohammed J and Subramanian, Dharmashankar},
  journal={arXiv preprint arXiv:2108.03348},
  year={2021}
}

Name		Name	Last commit message	Last commit date
Latest commit History 16 Commits
configs		configs
lib		lib
LICENSE		LICENSE
README.md		README.md
do_evaluations.py		do_evaluations.py
environment.yml		environment.yml
make_predictions.py		make_predictions.py
run_training.py		run_training.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Edge-augmented Graph Transformer (PyTorch)

News

Introduction

Results

Download Trained Model Checkpoints

Requirements

Run Training and Evaluations

More About Training

Configs

Python Environment

Citation

About

Releases

Packages

Languages

License

shamim-hussain/egt_pytorch

Folders and files

Latest commit

History

Repository files navigation

Edge-augmented Graph Transformer (PyTorch)

News

Introduction

Results

Download Trained Model Checkpoints

Requirements

Run Training and Evaluations

More About Training

Configs

Python Environment

Citation

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages