HGRL-DTA

This repository contains a PyTorch implementation of the paper "Hierarchical Graph Representation Learning for the Prediction of Drug-Target Binding Affinity".

Overview of Source Codes

materials/ contains raw materials of the Davis dataset and the KIBA dataset.
data/ contains the input data of our model.
metrics.py: contains the evaluation metrics used in our experiments (i.e., MSE, CI, $r_m^2$, Pearson, and AUPR).
GraphInput.py: contains the construction processes of the affinity graph, the drug molecule graph and the target molecule graph.
model.py: contains our HGRL-DTA model and its variants.
train_test_S1.py: contains the training and testing processes under the S1 setting.
train_test_S2.py: contains the training and testing processes under the S2 setting.
train_test_S3.py: contains the training and testing processes under the S3 setting.
train_test_S4.py: contains the training and testing processes under the S4 setting.
utils.py: contains utility functions.

Dependencies

numpy == 1.17.4
scikit-learn == 0.22.2
rdkit == 2017.09.1
networkx == 2.5
torch == 1.4.0
torch-geometric == 1.7.0
lifelines == 0.25.6
argparse == 1.4.0

Runing

Data Preparation

Prepare target molecule graphs, please refer to Prepare Target Molecule Graphs.

Setting S1

Cross Validation

Cross validation our model on the Davis dataset:

python train_test_S1.py --dataset davis --cuda_id 0 --num_epochs 2000 --batch_size 512 --lr 0.0005 --model 0 ---fold 0
python train_test_S1.py --dataset davis --cuda_id 0 --num_epochs 2000 --batch_size 512 --lr 0.0005 --model 0 ---fold 1
python train_test_S1.py --dataset davis --cuda_id 0 --num_epochs 2000 --batch_size 512 --lr 0.0005 --model 0 ---fold 2
python train_test_S1.py --dataset davis --cuda_id 0 --num_epochs 2000 --batch_size 512 --lr 0.0005 --model 0 ---fold 3
python train_test_S1.py --dataset davis --cuda_id 0 --num_epochs 2000 --batch_size 512 --lr 0.0005 --model 0 ---fold 4

Cross validation under other experimental settings is similar.

Train and Test

Train and test our model on the Davis dataset:

python train_test_S1.py --dataset davis --cuda_id 0 --num_epochs 2000 --batch_size 512 --lr 0.0005 --model 0 --dropedge_rate 0.2

Train and test our model on the KIBA dataset:

python train_test_S1.py --dataset kiba --cuda_id 0 --num_epochs 2000 --batch_size 512 --lr 0.0005 --model 0 --dropedge_rate 0.2 --drug_aff_k 40 --target_aff_k 150

Ablation Study

Ablation study on the Davis dataset:

HGRL-DTA (w/o CAG):

python train_test_S1.py --dataset davis --cuda_id 0 --num_epochs 2000 --batch_size 512 --lr 0.0005 --model 2 --dropedge_rate 0.2

HGRL-DTA (w/o FMG):

python train_test_S1.py --dataset davis --cuda_id 0 --num_epochs 2000 --batch_size 512 --lr 0.0005 --model 1 --dropedge_rate 0.2

HGRL-DTA (w/o WA):

python train_test_S1.py --dataset davis --cuda_id 0 --num_epochs 2000 --batch_size 512 --lr 0.0005 --model 0 --weighted --dropedge_rate 0.2

HGRL-DTA-L:

python train_test_S1.py --dataset davis --cuda_id 0 --num_epochs 2000 --batch_size 512 --lr 0.0005 --model 3 --dropedge_rate 0.2

Ablation study on the KIBA dataset is similar.

Setting S2

Train and test our model on the Davis dataset:

python train_test_S2.py --dataset davis --cuda_id 0 --num_epochs 200 --batch_size 512 --lr 0.0005 --model 0 --dropedge_rate 0.2 --drug_sim_k 2 --skip

Train and test our model on the KIBA dataset:

python train_test_S2.py --dataset kiba --cuda_id 0 --num_epochs 200 --batch_size 512 --lr 0.0005 --model 0 --dropedge_rate 0.2 --drug_aff_k 40 --target_aff_k 90 --drug_sim_k 2 --skip

To prevent over-fitting, the same setting (i.e, 200 epochs, 512 batch size and 0.0005 learning rate) is used when runing other compared methods.

Setting S3

Train and test our model on the Davis dataset:

python train_test_S3.py --dataset davis --cuda_id 0 --num_epochs 200 --batch_size 512 --lr 0.0005 --model 0 --dropedge_rate 0.2 --target_sim_k 7 --skip

Train and test our model on the KIBA dataset:

python train_test_S3.py --dataset kiba --cuda_id 0 --num_epochs 200 --batch_size 512 --lr 0.0005 --model 0 --dropedge_rate 0.2 --drug_aff_k 40 --target_aff_k 150 --target_sim_k 7 --skip

To prevent over-fitting, the same setting (i.e, 200 epochs, 512 batch size and 0.0005 learning rate) is used when runing other compared methods.

Setting S4

Train and test our model on the Davis dataset:

python train_test_S4.py --dataset davis --cuda_id 0 --num_epochs 200 --batch_size 512 --lr 0.0005 --model 0 --dropedge_rate 0.2 --drug_sim_k 2 --target_sim_k 7 --skip

Train and test our model on the KIBA dataset:

python train_test_S4.py --dataset kiba --cuda_id 0 --num_epochs 200 --batch_size 512 --lr 0.0005 --model 0 --dropedge_rate 0.2 --drug_aff_k 40 --target_aff_k 90 --drug_sim_k 2 --target_sim_k 7 --skip

To prevent over-fitting, the same setting (i.e, 200 epochs, 512 batch size and 0.0005 learning rate) is used when runing other compared methods.

Name		Name	Last commit message	Last commit date
Latest commit History 31 Commits
source		source
.gitignore		.gitignore
Framework.jpg		Framework.jpg
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

HGRL-DTA

Overview of Source Codes

Dependencies

Runing

Data Preparation

Setting S1

Cross Validation

Train and Test

Ablation Study

Setting S2

Setting S3

Setting S4

About

Releases

Packages

Languages

License

Zhaoyang-Chu/HGRL-DTA

Folders and files

Latest commit

History

Repository files navigation

HGRL-DTA

Overview of Source Codes

Dependencies

Runing

Data Preparation

Setting S1

Cross Validation

Train and Test

Ablation Study

Setting S2

Setting S3

Setting S4

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages