DTi2Vec: Drug-Target interaction prediction using network embedding and ensemble learning

Submitted: 16 December 2020 Accepted: 05 September 2021 Published: 22 September 2021

Link: https://jcheminf.biomedcentral.com/articles/10.1186/s13321-021-00552-w

This code is implemented using Python 3.7

For any qutions please contact the first author:

Maha Thafar

Computer, Electrical and Mathematical Sciences and Engineering Division (CEMSE), Computational Bioscience Research Center, Computer (CBRC), King Abdullah University of Science and Technology (KAUST) - Collage of Computers and Information Technology, Taif University (TU)

Getting Started

DTi2Vec Workflow

Prerequisites:

There are several required Python packages to run the code:

gensim (for node2vec code)
numpy
Scikit-learn
imblearn
pandas
xgboost

These packages can be installed using pip or conda as the follwoing example

pip install -r requirements.txt

Files Description:

The important folders are (to run the code):

1.(Input) folder: that includes four folder for 5 datasets include:

Nuclear Receptor dataset (nr),
G-protein-coupled receptor (gpcr),
Ion Channel (ic),
Enzyme (e)
FDA_DrugBank (DrugBank) which each one of them has all required data of drug-target interactions (in Adjacency matrix and edgelist format) and drug-drug similarity and target-target similarity in (square matrix format)

2.(EMBED) folder: that has also five folders coressponding for five datasets, each folder contains the generated node2vec Embedding file for each fold of training data (coressponding to the same seed of CV in the main node2vec code) - to access the main code of node2vec: https://github.com/aditya-grover/node2vec, or you can install node2vec library using pip install node2vec

There are 3 files of the implementation:

load_datasets.py --> to read the input data for each dataset sperately
tow main function 1- Random CV Setting --> - DTi2Vec_main.py 2- New Drug Setting --> - DTi2vec_newDrug_seting_generatedEMBED.py

Installing:

To get the development environment runining, the code get 3 parameters from the user which are:

the dataset name data:(nr, gpcr, ic, e, DrugBank)
the boosting classifier classifier: AdaBoost(ab), XGBoost (xgbc)
the fusion function func: (Concat, Hadmard, AVG, WL1, WL2)
(the defual values are: dataset:nr , classifier:ab, fusion function:Hadmard )
to run the code for random CV settings (to obtain best results for each dataset run the following:

python DTi2Vec_main.py --data nr --classifier ab --func WL1

python DTi2Vec_main.py --data gpcr --classifier xgbc --func Hadamard

python DTi2Vec_main.py --data ic --classifier xgbc --func Concat

python DTi2Vec_main.py --data e --classifier xgbc --func Concat

python DTi2Vec_main.py --data DrugBank --classifier xgbc --func Hadamard

--

For new drug setting: (It takes 2 args: dataset name and the fusion function)

The classifier is XGBoost for all dataset
The Embeddings are generated using node2vec for new drugs CV and can be found in:'EMBED/newDrug_EMBED'
and then read the generated embeddings in this code.
The best obtianed results for all datasets using WL1 except for DB using Hadmard

**Examples to run newDrug setting:

python DTi2vec_newDrug_seting_generatedEMBED.py --data nr  --func WL1

python DTi2vec_newDrug_seting_generatedEMBED.py --data DrugBank --func Hadamard

For citation:

Thafar, Maha A., Rawan S. Olayan, Somayah Albaradei, Vladimir B. Bajic, Takashi Gojobori, Magbubah Essack, and Xin Gao. "DTi2Vec: Drug–target interaction prediction using network embedding and ensemble learning." Journal of Cheminformatics 13, no. 1 (2021): 1-18.

Name		Name	Last commit message	Last commit date
Latest commit History 28 Commits
.ipynb_checkpoints		.ipynb_checkpoints
EMBED		EMBED
Input		Input
Novel_Interactions		Novel_Interactions
__pycache__		__pycache__
test_predicted_DT		test_predicted_DT
.DS_Store		.DS_Store
.gitattributes		.gitattributes
DTi2vec_main.py		DTi2vec_main.py
DTi2vec_newDrug_seting_generatedEMBED.py		DTi2vec_newDrug_seting_generatedEMBED.py
Figure-2.png		Figure-2.png
README.md		README.md
load_datasets.py		load_datasets.py
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DTi2Vec: Drug-Target interaction prediction using network embedding and ensemble learning

Getting Started

DTi2Vec Workflow

Prerequisites:

Files Description:

The important folders are (to run the code):

There are 3 files of the implementation:

Installing:

For new drug setting: (It takes 2 args: dataset name and the fusion function)

For citation:

About

Releases

Packages

Languages

MahaThafar/DTi2Vec

Folders and files

Latest commit

History

Repository files navigation

DTi2Vec: Drug-Target interaction prediction using network embedding and ensemble learning

Getting Started

DTi2Vec Workflow

Prerequisites:

Files Description:

The important folders are (to run the code):

There are 3 files of the implementation:

Installing:

For new drug setting: (It takes 2 args: dataset name and the fusion function)

For citation:

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages