DiSyn

This is a python implementation of Disentangled Synthesis Transfer Network (DiSyn) which enhances generalizability of drug response prediction by extracting features related and unrelated to drug responses to synthesize new training samples and improve prediction accuracy of label-scarce target domains.

1. Quick Start

1.1 Installation

DiSyn depends on PyTorch(1.13), Numpy, scikit-learn, pandas.

Use the provided configuration file environment.yaml in /code to create a conda required environment.

$ cd /code
$ conda env create -f environment.yaml

Running the command above will create environment disyn. To activate the disyn environment, use:

$ conda activate disyn

1.2 Re-evaluation and inferrence

For re-evaluation or inferrence with your own data, please refer to the inference.py in code/ :

$ cd /code
$ python inference.py --drug=$drug
# You might need to adjust the format of your data to align with the current schema.

We have provided the model parameters with the highest AUROC trained on GDSC and TCGA datasets.

Drug	SHA256
5-Fluorouracil	23f73c2b9e15af1fdc03ee081084b47b928433df40ed8dc86ff1b231b684f13d
Bicalutamide	534968be4b0fa70dedb82ea78cfd50272b83aecf2f8fc7c480a34fcd17d2bc4c
Bleomycin	5948efc4c3508943847c11d75d17ea3e09230433d89445660fa745ba377e2deb
Cisplatin	02ad2b62dfd965fce5eaf80ed792ea1acebe61997cff17aa721d774f9d3dc2b8
Docetaxel	b4c815b847f70fd97364551c0380f007fe45cfcb8cb2e661121118148375207d
Doxorubicin	3b21e1a08f1fe3e61c83c0f1213e320a1d4d8e4f4b4bb8ada92f1634726a6888
Etoposide	9114ee71b5ef014a18d6e9f8bbd7bddd3f5360bbff13afb221e76f739a95716f
Gemcitabine	e6b658c88b0925ff27b236a5aaa4e028fa01a505a8d904979acbd74bb75584c1
Methotrexate	d058ddfc92dfeaf62cefe799bead5b45dae87492dc4d97026af2fdbecfdc16f7
Paclitaxel	b0dabc57efc7fd349770e9eb627f16b7cb1079461093147870b18d1a3557e998
Pemetrexed	2bd588f30c75d24e55afa300307443950ebea958efe62f00fc0511179baff549
Sorafenib	b8d7dc953472538d87fb6ed8a3bd571fe49e148bb69e61157fca8ad000ec4651
Tamoxifen	ac65f639b980f3bfb363176bc50745be515a18828b3e0d25715077a97f716f4e
Temozolomide	b65378082288d2ee9901a7aef54eff845e0b9be158ecc0cbce58deb132653480
Vinblastine	a6a1ed7f9b32da3424e4a354ef2036e1d4b4370f7a2258b06a88c683eba70f13
Vinorelbine	0f8dea963a249397d86d353025d58c63d1d3af19833c8648bbdea5e40c941750

2. Model Retraining

DiSyn adapted a composite architecture that includes an unsupervised pretraining process and iterates between disentanglement and a synthetic-data-invovled specific training stages. If you're looking to retrain the DiSyn models, please refer to the following instructions.

2.1 Data

The raw data we used for model training and figure reproducing is accessible at Google Drive. If you want to retrain the models, please download these files and extract them into /data.

2.2 Model Retraining

initial steps

2.2.1 Model Pre-train

$ python main_pretrain.py --nums_recon=$nums_recon --nums_critic=$nums_critic --drop_out=$drop_out

For parameter ranges we used in the paper work, please refer to the supplementary materials.

2.2.2 Task-specific train

$ python main_task_specific_train.py --drug_name=$drug --nums_recon=$nums_recon --nums_critic=$nums_critic --drop_out=$drop_out

where $nums_recon, $nums_critic, $drop_out are the parameters employed in the pretraining process of the models.

iterative steps

Depending on the number of iterations, it might be necessary to loop through the following two steps.

2.2.3 Disentanglement

$ python main_task_specific_train_step2_recon.py --drug_name=$drug --step=$step  --recon_epochs=$recon_epochs --clsadv_alpha=$clsadv_alpha --drop_out=$drop_out

2.2.4 Task-specific train in iterration

$ python main_task_specific_train_step2.py --drug_name=$drug --recon_epochs=$recon_epochs --clsadv_alpha=$clsadv_alpha --drop_out=$drop_out

3. Figures Reproducing

For figure reproducing, please refer to /code/fig2_reproduce.py and /code/fig3_reproduce_AUROC.py.
We have only listed the reproduction process of AUROC here for now.

$ python /code/fig2_reproduce.py
$ python /code/fig3_reproduce_AUROC.py

4. Others

Parts of implementations of this project are coming from

https://github.com/XieResearchGroup/CODE-AE
https://github.com/fungtion/DANN_py3

Respect to the contributors of the open source community.

Name		Name	Last commit message	Last commit date
Latest commit History 1 Commit
code		code
data/datasplitfold/seeds		data/datasplitfold/seeds
imgs		imgs
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

DiSyn

1. Quick Start

1.1 Installation

1.2 Re-evaluation and inferrence

2. Model Retraining

2.1 Data

2.2 Model Retraining

initial steps

2.2.1 Model Pre-train

2.2.2 Task-specific train

iterative steps

2.2.3 Disentanglement

2.2.4 Task-specific train in iterration

3. Figures Reproducing

4. Others

About

Releases

Packages

Languages

LiHongCSBLab/DiSyn

Folders and files

Latest commit

History

Repository files navigation

DiSyn

1. Quick Start

1.1 Installation

1.2 Re-evaluation and inferrence

2. Model Retraining

2.1 Data

2.2 Model Retraining

initial steps

2.2.1 Model Pre-train

2.2.2 Task-specific train

iterative steps

2.2.3 Disentanglement

2.2.4 Task-specific train in iterration

3. Figures Reproducing

4. Others

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages