This is a python implementation of Disentangled Synthesis Transfer Network (DiSyn) which enhances generalizability of drug response prediction by extracting features related and unrelated to drug responses to synthesize new training samples and improve prediction accuracy of label-scarce target domains.
DiSyn depends on PyTorch(1.13), Numpy, scikit-learn, pandas.
Use the provided configuration file environment.yaml
in /code
to create a conda required environment.
$ cd /code
$ conda env create -f environment.yaml
Running the command above will create environment disyn
. To activate the disyn environment, use:
$ conda activate disyn
For re-evaluation or inferrence with your own data, please refer to the inference.py
in code/
:
$ cd /code
$ python inference.py --drug=$drug
# You might need to adjust the format of your data to align with the current schema.
We have provided the model parameters with the highest AUROC trained on GDSC and TCGA datasets.
DiSyn adapted a composite architecture that includes an unsupervised pretraining process and iterates between disentanglement and a synthetic-data-invovled specific training stages. If you're looking to retrain the DiSyn models, please refer to the following instructions.
The raw data we used for model training and figure reproducing is accessible at Google Drive.
If you want to retrain the models, please download these files and extract them into /data
.
$ python main_pretrain.py --nums_recon=$nums_recon --nums_critic=$nums_critic --drop_out=$drop_out
For parameter ranges we used in the paper work, please refer to the supplementary materials.
$ python main_task_specific_train.py --drug_name=$drug --nums_recon=$nums_recon --nums_critic=$nums_critic --drop_out=$drop_out
where $nums_recon
, $nums_critic
, $drop_out
are the parameters employed in the pretraining process of the models.
Depending on the number of iterations, it might be necessary to loop through the following two steps.
$ python main_task_specific_train_step2_recon.py --drug_name=$drug --step=$step --recon_epochs=$recon_epochs --clsadv_alpha=$clsadv_alpha --drop_out=$drop_out
$ python main_task_specific_train_step2.py --drug_name=$drug --recon_epochs=$recon_epochs --clsadv_alpha=$clsadv_alpha --drop_out=$drop_out
For figure reproducing, please refer to /code/fig2_reproduce.py
and /code/fig3_reproduce_AUROC.py
.
We have only listed the reproduction process of AUROC here for now.
$ python /code/fig2_reproduce.py
$ python /code/fig3_reproduce_AUROC.py
Parts of implementations of this project are coming from
https://github.com/XieResearchGroup/CODE-AE
https://github.com/fungtion/DANN_py3
Respect to the contributors of the open source community.