This repository contains data and code for the paper Zhang*, Hou*, et al. Polygenic enrichment distinguishes disease associations of individual cells in single-cell RNA-seq data.
Also see scDRS software and documentation.
- Final version (rv_final): current version (using
scdrs v1.0.0
). - Revision 1 (rv1): revision 1 (using
scdrs v1.0.0
). - Initial submission: see readme_inital_sub.md file (using
scdrs v0.1
).
Codes are at ./job.reproduce
. To run the code, set DATA_PATH
(if in the code) to your local folder of the main data scDRS_data_release_{RELEASE_DATE}
and set SCORE_FILE_PATH
(if in the code) to your local folder of the scDRS score files scDRS_data_release_{RELEASE_DATE}.score_file_tmsfacs
.
- Final version (rv_final) and revision 1 (rv1): Main data scDRS_data_release_030122 (3.8 GB). scDRS score files for TMS FACS + 74 diseases/traits scDRS_data_release_030122.score_file_tmsfacs (36.8 GB).
- Initial submission: Main data scDRS_data_release_092121 (3.6 GB). scDRS score files for TMS FACS + 74 diseases/traits scDRS_data_release_092121.score_file_tmsfacs (36.3 GB).
Compute scDRS scores for TMS FACS + 74 diseases/traits
- Final version (rv_final) and revision 1 (rv1): Score files are in
scDRS_data_release_030122.score_file_tmsfacs
. Compute them yourself usingreproduce_compute_score.tms_facs_with_cov.magma_10kb_1000.rv1.sh
- Initial submission: Score files were included in
scDRS_data_release_092121.score_file_tmsfacs
. Compute them yourself usingreproduce_compute_score.tms_facs_with_cov.magma_10kb_1000.sh
Cell type-level analysis (Fig. 3)
- Final version (rv_final) and revision 1 (rv1):
reproduce_celltype.rv1.ipynb
. - Initial submission:
reproduce_celltype.ipynb
.
T cell analysis (Fig. 4)
- Final version (rv_final) and revision 1 (rv1) (Fig. 4A-E):
reproduce_tcell.rv1.ipynb
. - Initial submission (Fig. 4A-C):
reproduce_tcell.ipynb
.
T cell gene prioritization (Fig. 4)
- Final version (rv_final) and revision 1 (rv1) (Fig. 4F):
reproduce_tcell_gene.rv1.ipynb
. - Initial submission (Fig. 4D):
reproduce_tcell_gene.ipynb
.
Neuron analysis (Fig. 5A,B)
- Final version (rv_final) and revision 1 (rv1):
reproduce_neuron.rv1.ipynb
. - Initial submission:
reproduce_neuron.ipynb
.
Hepatocyte analysis (Fig. 5C,D)
- Final version (rv_final) and revision 1 (rv1):
reproduce_hep.rv1.ipynb
. - Initial submission:
reproduce_hep.ipynb
.
Curate information for 74 diseases/traits:
- Curate information for the 74 diseases:
job.curate_data/get_trait_list.ipynb
- Curate information for the 74 diseases (rv1):
job.curate_data/get_trait_list.rv1.ipynb
Curate gene set (.gs) files:
- .gs file for 74 diseases:
job.curate_data/curate_gs_file.ipynb
- .gs file for 74 diseases (rv1):
job.curate_data/curate_gs_file.rv1.ipynb
- .gs file for T cell signatures:
job.curate_data/curate_gs.tcell_signature.ipynb
- .gs file for ploidy signatures:
job.curate_data/curate_ploidy_gs.ipynb
- .gs file for zonation signatures:
job.curate_data/curate_zonation_gs.ipynb
- .gs file for metabolic pathways:
job.curate_data/curate_gs.metabolic.ipynb
Curate scRNA-seq data sets:
- TS FACS:
job.curate_data/curate_ts_data.ipynb
- Cano-Gamez & Soskic et al.:
job.curate_data/curate_canogamez_tcell_data.ipynb
- Nathan et al.:
job.curate_data/curate_nathan_tcell_data.ipynb
- Aizarani et al.:
job.curate_data/curate_aizarani_liver_atlas_data.ipynb
- Halpern & Shenhav et al.:
job.curate_data/curate_halpern_mouse_liver_data.ipynb
- Richter & Deligiannis et al.:
job.curate_data/curate_richter_hepatocyte_data.ipynb
- Compute gene-level statistics:
compute_data_stats.py
andcompute_data_stats.sh
- Get meta information of data sets:
get_data_info.ipynb
- Get meta information of data sets (rv1):
get_data_info.rv1.ipynb
- Get meta information of data sets (rv_final):
get_data_info.rv_final.ipynb
- TMS FACS + 74 diseases:
job.compute_score/compute_score.tms_facs_with_cov.magma_10kb_1000.sh
- TMS FACS + T cell signatures (using scDRS scripts instead of CLI):
job.compute_score/compute_score.tms_facs_with_cov.tcell_sig.sh
- TMS FACS + metabolic (using scDRS scripts instead of CLI):
job.compute_score/compute_score.tms_facs_with_cov.hep_metabolic.sh
- TMS droplet + 74 diseases:
job.compute_score/compute_score.tms_droplet_with_cov.magma_10kb_1000.sh
- TS FACS + 74 diseases:
job.compute_score/compute_score.ts_facs_with_cov.magma_10kb_1000.sh
Make schematic figures.
Data generation:
- Generate the TMS FACS 10K subsampled data and null gene sets:
job.simulation/generate_null_simulation_data.ipynb
- Generate the TMS FACS 10K subsampled data and null gene sets (rv1):
job.simulation/generate_null_simulation_data.rv1.ipynb
- Generate location-matched gene sets and make figures (rv1):
job.simulation/simulation.other_null.ipynb
- Generate location-matched gene sets and make figures (rv_final):
job.simulation/simulation.other_null.rv_final.ipynb
- Generate causal gene sets and perturbation configurations:
job.simulation/generate_causal_simulation_data.ipynb
- Generate 20 reps of subsampled TMS FACS 10K data (rv1):
job.simulation/generate_subsampled_tms_data.rv1.ipynb
Compute results:
- Compute scDRS scores for null simulations:
job.simulation/compute_simu_score.sh
- Compute scDRS scores for null simulations (rv1):
job.simulation/compute_simu_score.rv1.sh
- Compute scDRS scores for null simulations using the
-adj-prop
option (rv1):job.simulation/compute_simu_score.adj_prop.rv1.sh
- Compute Seurat scores for null simulations:
job.simulation/compute_simu_score_scanpy.sh
- Compute Seurat scores for null simulations (rv1):
job.simulation/compute_simu_score_scanpy.rv1.sh
- Compute Vision scores for null simulations:
job.simulation/compute_simu_score_vision.sh
- Compute Vision scores for null simulations (rv1):
job.simulation/compute_simu_score_vision.rv1.sh
- Compute VAM scores for null simulations:
job.simulation/call_R_vam.sh
- Compute VAM scores for null simulations (rv1):
job.simulation/call_R_vam.rv1.sh
- Compute scores (scDRS/Seurat/Vision) for causal simulations (500 random causal cells):
job.simulation/compute_perturb_simu_score.sh
- Compute scores (scDRS/Seurat/Vision) for causal simulations (B cells causal):
job.simulation/compute_perturb_simu_score_Bcell.sh
Make figures:
- Make figures for null simulations:
job.simulation/make_figure.null_simulation.ipynb
- Make figures for null simulations (rv1):
job.simulation/make_figure.null_simulation.rv1.ipynb
- Make figures for null simulations (rv_final):
job.simulation/make_figure.null_simulation.rv_final.ipynb
- Make figures for causal simulations (500 random causal cells):
job.simulation/make_figure.causal_simulation.ipynb
- Make figures for causal simulations (500 random causal cells) (rv_final):
job.simulation/make_figure.causal_simulation.rv_final.ipynb
- Make figures for causal simulations (B cells causal):
job.simulation/make_figure.causal_simulation_Bcell.ipynb
- Make figures for causal simulations (B cells causal) (rv_final):
job.simulation/make_figure.causal_simulation_Bcell.rv_final.ipynb
- Make figures for subsampled UKB (rv1):
job.simulation/make_figure.UKB_subsample.rv1.ipynb
- Make figures for subsampled UKB (rv_final):
job.simulation/make_figure.UKB_subsample.rv_final.ipynb
- Summary of the cell-type association results:
job.celltype_association/summary_ct.ipynb
- Main analysis:
job.celltype_association/main_figure.ipynb
- Main analysis (rv1):
job.celltype_association/main_figure.rv1.ipynb
- Comparison of cell-type association for three atlas datasets: TMS FACS, TMS droplet, TS FACS:
job.celltype_association/atlas_compare.ipynb
- Comparison of cell-type association for three atlas datasets: TMS FACS, TMS droplet, TS FACS (rv1):
job.celltype_association/atlas_compare.rv1.ipynb
- Relationship between scDRS power and heritability, polygenicity:
job.celltype_association/optimal_param.ipynb
- Comparison of cell-type association to LDSC-SEG:
job.celltype_association/ldsc_compare.ipynb
- Comparison of cell-type association to alternative methods (rv1):
job.celltype_association/methods_compare.rv1.ipynb
- Effects of gene sets for scDRS power:
job.celltype_association/vary_geneset.ipynb
- Examples of within-cell-type heterogeneity (including correlated genes and covariates) (rv_final):
job.celltype_association/hetero_examples.rv_final.ipynb
: - Evaluation of alternative versions of scDRS using control traits and cell types (rv1):
job.continuous_score/
(see the directory for more details)
- Reprocess TMS T cells and assign effectorness gradients:
job.case_tcell/s1_reprocess_tms_tcell.ipynb
- Main analysis:
job.case_tcell/s3_analysis_tcell.ipynb
- Main analysis (rv1):
job.case_tcell/s3_analysis_tcell.rv1.ipynb
- Main analysis (rv_final):
job.case_tcell/s3_analysis_tcell.rv_final.ipynb
- Replication in Cano-Gamez & Soskic et al. and Nathan et al. data:
job.case_tcell/s4_analysis_tcell.replication.ipynb
- Replication in Cano-Gamez & Soskic et al. and Nathan et al. data (rv1):
job.case_tcell/s4_analysis_tcell.replication.rv1.ipynb
- Cluster-level LDSC-SEG analysis:
job.case_tcell/s5_compare_ldsc_cluster_4res.ipynb
- Cluster-level LDSC-SEG analysis (rv1):
job.case_tcell/s5_compare_ldsc_cluster_4res.rv1.ipynb
- Cluster-level LDSC-SEG analysis (rv_final):
job.case_tcell/s5_compare_ldsc_cluster_4res.rv_final.ipynb
- Disease gene prioritization:
job.case_tcell/s6_gene_prioritization.ipynb
- Disease gene prioritization (rv1):
job.case_tcell/s6_gene_prioritization.rv1.ipynb
- Disease gene prioritization (rv_final):
job.case_tcell/s5_compare_ldsc_cluster_4res.rv_final.ipynb
- Gene set overlap and disease score correlation between traits (rv1):
job.case_tcell/s7_contrast_traits.rv1.ipynb
- Automatic annotation using ProjecTILE in R (rv1):
job.case_tcell/s8_map_tms_tcell_ProjecTILs.rv1.ipynb
- Main analysis (Fig. 5AB):
job.ca1_pyramidal/main_figure.ipynb
- Main analysis (Fig. 5AB) (rv1):
job.ca1_pyramidal/main_figure.rv1.ipynb
- Analysis of neurons in TMS FACS dataset:
job.ca1_pyramidal/tms.ipynb
- Analysis of neurons in TMS FACS dataset (rv1):
job.ca1_pyramidal/tms.rv1.ipynb
- Analysis of Zeisel et al. 2015 dataset:
job.ca1_pyramidal/zeisel.ipynb
- Analysis of Zeisel et al. 2015 dataset (rv1):
job.ca1_pyramidal/zeisel.rv1.ipynb
- Verification of the inferred spatial coordinates:
job.ca1_pyramidal/spatial_verify.ipynb
- Reprocess TMS hepatocytes:
job.case_hepatocyte/s1_reprocess_tms_hep.ipynb
- Main analysis:
job.case_hepatocyte/s3_analysis_hep.ipynb
- Main analysis (rv1):
job.case_hepatocyte/s3_analysis_hep.rv1.ipynb
- Main analysis (rv_final):
job.case_hepatocyte/s3_analysis_hep.rv_final.ipynb