Workflow to construct SQLite files of LRBase.XXX.eg.db-type packages.
- Bash: GNU bash, version 4.2.46(1)-release (x86_64-redhat-linux-gnu)
- Snakemake: 6.0.5
- Singularity: 3.5.3
- DLRP: L-R list in DLRP database
- IUPHAR: L-R list in IUPHAR database
- HPMR: L-R list in HPMR database
- CELLPHONEDB: L-R list in CellPhoneDB database
- SINGLECELLSIGNALR: L-R list in SingleCellSignalR database
- ENSEMBL_DLRP: L-R list in DLRP based on the ortholog of Human genes in Ensembl Protein trees
- ENSEMBL_IUPHAR: L-R list in IUPHAR based on the ortholog of Human genes in Ensembl Protein trees
- ENSEMBL_HPMR: L-R list in HPMR based on the ortholog of Human genes in Ensembl Protein trees
- ENSEMBL_CELLPHONEDB: L-R list in CellPhoneDB database based on the ortholog of Human genes in Ensembl Protein trees
- ENSEMBL_SINGLECELLSIGNALR: L-R list in SingleCellSignalR database based on the ortholog of Human genes in Ensembl Protein trees
- NCBI_DLRP: L-R list in DLRP based on the ortholog of Human genes in NCBI Homologene
- NCBI_IUPHAR: L-R list in IUPHAR based on the ortholog of Human genes in NCBI Homologene
- NCBI_HPMR: L-R list in HPMR based on the ortholog of Human genes in NCBI Homologene
- NCBI_CELLPHONEDB: L-R list in CellPhoneDB database based on the ortholog of Human genes in NCBI Homologene
- NCBI_SINGLECELLSIGNALR: L-R list in SingleCellSignalR database based on the ortholog of Human genes in NCBI Homologene
- RBBH_DLRP: L-R list in DLRP based on the ortholog of Human genes in Reciprocal BLAST Best Hit used in MeSH.XXX.eg.db workflow
- RBBH_IUPHAR: L-R list in IUPHAR based on the ortholog of Human genes in Reciprocal BLAST Best Hit used in MeSH.XXX.eg.db workflow
- RBBH_HPMR: L-R list in HPMR based on the ortholog of Human genes in Reciprocal BLAST Best Hit used in MeSH.XXX.eg.db workflow
- RBBH_CELLPHONEDB: L-R list in CellPhoneDB database based on the ortholog of Human genes in Reciprocal BLAST Best Hit used in MeSH.XXX.eg.db workflow
- RBBH_SINGLECELLSIGNALR: L-R list in SingleCellSignalR database based on the ortholog of Human genes in Reciprocal BLAST Best Hit used in MeSH.XXX.eg.db workflow
- SWISSPROT_HPRD: Known subcellular localization in Swiss-Prot and PPI list in HPRD
- TREMBL_HPRD: Predicted subcellular localization in TrEMBL and PPI list in HPRD
- FANTOM5: Predicted L-R list used in the FANTOM5 project
- BADERLAB: Predicted L-R list used in the Bader Lab
- ENSEMBL_SWISSPROT_HPRD: Known subcellular localization in Swiss-Prot and PPI list in HPRD based on the ortholog of Human genes in Ensembl Protein trees
- ENSEMBL_TREMBL_HPRD: Predicted subcellular localization in TrEMBL and PPI list in HPRD based on the ortholog of Human genes in Ensembl Protein trees
- ENSEMBL_FANTOM5: Predicted L-R list used in the FANTOM5 project based on the ortholog of Human genes in Ensembl Protein trees
- ENSEMBL_BADERLAB: Predicted L-R list used in the Bader Lab based on the ortholog of Human genes in Ensembl Protein trees
- NCBI_SWISSPROT_HPRD: Known subcellular localization in Swiss-Prot and PPI list in HPRD based on the ortholog of Human genes in NCBI Homologene
- NCBI_TREMBL_HPRD: Predicted subcellular localization in TrEMBL and PPI list in HPRD based on the ortholog of Human genes in NCBI Homologene
- NCBI_FANTOM5: Predicted L-R list used in the FANTOM5 project based on the ortholog of Human genes in NCBI Homologene
- NCBI_BADERLAB: Predicted L-R list used in the Bader Lab based on the ortholog of Human genes in NCBI Homologene
- RBBH_SWISSPROT_HPRD: Known subcellular localization in Swiss-Prot and PPI list in HPRD based on the ortholog of Human genes in Reciprocal BLAST Best Hit used in MeSH.XXX.eg.db workflow
- RBBH_TREMBL_HPRD: Predicted subcellular localization in TrEMBL and PPI list in HPRD based on the ortholog of Human genes in Reciprocal BLAST Best Hit used in MeSH.XXX.eg.db workflow
- RBBH_FANTOM5: Predicted L-R list used in the FANTOM5 project based on the ortholog of Human genes in Reciprocal BLAST Best Hit used in MeSH.XXX.eg.db workflow
- RBBH_BADERLAB: Predicted L-R list used in the Bader Lab based on the ortholog of Human genes in Reciprocal BLAST Best Hit used in MeSH.XXX.eg.db workflow
- SWISSPROT_SPRING: Known subcellular localization in Swiss-Prot and PPI list in SPRING
- TREMBL_SPRING: Predicted subcellular localization in TrEMBL and PPI list in SPRING
- data/rbbh/*.txt: Download from mesh-workflow/output/rbbh/*.txt and set them to data/rbbh/ directory.
- config.yaml: Check the latest version of STRING database (e.g., v11.0 on 2021/3/26 https://string-db.org) and change the value of VERSION_STRING, if it is needed. Also, specify the version of LRBase to crate.
The workflow consists of two snakemake workflows. After performing workflow/workflow1.smk, perform workflow/workflow2.smk as follows.
In local machine:
snakemake -s workflow/download.smk -j 4 --use-singularity
snakemake -s workflow/preprocess_known_human.smk -j 4 --use-singularity
snakemake -s workflow/preprocess_known_otherspecies.smk -j 4 --use-singularity
snakemake -s workflow/preprocess_putative_human.smk -j 4 --use-singularity
snakemake -s workflow/preprocess_putative_otherspecies.smk -j 4 --use-singularity
snakemake -s workflow/preprocess_putative_allspecies.smk -j 4 --use-singularity
snakemake -s workflow/csv.smk -j 4 --use-singularity
snakemake -s workflow/sqlite.smk -j 4 --use-singularity
snakemake -s workflow/metadata.smk -j 4 --use-singularity
snakemake -s workflow/plot.smk -j 4 --use-singularity
In parallel environment (GridEngine):
snakemake -s workflow/download.smk -j 32 --cluster "qsub -l nc=4 -p -50 -r yes -q node.q" --latency-wait 600 --use-singularity
snakemake -s workflow/preprocess_known_human.smk -j 32 --cluster "qsub -l nc=4 -p -50 -r yes -q node.q" --latency-wait 600 --use-singularity
snakemake -s workflow/preprocess_known_otherspecies.smk -j 32 --cluster "qsub -l nc=4 -p -50 -r yes -q node.q" --latency-wait 600 --use-singularity
snakemake -s workflow/preprocess_known_allspecies.smk -j 32 --cluster "qsub -l nc=4 -p -50 -r yes -q node.q" --latency-wait 600 --use-singularity
snakemake -s workflow/preprocess_putative_human.smk -j 32 --cluster "qsub -l nc=4 -p -50 -r yes -q node.q" --latency-wait 600 --use-singularity
snakemake -s workflow/preprocess_putative_otherspecies.smk -j 32 --cluster "qsub -l nc=4 -p -50 -r yes -q node.q" --latency-wait 600 --use-singularity
snakemake -s workflow/csv.smk -j 32 --cluster "qsub -l nc=4 -p -50 -r yes -q node.q" --latency-wait 600 --use-singularity
snakemake -s workflow/sqlite.smk -j 32 --cluster "qsub -l nc=4 -p -50 -r yes -q node.q" --latency-wait 600 --use-singularity
snakemake -s workflow/metadata.smk -j 32 --cluster "qsub -l nc=4 -p -50 -r yes -q node.q" --latency-wait 600 --use-singularity
snakemake -s workflow/plot.smk -j 32 --cluster "qsub -l nc=4 -p -50 -r yes -q node.q" --latency-wait 600 --use-singularity
In parallel environment (Slurm):
snakemake -s workflow/download.smk -j 32 --cluster "sbatch -n 4 --nice=50 --requeue -p node03-06" --latency-wait 600 --use-singularity
snakemake -s workflow/preprocess_known_human.smk -j 32 --cluster "sbatch -n 4 --nice=50 --requeue -p node03-06" --latency-wait 600 --use-singularity
snakemake -s workflow/preprocess_known_otherspecies.smk -j 32 --cluster "sbatch -n 4 --nice=50 --requeue -p node03-06" --latency-wait 600 --use-singularity
snakemake -s workflow/preprocess_putative_human.smk -j 32 --cluster "sbatch -n 4 --nice=50 --requeue -p node03-06" --latency-wait 600 --use-singularity
snakemake -s workflow/preprocess_putative_otherspecies.smk -j 32 --cluster "sbatch -n 4 --nice=50 --requeue -p node03-06" --latency-wait 600 --use-singularity
snakemake -s workflow/preprocess_putative_allspecies.smk -j 32 --cluster "sbatch -n 4 --nice=50 --requeue -p node03-06" --latency-wait 600 --use-singularity
snakemake -s workflow/csv.smk -j 32 --cluster "sbatch -n 4 --nice=50 --requeue -p node03-06" --latency-wait 600 --use-singularity
snakemake -s workflow/sqlite.smk -j 32 --cluster "sbatch -n 4 --nice=50 --requeue -p node03-06" --latency-wait 600 --use-singularity
snakemake -s workflow/metadata.smk -j 32 --cluster "sbatch -n 4 --nice=50 --requeue -p node03-06" --latency-wait 600 --use-singularity
snakemake -s workflow/plot.smk -j 32 --cluster "sbatch -n 4 --nice=50 --requeue -p node03-06" --latency-wait 600 --use-singularity
Copyright (c) 2021 Koki Tsuyuzaki and RIKEN Bioinformatics Research Unit Released under the Artistic License 2.0.
- Koki Tsuyuzaki
- Manabu Ishii
- Itoshi Nikaido