-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
feat: enhance analysis and visualization capabilities for scRNA-seq data
- Update service description for comprehensive single-cell RNA sequencing analysis. - Add new input parameters: max_features, resolution, species, min_pct, logfc_threshold, gsea_min_size, gsea_max_size, and category. - Implement GF-ICF for single-cell GSEA and integrate pathway analysis. - Enhance output files to include detailed cluster information and pathway scores. - Introduce docker-compose setup for RStudio environment. - Optimize Dockerfile with streamlined package installations using `pak`.
- Loading branch information
1 parent
110f47c
commit c9df03d
Showing
4 changed files
with
266 additions
and
210 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -3,7 +3,8 @@ key: simcore/services/comp/osparc-differential-expression | |
type: computational | ||
integration-version: 1.0.0 | ||
version: 0.1.0 | ||
description: Easily generate differential expression results from OSparc data | ||
description: | | ||
Easily generate differential expression results from OSparc data. This service performs comprehensive analysis of single-cell RNA sequencing data, including data normalization, clustering, dimensionality reduction, and pathway activity scoring. It provides visualizations such as t-SNE and UMAP plots for both gene expression and pathway activity, along with detailed cluster information and pathway scores. The service is designed to work with standard input formats and offers flexibility in analysis parameters. | ||
contact: [email protected] | ||
thumbnail: https://github.com/ITISFoundation/osparc-assets/blob/cb43207b6be2f4311c93cd963538d5718b41a023/assets/default-thumbnail-cookiecutter-osparc-service.png?raw=true | ||
authors: | ||
|
@@ -16,11 +17,12 @@ inputs: | |
label: Input Folder | ||
description: Folder containing scRNA-seq data (matrix.mtx, features.tsv, barcodes.tsv) | ||
type: data:*/* | ||
output_prefix: | ||
name: | ||
displayOrder: 2 | ||
label: Output Prefix | ||
description: Prefix for output files (optional) | ||
label: Project Name | ||
description: The name of the dataset being analyzed | ||
type: string | ||
defaultValue: sPARcRNA | ||
min_cells: | ||
displayOrder: 3 | ||
label: Minimum Cells | ||
|
@@ -33,11 +35,73 @@ inputs: | |
description: Minimum number of features (genes) per cell | ||
type: integer | ||
defaultValue: 200 | ||
max_features: | ||
displayOrder: 5 | ||
label: Maximum Features | ||
description: Maximum number of features (genes) per cell | ||
type: integer | ||
defaultValue: 2500 | ||
resolution: | ||
displayOrder: 6 | ||
label: Resolution | ||
description: Resolution parameter for clustering | ||
type: number | ||
defaultValue: 0.8 | ||
species: | ||
displayOrder: 7 | ||
label: Species | ||
description: Species for GSEA (e.g., "Homo sapiens" or "Mus musculus") | ||
type: string | ||
defaultValue: Homo sapiens | ||
min_pct: | ||
displayOrder: 8 | ||
label: Minimum Percentage | ||
description: Minimum percentage for FindAllMarkers | ||
type: number | ||
defaultValue: 0.25 | ||
logfc_threshold: | ||
displayOrder: 9 | ||
label: Log Fold-Change Threshold | ||
description: Log fold-change threshold for FindAllMarkers | ||
type: number | ||
defaultValue: 0.25 | ||
gsea_min_size: | ||
displayOrder: 10 | ||
label: GSEA Minimum Size | ||
description: Minimum gene set size for GSEA | ||
type: integer | ||
defaultValue: 15 | ||
gsea_max_size: | ||
displayOrder: 11 | ||
label: GSEA Maximum Size | ||
description: Maximum gene set size for GSEA | ||
type: integer | ||
defaultValue: 500 | ||
category: | ||
displayOrder: 12 | ||
label: MSigDB Category | ||
description: MSigDB category for GSEA (e.g., "H" for hallmark gene sets) | ||
type: string | ||
defaultValue: H | ||
outputs: | ||
output_file: | ||
displayOrder: 1 | ||
label: Processed Data | ||
description: Zipped file containing processed scRNA-seq data (AnnData objects) and metadata | ||
description: | | ||
Zipped file containing processed scRNA-seq data and analysis results. The zip file includes: | ||
- seurat_object.rds: Serialized R object containing the complete Seurat analysis | ||
- tsne_plot.png: t-SNE plot of cell clusters based on gene expression | ||
- dim_reduction_data.csv: CSV file with UMAP and t-SNE coordinates for both gene expression and pathway activity, along with cluster assignments | ||
- pathway_scores.csv: CSV file with pathway activity scores for each cell | ||
- cluster_info_genes.csv: CSV file with cluster information based on gene expression, including centroids and top pathways | ||
- cluster_info_pathways.csv: CSV file with cluster information based on pathway activity, including centroids and top pathways | ||
- outputs.json: JSON file containing summary statistics and file paths, including: | ||
- initial_cell_count: Number of cells in the input data | ||
- final_cell_count: Number of cells after filtering | ||
- gene_count: Number of genes in the analysis | ||
- project_name: Name of the analyzed dataset | ||
- cluster_count_genes: Number of clusters based on gene expression | ||
- cluster_count_pathways: Number of clusters based on pathway activity | ||
type: data:application/zip | ||
fileToKeyMap: | ||
final_output.zip: output_file |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
services: | ||
rstudio: | ||
image: rocker/rstudio:4.2.0 | ||
container_name: rstudio_instance | ||
environment: | ||
- PASSWORD=yourpassword | ||
- USER=scu | ||
- INPUT_FOLDER=/input | ||
- OUTPUT_FOLDER=/output | ||
volumes: | ||
- ./src/pipeline:/home/scu/pipeline | ||
- ./src/astro:/home/scu/astro | ||
- ./input:/input | ||
- ./output:/output | ||
ports: | ||
- "8787:8787" | ||
user: "root" | ||
command: ["--server-user", "scu"] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.