Support configurations in YAML format (#27)

* add support for YAML files * ignore files for testing during development * change yaml file structure; check for duplicate identifiers; add unit tests * fix unit tests that depend on console width * document data.frame and list formats supported for resource metadata
iSEE · Jan 12, 2023 · 6ee6b8b · 6ee6b8b
1 parent 07837e3
commit 6ee6b8b
Show file tree

Hide file tree

Showing 11 changed files with 465 additions and 51 deletions.
diff --git a/R/iSEEindex.R b/R/iSEEindex.R
@@ -1,39 +1,117 @@
 #' iSEEindex App
-#' 
+#'
 #' @description
 #' Generate an \pkg{iSEE} app that includes a landing page enabling
 #' users to choose from a custom set of data sets and initial configuration
 #' states prepared by the app maintainer.
-#' 
+#'
 #' @section Data Sets:
-#' The function passed to the argument `FUN.datasets` must return a `data.frame` that contains the following columns:
-#' 
+#' The function passed to the argument `FUN.datasets` must return either a `data.frame` or a `list` that contains metadata about the available data sets.
+#'
+#' Required metadata are:
+#'
 #' \describe{
 #' \item{id}{A unique identifier for the data set.}
 #' \item{label}{A short human-readable title for the data set, displayed in the 'Info' panel when the data set is selected.}
 #' \item{uri}{A Uniform Resource Identifier (URI) that indicates the location of the data file that contains the data set.}
 #' \item{description}{A more detailed description of the data set, displayed in the 'Info' panel when the data set is selected.}
 #' }
-#' 
-#' The `id` is used to identify the data set file in the \pkg{BiocFileCache}.
-#' Thus, we recommend using a dedicated `BiocFileCache()` for the app, using the `BiocFileCache(cache)` to specify an on-disk location (directory path) for the dedicated cache.
-#' 
+#'
+#' **Important:** The `id` value is used to identify the data set file in the \pkg{BiocFileCache}.
+#' Thus, we recommend using a dedicated `BiocFileCache()` for the app, using the `BiocFileCache(cache)` argument to specify an on-disk location (directory path) for the dedicated cache.
+#'
+#' Example `data.frame`:
+#'
+#' ```
+#' data.frame(
+#'   id = c("ID1", "ID2"),
+#'   label = c("Dataset 01", "Dataset 02"),
+#'   uri = c("https://example.com/1.rds", "https://example.com/2.rds"),
+#'   description = c("My first data set.", "My second data set.")
+#' )
+#' ```
+#'
+#' The `data.frame` may also contain optional columns of metadata specific to individual [`iSEEindexResource-class`] classes (refer to the help page of those classes for details).
+#' The value in optional columns can be left empty (`""`) for resource classes that do not require that information.
+#'
+#' Example `list`:
+#'
+#' ```
+#' list(
+#'   list(
+#'      id = "ID1",
+#'      label = "Dataset 01",
+#'      uri = "https://example.com/1.rds",
+#'      description = "My first data set."
+#'   ),
+#'   list(
+#'      id = "ID2",
+#'      label = "Dataset 02",
+#'      uri = "https://example.com/1.rds",
+#'      description = "My second data set."
+#'   )
+#' )
+#' ```
+#'
+#' The individual sub-lists may also contain optional named metadata specific to individual [`iSEEindexResource-class`] classes (refer to the help page of those classes for details).
+#'
 #' @section Initial Configurations:
-#' The function passed to the argument `FUN.initial` must return a `data.frame` that contains the following columns:
-#' 
+#' The function passed to the argument `FUN.initial` must return either a `data.frame` or a `list` that contains metadata about the available initial configurations, or `NULL` in the absence of any custom initial configuration (default settings will be applied to all data sets.).
+#'
+#' Required metadata are:
+#'
 #' \describe{
 #' \item{dataset_id}{The unique identifier of a data set.}
 #' \item{config_id}{A unique identifier for the initial configuration.}
 #' \item{label}{A short human-readable title for the initial configuration, representing the initial configuration in the 'Initial settings' dropdown menu.}
 #' \item{uri}{A Uniform Resource Identifier (URI) that indicates the location of the R script that contains the initial configuration.}
 #' \item{description}{A more detailed description of the initial configuration, displayed in the 'Configure and launch' panel when the initial configuration is selected.}
 #' }
-#' 
+#'
 #' The `dataset_id` must match one of the `id` values in the data set metadata.
 #' See section 'Data Sets'.
-#' 
+#'
 #' The same `config_id` may be re-used in combination with different `dataset_id`.
-#' The `dataset_id` and `initial_id` are combined to identify the initial configuration script and the associated data set in the \pkg{BiocFileCache}.
+#'
+#' **Important:** The `dataset_id` and `config_id` are combined to identify the initial configuration script and the associated data set in the \pkg{BiocFileCache}.
+#'
+#' Example `data.frame`:
+#'
+#' ```
+#' data.frame(
+#'   dataset_id = c("ID1", "ID1"),
+#'   config_id = c("config01", config02"),
+#'   label = c("Configuration 01", "Configuration 02"),
+#'   uri = c("https://example.com/1.R", "https://example.com/2.R"),
+#'   description = c("My first configuration.", "My second configuration.")
+#' )
+#' ```
+#'
+#' The `data.frame` may also contain optional columns of metadata specific to individual [`iSEEindexResource-class`] classes (refer to the help page of those classes for details).
+#' The value in optional columns can be left empty (`""`) for resource classes that do not require that information.
+#'
+#' Example `list`:
+#'
+#' ```
+#' list(
+#'   list(
+#'      dataset_id = "ID1",
+#'      config_id = "config01",
+#'      label = "Configuration 01",
+#'      uri = "https://example.com/1.R",
+#'      description = "My first configuration."
+#'   ),
+#'   list(
+#'      dataset_id = "ID1",
+#'      config_id = "config02",
+#'      label = "Configuration 02",
+#'      uri = "https://example.com/2.R",
+#'      description = "My second configuration."
+#'   )
+#' )
+#' ```
+#'
+#' The individual sub-lists may also contain optional named metadata specific to individual [`iSEEindexResource-class`] classes (refer to the help page of those classes for details).
 #'
 #' @param bfc An [BiocFileCache()] object.
 #' @param FUN.datasets A function that returns a `data.frame` of metadata for
@@ -42,7 +120,7 @@
 #' available initial configuration states.
 #'
 #' @return An [iSEE()] app with a custom landing page using a [BiocFileCache()] to cache a selection of data sets.
-#' 
+#'
 #' @author Kevin Rue-Albrecht
 #'
 #' @export
@@ -53,20 +131,43 @@
 #' @examples
 #' library(BiocFileCache)
 #' bfc <- BiocFileCache(cache = tempdir())
-#' 
+#'
+#' # Using YAML ----
+#'
+#' dataset_fun <- function() {
+#'     x <- yaml::read_yaml(system.file(package = "iSEEindex", "example.yaml"))
+#'     x$datasets
+#' }
+#'
+#' initial_fun <- function() {
+#'     x <- yaml::read_yaml(system.file(package = "iSEEindex", "example.yaml"))
+#'     x$initial
+#' }
+#'
+#' app <- iSEEindex(bfc, dataset_fun, initial_fun)
+#'
+#' if (interactive()) {
+#'     shiny::runApp(app, port = 1234)
+#' }
+#'
+#' # Using CSV ---
+#'
 #' dataset_fun <- function() {
-#'     read.csv(system.file(package="iSEEindex", "datasets.csv"))
+#'     x <- read.csv(system.file(package = "iSEEindex", "datasets.csv"))
+#'     x$datasets
 #' }
-#' 
+#'
 #' initial_fun <- function() {
-#'     read.csv(system.file(package = "iSEEindex", "initial.csv"))
+#'     x <- yaml::read.csv(system.file(package = "iSEEindex", "initial.csv"))
+#'     x$initial
 #' }
-#' 
+#'
 #' app <- iSEEindex(bfc, dataset_fun, initial_fun)
-#' 
+#'
 #' if (interactive()) {
-#'   shiny::runApp(app, port = 1234)
+#'     shiny::runApp(app, port = 1234)
 #' }
+#'
 iSEEindex <- function(bfc, FUN.datasets, FUN.initial = NULL) {
     stopifnot(is(bfc, "BiocFileCache"))
     if (is.null(FUN.initial)) {
@@ -81,7 +182,7 @@ iSEEindex <- function(bfc, FUN.datasets, FUN.initial = NULL) {
 }
 
 #' Prepare and Launch the Main App.
-#' 
+#'
 #' Invokes a function that replaces the landing page by the \pkg{iSEE}
 #' interactive dashboard.
 #'
@@ -103,7 +204,7 @@ iSEEindex <- function(bfc, FUN.datasets, FUN.initial = NULL) {
 #' landing page.
 #'
 #' @return A `NULL` value is invisibly returned.
-#' 
+#'
 #' @author Kevin Rue-Albrecht
 #'
 #' @importFrom utils capture.output

diff --git a/R/landing_page.R b/R/landing_page.R
@@ -17,11 +17,19 @@
 #'
 #' @rdname INTERNAL_landing_page
 .landing_page <- function(bfc, FUN.datasets, FUN.initial) {
+    # datasets
     datasets_available_table <- FUN.datasets()
+    if (is(datasets_available_table, "list")) {
+        datasets_available_table <- .list_to_dataframe(datasets_available_table)
+    }
     .check_datasets_table(datasets_available_table)
+    # initial configurations
     initial_available_table <- FUN.initial()
+    if (is(initial_available_table, "list")) {
+        initial_available_table <- .list_to_dataframe(initial_available_table)
+    }
     .check_initial_table(initial_available_table)
-
+    # landing page function (return value)
     function (FUN, input, output, session) {
         # nocov start
         output$allPanels <- renderUI({

diff --git a/R/utils-datasets.R b/R/utils-datasets.R
@@ -136,6 +136,13 @@
         txt <- "Data set metadata must have at least one row."
         .stop(txt)
     }
+    # Check that config identifiers are unique
+    which_dup <- duplicated(x[[.datasets_id]])
+    if (any(which_dup)) {
+        first_dup <- which(which_dup)[1]
+        txt <- sprintf("duplicate id: %s", x[[.datasets_id]][first_dup])
+        .stop(txt)
+    }
     # https://github.com/iSEE/iSEEindex/issues/23
     if (.dataset_region %in% colnames(x)) {
         txt <- paste(

diff --git a/R/utils-initial.R b/R/utils-initial.R
@@ -110,6 +110,13 @@
             .stop(txt)
         }
     }
+    # Check that config identifiers are unique
+    which_dup <- duplicated(x[[.initial_config_id]])
+    if (any(which_dup)) {
+        first_dup <- which(which_dup)[1]
+        txt <- sprintf("duplicate config_id: %s", x[[.initial_config_id]][first_dup])
+        .stop(txt)
+    }
     # https://github.com/iSEE/iSEEindex/issues/23
     if (.dataset_region %in% colnames(x)) {
         txt <- paste(

diff --git a/R/utils-yaml.R b/R/utils-yaml.R
@@ -0,0 +1,28 @@
+#' Import Resource Metadata from YAML
+#'
+#' @param file Path to a suitable YAML file.
+#'
+#' @return A `data.frame`
+#'
+#' @rdname INTERNAL_list_to_dataframe
+#'
+#' @examples
+#' library(yaml)
+#'
+#' datasets_file <- system.file(package="iSEEindex", "example.yaml")
+#' yaml_data <- read_yaml(system.file(package = "iSEEindex", "example.yaml"))
+#'
+#' # Data sets ----
+#'
+#' iSEEindex:::.list_to_dataframe(yaml_data$datasets)
+#'
+#' # Initial configurations ----
+#'
+#' iSEEindex:::.list_to_dataframe(yaml_data$initial)
+.list_to_dataframe <- function(x) {
+    # Convert list to a list of data.frames
+    list_of_df <- lapply(x, as.data.frame)
+    # rbind into a single data.frame
+    df <- do.call("rbind", list_of_df)
+    df
+}
diff --git a/inst/.gitignore b/inst/.gitignore
@@ -0,0 +1 @@
+dev-*
diff --git a/inst/example.yaml b/inst/example.yaml
@@ -0,0 +1,53 @@
+datasets:
+  - id: ID1
+    label: ReprocessedAllenData.rds
+    uri: https://zenodo.org/record/7304331/files/ReprocessedAllenData.rds
+    description: |
+        Reprocessed Allen Data.
+  - id: ID2
+    label: ReprocessedAllenData.rds
+    uri: https://zenodo.org/record/7304331/files/ReprocessedAllenData.rds
+    description: |
+        Reprocessed Allen Data (copy).
+
+initial:
+  - config_id: config01rcall
+    dataset_id: ID1
+    label: Configuration 1 (R call)
+    uri: rcall://system.file(package='iSEEindex','ReprocessedAllenData_config_01.R')
+    description: |
+      One `ReducedDimensionPlot` panel, one `ColumnDataTable` panel.
+
+      File distributed with the `iSEEindex` package.
+
+      <Source: YAML>
+  - config_id: config02rcall
+    dataset_id: ID1
+    label: Configuration 2 (R call)
+    uri: rcall://system.file(package='iSEEindex','ReprocessedAllenData_config_02.R')
+    description: |
+      One `RowDataTable` panel, one `ColumnDataTable` panel.
+
+      File distributed with the `iSEEindex` package.
+
+      <Source: YAML>
+  - config_id: config01zenodo
+    dataset_id: ID1
+    label: Configuration 1 (zenodo.org)
+    uri: https://zenodo.org/record/7304331/files/ReprocessedAllenData_config_01.R
+    description: |
+      One `ReducedDimensionPlot` panel, one `ColumnDataTable` panel.
+
+      File downloaded from <https://zenodo.org/record/7304331>.
+
+      <Source: YAML>
+  - config_id: config02zenodo
+    dataset_id: ID1
+    label: Configuration 2 (zenodo.org)
+    uri: https://zenodo.org/record/7304331/files/ReprocessedAllenData_config_02.R
+    description: |
+      One `RowDataTable` panel, one `ColumnDataTable` panel.
+
+      File downloaded from <https://zenodo.org/record/7304331>.
+
+      <Source: YAML>
diff --git a/inst/initial.csv b/inst/initial.csv
@@ -1,5 +1,5 @@
 dataset_id,config_id,label,uri,description
-ID1,config01rcall,"Configuration 1 (R call)","rcall://system.file(package='iSEEindex','ReprocessedAllenData_config_01.R')","One `ReducedDimensionPlot` panel, one `ColumnDataTable` panel.<br/><br/>File distributed with the `iSEEindex` package."
-ID1,config02rcall,"Configuration 2 (R call)","rcall://system.file(package='iSEEindex','ReprocessedAllenData_config_02.R')","One `RowDataTable` panel, one `ColumnDataTable` panel.<br/><br/>File distributed with the `iSEEindex` package."
-ID1,config01zenodo,"Configuration 1 (zenodo.org)",https://zenodo.org/record/7304331/files/ReprocessedAllenData_config_01.R?download=1,"One `ReducedDimensionPlot` panel, one `ColumnDataTable` panel.<br/><br/>File downloaded from <https://zenodo.org/record/7304331>."
-ID1,config02zenodo,"Configuration 2 (zenodo.org)",https://zenodo.org/record/7304331/files/ReprocessedAllenData_config_02.R?download=1,"One `RowDataTable` panel, one `ColumnDataTable` panel.<br/><br/>File downloaded from <https://zenodo.org/record/7304331>."
+ID1,config01rcall,"Configuration 1 (R call)","rcall://system.file(package='iSEEindex','ReprocessedAllenData_config_01.R')","One `ReducedDimensionPlot` panel, one `ColumnDataTable` panel.<br/><br/>File distributed with the `iSEEindex` package.<br/><br/><Source: CSV>"
+ID1,config02rcall,"Configuration 2 (R call)","rcall://system.file(package='iSEEindex','ReprocessedAllenData_config_02.R')","One `RowDataTable` panel, one `ColumnDataTable` panel.<br/><br/>File distributed with the `iSEEindex` package.<br/><br/><Source: CSV>"
+ID1,config01zenodo,"Configuration 1 (zenodo.org)",https://zenodo.org/record/7304331/files/ReprocessedAllenData_config_01.R?download=1,"One `ReducedDimensionPlot` panel, one `ColumnDataTable` panel.<br/><br/>File downloaded from <https://zenodo.org/record/7304331>.<br/><br/><Source: CSV>"
+ID1,config02zenodo,"Configuration 2 (zenodo.org)",https://zenodo.org/record/7304331/files/ReprocessedAllenData_config_02.R?download=1,"One `RowDataTable` panel, one `ColumnDataTable` panel.<br/><br/>File downloaded from <https://zenodo.org/record/7304331>.<br/><br/><Source: CSV>"