Skip to content

Commit

Permalink
Removed the deprecated combineCommonResults function.
Browse files Browse the repository at this point in the history
  • Loading branch information
LTLA committed Dec 20, 2024
1 parent 2224201 commit c2922b3
Show file tree
Hide file tree
Showing 14 changed files with 14 additions and 194 deletions.
2 changes: 1 addition & 1 deletion R/classifySingleR.R
Original file line number Diff line number Diff line change
Expand Up @@ -87,7 +87,7 @@
#'
#' \code{\link{pruneScores}}, to remove low-quality labels based on the scores.
#'
#' \code{\link{combineCommonResults}}, to combine results from multiple references.
#' \code{\link{combineRecomputedResults}}, to combine results from multiple references.
#'
#' @export
#' @importFrom BiocParallel bpnworkers
Expand Down
97 changes: 1 addition & 96 deletions R/combineCommonResults.R
Original file line number Diff line number Diff line change
@@ -1,99 +1,4 @@
#' Combine SingleR results with common genes
#'
#' Combine results from multiple runs of \code{\link{classifySingleR}} (usually against different references) into a single \linkS4class{DataFrame}.
#' This assumes that each run of \code{\link{classifySingleR}} was performed using a common set of marker genes.
#'
#' @param results A list of \linkS4class{DataFrame} prediction results as returned by \code{\link{classifySingleR}} when run on each reference separately.
#'
#' @return A \linkS4class{DataFrame} is returned containing the annotation statistics for each cell or cluster (row).
#' This mimics the output of \code{\link{classifySingleR}} and contains the following fields:
#' \itemize{
#' \item \code{scores}, a numeric matrix of correlations formed by combining the equivalent matrices from \code{results}.
#' \item \code{labels}, a character vector containing the per-cell combined label across references.
#' \item \code{references}, an integer vector specifying the reference from which the combined label was derived.
#' \item \code{orig.results}, a DataFrame containing \code{results}.
#' }
#' It may also contain \code{pruned.labels} if these were also present in \code{results}.
#'
#' The \code{\link{metadata}} contains \code{common.genes},
#' a character vector of the common genes that were used across all references in \code{results};
#' and \code{label.origin}, a DataFrame specifying the reference of origin for each label in \code{scores}.
#'
#' @details
#' Here, the strategy is to performed classification separately within each reference,
#' then collating the results to choose the label with the highest score across references.
#' For each cell, we identify the reference with the highest score across all of its labels.
#' The \dQuote{combined label} is then defined as the label assigned to that cell in the highest-scoring reference.
#' (The same logic is also applied to the first and pruned labels, if those are available.)
#'
#' Each result should be generated from training sets that use a common set of genes during classification,
#' i.e., \code{common.genes} should be the same in the \code{trained} argument to each \code{\link{classifySingleR}} call.
#' This is because the scores are not comparable across results if they were generated from different sets of genes.
#' It is also for this reason that we use the highest score prior to fine-tuning,
#' even if it does not correspond to the score of the fine-tuned label.
#'
#' It is highly unlikely that this function will be called directly by the end-user.
#' Users are advised to use the multi-reference mode of \code{\link{SingleR}} and related functions,
#' which will take care of the use of a common set of genes before calling this function to combine results across references.
#'
#' @author
#' Jared Andrews,
#' Aaron Lun
#'
#' @examples
#' # Making up data (using one reference to seed another).
#' ref <- .mockRefData(nreps=8)
#' ref1 <- ref[,1:2%%2==0]
#' ref2 <- ref[,1:2%%2==1]
#' ref2$label <- tolower(ref2$label)
#'
#' test <- .mockTestData(ref1)
#'
#' # Applying classification with SingleR's multi-reference mode.
#' ref1 <- scuttle::logNormCounts(ref1)
#' ref2 <- scuttle::logNormCounts(ref2)
#' test <- scuttle::logNormCounts(test)
#'
#' pred <- SingleR(test, list(ref1, ref2), labels=list(ref1$label, ref2$label))
#' pred[,1:5] # Only viewing the first 5 columns for visibility.
#'
#' @seealso
#' \code{\link{SingleR}} and \code{\link{classifySingleR}}, for generating predictions to use in \code{results}.
#'
#' \code{\link{combineRecomputedResults}}, for another approach to combining predictions.
#'
#' @export
#' @importFrom S4Vectors DataFrame metadata metadata<-
combineCommonResults <- function(results) {
.Deprecated(new = "combineRecomputedResults")

if (length(unique(lapply(results, rownames))) != 1) {
stop("cell/cluster names in 'results' are not identical")
}
if (length(unique(vapply(results, nrow, 0L)))!=1) {
stop("numbers of cells/clusters in 'results' are not identical")
}

all.common <- lapply(results, function (x) sort(metadata(x)$common.genes))
if (length(unique(all.common)) != 1) {
# This should be changed to 'stop' before release/after merge with PR #60.
warning("common genes are not identical")
}

ncells <- nrow(results[[1]])
collected.scores <- collected.best <- vector("list", length(results))
for (i in seq_along(results)) {
scores <- results[[i]]$scores
collected.best[[i]] <- scores[cbind(seq_len(ncells), max.col(scores))]
collected.scores[[i]] <- scores
}

all.scores <- do.call(cbind, collected.scores)
output <- DataFrame(scores = I(all.scores), row.names=rownames(results[[1]]))

metadata(output)$common.genes <- all.common[[1]]
metadata(output)$label.origin <- .create_label_origin(collected.scores)

chosen <- max.col(do.call(cbind, collected.best))
cbind(output, .combine_result_frames(chosen, results))
.Defunct(new = "combineRecomputedResults")
}
5 changes: 1 addition & 4 deletions R/combineRecomputedResults.R
Original file line number Diff line number Diff line change
@@ -1,8 +1,7 @@
#' Combine SingleR results with recomputation
#'
#' Combine results from multiple runs of \code{\link{classifySingleR}} (usually against different references) into a single \linkS4class{DataFrame}.
#' The label from the results with the highest score for each cell is retained.
#' Unlike \code{\link{combineCommonResults}}, this does not assume that each run of \code{\link{classifySingleR}} was performed using the same set of common genes, instead recomputing the scores for comparison across references.
#' This involves recomputing the scores so that they are comparable across references.
#'
#' @param results A list of \linkS4class{DataFrame} prediction results as returned by \code{\link{classifySingleR}} when run on each reference separately.
#' @inheritParams SingleR
Expand Down Expand Up @@ -67,8 +66,6 @@
#' @seealso
#' \code{\link{SingleR}} and \code{\link{classifySingleR}}, for generating predictions to use in \code{results}.
#'
#' \code{\link{combineCommonResults}}, for another approach to combining predictions.
#'
#' @references
#' Lun A, Bunis D, Andrews J (2020).
#' Thoughts on a more scalable algorithm for multiple references.
Expand Down
3 changes: 1 addition & 2 deletions R/plotMarkerHeatmap.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,7 @@
#'
#' Create a heatmap of the log-normalized expression for the most interesting markers of a particular label.
#'
#' @param results A \linkS4class{DataFrame} containing the output from \code{\link{SingleR}},
#' \code{\link{classifySingleR}}, \code{\link{combineCommonResults}}, or \code{\link{combineRecomputedResults}}.
#' @param results A \linkS4class{DataFrame} containing the output from \code{\link{SingleR}}, \code{\link{classifySingleR}}, or \code{\link{combineRecomputedResults}}.
#' @param test A numeric matrix of log-normalized expression values where rows are genes and columns are cells.
#' Each row should be named with the same gene name that was used to compute \code{results}.
#'
Expand Down
3 changes: 1 addition & 2 deletions R/plotScoreDistribution.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,7 @@
#'
#' Plot the distribution of assignment scores across all cells assigned to each reference label.
#'
#' @param results A \linkS4class{DataFrame} containing the output from \code{\link{SingleR}},
#' \code{\link{classifySingleR}}, \code{\link{combineCommonResults}}, or \code{\link{combineRecomputedResults}}.
#' @param results A \linkS4class{DataFrame} containing the output from \code{\link{SingleR}}, \code{\link{classifySingleR}}, or \code{\link{combineRecomputedResults}}.
#' @param show Deprecated, use \code{\link{plotDeltaDistribution}} instead for \code{show!="scores"}.
#' @param labels.use Character vector specifying the labels to show in the plot facets.
#' Defaults to all labels in \code{results}.
Expand Down
3 changes: 1 addition & 2 deletions R/plotScoreHeatmap.R
Original file line number Diff line number Diff line change
Expand Up @@ -2,8 +2,7 @@
#'
#' Create a heatmap of the \code{\link{SingleR}} assignment scores across all cell-label combinations.
#'
#' @param results A \linkS4class{DataFrame} containing the output from \code{\link{SingleR}},
#' \code{\link{classifySingleR}}, \code{\link{combineCommonResults}}, or \code{\link{combineRecomputedResults}}.
#' @param results A \linkS4class{DataFrame} containing the output from \code{\link{SingleR}}, \code{\link{classifySingleR}}, or \code{\link{combineRecomputedResults}}.
#' @param cells.use Integer or string vector specifying the single cells (i.e., rows of \code{results}) to show.
#' If \code{NULL}, all cells are shown.
#' @param labels.use Character vector specifying the labels to show in the heatmap rows.
Expand Down
2 changes: 2 additions & 0 deletions inst/NEWS.Rd
Original file line number Diff line number Diff line change
Expand Up @@ -23,6 +23,8 @@ This is simpler and more efficient than the previous "expanded with NA" format.

\item Separate the missingness check arguments in \code{SingleR()} with the new \code{check.missing.test=} and \code{check.missing.ref=} options.
The former is disabled by default, to avoid an unnecessary missingness check in the vast majority of test cases.

\item Removed the deprecated \code{combineCommonResults()} function.
}}

\section{Version 2.8.0}{\itemize{
Expand Down
2 changes: 1 addition & 1 deletion man/classifySingleR.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

74 changes: 0 additions & 74 deletions man/combineCommonResults.Rd

This file was deleted.

5 changes: 1 addition & 4 deletions man/combineRecomputedResults.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 1 addition & 2 deletions man/plotDeltaDistribution.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 1 addition & 2 deletions man/plotMarkerHeatmap.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 1 addition & 2 deletions man/plotScoreDistribution.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

3 changes: 1 addition & 2 deletions man/plotScoreHeatmap.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

0 comments on commit c2922b3

Please sign in to comment.