-
Notifications
You must be signed in to change notification settings - Fork 7
/
Copy pathMAPlot.R
346 lines (302 loc) · 15.5 KB
/
MAPlot.R
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
#' The MAPlot class
#'
#' The MAPlot is a \linkS4class{RowDataPlot} subclass that is dedicated to creating a MA plot.
#' It retrieves the log-fold change and average abundance and creates a row-based plot where each point represents a feature.
#'
#' @section Slot overview:
#' The following slots control the thresholds used in the visualization:
#' \itemize{
#' \item \code{PValueField}, a string specifying the field of \code{\link{rowData}} containing the p-values.
#' \item \code{PValueThreshold}, a numeric scalar in (0, 1] specifying the threshold to use on the (adjusted) p-value.
#' Defaults to 0.05.
#' \item \code{LogFCThreshold}, a non-negative numeric scalar specifying the threshold to use on the log-fold change.
#' Defaults to 0.
#' \item \code{PValueCorrection}, a string specifying the multiple testing correction to apply.
#' Defaults to \code{"BH"}, but can take any value from \code{\link{p.adjust.methods}}.
#' }
#'
#' In addition, this class inherits all slots from its parent \linkS4class{RowDataPlot},
#' \linkS4class{RowDotPlot}, \linkS4class{DotPlot} and \linkS4class{Panel} classes.
#'
#' @section Constructor:
#' \code{MAPlot(...)} creates an instance of a MAPlot class,
#' where any slot and its value can be passed to \code{...} as a named argument.
#'
#' Users are expected to load relevant statistics into the \code{\link{rowData}} of a \linkS4class{SummarizedExperiment}.
#' This panel expects one or more columns containing the p-values, log-fold changes and average abundances for each gene/row - see Examples.
#' The expected column names (and how to tune them) are listed at \code{?"\link{registerPValueFields}"}.
#'
#' @section Supported methods:
#' In the following code snippets, \code{x} is an instance of a \linkS4class{RowDataPlot} class.
#' Refer to the documentation for each method for more details on the remaining arguments.
#'
#' For setting up data values:
#' \itemize{
#' \item \code{\link{.cacheCommonInfo}(x, se)} returns \code{se} after being loaded with class-specific constants.
#' This includes \code{"valid.p.fields"}, \code{"valid.ab.fields"} and \code{"valid.lfc.fields"}, which are character vectors containing the names of valid \code{\link{rowData}} columns for the p-values, average abundances and log-fold changes, respectively.
#' \item \code{\link{.refineParameters}(x, se)} returns \code{x} after setting \code{XAxis="Row data"} and the various \code{*Pattern} fields to their cached values.
#' This will also call the equivalent \linkS4class{RowDataPlot} method for further refinements to \code{x}.
#' If valid p-value, abundance and log-fold change fields are not available, \code{NULL} is returned instead.
#' }
#'
#' For defining the interface:
#' \itemize{
#' \item \code{\link{.defineDataInterface}(x, se, select_info)} returns a list of interface elements for manipulating all slots described above.
#' \item \code{\link{.panelColor}(x)} will return the specified default color for this panel class.
#' \item \code{\link{.allowableXAxisChoices}(x, se)} returns a character vector specifying the acceptable average abundance-related variables in \code{\link{rowData}(se)} that can be used as choices for the x-axis.
#' \item \code{\link{.allowableYAxisChoices}(x, se)} returns a character vector specifying the acceptable log-fold change-related variables in \code{\link{rowData}(se)} that can be used as choices for the y-axis.
#' \item \code{\link{.hideInterface}(x, field)} will return \code{TRUE} for \code{field="XAxis"},
#' otherwise it will call the \linkS4class{RowDataPlot} method.
#' \item \code{\link{.fullName}(x)} will return \code{"MA plot"}.
#' }
#'
#' For monitoring reactive expressions:
#' \itemize{
#' \item \code{\link{.createObservers}(x, se, input, session, pObjects, rObjects)} sets up observers for all new slots described above, as well as in the parent classes via the \linkS4class{RowDataPlot} method.
#' }
#'
#' For creating the plot:
#' \itemize{
#' \item \code{\link{.generateDotPlotData}(x, envir)} will create a data.frame of row metadata variables in \code{envir}.
#' This should contain average abundances on the x-axis and log-fold changes on the y-axis,
#' in addition to an extra field specifying whether or not the feature was considered to be significantly up or down.
#' The method will return the commands required to do so as well as a list of labels.
#' \item \code{\link{.prioritizeDotPlotData}(x, envir)} will create variables in \code{envir} marking the priority of points.
#' Significant features receive higher priority (i.e., are plotted over their non-significant counterparts) and are less aggressively downsampled when \code{Downsample=TRUE}.
#' The method will return the commands required to do this as well as a logical scalar indicating that rescaling of downsampling resolution is performed.
#' \item \code{\link{.colorByNoneDotPlotField}(x)} will return a string specifying the field of the data.frame (generated by \code{\link{.generateDotPlotData}}) containing the significance information.
#' This is to be used for coloring when \code{ColorBy="None"}.
#' \item \code{\link{.colorByNoneDotPlotScale}(x)} will return a string containing a \pkg{ggplot2} command to add a default color scale when \code{ColorBy="None"}.
#' \item \code{\link{.generateDotPlot}(x, labels, envir)} returns a list containing \code{plot} and \code{commands}, using the inital \linkS4class{ColumnDataPlot} \link{ggplot} and adding horizontal lines demarcating the log-fold change threshold.
#' }
#'
#' For documentation:
#' \itemize{
#' \item \code{\link{.definePanelTour}(x)} returns an data.frame containing the steps of a panel-specific tour.
#' \item \code{\link{.getDotPlotColorHelp}(x, color_choices)} returns a function that generates an \pkg{rintrojs} tour for the color choice UI.
#' }
#'
#' @docType methods
#' @aliases MAPlot MAPlot-class
#' initialize,MAPlot-method
#' .cacheCommonInfo,MAPlot-method
#' .refineParameters,MAPlot-method
#' .defineDataInterface,MAPlot-method
#' .createObservers,MAPlot-method
#' .hideInterface,MAPlot-method
#' .fullName,MAPlot-method
#' .panelColor,MAPlot-method
#' .generateDotPlotData,MAPlot-method
#' .allowableXAxisChoices,MAPlot-method
#' .allowableYAxisChoices,MAPlot-method
#' .prioritizeDotPlotData,MAPlot-method
#' .colorByNoneDotPlotField,MAPlot-method
#' .colorByNoneDotPlotScale,MAPlot-method
#' .generateDotPlot,MAPlot-method
#' .definePanelTour,MAPlot-method
#' .getDotPlotColorHelp,MAPlot-method
#'
#' @examples
#' # Making up some results:
#' se <- SummarizedExperiment(matrix(rnorm(10000), 1000, 10))
#' rownames(se) <- paste0("GENE_", seq_len(nrow(se)))
#' rowData(se)$PValue <- runif(nrow(se))
#' rowData(se)$LogFC <- rnorm(nrow(se))
#' rowData(se)$AveExpr <- rnorm(nrow(se))
#'
#' if (interactive()) {
#' iSEE(se, initial=list(MAPlot()))
#' }
#'
#' @author Aaron Lun
#'
#' @seealso
#' \link{RowDataPlot}, for the base class.
#'
#' @name MAPlot-class
NULL
#' @export
setClass("MAPlot", contains="RowDataPlot",
slots=c(PValueField="character", PValueThreshold="numeric", LogFCThreshold="numeric", PValueCorrection="character"))
#' @export
setMethod(".fullName", "MAPlot", function(x) "MA plot")
#' @export
setMethod(".panelColor", "MAPlot", function(x) "#666600")
#' @export
setMethod("initialize", "MAPlot", function(.Object, PValueField=NA_character_,
PValueThreshold=0.05, LogFCThreshold=0, PValueCorrection="BH", ...)
{
args <- list(PValueField=PValueField, PValueThreshold=PValueThreshold,
LogFCThreshold=LogFCThreshold, PValueCorrection=PValueCorrection, ...)
do.call(callNextMethod, c(list(.Object), args))
})
#' @export
#' @importFrom methods new
MAPlot <- function(...) {
new("MAPlot", ...)
}
#' @importFrom stats p.adjust.methods
setValidity2("MAPlot", function(object) {
msg <- character(0)
field <- object[["PValueField"]]
if (length(field)!=1) {
msg <- c(msg, "'PValueField' must be a single string")
}
msg <- c(msg, .define_de_validity(object, patterns=c("PValuePattern", "LogFCPattern", "AveAbPattern")))
if (length(msg)) msg else TRUE
})
#' @export
setMethod(".cacheCommonInfo", "MAPlot", function(x, se) {
if (!is.null(.getCachedCommonInfo(se, "MAPlot"))) {
return(se)
}
se <- callNextMethod()
all.cont <- .getCachedCommonInfo(se, "RowDotPlot")$continuous.rowData.names
p.okay <- .matchPValueFields(se, all.cont)
lfc.okay <- .matchLogFCFields(se, all.cont)
ab.okay <- .matchAveAbFields(se, all.cont)
.setCachedCommonInfo(se, "MAPlot",
valid.lfc.fields=lfc.okay,
valid.p.fields=p.okay,
valid.ab.fields=ab.okay)
})
#' @export
#' @importFrom methods callNextMethod
setMethod(".refineParameters", "MAPlot", function(x, se) {
x <- callNextMethod() # Do this first to trigger warnings from base classes.
if (is.null(x)) {
return(NULL)
}
p.fields <- .getCachedCommonInfo(se, "MAPlot")$valid.p.fields
if (length(p.fields)==0L) {
warning("no valid p-value fields for '", class(x)[1], "'")
return(NULL)
}
x <- .replaceMissingWithFirst(x, "PValueField", p.fields)
x[["XAxis"]] <- "Row data"
x
})
#' @export
setMethod(".allowableXAxisChoices", "MAPlot", function(x, se) .getCachedCommonInfo(se, "MAPlot")$valid.ab.fields)
#' @export
setMethod(".allowableYAxisChoices", "MAPlot", function(x, se) .getCachedCommonInfo(se, "MAPlot")$valid.lfc.fields)
#' @export
#' @importFrom shiny numericInput selectInput hr
#' @importFrom stats p.adjust.methods
setMethod(".defineDataInterface", "MAPlot", function(x, se, select_info) {
plot_name <- .getEncodedName(x)
input_FUN <- function(field) paste0(plot_name, "_", field)
.addSpecificTour(class(x), "YAxis", function(plot_name) {
data.frame(
rbind(
c(
element=paste0("#", plot_name, "_", "YAxis + .selectize-control"),
intro="Here, we select the <code>rowData</code> field containing the log-fold changes for all features.
This is presumably generated from some comparison between conditions, e.g., for differential gene expression."
),
c(
element=paste0("#", plot_name, "_", "XAxisRowData + .selectize-control"),
intro="Similarly, we can select the <code>rowData</code> field containing the average abundances for all features.
This should have been generated from the same analysis that was used to obtain the log-fold changes."
)
)
)
})
.addSpecificTour(class(x), "PValueField", function(plot_name) {
data.frame(
element=paste0("#", plot_name, "_", "PValueField + .selectize-control"),
intro="Here, we select the <code>rowData</code> field containing the p-values.
This will be used to identify significant features after adjusting for multiple testing and applying log-fold change thresholds.
All significant features will then be highlighted by color on the plot.
<br/><br/>
Note that these p-values should be on the raw scale, i.e., not log-transformed, and not corrected for multiple testing."
)
})
.define_gene_sig_tours(x)
c(callNextMethod(),
list(
.selectInput.iSEE(x, "PValueField",
label="P-value field:",
selected=slot(x, "PValueField"),
choices=.getCachedCommonInfo(se, "MAPlot")$valid.p.fields),
hr()
),
.define_gene_sig_ui(x)
)
})
#' @export
setMethod(".hideInterface", "MAPlot", function(x, field) {
if (field == "XAxis") TRUE else callNextMethod()
})
#' @export
setMethod(".createObservers", "MAPlot", function(x, se, input, session, pObjects, rObjects) {
callNextMethod()
plot_name <- .getEncodedName(x)
.createUnprotectedParameterObservers(plot_name,
fields=c("PValueField", "PValueThreshold", "LogFCThreshold", "PValueCorrection"),
input=input, pObjects=pObjects, rObjects=rObjects)
})
#' @export
setMethod(".generateDotPlotData", "MAPlot", function(x, envir) {
output <- callNextMethod()
pval.field <- sprintf("rowData(se)[[%s]]", deparse(x[["PValueField"]]))
extra_cmds <- .define_de_status(x, lfc="plot.data$Y", pval=pval.field)
extra_cmds <- c(extra_cmds, "plot.data$IsSig <- c('down', 'none', 'up')[.de_status];")
eval(parse(text=extra_cmds), envir)
output$commands <- c(output$commands, extra_cmds)
output
})
#' @export
#' @importFrom shiny tagList
setMethod(".getDotPlotColorHelp", "MAPlot", function(x, color_choices) {
FUN <- callNextMethod()
function(plot_name) {
df <- FUN(plot_name)
df[1,2] <- "Here, we choose whether to color points by per-row attributes.
When <em>None</em> is selected, the plot defaults to a constant color for all non-significant features,
pink for the significant features with positive log-fold changes,
and blue for the significant features with negative log-fold changes.
The number of features in each category is also shown in the legend.
<br/><br/>
Alternatively, try out some of the different choices here, and note how further options become available when each choice is selected."
df
}
})
#' @export
setMethod(".prioritizeDotPlotData", "MAPlot", function(x, envir) .define_de_priority(envir))
#' @export
setMethod(".colorByNoneDotPlotField", "MAPlot", function(x) "IsSig")
#' @export
setMethod(".colorByNoneDotPlotScale", "MAPlot", function(x) .de_color_scale(x[["Downsample"]]))
#' @export
#' @importFrom ggplot2 geom_hline
setMethod(".generateDotPlot", "MAPlot", function(x, labels, envir) {
output <- callNextMethod()
# Adding the horizontal lines.
extras <- "dot.plot <- dot.plot +"
lfc <- x[["LogFCThreshold"]]
if (lfc > 0) {
# No idea why I need ggplot2:: here, but it just can't find it otherwise.
extras <- c(extras, sprintf("ggplot2::geom_hline(yintercept=c(-1, 1)*%s, color=\"darkgreen\", linetype=\"dashed\")", lfc))
}
if (length(extras) > 1) {
extras <- paste(extras, collapse="\n ")
output$commands <- c(output$commands, list(ma=extras))
output$plot <- eval(parse(text=extras), envir=envir)
}
output
})
#' @export
setMethod(".definePanelTour", "MAPlot", function(x) {
prev <- callNextMethod()
skip <- grep("VisualBoxOpen$", prev$element)
prev <- prev[-seq_len(skip-1),]
rbind(
c(paste0("#", .getEncodedName(x)), sprintf("The <font color=\"%s\">MA plot</font> panel shows the log-fold change from a differential comparison against the average abundance. Each point here corresponds to a feature in our <code>SummarizedExperiment</code>, and the number of significantly different features in the comparisons is shown in the legend.", .getPanelColor(x))),
c(paste0("#", .getEncodedName(x), "_DataBoxOpen"), "The <i>Data parameters</i> box shows the available parameters that can be tweaked in this plot.<br/><br/><strong>Action:</strong> click on this box to open up available options."),
c(paste0("#", .getEncodedName(x), "_YAxis + .selectize-control"), "We can control the columns containing the log-fold changes, based on the available fields in the <code>rowData</code> of the <code>SummarizedExperiment</code>."),
c(paste0("#", .getEncodedName(x), "_XAxisRowData + .selectize-control"), "Similarly, we can control the columns containing the average abundance of each feature, again based on the <code>rowData</code> fields. This is generally expected to be some sort of metric on the log-scale, e.g., an average log-CPM."),
c(paste0("#", .getEncodedName(x), "_PValueThreshold"), "A variety of thresholds can also be tuned to define significant differences; the most relevant of these is the threshold on the false discovery rate."),
prev
)
})