-
-
Notifications
You must be signed in to change notification settings - Fork 404
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve Survival stuff #1833
Improve Survival stuff #1833
Changes from 8 commits
068fc2d
8be0fd3
49f2867
4396ecf
9042625
fea0216
154dd02
b28d209
d3da8b1
d6ed181
f57b70a
a96c043
00631d3
d78d59b
41df983
4b40764
a21c2d6
b162ba9
99ad790
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -150,6 +150,7 @@ Suggests: | |
smoof, | ||
sparseLDA, | ||
stepPlr, | ||
survAUC, | ||
SwarmSVM, | ||
svglite, | ||
testthat, | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,43 @@ | ||
#' @title Set parameters of performance measures | ||
#' | ||
#' @description | ||
#' Sets hyperparameters of measures. | ||
#' | ||
#' @param measure [\code{\link{Measure}}]\cr | ||
#' Performance measure. | ||
#' @param ... [any]\cr | ||
#' Named (hyper)parameters with new settings. Alternatively these can be passed | ||
#' using the \code{par.vals} argument. | ||
#' @param par.vals [\code{list}]\cr | ||
#' Optional list of named (hyper)parameter settings. The arguments in | ||
#' \code{...} take precedence over values in this list. | ||
#' @template ret_measure | ||
#' @family performance | ||
#' @export | ||
setMeasurePars = function(measure, ..., par.vals = list()) { | ||
args = list(...) | ||
assertClass(measure, classes = "Measure") | ||
assertList(args, names = "unique", .var.name = "parameter settings") | ||
assertList(par.vals, names = "unique", .var.name = "parameter settings") | ||
measure$extra.args = insert(measure$extra.args, insert(par.vals, args)) | ||
measure | ||
} | ||
|
||
#' @title Set aggregation function of measure. | ||
#' | ||
#' @description | ||
#' Set how this measure will be aggregated after resampling. | ||
#' To see possible aggregation functions: \code{\link{aggregations}}. | ||
#' | ||
#' @param measure [\code{\link{Measure}}]\cr | ||
#' Performance measure. | ||
#' @template arg_aggr | ||
#' @return [\code{\link{Measure}}] with changed aggregation behaviour. | ||
#' @family performance | ||
#' @export | ||
setAggregation = function(measure, aggr) { | ||
assertClass(measure, classes = "Measure") | ||
assertClass(aggr, classes = "Aggregation") | ||
measure$aggr = aggr | ||
return(measure) | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -52,8 +52,5 @@ trainLearner.surv.gamboost = function(.learner, .task, .subset, .weights = NULL, | |
|
||
#' @export | ||
predictLearner.surv.gamboost = function(.learner, .model, .newdata, ...) { | ||
if (.learner$predict.type == "response") | ||
predict(.model$learner.model, newdata = .newdata, type = "link") | ||
else | ||
stop("Unknown predict type") | ||
predict(.model$learner.model, newdata = .newdata, type = "link") | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Why is the if no longer necessary here (and below)? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Survival learners do not support multiple predict types currently. There was an attempt to support survival probabilities, but this is not implemented. The calling function checks for predict type and matches against properties, so this is dead code. |
||
} |
This file was deleted.
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -17,6 +17,9 @@ | |
#' For clustering measures, we compact the predicted cluster IDs such that they form a continuous series | ||
#' starting with 1. If this is not the case, some of the measures will generate warnings. | ||
#' | ||
#' Some measure have parameters. Their defaults are set in the constructor \code{\link{makeMeasure}} and can be | ||
#' overwritten using \code{\link{setMeasurePars}}. | ||
#' | ||
#' @param truth [\code{factor}]\cr | ||
#' Vector of the true class. | ||
#' @param response [\code{factor}]\cr | ||
|
@@ -1337,19 +1340,65 @@ measureMultilabelTPR = function(truth, response) { | |
#' @format none | ||
cindex = makeMeasure(id = "cindex", minimize = FALSE, best = 1, worst = 0, | ||
properties = c("surv", "req.pred", "req.truth"), | ||
name = "Concordance index", | ||
name = "Harrell's Concordance index", | ||
note = "Fraction of all pairs of subjects whose predicted survival times are correctly ordered among all subjects that can actually be ordered. In other words, it is the probability of concordance between the predicted and the observed survival.", | ||
fun = function(task, model, pred, feats, extra.args) { | ||
requirePackages("Hmisc", default.method = "load") | ||
resp = pred$data$response | ||
if (anyMissing(resp)) | ||
requirePackages("_Hmisc") | ||
y = getPredictionResponse(pred) | ||
if (anyMissing(y)) | ||
return(NA_real_) | ||
# FIXME: we need to convert to he correct survival type | ||
s = Surv(pred$data$truth.time, pred$data$truth.event) | ||
Hmisc::rcorr.cens(-1 * resp, s)[["C Index"]] | ||
s = getPredictionTruth(pred) | ||
Hmisc::rcorr.cens(-1 * y, s)[["C Index"]] | ||
} | ||
) | ||
|
||
#' @export cindex.uno | ||
#' @rdname measures | ||
#' @format none | ||
#' @references | ||
#' H. Uno et al. | ||
#' \emph{On the C-statistics for Evaluating Overall Adequacy of Risk Prediction Procedures with Censored Survival Data} | ||
#' Statistics in medicine. 2011;30(10):1105-1117. \url{http://dx.doi.org/10.1002/sim.4154}. | ||
cindex.uno = makeMeasure(id = "cindex.uno", minimize = FALSE, best = 1, worst = 0, | ||
properties = c("surv", "req.pred", "req.truth", "req.model"), | ||
name = "Uno's Concordance index", | ||
note = "Fraction of all pairs of subjects whose predicted survival times are correctly ordered among all subjects that can actually be ordered. In other words, it is the probability of concordance between the predicted and the observed survival. Corrected by weighting with IPCW as suggested by Uno. Implemented in survAUC::UnoC.", | ||
fun = function(task, model, pred, feats, extra.args) { | ||
requirePackages("_survAUC") | ||
y = getPredictionResponse(pred) | ||
if (anyMissing(y)) | ||
return(NA_real_) | ||
surv.train = getTaskTargets(task, recode.target = "rcens")[model$subset] | ||
max.time = assertNumber(extra.args$max.time, null.ok = TRUE) %??% max(getTaskTargets(task)[, 1L]) | ||
survAUC::UnoC(Surv.rsp = surv.train, Surv.rsp.new = getPredictionTruth(pred), time = max.time, lpnew = y) | ||
}, | ||
extra.args = list(max.time = NULL) | ||
) | ||
|
||
#' @export iauc.uno | ||
#' @rdname measures | ||
#' @format none | ||
#' @references | ||
#' H. Uno et al. | ||
#' \emph{Evaluating Prediction Rules for T-Year Survivors with Censored Regression Models} | ||
#' Journal of the American Statistical Association 102, no. 478 (2007): 527-37. \url{http://www.jstor.org/stable/27639883}. | ||
iauc.uno = makeMeasure(id = "iauc.uno", minimize = FALSE, best = 1, worst = 0, | ||
properties = c("surv", "req.pred", "req.truth", "req.model", "req.task"), | ||
name = "Uno's estimator of cumulative AUC for right censored time-to-event data", | ||
note = "To set an upper time limit, set argument max.time (defaults to max time in complete task). Implemented in survAUC::AUC.uno.", | ||
fun = function(task, model, pred, feats, extra.args) { | ||
requirePackages("_survAUC") | ||
max.time = assertNumber(extra.args$max.time, null.ok = TRUE) %??% max(getTaskTargets(task)[, 1L]) | ||
times = seq(from = 0, to = max.time, length.out = extra.args$resolution) | ||
surv.train = getTaskTargets(task, recode.target = "rcens")[model$subset] | ||
y = getPredictionResponse(pred) | ||
if (anyMissing(y)) | ||
return(NA_real_) | ||
survAUC::AUC.uno(Surv.rsp = surv.train, Surv.rsp.new = getPredictionTruth(pred), times = times, lpnew = y)$iauc | ||
}, | ||
extra.args = list(max.time = NULL, resolution = 1000) | ||
) | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Could you add hand-constructed tests for the new measures please? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What do you mean by hand-constructed? Calculating these measures without a package would require a few hundred LOC. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. For most of the other measure tests along the lines of incorrect predictions 5, correct predictions 10, therefore error rate 33%. Check that implemented measure gets that number. The point is to check that the number is correct for specific cases (and these can be constructed, i.e. you know what the answer should be). There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. It's complicated. I've added a small test to check if perfect predictions lead to (nearly) perfect performance if there is no censoring. For all other cases, I'd need an external package because you cannot compute this by hand (in a reasonable time frame). I guess we have to rely on the package authors of @PhilippPro Do you have any ideas how to test those measures? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. No, not really. Of course one can construct simple cases without censoring that can be calculated by hand, but with censoring we have to use the complicated formulas from Uno's Paper here (https://www.ncbi.nlm.nih.gov/pmc/articles/PMC3079915/), which do not look very simple at first glance. |
||
############################################################################### | ||
### cost-sensitive ### | ||
############################################################################### | ||
|
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a test for this please?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a test in this PR which detects if the predictions are reversed/inverted.