-
Notifications
You must be signed in to change notification settings - Fork 216
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add criterion to sksurv.ensemble.RandomSurvivalForest #108
Comments
Since this was posted, there's a growing literature suggesting that the time-varying nature of some features would necessitate alternative splitting strategies in RSF's. Having only a single strategy (log-rank) that is subject to some of the same proportionality assumptions of a Cox Regression might defeat the purpose of a model ideally designed for non-linear problems. Having at least one alternative option like a Poisson regression log-likelihood could offer an intermediate solution before open-ended splitting strategies become available. See the following examples of varying splitting strategies: |
@james-sexton96 The options for the splitting rule is quite large in the literature. I haven't followed closely the last couple of years, so I'm not sure if a consensus emerged by now. Conditional Inference Forests would definitely be interesting (see #341). Do you have a reference for the Poisson regression log-likelihood you mentioned? |
@sebp A poisson regression log-likelihood is well suited for real-world data as opposed to data with structured follow up. It would be nice to mirror sklearn's random forest regressor's parameters by including a kwa for criterion, and if I have time, I can draft an implementation of a poisson split criteria! Crowther et al. 2012 |
See also, poisson criteria added to sci-kit learn |
It would be fantastic to have
criterion
(i.e., the function to measure the quality of a split) as a parameter ofRandomSurvivalForest
. I know that currently only the log-rank splitting rule is supported. For now, this could be set as the default (and only option). In the future, this could be expanded to cover other options (for example, from the original paperconservation
,log_rank_score_rule
,log_rank_random
) - changing the corresponding splitting code as well. This would also make theRandomSurvivalForest
more similar to itsscikit
counterparts (e.g.,RandomForestRegressor
), making it (even) more compatible with other packages that build onscikit
's standard structure.I think this could be done easily in
forest.py
:If this is something you think it might be interesting, I would be more than happy to help with a proper PR request.
The text was updated successfully, but these errors were encountered: