You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
CoxnetSurvivalAnalysis and concordance_index_censored (or model score) currently do not support sample weights and this would be very useful to have when calling fit and score using these methods.
This is something GLMNET has and, with respect to Coxnet, the extension to the partial log-likelihood calculation to support sample weights is described in the paper in Section 2.5 https://web.stanford.edu/~hastie/Papers/v39i05.pdf.
Support for sample weights is very useful since it is quite common to have repeated measures, for example in biological datasets. My current workaround is I wrote a custom scikit-learn compatible CV iterator that takes a random sample from each group of repeated measures before producing folds or splits.
The text was updated successfully, but these errors were encountered:
hermidalc
changed the title
Add sample weights support to CoxnetSurvivalAnalysis and concordance_index_censored
Support sample weights in CoxnetSurvivalAnalysis and concordance_index_censored
May 26, 2020
hermidalc
changed the title
Support sample weights in CoxnetSurvivalAnalysis and concordance_index_censored
Sample weights support in CoxnetSurvivalAnalysis and concordance_index_censored
May 28, 2020
CoxnetSurvivalAnalysis
andconcordance_index_censored
(or modelscore
) currently do not support sample weights and this would be very useful to have when callingfit
andscore
using these methods.This is something GLMNET has and, with respect to Coxnet, the extension to the partial log-likelihood calculation to support sample weights is described in the paper in Section 2.5 https://web.stanford.edu/~hastie/Papers/v39i05.pdf.
I don’t know if the current version of GLMNET does more with sample weights in the Coxnet algorithm than what is described above in the paper. I have a hard time reading FORTRAN, but I believe the sample weights are the
w
variable here https://github.com/cran/glmnet/blob/8d764f2a609e8bfe0c32c9eabbd7e8c04c9394f6/src/glmnet5dpclean.f#L3543You can also see how sample weights are applied to the c-index calculation here https://github.com/cran/glmnet/blob/8d764f2a609e8bfe0c32c9eabbd7e8c04c9394f6/R/Cindex.R#L37
Support for sample weights is very useful since it is quite common to have repeated measures, for example in biological datasets. My current workaround is I wrote a custom scikit-learn compatible CV iterator that takes a random sample from each group of repeated measures before producing folds or splits.
The text was updated successfully, but these errors were encountered: