You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
This concerns Chapter 11 of the tlverse handbook on survival. It is not clear what time points are being targeted. The Grid for estimating the hazard needs to be distinguished from the time points of interest, in essence. Also, I think this leads to incorrect hazard loss minimization because you don't want to penalize the hazard on time points later than the last time point of interest--i.e. if I want survival at times 1:5, I will only use data from times 1:5, not fit the hazard on times after time 5. Maybe you'd disagree but survival at time, t, is only dependent on the conditional hazard before time t.
library(tmle3)
library(sl3)
vet_data <- read.csv(
paste0(
"https://raw.githubusercontent.com/tlverse/deming2019-workshop/",
"master/data/veteran.csv"
)
)
vet_data$trt <- vet_data$trt - 1
# make fewer times for illustration
vet_data$time <- ceiling(vet_data$time / 20)
head(vet_data)
var_types <- list(
T_tilde = Variable_Type$new("continuous"),
t = Variable_Type$new("continuous"),
Delta = Variable_Type$new("binomial")
)
Note, the testsurv function will test the chapter with different target_times and either cut the data to the target_times or not.
first try to estimate target times 1:5 but enter the entire df_long which has assigned info for all patients for times 1:50. However, this appears to give targeted estimates for all times 1:50 as tmle_params shows (it ignores the self$options$target_times argument in that method, apparently.
ex1 = testsurv(1:5, cut = F)
# this should just be 5 params according to target_times
ex1$tmle_params
Gives parameter estimates for times 1:50. which was not wanted.
# it is giving parameters no one asked for
ex1[[1]]$initial_psi
ex1$tmle_fit_manual$estimates[[1]]$psi[1:25]
do the same as before but cut the data to only contain info for times 1:5. This should yield identical answers to ex1 but doesn't lest we want to fit on times after the last time point of interest.
If ex1 is actually ignoring the target_times and just targeting all time points, this should yield the same answer as ex1 but doesn't. Initial estimates match but targeted estimates do not so. Maybe ex1 is only targeting times 1:5 even though the tmle_params are the same????? What is this code doing?
Another observation: What is the ex1$tmle_fit_manual$estimates[[1]]$IC exactly? The IC should have n rows (for n=137 independent patients) but instead has (n*(# time points)) = 137*50 = 6580 rows. And this IC matrix also has 50 columns for all 50 time points as opposed to just 5 colums for times 1:5 as target_times specifies. I'm not sure what this IC is because for time point 1 (assuming the IC is the first column in the IC matrix) there ought to be only 137 nonzero values for each of the residuals for the hazard from time 0 to 1. How could that have 6580 non-zero entries?
This just in: After checking further, it appears this IC matrix is 50 identical IC's stacked on top of each other--a stack for each time point.
Just one more comment on this: How do we specify to use single epsilon iterative (as in clfm) vs multi-epsilon ridge as is specified here vs recursive one-step? The ridge idea is very cool but not needed for only a few parameters. Thanks!!!
Thanks for a swift reply in advance!!!!!
This concerns Chapter 11 of the tlverse handbook on survival. It is not clear what time points are being targeted. The Grid for estimating the hazard needs to be distinguished from the time points of interest, in essence. Also, I think this leads to incorrect hazard loss minimization because you don't want to penalize the hazard on time points later than the last time point of interest--i.e. if I want survival at times 1:5, I will only use data from times 1:5, not fit the hazard on times after time 5. Maybe you'd disagree but survival at time, t, is only dependent on the conditional hazard before time t.
Note, the testsurv function will test the chapter with different target_times and either cut the data to the target_times or not.
first try to estimate target times 1:5 but enter the entire df_long which has assigned info for all patients for times 1:50. However, this appears to give targeted estimates for all times 1:50 as tmle_params shows (it ignores the self$options$target_times argument in that method, apparently.
Gives parameter estimates for times 1:50. which was not wanted.
do the same as before but cut the data to only contain info for times 1:5. This should yield identical answers to ex1 but doesn't lest we want to fit on times after the last time point of interest.
If ex1 is actually ignoring the target_times and just targeting all time points, this should yield the same answer as ex1 but doesn't. Initial estimates match but targeted estimates do not so. Maybe ex1 is only targeting times 1:5 even though the tmle_params are the same????? What is this code doing?
The text was updated successfully, but these errors were encountered: