Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Pooled logistic for survival analyses #42

Open
pzivich opened this issue Mar 29, 2024 · 0 comments
Open

Pooled logistic for survival analyses #42

pzivich opened this issue Mar 29, 2024 · 0 comments
Assignees
Labels
enhancement New feature or request Estimating-Equation Request for new estimating equation

Comments

@pzivich
Copy link
Owner

pzivich commented Mar 29, 2024

Is your feature request related to a problem? Please describe.

Add an estimating equation for pooled (logistic) regression to support survival analysis operations. This is a finite-dimension M-estimator, so standard theory would apply. This also opens up various survival analysis options, like computing IPCW, g-computation, and others.

Describe the solution you'd like

Build an estimating equation for pooled logistic regression. Note that it would not require a long data set. Specifically, we should evaluate something like the following
$$\sum_{i=1}^n \left( \sum_{k \in R} (\Delta_i t_k - m(W_i, S_i; \beta)) \left[ W_i, S_i \right]^T \right) = 0$$
this makes a compact estimating equation which avoids the expansion into a long data set. This avoids mistakes potentially introduced in data processing steps (for the users). This is the advantage of working with the score! However, it requires some finesse to specify the estimating equation programmatically. Particularly, the design matrix for time (i.e., $S$) which is dependent on $k$.

Challenges here:

  • Need to process the time design matrix if we don't covert to a long data structure
  • Weights can be time-dependent, which complicates the implementation that doesn't require a long data structure (weights are a matrix instead of a vector in that case)
  • User can't directly control who contributes as the compact structure sums over the time internally. This can be modified by using the weights argument, but is more opaque.

Describe alternatives you've considered

Code from scratch each time (I would rather not, and would be good support for users).

Additional context

Abbott, R. D. (1985). Logistic regression in survival analysis. American Journal of Epidemiology, 121(3), 465-471.

D'Agostino, R. B., Lee, M. L., Belanger, A. J., Cupples, L. A., Anderson, K., & Kannel, W. B. (1990). Relation of pooled logistic regression to time dependent Cox regression analysis: the Framingham Heart Study. Statistics in Medicine, 9(12), 1501-1515.

Hernán, M. A. (2010). The hazards of hazard ratios. Epidemiology, 21(1), 13-15.

Ngwa, J. S., Cabral, H. J., Cheng, D. M., Pencina, M. J., Gagnon, D. R., LaValley, M. P., & Cupples, L. A. (2016). A comparison of time dependent Cox regression, pooled logistic regression and cross sectional pooling with simulations and an application to the Framingham Heart Study. BMC Medical Research Methodology, 16, 1-12.

@pzivich pzivich added the enhancement New feature or request label Mar 29, 2024
@pzivich pzivich self-assigned this Mar 29, 2024
@pzivich pzivich added the Estimating-Equation Request for new estimating equation label Apr 23, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Estimating-Equation Request for new estimating equation
Projects
None yet
Development

No branches or pull requests

1 participant