How to reverse reference level in Cox regression output? #458

plpxsk · 2018-05-11T17:28:49Z

With a categorical covariate with 2 levels: "A" and "B", it looks like "A" is kept as reference level, so that cox regression output shows hazard ratio of B to A. What if I'd prefer the reversed hazard ratio of A to B (so that "B" is reference level")

To clarify, in R, this is done via "factor" and then you can set a reference factor of "B", for example.

Is this possible in lifelines?

The text was updated successfully, but these errors were encountered:

CamDavidsonPilon · 2018-05-11T18:08:19Z

Hey @pavopax - just so I'm clear, can you show an example of what this might look like in R code?

plpxsk · 2018-05-11T19:47:27Z

sure:

library(dplyr)
library(forcats)
library(survival)

N <- 20
df <- data_frame(
  time=rexp(N),
  event=as.integer(rbinom(N, 1, .5)),
  treatment_=c(rep("control", N/2), rep("active", N/2)))

cox <- coxph(Surv(time, event) ~ treatment_, data=df)

In the model summary below, exp(coef) for "treatment_control" is the hazard ratio of control:active , where active is the "reference" since it is alphabetically before "control"

> cox
Call:
coxph(formula = Surv(time, event) ~ treatment_, data = df)

                   coef exp(coef) se(coef)     z    p
treatment_control -0.373     0.689    0.917 -0.41 0.68

And I'd like to see the reverse HR: active to control, where control is the reference ((common in pharma/biotech).

So I reverse it like so:

df$treatment_  <- df$treatment_ %>% as.factor() %>% forcats::fct_rev()

And the result:

cox <- coxph(Surv(time, event) ~ treatment_, data=df)

> cox
Call:
coxph(formula = Surv(time, event) ~ treatment_, data = df)

                 coef exp(coef) se(coef)    z    p
treatment_active 0.373     1.452    0.917 0.41 0.68

Now, this shows me the desired hazard ratio, which I obtained by reversing the "treatment_" factor variable via fct_rev()

(It is actually easy to get this hazard ratio: exp(coef) is just 1/exp(coef) from the first cox model )

CamDavidsonPilon · 2018-05-11T20:02:02Z

Let me know if this solves your question: in lifelines, all the processing & encoding is done outside of lifelines. That is, before the dataframe even hits lifelines, it should be encoded as you wish. So it's up to you whether the baseline is control or treatment in the dataframe column. The inference should proceed as normal, and the resulting cph.print_summary should represent the choice chosen.

Popular encoding functions are available in the patsy library (which has a R-style formula system) and Pandas get_dummies function (but this doesn't allow you wish to chose as the baseline.

df = pd.DataFrame.from_records([
  {'t': 1, 'e': 0, 'cata': 'control'},
  {'t': 2, 'e': 0, 'cata': 'treatment'},
  {'t': 3, 'e': 0, 'cata': 'treatment'},
  {'t': 4, 'e': 0, 'cata': 'control'},
  {'t': 5, 'e': 0, 'cata': 'treatment'},
  {'t': 6, 'e': 0, 'cata': 'control'},
])

df = pd.get_dummies(df, columns='cata')
# choice is now yours to make either baseline, so either:
del df['cata_control']
# or 
del df['cata_treatment']

cph = CoxPHFitter()
cpf.fit(df, 't', 'e')
...

plpxsk · 2018-05-11T20:52:21Z

Yes, this makes complete sense 💯

As suggested, I will make dummies with pd.get_dummies(drop_first=False) and then drop my "reference" level column from the data frame.

And I fully agree with your philosophy of "do all processing/encoding outside of lifelines."

So the modification could instead be made in pd.get_dummies() as discussed at pandas-dev/pandas#12042 (comment). For future viewers, a more general relevant thread is at pandas-dev/pandas#8918.

Issue can be closed.

CamDavidsonPilon closed this as completed May 11, 2018

t-silvers mentioned this issue Dec 2, 2021

Set reference level in CPH using formula with Formulaic? #1368

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to reverse reference level in Cox regression output? #458

How to reverse reference level in Cox regression output? #458

plpxsk commented May 11, 2018

CamDavidsonPilon commented May 11, 2018

plpxsk commented May 11, 2018 •

edited

Loading

CamDavidsonPilon commented May 11, 2018

plpxsk commented May 11, 2018 •

edited

Loading

How to reverse reference level in Cox regression output? #458

How to reverse reference level in Cox regression output? #458

Comments

plpxsk commented May 11, 2018

CamDavidsonPilon commented May 11, 2018

plpxsk commented May 11, 2018 • edited Loading

CamDavidsonPilon commented May 11, 2018

plpxsk commented May 11, 2018 • edited Loading

plpxsk commented May 11, 2018 •

edited

Loading

plpxsk commented May 11, 2018 •

edited

Loading