Topic 7 IP weighting for censoring

Learning Goals

  1. Understand how IP weighting can be used to address the lack of generalizability caused by censoring





Discussion

Language for interpreting coefficients in MSMs

E[Ya]=β0+β1a

  • Context: Y is cholesterol, A is medication (a = 1: on medication)
  • β0: The average potential cholesterol level when all are not on medication
  • β1: The average causal effect: the change in average potential cholesterol level if everyone were on medication compared to if no one were on medication





Do we really care about the ACE?

  • Depends on research goals, but yes, sometimes ACE is of interest.
    • Is it a nationwide policy? If so, then enacting the policy means “treating” everyone and not enacting means treating no one.
  • May care about the ACE within subgroups. We’ve seen this with effect modification MSMs:

E[Ya]=β0+β1a+β2M+β3a×M

  • Say M indicates history of heart disease (1: history. 0: no history)
  • Question: What are the interpretation of β1 and β3 in this model?





What is the difference between regular and IP-weighted GLMs?

  • Our tree diagrams showed that weighting allows us to create “populations” of all treated and all untreated.
    • Our WHATIF book calls the combined population of the all treated and the all untreated a pseudopopulation
  • We can model the outcomes in this pseudopopulation with standard regression models (marginal structural models).
    • Allows us to estimate causal quantities: P(Ydo(A=a)), E[Ydo(A=a)] (in potential outcome notation: P(Ya), E[Ya])
    • Standard regression models without IP weights do not estimate these causal quantities.





Censoring and selection bias

Reminder: NHEFS investigation from last time

Research goal: What is the average causal effect of smoking cessation on weight gain at a follow-up visit about 10 years later?

  • There were some people who did not have their weight measured at Visit 2. They must be excluded from our analysis.
    • Define a censoring indicator C where C=1 means censored/excluded and C=0 means uncensored/included.
    • Our analysis only generalizes to uncensored individuals–the types of people who would stay in the study!

  • Before: just wanted to find Z such that conditional on Z (within subsets defined by Z), the outcomes of the treated and untreated were comparable.
    • Could appropriately upweight the treated and untreated to get “populations” of all treated and all untreated.


  • Now: also want to ensure that the outcomes in the censored and uncensored are comparable.
    • Can appropriately upweight the uncensored to represent everyone had they not been censored.
    • The censored get zero weight. (We don’t observe them!)


  • When are the outcomes in the censored and uncensored comparable?
    • When variables in A and Z2 together d-separate C and the outcome Y.
    • e.g., The DAG below. View C as the new “treatment” variable.
    • Another way to put it: when we put a box around C in the DAG, Z must d-separate A and Y under the null.
      Same condition as before, now explicit about paths with conditioned-on colliders (non-backdoor paths).

Questions: C is inherently conditioned on in our analysis (C=0).

  • What set Z d-separates the treatment and outcome under the null?
  • What is a set Z2 such that Z2A (read Z2 “union” A, or Z2 and A together) d-separate C and the outcome?





Conclusion:

  • Z should include:
    sex, age, race, education, smokeintensity, smokeyrs, active, exercise, wt71, and VidGames
    (Same as before with the addition of VidGames)
  • Z blocks spurious, non-causal paths between A (qsmk) and Y (wt82_71)


  • Z2 should include VidGames
  • Together with A, Z2 blocks spurious, non-causal paths between C and Y
  • Note that Z2 is contained within Z.
    • This will be the case when A is a cause of C (often true).





Ultimate goal: Want to upweight individuals using P(A,C=0Z)

  • Why? Same rationale as our tree diagram with additional C=1 and C=0 branches.
  • Allows us to estimate E[Ya,c=0]: expected potential outcome under treatment a and when no one has been censored (C=0)





We can express P(A,C=0Z) as the following:

P(A,C=0Z)=P(A,C=0,Z)/P(Z)=P(AZ)P(C=0A,Z)

  • Based on the last line, we see (1) the propensity score and (2) another “propensity” of being uncensored.
  • Will want models for both.
  • (The censoring probability is not typically called a propensity score.)





A little extra theory (don’t worry about the math part below if you haven’t taken Probability):

Z may contain extra variables not in Z2 if there are, say, many common causes of A and Y, but only a few of these are a cause of C. Let’s say that these extra variables are in the set Z1. (Z=Z1Z2)

If the variables in Z1 are not causes of C, then C and Z1 are conditionally independent given A and Z2.

Going through the work below…

P(A,C=0Z)=P(A,C=0Z1,Z2)=P(A,C=0,Z1,Z2)/P(Z1,Z2)=P(AZ1,Z2)P(C=0A,Z1,Z2)=P(AZ1,Z2)P(C=0A,Z2)

…this implies that the propensity score model (for A) should depend on all variables in Z (the set Z1 and Z2 together) and that the censoring probability model only need depend on Z2.





Analysis plan:

Notation:

  • Z is the set that d-separates A and Y when C is conditioned on.
  • Z2 (a subset of Z) that, together with A, d-separates C and Y
  1. Model the propensity scores P(AZ) using all variables Z in the d-separating set. Create inverse probability weights WA=1/P(AZ) (treatment weights).

  2. Model the probability of censoring P(CA,Z2). Create inverse probability weights WC=1/P(C=0A,Z2) (censoring weights).

  3. Create final weights (for treatment and censoring) WA,C=WA×WC.

  4. Fit the desired marginal structural model with these final weights.





Exercises

A template Rmd is available here.

Setup

library(readr)
library(dplyr)
library(ggplot2)
library(geepack)

nhefs <- read_csv("https://cdn1.sph.harvard.edu/wp-content/uploads/sites/1268/1268/20/nhefs.csv")

Create a censoring indicator variable that will be TRUE if the individual was censored and FALSE otherwise.

nhefs$cens <- is.na(nhefs$wt82_71)

Throughout we’ll use the full nhefs dataset, and we’ll work from the DAG below (C has been conditioned on):



Part 1: Modeling to obtain weights

  1. Based on the DAG above, identify the set of variables Z that d-separates qsmk and weight gain.

  2. Fit an appropriate treatment propensity score model, and call this ps_treat_mod. For simplicity, assume that a quadratic relationship for the quantitative variables gives a good fit. Make sure to wrap categorical variables inside factor() in your model formula.

  3. Add a new variable to your dataset called weights_treat that contains the IP weights for treatment.

  4. Is it believable that quitting smoking (treatment) is a cause of censoring? How do the tabulations/calculations below help answer this?

    # P(censored | quitters) and P(censored | nonquitter)
    table(qsmk = nhefs$qsmk, cens = nhefs$cens)
    table(qsmk = nhefs$qsmk, cens = nhefs$cens) %>% prop.table(margin = 1)
  5. How would you see if the data support the age --> C and education --> C arrows? Describe, but don’t actually perform this analysis.

  6. Fit an appropriate model for the probability of censoring, and call this prob_cens_mod. For simplicity, assume that a quadratic relationship for the quantitative variables gives a good fit. Make sure to wrap categorical variables inside factor() in your model formula.

  7. Add a new variable to your dataset called weights_cens that contains the IP weights for censoring. (Note that in R, the default for logistic regression is to model the probability of the outcome variable equaling 1.)

  8. Add a final weight variable to your dataset called weight_TC.




Part 2: Fitting MSMs

Fit the following two MSMs (A is qsmk, and for sex, 0 indicates males and 1 indicates females):

E[Ya,c=0]=β0+β1a E[Ya,c=0]=β0+β1a+β2sex+β3a×sex

Note: In our previous analysis where we worked with nhefs_subs, we had actually fit the following:

E[YaC=0]=β0+β1a E[YaC=0]=β0+β1a+β2sex+β3a×sex

Refit these models here using the relevant weights. How do your results compare?