Topic 2 Exchangeability
Learning Goals
- Define an average causal effect in terms of potential outcomes
- Apply the concepts of marginal and conditional exchangeability to answer questions about (hypothetical) data on potential outcomes
- Give examples of when marginal and conditional exchangeability would and would not hold in various data contexts
- Explain why a direct comparison of the outcomes in the treated and untreated is misleading as an estimate of a causal effect
Slides from today are available here.
Exercises
Exercise 1
Suppose that we are trying to understand the causal effect of a personal finance course on the percent of earnings left in savings each month (abbreviated as “percent savings”). For the 500 people who took the course, we are able to collect data on percent savings and various other factors. We are also able to collect the same information from 500 people who did not take the course.
Do you think that a comparison of percent savings in the course takers and non-takers would be a valid estimate of the average causal effect? Explain your viewpoint using the concepts of potential outcomes and exchangeability.
One important factor to consider is the number of children that an individual has. Explain how this factor could contribute to a lack of exchangeability in the outcomes of the course takers and non-takers. As part of your explanation, discuss how observed outcomes compare to the missing potential outcomes. State any assumptions you make about the relationships between different factors.
Do you think conditional exchangeability holds conditional on number of children?
- If yes, why?
- If no, what other factors might be needed to achieve conditional exchangeability? What is your thought process in thinking about other factors?
Exercise 2
While \(Y^a\) denotes the potential outcome under treatment \(A = a\), \(Y\) denotes the observed outcome. (For the treated, \(Y = Y^{a=1}\). For the untreated, \(Y = Y^{a=0}\).)
Is it possible for marginal exchangeability to hold but for \(Y\) and \(A\) to be dependent? Explain using a numerical or graphical example.
Exercise 3
We have the data below on number of children (\(Z\)), treatment (course takers: \(A = 1\). course non-takers: \(A = 0\)), and the percent savings outcome (\(Y^a\)) categorized as either high (H) or low (L).
Assuming that the course takers and non-takers are exchangeable conditional on number of children, estimate the average causal effect \(P(Y^{a=1} = \mathrm{high}) - P(Y^{a=0} = \mathrm{high})\).
\(n\) | \(Z\) | \(A\) | \(Y\) |
---|---|---|---|
10 | 2 | 1 | 7 H, 3 L |
10 | 1 | 1 | 6 H, 4 L |
10 | 0 | 1 | 5 H, 5 L |
30 | 2 | 0 | 18 H, 12 L |
40 | 1 | 0 | 20 H, 20 L |
50 | 0 | 0 | 20 H, 30 L |
Exercise 4
Assuming that we have exchangeability of the course takers and non-takers conditional on number of children (\(Z\)) and use of public transportation (\(W\)), how might a regression model be used to estimate the average causal effect of the personal finance course?
Exercise 5
Exchangeability is a core assumption in causal inference. Do you think it is possible to test this assumption with data? If yes, how? If not, why not?
Exercise 6 (Extra)
Consider the 4 scenarios below. For all scenarios, \(A\) represents a binary treatment, and \(Y\) represents the observed outcome (quantitative).
- Marginal exchangeability, independence of \(Y\) and \(A\)
- Marginal exchangeability, dependence of \(Y\) and \(A\)
- Lack of marginal exchangeability, independence of \(Y\) and \(A\)
- Lack of marginal exchangeability, dependence of \(Y\) and \(A\)
Draw plots of potential outcome distributions or create numerical examples that illustrate the following 4 scenarios.