Topic 14 Midterm Review

Core Themes

Misleading

  • Visualizations: How do different visualizations hide / fail to show certain aspects of the data?
  • Summary statistics: How do summary statistics hide / fail to show certain aspects of the data?
  • Coefficients in simple linear regression models
  • Multiple R-squared
  • Multicollinearity
  • Confounders and mediators



Prediction vs. Explanation

  • Metrics for predictive quality
  • Goal: prediction
    • Include all variables that actually predict the response
    • Does multicollinearity / redundancy matter?
    • Do confounders / mediators matter?
  • Goal: explanation
    • Does multicollinearity / redundancy matter?
    • Do confounders / mediators matter?





Additional Exercises

Exercise 1

Suppose that we want to know the causal effect of \(X\) on \(Y\), and there is a single confounder \(C\).

  • Can the model Y ~ X+C answer this research question?
  • Can the model Y ~ X*C answer this research question?

Make sure that you know how to interpret the coefficients in both of the above models regardless of the type of variable \(X\) and \(C\) are.



Exercise 2

Consider a categorical predictor of interest \(X\), response \(Y\), and a quantitative confounder \(C\). Consider the following 3 situations:

  • Situation 1: Confounding: no. Interaction: yes.
  • Situation 2: Confounding: yes. Interaction: no.
  • Situation 3: Confounding: yes. Interaction: yes.
  • Situation 4: Confounding: no. Interaction: no.

For each situation draw a single plot that clearly shows the situation. After you draw these single plots, think about what the plots would look like if the situation weren’t so clear and about what additional plots would be helpful in that circumstance.



Exercise 3

  1. Consider the process of performing bootstrapping. Identify all points in the process that could be performed incorrectly and describe the implications of such errors.

  2. We view our bootstrap confidence intervals as ranges of plausible values for the true population value. What is the rationale for this?