Schedule

Course Calendar

Readings in the schedule below refer to the following textbooks (freely available online):

Guiding questions for the readings are available at the bottom of this page.

Week Tuesday Thursday Announcements
1 9/5: Welcome! Meeting each other and designing our learning community
Before class: Review the syllabus and think about the questions posed in the green "Reflect" blocks.
9/7: Advanced visualization in ggplot
Before class: Review the construction of plots from STAT 112 and STAT 155. Answer the Guiding Questions at the bottom of this page.
Work on HW0 (your 10-year vision, doesn't need to be turned in).
Look ahead to HW1
2 9/12: Advanced map visualization
Before class: Watch this video on Coordinate Reference Systems, and answer the Guiding Questions at the bottom of this page.
9/14: Advanced map visualization (continued)
Turn in HW1 by midnight on Wed 9/13. Look ahead to HW2 due Wednesday 9/20 at midnight.
3 9/19: Interactive visualization
Before class: Listen to this podcast from Chapter 7 (timestamp 18:09) through Chapter 8 (ending at timestamp 25:27). Answer the Guiding Question at the bottom of this page. Install the "shiny" and "plotly" R packages.
9/21: Classroom Community and Connectedness (CC&C) Survey
For the first 30 minutes, we will move our course projects forward. In the last hour of class, CC&C facilitators will come in to run an activity on how community-building is going in our course.
Turn in HW2 by midnight on Wednesday 9/20. Look ahead to HW3.
4 9/26: Data wrangling: numbers, logicals, and dates Helpful readings (read before or after class): (All from R4DS) Chapter 13 (Logicals), Chapter 14 (Numbers), and Chapter 18 (Dates/Times) 9/28: Data wrangling: strings
Helpful readings (read before or after class): (All from R4DS) Chapter 15 (Strings) and Chapter 16 (Regular Expressions)
Turn in HW3. Look ahead to HW4 (Project Miletone 2).
5 10/3: Data wrangling: factors
Helpful readings (read before or after class): Chapter 17 (Factors) (R4DS).
10/5: Writing functions
Helpful readings (read before or after class): R4DS Chapter 26 (Functions) and RPDS Section 13.1 (if-else).
Turn in Project Milestone 2 by either Wed 10/4 or Wed 10/11 at midnight. Start looking at Reflection 1.
6 10/10: Loops and iteration
Helpful readings (read before or after class): Chapter 27 (Iteration) and this tutorial.
10/12: Loops and iteration Turn in HW4 (Project Milestone 2) by Wed 10/11 if you haven't already. Turn in Reflection 1 by Wed 10/11.
7 10/17: Data acquisition: APIs
Helpful readings (read before or after class):
10/19: Data acquisition: Scraping
Helpful readings (read before or after class): rvest vignette
Turn in HW5.
8 10/24: Project feedback
Project progress presentation #1 (Milestone 3): Your team will present a 5-7 minute progress report and plan for next steps
10/26: No class - Fall Break 🍁
9 10/31: Review and practice (Day 1) 11/2: Review and practice (Day 2) Work on review activities as part of HW6.
10 11/7: Data acquisition: databases
Required reading before class: R4DS Chapter 22 (Databases).
11/9: Data acquisition: databases (continued) Work on HW7.
11 11/14: Missing data: wrangling and missingness mechanisms
Required reading before class: Chapters 1, 2, and 3 in The Missing Book
11/16: Missing data: imputation
Required reading before class: Chapters 11, 13, and 14 in The Missing Book
Turn in HW7 on Wednesday 11/15.
12 11/21: Project presentations
Project progress presentation #2 (Milestone 4): Your team will give a 10 minute presentation with intermediate results for 2 research questions.
11/23: No class - Thanksgiving Break 🦃
13 11/28: Project work time 11/30: Project work time
14 12/5: Project work time 12/7: Project work time
15 12/12: Project presentations (last day of class)

Guiding Questions

Do your best to answer guiding questions before the indicated class period. Responses don’t need to be turned in, but answering helps you prepare effectively for class.

11/7: Databases

Before class on Tuesday, install the DBI and duckdb packages. Post in the #questions channel on Slack if you run into issues.

As you read the Databases chapter of R4DS, answer the following questions:

  • What are the differences between client-server, cloud, and in-process database management systems (DBMSs)?
  • Make note of how the following functions are used in a database workflow. What does each function do? What inputs/arguments does each function require? Create a set of notes that shows the sequence of these functions in a database workflow.
    • DBI::dbConnect()
    • DBI::dbReadTable()
    • DBI::dbGetQuery()
    • tbl()
    • collect()
  • Make note of how the showQuery() function can help you learn SQL (structured query language) by translating dplyr code into SQL.
    • As you read Section 22.5 (SQL), create notes that relate parts of SQL queries to dplyr functions.

10/19: Web scraping

As you read the rvest package vignette, answer the following questions:

  • The first step of getting (scraping) data from an arbitrary web page is to read in the webpage. What rvest function(s) are relevant for this step?
  • Next we need to select the HTML elements that contain the information that we want. What function(s) are relevant here?
    • How would we select all elements that have the class “author”?
    • How would we select all level 1 headers (tag <h1>)?
  • Next we extract information from the selected HTML elements.
    • What is the difference between selecting the text contents of an HTML element and selecting an attribute of the HTML element? What functions are used for these two tasks?
    • How would we get all URLs for links that appear on a webpage?

9/19: Interactive visualization

After listening to this podcast from Chapter 7 (timestamp 18:09) through Chapter 8 (ending at timestamp 25:27), reflect on the following question:

  • What was new, unexpected, or interesting in the discussion about animations, interactivity, and dashboards?

9/12: Advanced map visualization

After/while watching this video on Coordinate Reference Systems (CRS), answer the following questions:

  • What is the shape of the Earth?
  • Why is GDA94 a great datum name?
  • What are the two components of a CRS/GCS?
  • Why do we use many different local CRSs rather than just one CRS for the whole earth?
  • Why is it insufficient to identify a location by its latitude and longitude?
  • Why do we need to be mindful about CRSs when working with different spatial datasets?

9/7: Advanced visualization in ggplot

To review plot creation skills from STAT/COMP 112 and STAT 155, use the diamonds dataset in the ggplot2 package to recreate the following visualizations:

library(ggplot2)
data(diamonds)