Project work

Peer review of code

Class date: April 9, 2024

Plan: You’ll walk through your code so far step by step and demonstrate to another team that all steps are accurate.

Process:

One team will talk through each section of code in their project so far and explain how they checked the accuracy of those steps.
The other team will listen, ask questions, and provide suggestions. The listening team members should keep in mind the following characteristics from the Code quality and documentation section of the project rubric:
- Are functions used effectively to reduce code duplication?
- Do functions do a single small task?
- Are loops/purrr used appropriately to handle repeated tasks?
- Is their clear documentation in the text before and after code chunks of what is happening in each code chunk?
- Are code comments used to add clarity inside code chunks?
- When using a clean dataset for modeling and/or visualization, is it clear what the cases represent and what portion (possibly a subset) of the data is being used?
Take notes on what you learn from peer feedback on your code and what you learn from observing your peer’s code. This will be part of the reflection for Skills Session 2.

Class date: April 11, 2024

Plan: We’ll look at some materials to provide inspiration and guidance for your projects.

Section 4.2.1 Guidelines for good plots presents 6 guidelines for creating great plots:

Although it’s not explicitly stated, an overarching theme is to facilitate comparisons.

When you present your visualizations, what aspects is the viewer drawn to, and what do they want to compare?
Make it as easy as possible to compare those things.

Example: grid of scatterplots

Scatterplot grids are considered in Guideline 5: Consider using small multiples.
What if the viewer ultimately wants to focus on the slope and the strength of the correlation?
- Then it might be more effective to summarize each scatter plot with the correlation coefficient, slope estimate, and slope confidence interval.
- geom_point(), geom_linerange()/geom_errorbar() would be effective here.

Exercise: If your digital deliverable (whether blog post or Shiny app) could only show 3 visuals, what would they be? Why?

Let’s take a moment to explore The Pudding’s 30 Years of American Anxieties.

In what ways do these letters reveal essential context that would never be found in a dataset?
What hidden context can you imagine for your dataset?
- What additional information could accompany your dataset to provide a more full picture of the lived experiences of all those who may have been connected to the data?
- Who collected this data? Why? What might have been their agenda?
- How might the agendas of the data collectors affected what data are available? In terms of:
  - What cases are present in and absent from the data?
  - What variables are available and in what format (e.g., categories)?
- Think about the labor involved in collecting your data. Whose labor is most visible and applauded? Whose labor is invisible?