AE 11: Opinion articles in The Chronicle

Part 1 - Data scraping

See chronicle-scrape.R for suggested scraping code.

Part 2 - Data analysis

Let’s start by loading the packages we will need:

library(tidyverse)

Load the data you saved into the data folder and name it chronicle.

chronicle <- read_csv("data/chronicle.csv")

Who are the most prolific authors of the 500 most recent opinion articles in The Chronicle?

chronicle |>
  count(author, sort = TRUE)

# A tibble: 193 × 2
   author             n
   <chr>          <int>
 1 Luke A. Powery    30
 2 Anna Garziera     25
 3 Aaron Siegle      23
 4 Monday Monday     23
 5 Advikaa Anand     15
 6 Nik Narain        14
 7 Leo Goldberg      13
 8 Remembrances      13
 9 Alex Berkman      11
10 Heidi Smith       10
# ℹ 183 more rows

Draw a line plot of the number of opinion articles published per day in The Chronicle.

chronicle |>
  count(date) |>
  ggplot(aes(x = date, y = n, group = 1)) +
  geom_line()

What percent of the most recent 500 opinion articles in The Chronicle mention “climate” in their title?

chronicle |>
  mutate(
    title = str_to_lower(title),
    climate = if_else(
      str_detect(title, "climate"),
      "mentioned",
      "not mentioned"
    )
  ) |>
  count(climate) |>
  mutate(prop = n / sum(n))

# A tibble: 2 × 3
  climate           n  prop
  <chr>         <int> <dbl>
1 mentioned         8 0.016
2 not mentioned   492 0.984

Come up with another question and try to answer it using the data.

# add code here