STA 199 Introduction to Data Science and Statistical Thinking

Below is a prospective outline for the course. Due dates are firm, but topics may change with advanced notice.

WEEK DATE PREPARE TOPIC MATERIALS DUE
0 Wed, Jan 7
πŸ§‘β€πŸ« Welcome! πŸ–₯️ slides 00

Thu, Jan 8
πŸ’» Lab 0 lab 0
1 Mon, Jan 12 πŸ“— r4ds - intro
πŸ“˜ ims - chp 1
πŸŽ₯ Meet the toolkit :: R and RStudio
πŸŽ₯ Meet the toolkit :: Quarto
πŸŽ₯ Code along :: First data viz with UN Votes
πŸ§‘β€πŸ« Meet the toolkit πŸ–₯️ slides 01
πŸ—’οΈ notes 01


Wed, Jan 14 πŸ“— r4ds - chp 1
πŸ“˜ ims - chp 4
πŸŽ₯ Visualizing data
πŸŽ₯ Building a plot step-by-step with ggplot2
πŸŽ₯ Grammar of graphics
πŸŽ₯ Code along :: First look at Palmer Penguins
πŸ§‘β€πŸ« Grammar of data visualization πŸ–₯️ slides 02
πŸ—’οΈ notes 02


Thu, Jan 15
πŸ’» Lab 1 lab 1 Lab 1 @ end of section
2 Mon, Jan 19
❌ MLK Day - No Lecture


Wed, Jan 21 πŸ“— r4ds - chp 2
πŸ“— r4ds - chp 3.1-3.5
πŸŽ₯ Grammar of data transformation
πŸŽ₯ Code along :: Flights and pipes
πŸ§‘β€πŸ« Grammar of data transformation πŸ–₯️ slides 03
πŸ—’οΈ notes 03
πŸ’ƒ ae 03


Thu, Jan 22
πŸ’» Lab 2 lab 2
hw 1
Lab 2 @ end of section
3 Mon, Jan 26 πŸ“— r4ds - chp 3.6-3.7
πŸŽ₯ Visualizing and summarizing categorical data
πŸŽ₯ Visualizing and summarizing numerical data
πŸŽ₯ Visualizing and summarizing relationships
πŸŽ₯ Code along :: Star Wars characters
πŸ§‘β€πŸ« Exploratory Data Analysis I πŸ–₯️ slides 04
πŸ—’οΈ notes 04


Wed, Jan 28 πŸ“˜ ims - chp 5
πŸ“˜ ims - chp 6
πŸŽ₯ Code along :: Diving deeper with Palmer Penguins
πŸ§‘β€πŸ« Exploratory Data Analysis II πŸ–₯️ slides 05
πŸ—’οΈ notes 05
πŸ•Ί ae 05
HW 1 @ 11:59 pm

Thu, Jan 29
πŸ’» Lab 3 lab 3
hw 2
Lab 3 @ end of section
4 Mon, Feb 2 πŸŽ₯ Tidy data
πŸŽ₯ Tidying data
πŸŽ₯ Code along :: Country populations over time
πŸ“— r4ds - chp 5
πŸ§‘β€πŸ« Tidying data πŸ–₯️ slides 06
πŸ—’οΈ notes 06
🧞 ae 06


Wed, Feb 4 πŸŽ₯ Joining data
πŸŽ₯ Code along :: Continent populations
πŸ“— r4ds - chp 19.1-19.3
πŸ§‘β€πŸ« Joining data πŸ–₯️ slides 07
πŸ—’οΈ notes 07
🦧 ae 07
HW 2 @ 11:59 pm

Thu, Feb 5
πŸ’» Lab 4 lab 4
hw 3
Lab 4 @ end of section
5 Mon, Feb 9 πŸŽ₯ Data types
πŸŽ₯ Data classes
πŸŽ₯ Code along :: That’s my type
πŸ“— r4ds - chp 16
πŸ§‘β€πŸ« Data types and classes πŸ–₯️ slides 08
πŸ—’οΈ notes 08
😭 follow-up


Wed, Feb 11
πŸ§‘β€πŸ« More data types and classes πŸ–₯️ slides 09
πŸ—’οΈ notes 09
πŸ™ AE 08
HW 3 @ 11:59 pm

Thu, Feb 12
πŸ’» Lab 5 lab 5
hw 4
Lab 5 @ end of section
6 Mon, Feb 16 πŸŽ₯ Importing data
πŸŽ₯ Code along :: Halving CO2 emissions
πŸŽ₯ Code along :: Student survey
πŸ“— r4ds - chp 7
πŸ“— r4ds - chp 17.1 - 17.3
πŸ§‘β€πŸ« Importing and recoding data πŸ–₯️ slides 10
πŸ—’οΈ notes 10
🐒 AE 09
πŸͺ AE 10


Wed, Feb 18 πŸŽ₯ Web scraping basics
πŸŽ₯ Code along :: Scraping an eCommerce page
πŸ“— r4ds - chp 24.1 - 24.6
πŸŽ₯ Code along :: Scraping many eCommerce pages
πŸŽ₯ Web scraping considerations
πŸ“— r4ds - chp 25.1 - 25.2
πŸ§‘β€πŸ« Webscraping πŸ–₯️ slides 11
πŸ—’οΈ notes 11
🌐 AE 11
HW 4 @ 11:59 pm

Thu, Feb 19
πŸ’» Project Milestone 1 + 2 πŸŽ‰ Description
πŸͺ¨ Milestone 1
πŸ§— Milestone 2

7 Mon, Feb 23
πŸ§‘β€πŸ« Data science wrap-up πŸ–₯️ slides 12
πŸ—’οΈ notes 12


Tue, Feb 24


Project Proposal @ 11:59 pm

Wed, Feb 25 πŸ’« study guide πŸ“ Midterm 1


Thu, Feb 26
❌ No lab!


Sun, Mar 1


Take-home 1 @ 11:59 pm
8 Mon, Mar 2 πŸ“˜ ims - chp 6
πŸ“— r4ds - chp 10
πŸ§‘β€πŸ« Communicating results πŸ–₯️ slides 13
πŸ—’οΈ notes 13
🍊 AE 13
🀒 AE 14
Peer evaluation 1 @ 11:59 pm

Wed, Mar 4 πŸŽ₯ Misrepresentation
πŸŽ₯ Data privacy
πŸŽ₯ Algorithmic bias
πŸŽ₯ Code along :: Sectors and services
πŸ§‘β€πŸ« Data science ethics πŸ–₯️ slides 14 Milestone 3 @ 11:59 pm

Thu, Mar 5
πŸ’» Project Milestone 4 πŸͺ¨ Milestone 4 Milestone 4 @ end-of-lab
9 Mon, Mar 9
❌ Spring Break - No Lecture


Wed, Mar 11
❌ Spring Break - No Lecture


Thu, Mar 12
❌ Spring Break - No Lab

10 Mon, Mar 16 πŸŽ₯ The language of models
πŸŽ₯ Linear regression with a numerical predictor
πŸ“˜ ims - chp 7.1
πŸ§‘β€πŸ« The language of models
Peer evaluation 2 @ 11:59 pm

Wed, Mar 18 πŸŽ₯ Linear regression with a categorical predictor
πŸŽ₯ Outliers in linear regression
πŸŽ₯ Code along :: Modeling fish
πŸ“˜ ims - chp 7.2
πŸ§‘β€πŸ« Simple linear regression


Thu, Mar 19
πŸ’» Lab 6
Lab 6 @ end of section
11 Mon, Mar 23 πŸŽ₯ Linear regression with multiple predictors
πŸŽ₯ Main and interaction effects
πŸ“˜ ims - chp 8.1-8.2
πŸ“˜ ims - chp 8.3-8.5
πŸ§‘β€πŸ« Multiple linear regression I


Wed, Mar 25 πŸŽ₯ Code along :: Modeling interest rates πŸ§‘β€πŸ« Multiple linear regression II
HW 5 @ 11:59 pm

Thu, Mar 26
πŸ’» Final Project Presentation


Sun, Mar 29


Final Project @ 11:59 pm
12 Mon, Mar 30 πŸŽ₯ Logistic regression
πŸŽ₯ Code along :: Building a spam filter
πŸ“˜ ims - chp 9
πŸ§‘β€πŸ« Logistic regression I
Peer evaluation 3 @ 11:59 pm

Wed, Apr 1 πŸŽ₯ Clasification and decision errors
πŸŽ₯ Overfitting and spending your data
πŸ§‘β€πŸ« Logistic regression II
HW 6 @ 11:59 pm

Thu, Apr 2
πŸ’» Lab 7
Lab 7 @ end of section
13 Mon, Apr 6 πŸŽ₯ Code along :: Forest classification πŸ§‘β€πŸ« Modeling wrap-up


Wed, Apr 8
πŸ“ Midterm 2


Thu, Apr 9
❌ We’re all tired - No lab!


Sun, Apr 12


Take-home 2 @ 11:59 pm
14 Mon, Apr 13 πŸŽ₯ Quantifying uncertainty
πŸŽ₯ Bootstrapping
πŸŽ₯ Code along :: Bootstrapping Duke Forest houses
πŸ“˜ ims - chp 11
πŸ“˜ ims - chp 12
πŸ§‘β€πŸ« Interval estimation


Wed, Apr 15 πŸŽ₯ Hypothesis testing
πŸ“˜ ims - chp 11
πŸ§‘β€πŸ« Hypothesis testing


Thu, Apr 16
πŸ’» Lab 8
Lab 8 @ end of section
15 Mon, Apr 20
πŸ§‘β€πŸ« More inference


Wed, Apr 22
πŸ§‘β€πŸ« Farewell!
HW 7 @ 11:59 pm

Thu, Apr 23
❌ Reading Days - No Lab

16 Fri, May 1
πŸ“ Final (2pm - 5pm)