STA 199 Introduction to Data Science and Statistical Thinking
Below is a prospective outline for the course. Due dates are firm, but topics may change with advanced notice.
| WEEK | DATE | PREPARE | TOPIC | MATERIALS | DUE |
|---|---|---|---|---|---|
| 0 | Wed, Jan 7 | π§βπ« Welcome! | π₯οΈ slides 00 | ||
| Thu, Jan 8 | π» Lab 0 | lab 0 | |||
| 1 | Mon, Jan 12 | π r4ds - intro π ims - chp 1 π₯ Meet the toolkit :: R and RStudio π₯ Meet the toolkit :: Quarto π₯ Code along :: First data viz with UN Votes |
π§βπ« Meet the toolkit | π₯οΈ slides 01 ποΈ notes 01 |
|
| Wed, Jan 14 | π r4ds - chp 1 π ims - chp 4 π₯ Visualizing data π₯ Building a plot step-by-step with ggplot2 π₯ Grammar of graphics π₯ Code along :: First look at Palmer Penguins |
π§βπ« Grammar of data visualization | π₯οΈ slides 02 ποΈ notes 02 |
||
| Thu, Jan 15 | π» Lab 1 | lab 1 | Lab 1 @ end of section | ||
| 2 | Mon, Jan 19 | β MLK Day - No Lecture | |||
| Wed, Jan 21 | π r4ds - chp 2 π r4ds - chp 3.1-3.5 π₯ Grammar of data transformation π₯ Code along :: Flights and pipes |
π§βπ« Grammar of data transformation | π₯οΈ slides 03 ποΈ notes 03 π ae 03 |
||
| Thu, Jan 22 | π» Lab 2 |
lab 2 hw 1 |
Lab 2 @ end of section | ||
| 3 | Mon, Jan 26 | π r4ds - chp 3.6-3.7 π₯ Visualizing and summarizing categorical data π₯ Visualizing and summarizing numerical data π₯ Visualizing and summarizing relationships π₯ Code along :: Star Wars characters |
π§βπ« Exploratory Data Analysis I | π₯οΈ slides 04 ποΈ notes 04 |
|
| Wed, Jan 28 | π ims - chp 5 π ims - chp 6 π₯ Code along :: Diving deeper with Palmer Penguins |
π§βπ« Exploratory Data Analysis II | π₯οΈ slides 05 ποΈ notes 05 πΊ ae 05 |
HW 1 @ 11:59 pm | |
| Thu, Jan 29 | π» Lab 3 |
lab 3 hw 2 |
Lab 3 @ end of section | ||
| 4 | Mon, Feb 2 | π₯ Tidy data π₯ Tidying data π₯ Code along :: Country populations over time π r4ds - chp 5 |
π§βπ« Tidying data | π₯οΈ slides 06 ποΈ notes 06 π§ ae 06 |
|
| Wed, Feb 4 | π₯ Joining data π₯ Code along :: Continent populations π r4ds - chp 19.1-19.3 |
π§βπ« Joining data | π₯οΈ slides 07 ποΈ notes 07 𦧠ae 07 |
HW 2 @ 11:59 pm | |
| Thu, Feb 5 | π» Lab 4 |
lab 4 hw 3 |
Lab 4 @ end of section | ||
| 5 | Mon, Feb 9 | π₯ Data types π₯ Data classes π₯ Code along :: Thatβs my type π r4ds - chp 16 |
π§βπ« Data types and classes | π₯οΈ slides 08 ποΈ notes 08 π follow-up |
|
| Wed, Feb 11 | π§βπ« More data types and classes | π₯οΈ slides 09 ποΈ notes 09 π AE 08 |
HW 3 @ 11:59 pm | ||
| Thu, Feb 12 | π» Lab 5 |
lab 5 hw 4 |
Lab 5 @ end of section | ||
| 6 | Mon, Feb 16 | π₯ Importing data π₯ Code along :: Halving CO2 emissions π₯ Code along :: Student survey π r4ds - chp 7 π r4ds - chp 17.1 - 17.3 |
π§βπ« Importing and recoding data | π₯οΈ slides 10 ποΈ notes 10 π’ AE 09 πͺ AE 10 |
|
| Wed, Feb 18 | π₯ Web scraping basics π₯ Code along :: Scraping an eCommerce page π r4ds - chp 24.1 - 24.6 π₯ Code along :: Scraping many eCommerce pages π₯ Web scraping considerations π r4ds - chp 25.1 - 25.2 |
π§βπ« Webscraping | π₯οΈ slides 11 ποΈ notes 11 π AE 11 |
HW 4 @ 11:59 pm | |
| Thu, Feb 19 | π» Project Milestone 1 + 2 | π Description πͺ¨ Milestone 1 π§ Milestone 2 |
|||
| 7 | Mon, Feb 23 | π§βπ« Data science wrap-up | π₯οΈ slides 12 ποΈ notes 12 |
||
| Tue, Feb 24 | Project Proposal @ 11:59 pm | ||||
| Wed, Feb 25 | π« study guide | π Midterm 1 | |||
| Thu, Feb 26 | β No lab! | ||||
| Sun, Mar 1 | Take-home 1 @ 11:59 pm | ||||
| 8 | Mon, Mar 2 | π ims - chp 6 π r4ds - chp 10 |
π§βπ« Communicating results | π₯οΈ slides 13 ποΈ notes 13 π AE 13 π€’ AE 14 |
Peer evaluation 1 @ 11:59 pm |
| Wed, Mar 4 | π₯ Misrepresentation π₯ Data privacy π₯ Algorithmic bias π₯ Code along :: Sectors and services |
π§βπ« Data science ethics | π₯οΈ slides 14 | Milestone 3 @ 11:59 pm | |
| Thu, Mar 5 | π» Project Milestone 4 | πͺ¨ Milestone 4 | Milestone 4 @ end-of-lab | ||
| 9 | Mon, Mar 9 | β Spring Break - No Lecture | |||
| Wed, Mar 11 | β Spring Break - No Lecture | ||||
| Thu, Mar 12 | β Spring Break - No Lab | ||||
| 10 | Mon, Mar 16 | π₯ The language of models π₯ Linear regression with a numerical predictor π ims - chp 7.1 |
π§βπ« The language of models | Peer evaluation 2 @ 11:59 pm | |
| Wed, Mar 18 | π₯ Linear regression with a categorical predictor π₯ Outliers in linear regression π₯ Code along :: Modeling fish π ims - chp 7.2 |
π§βπ« Simple linear regression | |||
| Thu, Mar 19 | π» Lab 6 | Lab 6 @ end of section | |||
| 11 | Mon, Mar 23 | π₯ Linear regression with multiple predictors π₯ Main and interaction effects π ims - chp 8.1-8.2 π ims - chp 8.3-8.5 |
π§βπ« Multiple linear regression I | ||
| Wed, Mar 25 | π₯ Code along :: Modeling interest rates | π§βπ« Multiple linear regression II | HW 5 @ 11:59 pm | ||
| Thu, Mar 26 | π» Final Project Presentation | ||||
| Sun, Mar 29 | Final Project @ 11:59 pm | ||||
| 12 | Mon, Mar 30 | π₯ Logistic regression π₯ Code along :: Building a spam filter π ims - chp 9 |
π§βπ« Logistic regression I | Peer evaluation 3 @ 11:59 pm | |
| Wed, Apr 1 | π₯ Clasification and decision errors π₯ Overfitting and spending your data |
π§βπ« Logistic regression II | HW 6 @ 11:59 pm | ||
| Thu, Apr 2 | π» Lab 7 | Lab 7 @ end of section | |||
| 13 | Mon, Apr 6 | π₯ Code along :: Forest classification | π§βπ« Modeling wrap-up | ||
| Wed, Apr 8 | π Midterm 2 | ||||
| Thu, Apr 9 | β Weβre all tired - No lab! | ||||
| Sun, Apr 12 | Take-home 2 @ 11:59 pm | ||||
| 14 | Mon, Apr 13 | π₯ Quantifying uncertainty π₯ Bootstrapping π₯ Code along :: Bootstrapping Duke Forest houses π ims - chp 11 π ims - chp 12 |
π§βπ« Interval estimation | ||
| Wed, Apr 15 | π₯ Hypothesis testing π ims - chp 11 |
π§βπ« Hypothesis testing | |||
| Thu, Apr 16 | π» Lab 8 | Lab 8 @ end of section | |||
| 15 | Mon, Apr 20 | π§βπ« More inference | |||
| Wed, Apr 22 | π§βπ« Farewell! | HW 7 @ 11:59 pm | |||
| Thu, Apr 23 | β Reading Days - No Lab | ||||
| 16 | Fri, May 1 | π Final (2pm - 5pm) |
