Welcome to statistical learning!

Introductions

About me

  • PhD in Statistical Science from Duke University, BA in Mathematics and Computer Science from Swarthmore College

  • Research interests: Bayesian hierarchical models for ecological applications

    • Developing models for single species or community abundances
  • Office: Warner 214

    • If my door is open, come on in! Also feel free to e-mail me.

    • Office hours: Tuesdays 3-4pm and by appointment via Calendly

  • Current hobbies: running, mushroom foraging

  • Aspirational hobbies/skills: fly fishing, driving stick shift

About you

  • Introduce yourself using any or all of the following (the first is mandatory):

    • Name
    • Year
    • Major/minor
    • Hobbies
    • How do you take your coffee?

Course details

About the course

  • Course website: https://math218-spring2023.github.io/ (please bookmark!)

  • Learn various models for regression and classification tasks (more on this next lecture)

    • Linear and logistic regression, KNN, decision trees + variants, K-means and hierarchical clustering

Necessary background

  • I assume you have taken Math 118 prior to this course, and are comfortable with tidyverse and RStudio. There is a large emphasis on computing.

    • I also assume you are comfortable with knitting. In this course, I ask that you knit to PDF.
  • We will learn how to code in base R, and by the end of the course you should feel comfortable switching between base R and tidyverse.

  • We will focus more on applications and developing intuition. The goal is to begin developing a toolbox of methods that you may use in future analyses.

Class Meetings

  • Lecture
    • Focus on concepts behind statistical learning techniques
    • Interactive lecture that includes examples and hands-on exercises
    • Bring fully-charged laptop to every lecture
      • Please let me know if you do not have access to a laptop
  • Lab
    • Typically occurs on Fridays
    • Focus on computing using functions provided in R packages
    • Apply concepts from lecture to case study scenarios
  • Implementation
    • Some days will be focused on implementing (i.e. coding by hand) methods discussed in lecture
    • Complete in small groups

Major assessments

  • One midterm with two components:
    • Computational component (take-home)
    • Oral component
  • Final project
    • Groups of 3-4 students (tentatively)
    • Presentations during last two days of class*
    • NO sit-down final

Grading

Assignments

  1. Labs (30%)
  2. Implementation deliverables (20%)
  3. Midterm (20%)
  4. Final project (25%)
  5. Participation (5%)

Important dates

  • Friday, 3/31: take-home midterm

  • Monday, 4/3: oral midterm

  • Friday, 4/14: Spring symposium (no class)

  • Monday, 4/17: final day to drop classes :(

  • Friday, 5/13 and Monday, 5/15*: project presentations

Excused Absences

  • Students who miss a class due to a scheduled varsity trip, religious holiday, or short-term illness should fill out the respective form.

    • These excused absences do not excuse you from assigned work.
  • If you have a personal or family emergency or chronic health condition that affects your ability to participate in class, please contact your academic dean’s office.

  • Exam dates cannot be changed and no make-up exams will be given.

Late Work and Regrade Requests

  • Homework assignments:

    • After the assigned deadline, there is a 10% penalty for each day the assignment is late
    • Please communicate with me early if you will need a homework extension!
  • Late work will not be accepted for the midterm or final project.

  • Regrade requests must be submitted within one week of when the assignment is returned

Academic Honesty and Reusing Code

  • All work for this class should be done in accordance with the Middlebury Honor code. Any violations will automatically result in a grade of 0 on the assignment and will be reported.

  • Unless explicitly stated otherwise, you may make use of online resources (e.g. StackOverflow) for coding examples on assignments. If you directly use code from an outside source (or use it as inspiration), you must or explicitly cite where you obtained the code. Any recycled code that is discovered and is not explicitly cited will be treated as plagiarism.

  • On individual assignments, you may discuss the assignment with one another; however, you may not directly share code or write up with other students. This includes copy-and-paste sharing, as well as showing your screen with the code displayed to another student.

  • On team assignments, you may not directly share code or write up with another team. Unauthorized sharing of the code or write up will be considered a violation for all students involved.

  • ChatGPT most likely will not be useful in this class. However, if you use it on an assignment, please let me know in what capacity you used it by including a comment in your assignment.

Inclusion

  • In this course, we will strive to create a learning environment that is welcoming to all students. If there is any aspect of the class that is not welcoming or accessible to you, please let me know immediately.

  • Additionally, if you are experiencing something outside of class that is affecting your performance in the course, please feel free to talk with me and/or your academic dean.

Your Turn!

Create a GitHub account

Go to https://github.com, and create an account (unless you already have one). After you create your account, click here and enter your GitHub username.

Tips for creating a username from Happy Git with R.

  • Incorporate your actual name!
  • Reuse your username from other contexts if you can.
  • Shorter is better than longer; be as unique as possible in as few characters as possible.
  • Avoid words laden with special meaning in programming, like NA.

“Coding” exercise

Let’s create the following plot together:

Data

cat_lovers %>%
  datatable(options = list(pageLength = 5))

Playing with base R

  • Create a folder on your desktop called Math218
  • Open RStudio and create a new Rmarkdown document.
    • We will work through some coding exercises. The associated code can be found in “Live Code 01”

TO DO

  • Enter your GitHub username into the Google form

    • You will soon receive and e-mail asking you to join our Math 218 GitHub organization. Please accept it!
  • Try the practice exercises involved with “Live code 01”