A course on how to leverage randomness to build fast algorithms for data science problems.
Instructor: R. Teal Witter. Please call me Teal.
Class Times: Tuesdays and Thursdays from 2:45 to 4:00pm in Kravis 164.
Office Hours: Mondays and Thursdays from 12:30 to 2:00pm in Adams 213.
Problem Sets: Your primary opportunity to learn the material will be on problem sets. You may work with others to solve the problems, but you must write your solutions by yourself, and explicitly acknowledge any outside help (e.g., websites, people, LLMs).
Quizzes: There will be short quizzes at the beginning of (randomly) selected classes. These quizzes will test your understanding of the problem sets and the concepts from the prior week.
Exams: The two midterm exams are the primary method of assessing your understanding of the material.
Project: The project offers a chance to explore an area that interests you, practice writing high quality code, and develop your ability to communicate technical ideas to an audience.
Resources: This class is based on Chris Musco’s phenomenal algorithmic machine learning and data science course at NYU. While we do not have a textbook, I have prepared written notes for every lecture.
|
Week
|
Tuesday
|
Thursday
|
Slides
|
Assignments
|
|
Week 1 (1/20 and 1/22)
|
Math Review
|
Set Size Estimation
|
Slides
|
Problem 1
|
|
Week 2 (1/27 and 1/29)
|
Frequent Items
|
Frequent Items
|
Slides
|
Problem 2, Problem 3
|
|
Week 3 (2/3 and 2/5)
|
Distinct Elements
|
Distinct Elements
|
Slides
|
Problem 4, Problem 5
|
|
Week 4 (2/10 and 2/12)
|
Load Balancing
|
Concentration Inequalities
|
Slides
|
Problem 6, Problem 7
|
|
Week 5 (2/17 and 2/19)
|
High-Dimensional Geometry
|
High-Dimensional Geometry
|
Slides
|
Problem 8, Problem 9
|
|
Week 6 (2/24 and 2/26)
|
Dimensionality Reduction
|
Dimensionality Reduction
|
Slides
|
Problem 10, Problem 11
|
|
Week 7 (3/3 and 3/5)
|
Similarity Estimation
|
Midterm Exam
|
Slides
|
|
|
Week 8 (3/10 and 3/12)
|
Similarity Estimation
|
Singular Value Decomposition
|
Slides
|
Problem 12, Problem 13
|
|
Week 9 (3/17 and 3/19)
|
Spring Break (No Class)
|
Spring Break (No Class)
|
|
|
|
Week 10 (3/24 and 3/26)
|
Singular Value Decomposition
|
Power Method
|
Slides
|
Problem 14, Problem 15
|
|
Week 11 (3/31 and 4/2)
|
Load Balancing at Databricks (Suyog Soti)
|
Power Method
|
Slides
|
Problem 16
|
|
Week 12 (4/7 and 4/9)
|
Spectral Graph Theory
|
Spectral Graph Theory
|
Slides
|
Problem 17, Problem 18
|
|
Week 13 (4/14 and 4/16)
|
Sketched Regression
|
Sketched Regression
|
Slides
|
Problem 19, Problem 20
|
|
Week 14 (4/21 and 4/23)
|
Project/Exam Preparation
|
Midterm Exam
|
|
Project Proposal (TeX)
|
|
Week 15 (4/28 and 4/30)
|
Fast JL Transform
|
Explainable AI
|
Slides
|
Problem 21, Problem 22
|
|
Week 16 (5/5 and 5/7)
|
Explainable AI
|
Reading Day (No Class)
|
|
Final Project (TeX)
|
|
Week 17 (5/12 and 5/14)
|
Finals (No Class)
|
Project Presentation 2-5pm on Friday 5/15
|
|
|