Spring 2026
  • Discord
  • Gradescope
  • Syllabus

Randomized Algorithms for Data Science

A course on how to leverage randomness to build fast algorithms for data science problems.


Instructor: R. Teal Witter. Please call me Teal.

Class Times: Tuesdays and Thursdays from 2:45 to 4:00pm in Kravis 164.

Office Hours: Mondays and Thursdays from 12:30 to 2:00pm in Adams 213.

Problem Sets: Your primary opportunity to learn the material will be on problem sets. You may work with others to solve the problems, but you must write your solutions by yourself, and explicitly acknowledge any outside help (e.g., websites, people, LLMs).

Quizzes: There will be short quizzes at the beginning of (randomly) selected classes. These quizzes will test your understanding of the problem sets and the concepts from the prior week.

Exams: The two midterm exams are the primary method of assessing your understanding of the material.

Project: The project offers a chance to explore an area that interests you, practice writing high quality code, and develop your ability to communicate technical ideas to an audience.

Resources: This class is based on Chris Musco’s phenomenal algorithmic machine learning and data science course at NYU. While we do not have a textbook, I have prepared written notes for every lecture.

Week Tuesday Thursday Slides Assignments
Streaming & Sketching
Week 1 (1/20 and 1/22) Math Review Set Size Estimation Slides Problem 1
Week 2 (1/27 and 1/29) Frequent Items Frequent Items Slides Problem 2, Problem 3
Week 3 (2/3 and 2/5) Distinct Elements Distinct Elements Slides Problem 4, Problem 5
Week 4 (2/10 and 2/12) Load Balancing Concentration Inequalities Slides Problem 6, Problem 7
Week 5 (2/17 and 2/19) High-Dimensional Geometry High-Dimensional Geometry Slides Problem 8, Problem 9
Week 6 (2/24 and 2/26) Dimensionality Reduction Dimensionality Reduction Slides Problem 10, Problem 11
Week 7 (3/3 and 3/5) Similarity Estimation Midterm Exam Slides
Week 8 (3/10 and 3/12) Similarity Estimation Singular Value Decomposition Slides Problem 12, Problem 13
Week 9 (3/17 and 3/19) Spring Break (No Class) Spring Break (No Class)
Linear Algebra & Spectral Methods
Week 10 (3/24 and 3/26) Singular Value Decomposition Power Method Slides Problem 14, Problem 15
Week 11 (3/31 and 4/2) Load Balancing at Databricks (Suyog Soti) Power Method Slides Problem 16
Week 12 (4/7 and 4/9) Spectral Graph Theory Spectral Graph Theory Slides Problem 17, Problem 18
Week 13 (4/14 and 4/16) Sketched Regression Sketched Regression
Week 14 (4/21 and 4/23) Project/Exam Preparation Midterm Exam
Week 15 (4/28 and 4/30) Explainable AI Explainable AI
Week 16 (5/5 and 5/7) TurboQuant Reading Day (No Class)
Week 17 (5/12 and 5/14) Finals (No Class) Project Presentation 2-5pm