Instructors |
Adam Klivans and Pradeep Ravikumar
|
Office Hours |
ACES 2.434, Mondays 3:30-5:00 pm (by appointment)
|
Overview |
A central problem in machine learning is to develop algorithms that have provable guarantees in terms of both running time and number of "training" observations required. Computational Learning Theory has traditionally focused on the first issue (the computational complexity of learning algorithms) while Statistical Learning Theory has focused on the second (their statistical efficiency). In this course we will cover both these aspects, and try to understand how learning is constrained given limited computation and limited data.
|
Grading |
Four problem sets (3/4 of final grade), and a final paper presentation (1/4 of final grade).
|
(Optional) Textbooks |
An Introduction to Computational Learning Theory. Michael Kearns, Umesh Vazirani
A Probabilistic Theory of Pattern Recognition.
Luc Devroye, Laszlo Györfi, Gabor Lugosi.
The Elements of Statistical Learning: Data Mining, Inference, and Prediction. Trevor Hastie, Robert Tibshirani, Jerome Friedman.
|
Homeworks |
|
(Tentative) Schedule |
Module |
Date |
Topic |
Notes |
Faculty |
1,2 |
|
... |
... |
AK |
|
|
... |
... |
AK |
|
|
Fourier Learning |
Papers: [1]
[2]
[3]
|
Guest Lect: Homin Lee |
3: Statistical Analysis Foundations
(2 weeks). Consistency, Convergence Rates, Generalization Error.
|
|
Useful Inequalities |
DGL; Chapter 8 |
PR |
|
Glivenko-Cantelli Theorem |
DGL; Chapter 12 |
PR |
|
V-C Theorem |
DGL; Chapter 12 |
PR |
4: Complexity of Learning
(2 weeks). VC Theory, Metric Entropy, Rademacher Complexity, Margin Bounds.
|
|
Shatter Coefficients, V-C Dimension |
DGL; Chapter 13 |
PR |
|
Metric Entropy |
DGL; Chapter 28 |
PR |
|
Uniform Deviations of Averages from Expectations |
DGL; Chapter 29 |
PR |
|
Rademacher Complexity |
|
Guest Lect: Ambuj Tewari |
5: Misc
|
|
Surrogate Losses for Classification |
Paper: [1] |
PR |
|
Reproducing Kernel Hilbert Spaces |
Book: [1] |
PR |
|