Logistics: |
Tue/Thu 3:30 - 5:00
GDC 5.304
Unique Number: 51810
Course web page:
http://www.cs.utexas.edu/~ecprice/courses/sublinear/
|
Professor: |
Eric Price
Email: ecprice@cs.utexas.edu
Office: GDC 4.510
Office Hours: Wednesday 3-4pm
|
TA: |
Zhao Song
Email: zhaos@utexas.edu
Office Hours: TBA
|
Content: |
This graduate course will study algorithms that can process
very large data sets. In particular, we will consider
algorithms for:
- Data streams, where you don't have enough space to
store all the data being generated.
- Property testing, where you don't have enough time to
look at all the data.
- Compressed sensing, where you don't have enough
measurement capacity to observe all the data.
|
Useful References: |
The last instantiation of this class was similar to this one. Other similar courses
include Sublinear
Algorithms (at
MIT), Algorithms
for Big Data (at Harvard),
and Sublinear
Algorithms for Big Datasets (at the University of Buenos
Aires).
|
Problem Sets: |
Problem sets are due every other week at the beginning of class. Typewritten solutions are preferred.
- Problem Set 1. Due September 15.
- Problem Set 2. Due October 4.
- Problem Set 3. Due October 20.
- Problem Set 4. Due November 8.
- Problem Set 5. Due November 22.
|
Lectures: |
- Thursday, August 25. Course overview; basic uniformity testing. [Lecture notes (pdf) (tex)]
- Tuesday, August 30. Concentration inequalities; distinct elements. [Lecture notes (pdf) (tex)]
- Thursday, September 1. More distinct elements algorithms and lower bounds. [Lecture notes (pdf) (tex)]
- Tuesday, September 6. Concentration of measure. [Lecture notes (pdf) (tex)] [scratch]
- Thursday, September 8. Subgamma variables; Johnson-Lindenstrauss. [Lecture notes (pdf) (tex)]
- Tuesday, September 13. Count-Min sketch. [Lecture notes (pdf) (source)] [scratch]
- Thursday, September 15. Count-Sketch. [Lecture notes (pdf) (tex)]
- Tuesday, September 20. L0 sampling; exact sparse recovery. [Lecture notes (pdf) (tex)] [scratch]
- Thursday, September 22. Graph sketching. [Lecture notes (pdf) (source)] [scratch]
- Tuesday, September 27. Coresets. [Lecture notes (pdf) (tex)] [scratch]
- Thursday, September 29. Cauchy distribution; Fp moment estimation. [Lecture notes (pdf) (tex)]
- Tuesday, October 4. Fp moment estimation lower bounds; packing/covering numbers. [Lecture notes (pdf) (tex)]
- Thursday, October 6. Maurey's empirical method; Restricted Isometry Property. [Lecture notes (pdf) (tex)]
- Thursday, October 13. Proving the RIP; iterative hard thresholding. [Lecture notes (pdf) (tex)]
- Tuesday, October 18. Model-based compressive sensing. [Lecture notes (pdf) (tex)]
- Thursday, October 20. L1 minimization. [Lecture notes (pdf) (tex)]
- Tuesday, October 25. Lower bounds for sparse recovery. [Lecture notes (pdf) (tex)]
- Thursday, October 27. Adaptive sparse recovery. [Lecture notes (pdf) (tex)]
- Tuesday, November 1. RIP-1; SSMP. [Lecture notes (pdf) (tex)]
- Thursday, November 3. Fourier uncertainty principle. Symmetrization; Dudley's entropy integral; start Fourier RIP. [Lecture notes (pdf) (tex)]
- Tuesday, November 8. Finish Fourier RIP. [Lecture notes (pdf) (tex)]
- Thursday, November 10. Property testing: monotonicity, grids.
- Tuesday, November 17. Property testing on graphs.
- Thursday, November 19. Distribution testing: uniformity, identity.
- Tuesday, November 24. Distribution testing: identity of pairs of distributions; independence.
The tentative outline for the course is as follows:
- Uniformity testing
- Concentration inequalities and Johnson-Lindenstrauss
- Distinct elements counting
- Heavy hitters
- Graph sketching
- Compressed sensing
- Model-based compressed sensing
- Sparse Fourier transforms
- Property testing
- Other streaming models: random order, distributional
|
Prerequisites: |
Mathematical maturity and comfort with undergraduate algorithms and
basic probability. Ideally also familiarity with linear algebra.
|
Grading: |
Grades will be based on the following weighting of class components:
40%: Homework
30%: Final project
20%: Scribing lectures
10%: Participation
|
Plus and minus modifiers will not appear in the final grade.
Scribing: |
In each class, two students will be assigned to take notes.
These notes should be written up in
a standard LaTeX format before
the next class.
|
Homework policy: |
There will be a homework assignment roughly every two weeks.
Collaboration policy: You are encouraged to
collaborate on homework. However, you must write up your own
solutions. You should also state the names of those you
collaborated with on the first page of your submission.
|
Final project: |
In lieu of a final exam, students will perform final
projects. These may be done individually or in groups of
2-3. An ideal final project would perform a piece of
original research in a topic related to the course. Failing
that, one may perform a literature survey covering several
research papers in the field.
Students will present their results to the class during
the last week of classes. The final paper will be due on
the scheduled final exam day.
|
Students with
Disabilites:
|
Any student with a documented disability (physical or
cognitive) who requires academic accommodations should contact the
Services for Students with Disabilities area of the Office of the
Dean of Students at 471-6259 (voice) or 471-4641 (TTY for users
who are deaf or hard of hearing) as soon as possible to request an
official letter outlining authorized accommodations.
|