CS 395T: Topics in Multicore Programming

How do we make parallel programming mainstream? This is one of the most important problems facing computer systems researchers today, and solving it is the key to unlocking the performance potential of multicore processors. This seminar course focuses on recent breakthroughs that attempt to address this problem by exploiting the deep structure of parallelism and locality in algorithms. These ideas are useful not only for computational science applications but also for complex applications from other domains such as machine learning, big-data, and games.

This semester, the course will focus on (i) how to exploit parallelism for machine learning and big-data applications, and (ii) how to exploit approximation to reduce power and energy consumption. There is a lot of current research in both the systems and machine learning communities on these topics, and a variety of domain-specific languages (DSLs) and implementations for these domains have been proposed recently for both shared-memory and distributed-memory architectures.

Topics include the following:

Structure of parallelism and locality in important algorithms in computational science and machine learning
Algorithm abstractions: operator formulation of algorithms, dependence graphs
Multicore architectures: interconnection networks, cache coherence, memory consistency models, synchronization
Scheduling and load-balancing
Parallel data structures: lock-free data structures, array/graph partitioning
Memory hierarchies and locality, cache-oblivious algorithms
Compiler analysis and transformations
Performance models: PRAM, BPRAM, logP
Self-optimizing software, auto-tuning
GPUs and GPU programming
Case studies: Cilk, MPI, OpenMP, Map-reduce, Galois, GraphLab
Approximate computing for power and energy optimization

Students will present papers, participate in discussions, and do a substantial final project. The readings will include some of the classic papers in the field of parallel programming. In addition, there will be a small number of programming assignments and homeworks at the beginning of the semester. Some of the lectures in the course will be given by Inderjit Dhillon and Pradeep Ravikumar, who are experts in machine learning.

Prerequisites: programming maturity, knowledge of C/C++, basic courses on modern computer architecture and compilers

For basic material on computer architecture, read "Computer Architecture: A Quantitative Approach"
by Hennessy & Patterson, Morgan Kaufmann Publishers. For basic material on compilers, read "Optimizing Compilers for Modern Architectures" by Allen and Kennedy.

Lecture schedule and notes

CS 395T: Topics in Multicore Programming

Fall 2013

Lecture: Tue Th 12:30-2:00PM

GDC 6.202 (new room)