Lecture
Schedule
January
22
Course overview
24
Basics of computer
architecture (I): pipelined processors
29
Basics of computer
architecture (II): OOO execution processors
Another
useful set of slides on OOO processors
Lectures
from the ECE architecture course
31
Measurements: timing and PAPI counters
February
5
Basics
of modern compilers (Areg Melik-Adamyan)
7
Sources of parallelism and locality in important algorithms (4
lectures):
Graph
algorithms
Additional reading: The TAO of
Parallelism in Programs, Pingali et al, PLDI 2011.
12/14
Computational science
algorithms
Video of Miss Marple solving
differential equations
19
Dependences, dependence graphs, work/span, scheduling
21
Cache architecture
and memory hierarchy (Areg Melik-Adamyan)
26
Locality, loop and data
transformations
28
Case study of locality enhancement:
GEMM and ATLAS
March
5
Intel VTune (I)
for cache performance analysis (Jackson Marusarz)
7 /12/14 Vectorization (3
lectures) (Pablo Reble)
19/21 Spring break
26
Shared-memory
architectures: cache-coherence
28
pThreads programs (2
lectures)
April
2
4
Memory consistency
9/11 OpenMP (2 lectures)
(Mike Voss)
16 Intel Threading
Building Blocks (Mike Voss)
18 Intel Advisor and
VTune(II) (Jackson Marusarz)
23 Case
study of shared-memory parallelization (Jackson Marusarz)
25
MPI - 1
(Hajime Fujita)
30
MPI - 2
(Hajime Fujita)
May
2/7 Intro to GPU
programming (Martin Burtscher)
Connected component implementation for GPUs (Martin Burtscher)
Maximal independent set algorithm for GPUs (Martin Burtscher)
Parallel prefix sums and scans (Martin Burtscher)
9
Final exam review