Lecture Schedule

January
 
22            Course overview

  24            Basics of computer architecture (I): pipelined processors
 
  29           
Basics of computer architecture (II): OOO execution processors
                   Another useful set of slides on OOO processors
                   Lectures from the ECE architecture course
 
  31           
Measurements: timing and PAPI counters      

February
  
  5           
Basics of modern compilers (Areg Melik-Adamyan)

    7
            Sources of parallelism and locality in important algorithms (4 lectures):
                          Graph algorithms
                          Additional reading: The TAO of Parallelism in Programs, Pingali et al, PLDI 2011.
                    
  12/14             Computational science algorithms
                          Video of Miss Marple solving differential equations

  19            Dependences, dependence graphs, work/span, scheduling

  21            Cache architecture and memory hierarchy (Areg Melik-Adamyan)

  26            Locality, loop and data transformations

  28            Case study of locality enhancement: GEMM and ATLAS

March
  5
              Intel VTune (I) for cache performance analysis (Jackson Marusarz) 

 
7 /12/14 Vectorization (3 lectures) (Pablo Reble)
 

 19/21       Spring break

 26           
Shared-memory architectures: cache-coherence     

 28            pThreads programs (2 lectures)

April
         
  2            

 

  4            
Memory consistency  

  9/11      
OpenMP (2 lectures) (Mike Voss)  
 
  16           Intel Threading Building Blocks (Mike Voss)

  18           Intel Advisor and VTune(II) (Jackson Marusarz)

  23          Case study of shared-memory parallelization (Jackson Marusarz)

  25           MPI - 1 (Hajime Fujita)

  30           MPI - 2 (Hajime Fujita)

May

 
2/7         Intro to GPU programming (Martin Burtscher)
                 Connected component implementation for GPUs (Martin Burtscher)
                 Maximal independent set algorithm for GPUs (Martin Burtscher)
                 Parallel prefix sums and scans (Martin Burtscher)

     9
        Final exam review