Course objectives: To obtain the high level of end-to-end performance needed in problem domains like graphics, computer games, and machine learning, it is necessary for programs to exploit many of the features of modern computer architectures. In this course, we will study the performance-critical features of modern computer architectures, and discuss how applications can take advantage of them to obtain high performance. This is not a course on software tricks; rather, the emphasis is on abstractions of computer architecture, understanding performance, and obtaining performance when you need it.
Topics covered in lecture include the following:
- Analysis of applications that need high end-to-end
performance
- Understanding performance: performance models, Amdahl's law
- Measurement and design of computer experiments
- Micro-benchmarks for abstracting performance-critical aspects of computer systems
- Memory hierarchy: caches, virtual memory, exploiting spatial
and temporal locality
- Vectors and vectorization
- GPUs and GPU programming
- Multi-core processors and shared-memory programming, OpenMP
- Distributed-memory machines and message-passing programming, MPI
- Self-optimizing software
Prerequisites:
programming maturity, knowledge of C/C++, basic course on modern
computer architecture
Course work: There will 6 substantial programming
assignments (60% of grade), a mid-semester exam (15% of grade)
and a final exam (25% of grade).