CS 377P: Programming for Performance
Assignment 1: Performance counters
Due date: September 6, 2023, 10:00PM
Late submission policy: Submissions can be at most 1 day
late. There will be a 10% penalty for late submissions.
Description
1) Write C code for the 6 variants of matrix-matrix multiply
(MMM) you can generate by permuting loops in the standard
three-nested loop version of MMM. The data type in the matrix
should be doubles.
Hint: To check cache sizes on the
machine, run: lscpu
2) Answer the following questions, using a few sentences for each
one.
- What are data and control dependences? Give simple
examples to illustrate these concepts.
- Explain out-of-order execution and in-order
retirement/commit. Why do high-performance processors
execute instructions out of order but retire them in order?
What hardware structure(s) are used to implement in-order
retirement?
- Consider the out-of-order execution model described in
lecture (ROB+register renaming). Since there are a limited
number of physical registers, the processor must determine
when it is safe to reallocate a physical register to hold
another value. Explain briefly how a processor might do this.
Deliverables
Submit (in canvas) the following two files:
- A .tar.gz file with your code, a README.txt and a
Makefile.
- The README.txt describes how to run your program and what
the output will be. A reasonable output will be pairs of
"name of measured event, value".
- With the Makefile, your code should be compiled on the 5
CS machines by running only "make".
- A report (in .pdf) containing the tables, and the answers to
the questions in both parts.
Grading
Code: 40 points
Measurements (plots): 30 points
Explanation: 10 points
Answers to short questions in (2): 20 points
PAPI:
To see which papi counters are available on a host, run:
papi_avail
To see which papi counters can be collected at the same time,
run:
papi_event_chooser
The PAPI website has a lot of information: https://icl.utk.edu/papi/.
Here's the PAPI users guide:
https://icl.utk.edu/projects/papi/files/documentation/PAPI_USER_GUIDE_23.htm
"Warning! num_cntrs is more than num_mpx_cntrs" can be ignored.
ICC:
To run ICC on the indicated CS machines, run:
. /opt/intel/oneapi/setvars.sh
icc [compiler commands]
To check the availability of icc, run:
icc -v