CS 394P: Data Flow Analysis
Due: February 21, 2005.
This assignment is to write some basic programs for data flow analysis.
Examine the code in the file dataflow.lsp . This file
contains some helper functions and some test data.
We will represent a basic block as a node containing code as a
progn of statements. Each node also has a node number, the
numbers of its successor nodes, and places to hold the data flow
information. Data flow information will be represented as sets of
symbols, where each symbol is either a variable or a compiler variable
that represents a subexpression. You can use the Lisp functions
union, intersection, and set-difference
to manipulate these sets.
- Call the function (initexprs) once to initialize the
expression table, *exprs* . (initexprs) will
associate each subexpression with a compiler variable, which will
be a generated symbol. After calling (initexprs), you
can look at *exprs* to see what it produced.
- Write a function (comp-kill node) to create
the computed and killed sets due to
the code in a node, and save the results in the node data structure.
You can use (setf (computed node) ...) and
(setf (killed node) ...) to store the results.
Each of these sets will be represented as a list of symbols.
- Write a loop to call comp-kill on all nodes.
- Write a function (avail-entry node) to compute
what is available on entry to a node from what is available on exit
from its predecessors. You can assume that (avail-exit node)
has been stored; initially, of course, it will be empty.
- Write a function to iteratively compute (avail-exit node)
based on (avail-entry node) and what is computed and killed
within the node. Continue updating these values until there is no
further change.
- Write a function to iterate through all nodes and identify
redundant parts of the code. For each node, print out the node number
and iterate through each statement of the code. Keep an available
set that is available-on-entry initially, and update it for each
statement. If a statement involves some redundant code, print out
the statement and the redundant expression.
You will probably find that not all of the subexpressions that are
actually redundant are identified by your algorithm. The remaining ones
can be found using more sophisticated methods.
The following functions can be used in writing your code:
- (predecessors node) returns a list of node numbers of the
predecessors of node.
- (expcomputed exp) returns the set of subexpressions
that are computed in the process of computing exp.
- (killedby var) returns the set of subexpressions
that are killed by redefinition of var.