I am working on several research topics. These are not presented in any particular order -- I turn my attention to each topic as time permits and as the interests of my students and colleagues proceed.
Much of my work is concerned with using or developing the ACL2 theorem proving system.
For the last two years, I have been working with Ivan Sutherland on rapid, single-flux, quantum (RSFQ) logic. This effort concerns developing a methodology to specify, design, model, validate, verify, build, test, and analyze RSFQ logic circuits. RSFQ logic is based on Josephson junctions, and it operates at very cold (4K) temperatures. Ivan and I work closely with scientists at MIT Lincoln Laboratory (MTI/LL), where there are other scientists investigating the properties of such circuits. MIT/LL also provides fabrication services for our experimental designs; we hope to soon receive our latest ``tape out''. As a part of this effort, we are using the ACL2 and the Electric tools. We want to develop an ability to design RSFQ logic circuits using a self-timed logic paradigm that we can model and analyze with ACL2. We would appreciate hearing from interested people.
We are working on a formal specification for a small subset of the X86 instruction set architecture (ISA); this effort started in 2010. Attempting to specify the entire X86 ISA would require a very substantial effort -- well beyond our available resources. Even so, we believe our subset may have utility to some. Microsoft has been generous enough to provide us with some support to further this effort.
As a starting point, we have develped an ACL2-based specification of the Y86 microprocessor which is described in "Computer Systems, A Programmer's Perspective" by Randy Bryant and David O’Hallaron. The Y86 is a very simple microprocessor, with a X86-like ISA; we are using this specification as a starting place for our effort. The source files for the model, when loaded into ACL2, provide a running simulator of the Y86 microprocessor design.
We believe that developing a formal instruction-set architecture (ISA) specification for the X86 architecture could serve a number of purposes.
We believe that such a model should provide a means to specify a multiprocessor configuration, so it will be necessary to evolve a "memory centric" specification that permits interleaved execution. Each copy of an ISA processor specification, whether virtual (as in the case of hyper-threading) or physical (such as with multi-core implementations), must include its own register file, user/supervisor state, interrupt controller, and memory management unit.
In light of the complexity of the X86 ISA architecture and the sheer amount of work necessary to create such a formal model, it will have to be done in pieces. Here are some obvious elements.
We have nearly completed elements (1), (2), (3), and (4), including all of the address modes. The allows a skilled ACL2 user to prove the correctness of single-threaded, user-level programs. Our X86 specification can simulate integer-only, user-level programs generated by the GNU and LLVM compilers. We validate our evolving model by co-simulation against real X86 processors. Our X86 specification project has its own Website, where there is information about our progress.
Bob Boyer and I have been working on an implementation of Hash CONS for use within ACL2. This effort extends the core capabilities of the ACL2 system.
We have developed an extension of ACL2 that includes the implementation of hash-based association lists and function memoization; this makes some algorithms execute more quickly. This extension, enabled partially by the implementation of Hash-CONS, represents ACL2 objects in a canonical way, thus the comparison of any two such objects can be determined without the cost of descending through their CONS structures. A restricted set of ACL2 user-defined functions may be memoized; the underlying implementation may conditionally retain the values of such function calls so that if a repeated function application is requested, a previously computed value may instead be returned. We have defined a fast association list access and update functions using hash tables. We provide a file reader that identifies and eliminates duplicate representations of repeated objects, and a file printer that produces output with no duplicate subexpressions.
We have a small collection of other papers related to Hash CONS. If you believe a good reference is missing, please send us a pointer.
We formalized the DE2 hierarchical, occurrence-oriented finite state machine (FSM) language, and have developed a proof theory allowing the mechanical verification of FSM descriptions. Using the ACL2 functional logic, we have defined a syntactic well-formedness predicate and a symbolic simulator that defines the DE2 cycle-based simulation semantics. DE2 is deeply embedded within ACL2, and the DE2 language includes an annotation facility that can be used by programs that manipulate DE2 descriptions; this facility may restrict the use of defined modules or it may provide other module information. The DE2 user may write and prove the correctness of programs that generate DE2 descriptions.
Erik Reeber (mostly) and I have been developing this language as a vehicle to explore how we should extend current hardware description languages to make them more useful to the design process. For instance, the DE2 language permits a number of embedded annotations -- these annotations are first class parts of a design description and they can be analyzed just like the functional specification part.
DE2 is designed to permit the rigorous hierarchical description and hierarchical verification of finite-state machines (FSMs). We call our language DE2 (Dual-Eval 2) because of the two-pass approach that we employ for the language recognizers and evaluators. DE2 is actually a general-purpose language for specifying FSMs; users may define their own language primitives. We recognize valid DE2 descriptions with an ACL2 predicate that defines the permissible syntax, names, and hierarchy, of valid descriptions. The DE2 language also provides a rich annotation language that can be used to enforce syntactic and semantic design restrictions.
Erik Reeber and I describe the DE2 language in our paper the ``Formalization of the DE2 Language'', which appears in Correct Hardware Design and Verification Methods (CHARME 2005), Lecture Notes in Computer Science}, No. 3725, pages 20-34, © Springer-Verlag, 2005.
We have also have built a verifying compiler that translated DE2 circuits into cycle-accurate ACL2 models. It is a verifying compiler in that it also produces an ACL2 proof that the ACL2 model is equivalent to the DE2 circuit, with respect to the DE2 semantics, formalized in ACL2. Properties regarding a finite number of steps of the ACL2 model can be written in the Subclass of Unrollable List Formulas in ACL2 (SULFA), for which we have developed a SAT-based decision procedure. We describe SULFA and this decision procedure in our paper "A SAT-Based Decision Procedure for the Subclass of Unrollable List Formulas in ACL2 (SULFA)" (abstract), which will appear in the 3rd International Joint Conference on Automated Reasoning (IJCAR 2006).
We have used our methodology to mechanically verify components of the TRIPS microprocessor implementation. We compiled Verilog components into DE2, then used our verifying compiler to produce an equivalent ACL2 model. We then used a mixture of theorem proving and our SAT based decision procedure to verify that the ACL2 model satisfies its ACL2 specification.
Serita Nelesen and I are developing the TASPI experimental system for analysis, retrival, and storage of phylogenetic information. TASPI is implemented as an ACL2 program.
Phylogenetic tree searching algorithms often produce thousands of trees which biologists save in Newick format in order to perform further analysis. Unfortunately, Newick is neither space efficient, nor conducive to post-tree analysis such as consensus. We propose a new format for storing phylogenetic trees that significantly reduces storage requirements while continuing to allow the trees to be used as input to post-tree analysis. We implemented mechanisms to read and write such data from and to files, and also implemented a consensus algorithm that is faster by an order of magnitude than standard phylogenetic analysis tools. We demonstrate our results on a collection of data files produced from both maximum parsimony tree searches and Bayesian methods.
In 2005, we (Bob Boyer, Serita Nelesen, and I) published the paper "A Compressed Format for Collections of Phylogenetic Trees and Improved Consensus Performance"; this paper appeared in Algorithms in Bioinformatics: 5th International Workshop, WABI 2005. Lecture Notes in Computer Science 3692, pages 353-364, Springer-Verlag, 2005.
Mike Gordon and I started an experimental ACL2-HOL4 integration project in the summer of 2005, when we constructed a means of embedding the ACL2 logic in HOL4. This was done by defining a model of ACL2 datatypes (meaning a single type that can represent ACL2 symbols, characters, numbers, strings, and CONS (pairing) operations inside of HOL; we then defined a translator from this level of representation from HOL to ACL2 and back again.
Mike Gordon has a Webpage where
interested parties may investigate this effort.
Sol Swords and
I are working on analysis of transistor-level circuits. Currently, we
have created an analysis program similar in character to Bryant's
COSMOS
system . Our first goal is to make a smoothly functioning
analysis system that includes all of the analysis features found in
COSMOS. Our long-term goal is to improve the analysis capabilities of
our system beyond those of COSMOS.
We express the circuit as graph of wires connected by transistors. We
break th graph into channel-connected subcomponents for piecewise
analysis. We use Gaussian elimination to find three-valued formulas
represented as BDDs expressing the steady state behavior of each wire.
This model views the wires between transistors as having one of three
values (TRUE, FALSE, UNKNOWN) instead of a voltage. The model
includes provisions for transistors of different strengths and wire
capacitances; it analyzes off-sets and on-sets separately in order to
conservatively determine when a wire will have a good Boolean state as
opposed to an indeterminate value. Once a formula is found for each
wire, we symbolically simulate the resultant system using a
unit-transistor-delay model.
For a Hybrid system, we assume the system is described by one or more
systems of differential equations and some criteria for switching from
one system of equations to the other or instantaneously modifying the
system variables. Of particular interest is when the switching of the
system of equations or the change of variable value is dictated by the
output of some computer which is said to be the controller of the
system. Our interest is to research methods for reasoning about the
semantics of the computer program as well as the behavior of the
dynamic system. With the use of nonstandard analysis, we are using
proof methods often used for reasoning about computer programs to
reason about continuous dynamic systems interacting with computer
systems. The formal verification of control sytems using ACL2(R) is
the subject of Shant Harutunian's PhD research; he is an ECE PhD
Candidate.
A simple switching system is a bouncing ball. The position y of the
ball perpendicular to the impact surface may be described by a
differential equation: dy/dt = v0-(a*t), where the impact surface is
at y=0, the initial ball velocity is v0, acceleration due to gravity
is a, and time is t. The solution to this equation is
y(t)=(v0*t)-(0.5*a*(t^2)), where y(0)=0. The impact of the ball with
the impact surface causes the vertical velocity of the ball to
instantaneously change its sign. While the solution to the
differential equation describes the behavior of the ball prior to
impact, the concatenation of the system solutions initialized with the
velocity immediately after impact describes the behavior of the ball
over time. Hence, given a system solution y(v0,t), and given a
sequence S, not necessarily finite, of tuples where each tuple is a
velocity and time immediately after impact, one may then determine the
position of the ball over time. However, one must derive, based on
the dynamics of the system, this sequence S. This sequence may be
generated recursively if we define a function such that given a tuple
of velocity and time, returns the tuple of the velocity and time of
the next element in the sequence, where the time t is the duration
between successive impacts. For this case, the function may be found
by solving for the time t in the next tuple based on the roots of the
equation y(t)=(v0*t)- (0.5*a*(t^2)), and the next velocity using this
next time. Assuming positive time and initial velocity v0, given a
tuple (v0,t), we let the next tuple (vn,tn) = (e*v0, 2*v0/a), where
0<=e<=1 is the coefficient of restitution. To determine the
reachability of some state satisfying a predicate, the sequence S and
the system dynamics must be studied jointly. We have found formal
logic as being a useful tool in precisely and formally describing and
reasoning about such system behavior. If one wishes to reason that
the ball, with 0 < e < 1 and an initial non-zero velocity, then the
ball eventually reaches a state with zero velocity and position, then
one would have to reason about an infinite sequence S, even though the
time required to reach this state is finite. While the use of real
analysis and limits are useful for reasoning about this behavior,
nonstandard analysis provides a useful, formal framework within which
to reason that the ball reaches zero velocity and position, after
infinite in-elastic impacts, but finite time.
In our research, we use nonstandard analysis to model the physical
dynamics of the system as well as the switching agent or computer
controller which causes a change in the system dynamics. By the use
of nonstandard analysis, the real time physical system is discretized
in accordance to an infinitesimal time step epsilon. However,
predicates over the state space are also discretized according to this
epsilon. To reason about properties of the dynamic system, such as
stability, we turn to the use of invariants and measure functions,
which are commonly employed in program verification techniques. Once
we prove these necessary properties, by use of the principals of
nonstandard analysis, we may then conclude the property about the
overall hybrid system over real time.
We have applied this method for a simple system consisting of a
computer controller, and a widget, whereby the state variable x of the
widget is modified, with the constraint that the state variable can
only be modified by a limited, finite amount in a time duration, and
the computer is allowed to sample this value only at discrete time
points. Furthermore, while the variable x is real valued, the
computer is assumed to be integer valued, taking the floor of the
variable x. Using our method, we are able to prove that an algorithm
for adjusting the state variable x is such that 1) it causes the
widget to eventually reach a sate variable value x within 2.5 of the
requested value and 2) it remains within this margin of error. This
approach is instructive in demonstrating the use of formal logic and
nonstandard analysis in reasoning about a simple real time dynamic
system and computer program.
Transistor-level Circuit Analysis
Verification of Control Systems
Improved Arithmetic Procedures for ACL2
Robert Krug (mostly), J Moore, and I have worked on improving the
arthmetic proving procedures in ACL2 in several ways: new ACL2-system
code to better support processing of arithmetic expression, new
ACL2-system code to support automatic reasoning about arithmetic
expressions involving non-linear terms, and new ACL2 libraries (books)
that, when included, often decrease the effort required to prove
arithmetic facts.
Some Older Research Topics
The FM9001 Microprocessor Project
The FM9001
microprocessor was the first fabricated microprocess with a
mechanically-checked proof of its gate-level design. Bishop Brock and
I developed this microprocessor and the related technology. To verify
the FM9001 design, we formally specified the ISA, we formalized the
DUAL-EVAL hardware description langauage (HDL), we specified the
FM9001 design in the DUAL-EVAL HDL, and finally, we mechanically
verified that our design did indeed implement its ISA-level
specification. We have written
summary of this effort, which both describes the FM9001 processor
as well as the HDL used to implement the FM9001 design.