next up previous
Next: The Basics of C-Breeze Up: The C-Breeze Compiler Infrastructure Previous: The C-Breeze Compiler Infrastructure


Introduction

C-Breeze is a compiler infrastructure, written in C++, which accepts ANSI C (the ISO9899 standard) as input and which can produce various forms of output, including C and PowerPC assembly. C-Breeze comes with a set of built-in phases that perform basic operations. For example, there are phases to parse the input, dismantle the input into a canonical form, and produce a control flow graph. C-Breeze is also intended to be extended through the addition of new phases. This document explains how to use and extend C-Breeze.

Figure 1: Overview of the basic C-Breeze compilation process.
\begin{figure*}\centerline{\epsffile{overview.eps}}\end{figure*}

The high level C-Breeze structure is defined by a series of built-in phases that define different forms of internal representation. As shown in Figure 1, the parser accepts C source code and produces an AST (Abstract Syntax Tree) representation. This AST is fairly high-level and bears a strong resemblance to C. The next phase in the figure, the dismantler, converts the AST into a canonical format known as MIR (Medium-level Internal Representation), which while still machine-independent, is a much simpler form with many fewer types of constructs. For example, in MIR all control flow is represented by labels and goto's, and all statements have at most a single assignment operator. The -c-code phase can be invoked to emit C as output. Eventually, another lowering phase will be provided to convert the MIR to a machine-specific LIR format, producing a 3-address code from which it is easier to generate assembly code. For most purposes, users will want to perform their analyses and transformations at the MIR level for simplicity reasons. Except for the parser, the invocation of the phases is under user control, so the structure shown in Figure 1 can be changed to suit other needs.

The remainder of this document is organized according to the various levels of intermediate representation. We start by describing aspects of the system that cross-cut the various levels of representation, namely, how to invoke C-Breeze and how to use the phase structure. We then describe two important IR's: the AST that is produced by the parser, and the MIR that is generated by the dismantler and its associated control flow graph We conclude by describing the C-Breeze class hierarchy.


next up previous
Next: The Basics of C-Breeze Up: The C-Breeze Compiler Infrastructure Previous: The C-Breeze Compiler Infrastructure
Adam C. Brown 2006-01-26