PJ is a new tool for JTS specifically written to support the GenBorg model. PJ compacts extension chains for classes, interfaces., and state diagrams To call PJ:
> java PJ.Main [-tk] <base-file-name> <extension-file-name>
where:
The Bali command to build PJ is:
> java Bali.Main PJ
The layers that are included in PJ, along with a description of their semantics, is briefly below:
The following topics are covered in this document:
The GenBorg model produces a set of equations that define the extension chains or extension hierarchies of generated files. An equation r = a(b(c)) means compose base class or interface c with extension b, and that with extension a to produce r. PJ flattens this extension chain to produce a single file. If this composition were to be performed via a command-line, you would write:
> java PJ.Main -tk c.java b.java > tmp.java > java PJ.Main -tk tmp.java a.java > result.java
This isn't the preferred way to do this composition, for several reasons. First, PJ annotates parse trees with information that is lost when outputting intermediate results to files. This information isn't critical for running the current stable of regression tests, but will be very important later on. Second, the above runs slowly.
The preferred way is via a programmatic means:
import PJ.*;// Step 1: set the -t and -k flags in PJLang.JTSParseTree.setFlags( true, true ); // set the -t and -k flags in PJ// Step 2: create parse trees for files a, b, and c Lang.JTSParseTree a = new Lang.JTSParseTree( "a.java" ); Lang.JTSParseTree b = new Lang.JTSParseTree( "b.java" ); Lang.JTSParseTree c = new Lang.JTSParseTree( "c.java" );// Step 3: compose the trees, a.compose(b) merges b into a // so a is modifieda.compose(b); a.compose(c);// Step 4: set package name. The produced file will be // in package "foo", and output the file. a.setPackageName("foo"); a.print2file( "abc.java");
The above calling sequence will be used in GUI tools that take GenBorg equations, expand them, and invoke the corresponding extension-hierarchy compaction tools (of which PJ will be the tool for compacting code chains).
PJ expects a file containing a base interface to have the following format:
package <package-name>;[<import-statements>]<java-interface-declaration>
Note:
An interface extension file has the following format:
package <package-name>;[<import-statements>][<modifiers>] extends interface <name> [ extends <interface-list> ] { [body] }
Note:
Composition of a base interface with an extension (or an extension interface with another extension interface) follows simple rules:
An example of a base interface file and an extension interface file are:
package IBase1; import x.y.z.*; import java.io.Serializable; interface MyInt extends FooInterface { int Silent = 0; void foo() throws AFit; SomeType bar( int x ); }
package Iext1; import java.io.Serializable; public transient extends interface MyInt extends yyy, java.io.Serializable { int Terse = 2; void foo() throws SomeException; int increment( int i ); }
The result of their composition is shown below. Highlighted in yellow are additions made by the MyInt extension.
package IBase1; import x.y.z.*; import java.io.Serializable; public transient interface MyInt extends FooInterface, yyy, java.io.Serializable { int Silent = 0; int Terse = 2; void foo() throws AFit, SomeException; SomeType bar( int x ); int increment( int i ); }
Type sorting is a linear algorithm (which is invoked by the -t option to PJ) that sorts the declarations within an interface. To understand the motivation for type sorting, consider the result of the above composition if type sorting is not used (see below): it is a jungle of definitions that are hard to understand. Type sorting collects all variable definitions together in one spot and method definitions in another, and makes a small contribution to generated code beautification.
public transient interface MyInt extends FooInterface, yyy, java.io.Serializable { int Silent = 0; void foo() throws AFit, SomeException; SomeType bar( int x ); int Terse = 2; int increment( int i ); }
Again, it is possible to compose two interface extension files to produce a composite extension file. Remember: a base file is a "constant" and an extension file is a "function" in GenBorg. So composing a constant with a function (e.g., f(a)) produces a constant -- i.e., composing a base interface with an extension produces a composite base interface. Similarly, composing a function in GenBorg with another function yields a function. That is, composing an interface extension with another interface extension yields a composite interface extension.
A convenient and optional feature is to designate methods of an extension by the modifiers "new" or "overrides". "new" means that the extension is defining a new method whose name should not be used by a method in the base interface. "overrides" means that the extension is defining an extension to a previously defined method. PJ generates an error if it detects that the conditions for "new" or "overrides" are not satisfied. As an example:
package Iext0; extends interface MyInt extends java.io.Serializable { overrides void foo() throws SomeException; new void barrr(); }
The above interface defines a new method (barrr) that and overrides a previously defined method (foo)
The use of "new" and "overrides" makes a useful contribution to avoiding inadvertent capture; these modifiers should be used to guarentee that a method being defined by an extension is in fact a new method or does override an existing method.
PJ expects a file containing a base class to have the following format:
package <package-name>;[<import-statements>]<java-class-declaration>
Note:
An class extension file has the following format:
package <package-name>;[<import-statements>][<modifiers>] extends class <name> [ implements <interface-list> ] { [body] }
Note:
Again, rules of composition are simple:
An example of a base class file and an extension class file are:
package Ctop; import jakarta.util.*; class top { static int i,j; int ii,jj; static { i = 4; } top() { ii = 5; } top(int rj) { jj = rj; } void foo(float x, float y) { /* do something */ } float bar( float x ) { /* do something more */ } }
package Cmid; import AnotherPackage; extends class top implements java.io.Serializable, xxx { static int k; static { j = 5; } top(float x) { /* do something */ } float foobar() { Base(float).bar(4.0); Base(float,float).foo(0, 0); } public void foo( float x, float y ) { /* something more */ } }
The result of their composition is shown below. Highlighted in yellow are additions made by the top extension and highlighted in blue are additions triggered by the extension:
package Ctop; import jakarta.util.*; import AnotherPackage; class top implements java.io.Serializable, xxx { static { i = 4; } static { j = 5; } static int i,j; int ii,jj; static int k; top() { ii = 5; } top(int rj) { jj = rj; } top(float x) { /* do something */ } float bar( float x ){ return bar$$Ctop( x ); } final float bar$$Ctop( float x ) { /* do something more */ } final void foo$$Ctop(float x, float y) { /* do something */ } public void foo( float x, float y ) { /* something more */ } float foobar() { bar$$Ctop(4.0); foo$$Ctop(0, 0); } }
Key sorting is a technique much like type sorting. The body of a class has different kinds of entities -- variable declarations, method declarations, etc. Key sorting is similar to type sorting in that all entities of a single type (initialization blocks, variable declarations, methods) are grouped together. Key sorting goes beyond this to sort methods. In particular, we expect that there will be many "variations" of a single method, such as bar, bar$$Ctop, etc. Key sorting groups all of these related methods together, as shown above. (It does this by assigning a key to each method and sorting the methods in key order. Hence the name "key sort"). If key sorting is not used, an unintelligible jungle of declarations is produce (see below):
class top implements java.io.Serializable, xxx { static int i,j; int ii,jj; static { i = 4; } top() { ii = 5; } top(int rj) { jj = rj; } final void foo$$Ctop(float x, float y) { /* do something */ } float bar( float x ){ return bar$$Ctop( x ); } final float bar$$Ctop( float x ) { /* do something more */ } static int k; static { j = 5; } top(float x) { /* do something */ } float foobar() { bar$$Ctop(4.0); foo$$Ctop(0, 0); } public void foo( float x, float y ) { /* something more */ } }
We explain later how methods are composed and where all the additional methods come from. For now, remember it is possible to compose two class extension files to produce a composite extension file. As in the case of interfaces, a base file is a "constant" and an extension file is a "function" in GenBorg. So composing a constant with a function (e.g., f(a)) produces a constant, and composing a pair of functions yields a composite function.
In the following sections, rules for extending variables, methods, and constructors are reviews.
There is no notion of variable extension in GenBorg. It is one thing to override a method; it makes no sense to override a variable. Whenever the base class defines a variable and an extension class attempts to define the same variable, an error is reported by PJ.
This begs the question of inadvertent capture. What if a temporary variable x is defined in the base class and a different variable, also named x, is defined in the extension class? As mentioned above, PJ will complain. The clashing of names for temporary variables should be handled automatically. In a future version of PJ, we plan to add the capability of defining local variables whose names are mangled so that inadvertent capture is not possible.
You may have noticed that composing classes is a lot more complicated than composing interfaces. The difficulty rests on the ability of methods of an extension class to call arbitrary methods of their "superclass" or rather, "super extension". And part of the difficulty arises because PJ is a preprocessor which doesn't really understand type information.
You know that a subclass can call method foo() of its superclass by invoking super.foo(). By analogy, a class extension can call method foo() of its super extension by invoking Base().foo(). Because PJ is a preprocessor that does not type check programs, if an extension calls method of its super extension with signature foo(int, float, String), it does so by invoking Base(int,float,String).foo(5, 3.4, "me"). The token Base is a reserved keyword of PJ. Base is always followed by the type signature of the method that is to be invoked. Again, PJ is a preprocessor that doesn't understand type information. Base is the way the type signature of a "base-class" method is conveyed to PJ.
There are 4 rules that are specific to extending base-class methods. The actions taken depend on the following conditions:
If a base method is:
Before Composition | After Composition in Composite Class | |
Base class |
void foo( ) { /* do something */ } |
final void foo$$baseLayer() { /* do something */ } void foo( ) { foo$$baseLayer(); // something more } |
Extension class |
void foo( ) { Base( ).foo(); // something more } |
Before Composition | After Composition in Composite Class | |
Base class |
void foo( ) { /* do something */ } |
void foo( ) { // something more } |
Extension class |
void foo( ) { // something else } |
Before Composition | After Composition in Composite Class | |
Base class |
void foo( ) { /* do something */ } |
final void foo$$baseLayer( ) { /* do something */ } void foo( ) { foo$$baseLayer(); } void bar( ) { foo$$baseLayer(); // something more } |
Extension class |
void bar( ) { Base().foo(); // something more } |
There is an additional rule for extension methods: any extension method that does not override a base method is added to the composite class.
There are all sorts of problems with constructors. There are so many that it is much safer NOT to allow constructors to be extended. This is the position taken in PJ. New constructors can be added by an extension, but no existing constructor can be refined.
Constructors can be extended using the concepts already presented. The body of a constructor is moved into a method, which can be extended. Thus, instead of creating a superclass with a constructor and later extending the constructor via inheritance (left-hand- side of the table below), we rely on a design and coding technique to accomplish the same effect (right-hand-side of the table below):
Inheritance Hierarchy | Extension Hierarchy Equivalent |
class foo { int a; foo(int a) { this.a = a; } } |
class foo { int a; void fooConstructor(int a) { this.a = a; } foo(int a) { fooConstructor(a); } } |
class subfoo extends foo { int b; foo(int a) { super(a); b = 0; } } |
extends foo { int b; void fooConstructor(int a) { Base(int).fooConstructor(a); b = 0; } } |
A convenient and optional feature is to designate methods of an extension by the modifiers "new" or "overrides". "new" means that the extension is defining a new method whose name should not be used by a method in the base interface. "overrides" means that the extension is defining an extension to a previously defined method. PJ generates an error if it detects that the conditions for "new" or "overrides" are not satisfied. As an example,
package mid; extends class c { new void d() { } overrides void b() { /* new code */ } }
The above class extension defines a new method (d) and a method (b) that overrides a method in the base class of c.
The use of "new" and "overrides" makes a useful contribution to avoiding inadvertent capture; these modifiers should be used to guarentee that a method being defined by an extension is in fact a new method or does override an existing method.
PJ expects a file containing a base state diagram to have the following format:
package <package-name>;[<import-statements>]<state-diagram-declaration>
Note:
The syntax of a state diagram specification (which incidentally is slightly different than the version used to develop the FSATS prototype) is discussed later.
A state diagram extension file has the following format:
package <package-name>;[<import-statements>]<state-digram-extension-declaration>
Note:
The syntax of a state diagram extension specification is discussed later.
A state diagram can be:
Each is specified in the following way, where bold phrases are parameters and [bracketed] phrases are optional:
[modifiers] state_diagram name [implementsClause] SDClassBody[modifiers] state_diagram name extends supername [implementsClause] SDClassBody[modifiers] state_diagram name extends class supername [implementsClause] SDClassBody
The body of a state diagram (SDClassBody above) has the following syntax:
{ event_delivery methodname ( <parameter-list> ); (1)no_transition <block> (2)[ otherwise_default <block> ] (3)[ <states-list> ] (4)[ <ES-list> ] (5)[ <field-and-method-declarations> ] (6) }
The meaning of these concepts is specified in a separate document; here we talk only about the syntax of state diagram specifications. Observe the following notes:
states a, b, c; states e, f, g, h; // and so on
An example of a state diagram specification is:
package root1; import firstPackage.*; state_diagram root extends class common { event_delivery mymessage( M m ); no_transition { ignore(m); } // in root states a, b; states c; // in root edge e1 : a -> b conditions m!=null do { /* action 1 */ } Exit a { /* action a */ } }
An extension to a state diagram is expressed in the following syntax:
[modifiers] extends state_diagram name [implementsClause] SDExtBody
The body of a state diagram extension (SDExtBody above) has the following syntax:
{ [ otherwise_default <block> ] (3)[ <states-list> ] (4)[ <ES-list> ] (5)[ <field-and-method-declarations> ] (6) }
Again, the meaning of these concepts is specified in a separate document; here we talk only about the syntax. The meanings of these concepts is the same as above. Keep the following in mind:
An issue that can arise in the design of state diagrams is that an edge predicate or block of code associated with the actions of a state or edge can be modified by an extension. Since there are no capabilities to extend predicates or code blocks, the way this can be done is to define a boolean method for a predicate and a void method to encapsulate a code block. Methods can be extended using the normal mechanisms. So the way to understand ES-list declarations is that they are one-time-specifications that cannot be modified. These specifications can refer to methods which can be modified by extensions. So the scheme described here is general and simple.
Admittedly, this is a bit different from the original state diagram specification used in the FSATS prototype, it is marginally so. Further, there is no capability now for "refines edge" (i.e., the refinement/extension of edges). This too can be handled by defining predicates and edge transitions as methods which can be modified by extensions.
An example of a state diagram extension is:
package ext1; import firstPackage.*; extends state_diagram root { // from ext 1 states d; states e; // from ext1 edge e2 : d -> e conditions m!=null do { /* action 2 */ } edge e3 : c -> d conditions true do { /* action 3 */ } Exit b { /* action a */ } }
The result of composing the base state diagram defined earlier with the above extension is (defined to be the) union of these specifications, with the proviso that no state or edge is defined twice. The statements in yellow were added by the above extension.
package root1; import firstPackage.*; state_diagram root extends class common { event_delivery mymessage( M m ); no_transition { ignore(m); } // in root states a, b; states c; // from ext 1 states d; states e; // in root edge e1 : a -> b conditions m!=null do { /* action 1 */ } Exit a { /* action a */ } // from ext1 edge e2 : d -> e conditions m!=null do { /* action 2 */ } edge e3 : c -> d conditions true do { /* action 3 */ } Exit b { /* action a */ } }
The complete grammar for state diagram specifications is listed below:
// "state diagram" adds constructs to define state machines "state_diagram" STATE_DIAGRAM "states" STATES "event_delivery" EVENT_DELIVERY "edge" EDGE "conditions" CONDITIONS "->" ARROW "Exit" EXIT "Enter" ENTER "otherwise" OTHERWISE "otherwise_default" OTHERWISE_DEFAULT "no_transition" NO_TRANSITION // the following hooks SDDeclaration with class, interface declarations UnmodifiedTypeDeclaration : SDDeclaration :: SDDecl ; SDDeclaration : STATE_DIAGRAM QName [ SMExtendsClause ] [ ImplementsClause ] SDClassBody :: UmodSdDecl ; SMExtendsClause : LOOKAHEAD(2) ExtendsClause :: SMExtends | LOOKAHEAD(2) "extends" "class" AST_QualifiedName :: SMClsExtends ; SDClassBody : "{" [ RootClause ] [ OtherwiseClause ] [ StatesList ] [ ESList ] [ AST_FieldDecl ] "}" :: SdClassDecl ; StatesList : (StatesClause)+ ; StatesClause : STATES AST_TypeNameList ";" :: StatesDecl ; RootClause : DelivClause NoTransitionClause :: RootDecl ; NoTransitionClause : NO_TRANSITION Block :: NoTransDecl ; DelivClause : EVENT_DELIVERY QName "(" AST_ParList ")" ";" :: DelivDecl ; OtherwiseClause : OTHERWISE_DEFAULT Block :: ODefaultDecl ; ESList : ( Es )+ ; Es : EXIT QName Block :: ExitDecl | ENTER QName Block :: EnterDecl | EDGE QName ":" StartName ARROW QName CONDITIONS AST_Exp DO Block :: EdgeDecl | OTHERWISE QName Block :: OtherDecl ; StartName : QName :: SdSName | "*" :: StarName ; // the following is used to allow SD declarations to be nested within // class declarations; this is useful if we want to include SD specifications // within layers NestedSDDeclaration : [ AST_Modifiers ] SDDeclaration :: NSDDecl ; ClassBodyDeclaration : LOOKAHEAD( [ AST_Modifiers() ] "state_diagram" ) NestedSDDeclaration ; // the following is used to define extensions to state machines UnmodifiedTypeExtension : STATE_DIAGRAM QName [ ImplementsClause ] SDClassBody :: UmodSmExt ;
PJ is still undergoing changes. Known problems include:
PJ is a tool that is designed to compose different specifications (i.e., hierarchies of classes, interfaces, etc.) by macro expansion. A standard problem of macro-expansion is inadvertent capture. It arises when variables in different classes are given the same names, and ambiguity arises when the classes are composed. Consider the following class hierarchy:
class top { int i; ... System.out.println(i); ... } class bottom extends top { double i; ... System.out.println(i); ... }
The above is a legal Java program; the reason is that variable scoping differentiates the variable named "i" in class bottom from the variable named "i" in class top. Now, look what happens when top and bottom are blindly merged into a single class:
class top { int i; double i; // illegal ... System.out.println(i); ... // ambiguous ... System.out.println(i); ... }
That is, there are two variables named "i" and now references to variable "i" are ambiguous -- it is not obvious which variable is being referenced. This is typical of inadvertent capture. We will encounter such problems in PJ because of the way inheritance hierarchies are collapsed. The ambiguous (and incorrect) code above will be produced by PJ if class top were composed with the following extension:
extends class top { double i; ... System.out.println(i); ... }
A way to avoid inadvertent capture is to declare identifiers "local" to a particular class, and have them "mangled" so that they are forever unique. In GenBorg, inadvertent capture is a bit more difficult because identifiers are local to a layer, and a layer encapsulates many classes (or class extensions). Each class can reference variables and methods that are local to other classes of that layer. So assigning a mangled name to an identifier must be done consistently across all classes of a layer.
The solution used in PJ is to introduce a LocalId declaration which simply lists the identifiers that are local to a layer and that are to be mangled. In effect, the translation of a LocalId declaration is a textual search and replace performed on the body of the PJ declaration (class, state machine, etc.) thereby yielding the correct specification. As an example, a way to declare class top above is:
LocalId i; // list of names that are local to a layer, and are to be mangled
class top {
int i;
... System.out.println(i); ...
}
When top is instantiated, the name "i" is mangled -- i.e., changed into a unique identifier. Assume the name mangling changes "i" to "i$1", an instantiation of top yields:
class top { int i$1; ... System.out.println(i$1); ... }
The actual mangling used is <variableName>$$<layerName>.
Consider the following base class specification with local identifiers i, j, ii, jj, and foo.
package Ctop; import Jakarta.util.*; LocalId i, j; LocalId ii, jj; LocalId foo; class top { static int i,j; int ii,jj; static { i = 4; } top() { ii = 5; } top(int rj) { jj = rj; } void foo(float x, float y) { i = j = x+y; } }
Now consider an extension to top that uses exactly the same identifiers locally:
package Cmid; import java.util.*; LocalId i, j, ii, jj, foo; extends class top implements java.io.Serializable { static int i,j; int ii,jj; static { i = 4; } void foo(float x, float y) { i = j = x*y; } }
PJ composes these two specifications to yield:
package Ctop; import Jakarta.util.*; import java.util.*; class top implements java.io.Serializable { static int i$$Ctop,j$$Ctop; static int i$$Cmid,j$$Cmid; static { i$$Ctop = 4; } static { i$$Cmid = 4; } int ii$$Ctop,jj$$Ctop; int ii$$Cmid,jj$$Cmid; top() { ii$$Ctop = 5; } top(int rj) { jj$$Ctop = rj; } void foo$$Ctop(float x, float y) { i$$Ctop = j$$Ctop = x+y; } void foo$$Cmid(float x, float y) { i$$Cmid = j$$Cmid = x*y; } }
Note that PJ first assigns unique (mangled) names to local identifiers before composing specifications. It doesn't matter if PJ is composing classes, interfaces, state machines or whatever. The LocalId feature works for them all.
Finally, note that the LocalId declaration(s) list all of the identifiers that are local to a layer. Not all identifiers listed need be present in the body of a base or extension specification.
Use the following to invoke PJ from the command line:
> java PJ.Main -tk Basefile Extension1file Extension2file ... > output
The option -t selects type sorting, -k selects key sorting. The resulting output (error messages and all) are sent to standard out.
PJ is a package that can be called from another program (i.e., tool) in the following way.
(a) set PJ flags, clear error and warning counters, and specify the destination of error and warning messages:
PJ.JTSParseTree.setFlags( true, true ); PJ.JTSParseTree.resetCounters(); JTSParseTree.setReportStream( new PrintWriter( System.out ) );(b) create a JTSParseTree for each file, such as below:
try { PJ.JTSParseTree base = new PJ.JTSParseTree( "base-file-name" ); PJ.JTSParseTree ext = new PJ.JTSParseTree( "extn-file-name" ); } catch (Exception e) { System.err.println( "can't parse or open base or extension files" + e.getMessage() ); }(c) compose the trees, as below:
try { base.compose( ext ); if (PJ.JTSParseTree.errorCount() == 0) { // composition succeeded } else { // composition failed -- error string appears in reportStream PrintWriter } } catch (Exception e) { System.err.println( e.getMessage() ); // fatal error occurred in composition; something is likely wrong with PJ }note: compose can throw a RuntimeException if there are fatal errors; do not try to perform further compositions beyond this point. Composition errors can result, but exceptions will not be thrown.
(d) after all required files have been composed, set the name of the package for the generated class.
base.setPackageName( "name-of-package" );(e) output the resulting file. This can be done by any one of several methods:
- print() -- to standard out
- print2file( Writer w ) -- to the given writer
- print2file( File f ) -- to the given file
- print2file( String name ) -- to a file with the given name
As an example:
base.print();
In regression/PJ you will find regression tests for interfaces and classes. To run the entire suite of tests type:
> bash regress
PJ currently understands extensions to Java classes, interfaces, and state diagrams. It does not yet understand other extensions that have been added to Java, such as AST constructors. Extending PJ to support most of these features should be very simple.
To extend PJ, every AST_FieldDecl and TypeDeclaration production of the Java grammar must support a compose method. See the preprocess, CompInt, CompClass, and CompSm layers for details.
Another limitation variable capture. For now, this means that programmers must be very careful in selecting the names of variables and methods so that there is no name collision. More on this later.
PJ is still undergoing changes. Known problems include: