CU Arcadia Project Software System

jb - Java Bison Parser Runtime

Last Updated: 15 March 1996

Latest Version: jb1.0a

Location:

The following references are symbolic links to the latest versions.

Description:

The jb system provides a means to execute, in Java (tm), parsers generated using the Gnu Bison parser generator system.

Jb takes the C file output by Bison and scans it to extract the parse tables and constants. Jb then passes over various templates specified by the user and inserts the extracted information at specified points in the templates.

The following discussion assumes substantial familiarity with parsing using bison or yacc.

A jb parser consists of the following parts.

yyparse.java
The file yyparse.java is generated from yyparse.template. It is the primary runtime parse engine. It contains one class: yyparse. The yyparse class is the primary parse class, and it must be instantiated to create a parser instance. Note that the class name (``yyparse'') is the default. The flags note below indicates how to provide an alternate class name.

The yyparse class methods are as follows.

tokentypes.java
The file tokentypes.java is generated from tokentypes.template. It contains the terminal type integer constants shared by the parser and the lexical scanner. It contains one abstract class: tokentypes.

token.java
The file token.java defines the objects that are pushed onto the semantic stack during parsing. In other words, a reference to e.g. $1 in a parse action will be a reference to an object of type token. The token class contains an integer value typically taken from the values in class tokentype.

When the token has a semantic value (e.g., integer, string), then the lexical analyzer will want to use an appropriate subtype of token to hold that semantic value. Current the following subclasses are defined: int_token, long_token, double_token, boolean_token, string_token, and object_token. You are, of course, free to define other subclasses of token for your own purposes. Note that the token tokentype can be used to do appropriate downcasting of the tokens in the parse ($1, $2, etc).

yylex.java
The file yylex.java is the lexical scanner used by the parser. The parser is passed an instance of this class at construction time. As a rule, this is subclasses to define the next_token method to do the actual lexical analysis. As a rule, your subclass of yylex will import the tokentypes class to get access to the tokentype values. But it could also be passed through jb to insert the tokentype constants directly.

The yylex class has one primary method: next_token. This method is repeatedly invoked by yyparse.parse to obtain lexical items. The method yylex.next_token returns an integer tokentype value. Additionally, it places the associated semantic value for the token into a public attribute: yylval. If there is no associated value, then yylval should be set to null.

ParseException.java, int_vector.java, int_stack.java, token_stack.java
These files contain various support classes used by the parser. Since they are not templates, these files, along with yylex.java and the token classses, are stuck off in a separate package named bison that can be shared by all parsers and lexers.

Example:

The usual calculator example has been provided to show how the system is used. The calculator files are as follows.

The sequence for building the calculator is shown in the file Makefile.calc. It essentially consists of the following sequence of actions.

  1. Compile the parser independent files (ParseExeption, etc).
  2. Run bison on the calc.y to produce calc.tab.c.
  3. Run the jb program with input file calc.tab.c and with a list of file pairs into which table and constant information is to be inserted using a template file.
  4. Compile the parser java files.
  5. execute the parser with the command ``java Main''.

Invoking the jb Command:

The jb command is actually a tcl script. It interprets its first argument as the input file. Presumably this file was in turn produced as the output from bison. The rest of the arguments to jb are paired together. The first file of each pair is the template file and the second is the name of the file in which to place the result. Each template is read a line at a time and each line is written to the output file. Simultaneously, it is scanned for selected markers, and when one is encountered, the corresponding information from the input file (suitably modified for java). is inserted into the output file.

The list of markers is as follows.

Additionally, jb will accept flag arguments of the following form.

 -flagname flagvalue 
As it scans the template files, it looks for instances of @flagname@. If it encounters an instance of that flag, then it will substitute the corresponding flagvalue string. Currently, one such flag is predefined.

Dependencies:

Jb uses tcl to do the extraction of information and the insertion in the templates. It has been tested on tcl version 7.4, but should work with other versions since it does not do any sophisticated processing.

Jb has been tested with bison version 1.24. Other versions probably will work as well as long as the bison output parser is not wildly changed.

Supported Platforms:

It should be possible to install arpc on any platform that has tcl and Bison.

Acknowledgements

This work is sponsored by the Air Force Materiel Command, Rome Laboratory, and the Defense Advanced Research Projects Agency under Contract Number F30602-94-C-0253.

Contact and Bug Reports:

Dennis Heimbigner (dennis@cs.colorado.edu)