CU Arcadia Project Software System
jb - Java Bison Parser Runtime
Last Updated: 15 March 1996
Latest Version: jb1.0a
Location:
The following references are symbolic links to the latest versions.
-
README:
ftp://ftp.cs.colorado.edu/pub/cs/distribs/arcadia/jb.txt
-
SOURCE:
ftp://ftp.cs.colorado.edu/pub/cs/distribs/arcadia/jb.tar
Description:
The jb system provides a means to execute, in Java (tm), parsers
generated using the Gnu Bison parser generator system.
Jb takes the C file output by Bison and scans it to extract
the parse tables and constants. Jb then passes over
various templates specified by the user and inserts the
extracted information at specified points in the templates.
The following discussion assumes substantial familiarity with
parsing using bison or yacc.
A jb parser consists of the following parts.
- yyparse.java
- The file yyparse.java is generated from yyparse.template.
It is the primary runtime parse engine.
It contains one class: yyparse.
The yyparse class is the primary parse class, and
it must be instantiated to create a parser instance.
Note that the class name (``yyparse'') is the default.
The flags note below indicates how to provide an alternate
class name.
The yyparse class methods are as follows.
- yyparse --
the constructor takes an instance of a lexical scanner
(of class yylex) as an argument.
- setdebug -- turns debug output on or off.
- parse -- executes the parse engine.
- yyerror -- invoked when a parse error occurs.
- tokentypes.java
- The file tokentypes.java is generated from tokentypes.template.
It contains the terminal type integer constants shared by the parser
and the lexical scanner.
It contains one abstract class: tokentypes.
- token.java
- The file token.java defines the objects that are pushed
onto the semantic stack during parsing. In other words,
a reference to e.g. $1 in a parse action will be a reference
to an object of type token.
The token class contains an integer value typically
taken from the values in class tokentype.
When the token has a semantic value (e.g., integer, string),
then the lexical analyzer will want to use an appropriate
subtype of token to hold that semantic value.
Current the following subclasses are defined: int_token,
long_token, double_token, boolean_token, string_token,
and object_token. You are, of course, free to define other
subclasses of token for your own purposes. Note that the token
tokentype can be used to do appropriate downcasting
of the tokens in the parse ($1, $2, etc).
- yylex.java
- The file yylex.java is the lexical scanner used by the parser.
The parser is passed an instance of this class at construction time.
As a rule, this is subclasses to define the next_token method
to do the actual lexical analysis.
As a rule, your subclass of yylex will import
the tokentypes class to get access
to the tokentype values. But it could also be passed
through jb to insert the tokentype constants directly.
The yylex class has one primary method: next_token.
This method is repeatedly invoked by yyparse.parse
to obtain lexical items. The method yylex.next_token
returns an integer tokentype value. Additionally,
it places the associated semantic value for the token
into a public attribute: yylval. If there is no associated
value, then yylval should be set to null.
- ParseException.java, int_vector.java, int_stack.java, token_stack.java
- These files contain various support classes used by the parser.
Since they are not templates, these files, along with yylex.java
and the token classses, are stuck off in a separate package named
bison that can be shared by all parsers and lexers.
Example:
The usual calculator example has been provided to show how the system
is used. The calculator files are as follows.
- Main.java --
The file Main.java instantiates and invokes the parser.
It contains a subclass of yylex to do the actual lexical analysis
appropriate to the calculator.
The main program is class Main, method main. It
instantiates the parser and lexer and calls the parser parse
method to parse input strings.
- yyparse.java -- the instantiation of yyparse.template.
- tokentypes.java -- the instantiation of tokentypes.template.
- calc.y -- standard bison parser definition.
The sequence for building the calculator is shown
in the file Makefile.calc.
It essentially consists of the following sequence of actions.
- Compile the parser independent files (ParseExeption, etc).
- Run bison on the calc.y to produce calc.tab.c.
- Run the jb program with input file calc.tab.c
and with a list of file pairs into which table and constant
information is to be inserted using a template file.
- Compile the parser java files.
- execute the parser with the command ``java Main''.
Invoking the jb Command:
The jb command is actually a tcl script. It interprets its first
argument as the input file. Presumably this file was in turn
produced as the output from bison. The rest of the arguments to
jb are paired together. The first file of each pair is the
template file and the second is the name of the file in which to
place the result. Each template is read a line at a time and each line is
written to the output file. Simultaneously, it is scanned for
selected markers, and when one is encountered, the corresponding
information from the input file (suitably modified for java). is
inserted into the output file.
The list of markers is as follows.
- @PREFIX@ -- this is the text from the beginning of the
parser specification and enclosed in %{ ... %}.
- @SUFFIX@ -- this is the text from the end of the
parser specification after the last %%.
- @ACTIONS@ -- this is actions for the parse rules encoded
as a switch statement.
- @TOKENTYPES@ -- this is the list of token types taken from %token
specifications.
- @TABLES@ -- this is the parse tables and associated constants
used by the parse engine.
Additionally, jb will accept flag arguments of the following form.
-flagname flagvalue
As it scans the template files, it looks for instances of
@flagname@. If it encounters an instance of that flag, then
it will substitute the corresponding flagvalue string.
Currently, one such flag is predefined.
- @yyparse@ -- this is the class name for the parser class.
It defaults to the string ``yyparse''.
By specifying this flag, one can change the parser class
name, and so one can have multiple parsers in the same
java program.
Dependencies:
Jb uses tcl to do the extraction of information
and the insertion in the templates.
It has been tested on tcl version 7.4, but should
work with other versions since it does not do
any sophisticated processing.
Jb has been tested with bison version 1.24.
Other versions probably will work as well as long as the
bison output parser is not wildly changed.
Supported Platforms:
It should be possible to install arpc on any platform
that has tcl and Bison.
Acknowledgements
This work is sponsored by
the Air Force Materiel Command, Rome Laboratory, and
the Defense Advanced Research Projects Agency under
Contract Number F30602-94-C-0253.
Contact and Bug Reports:
Dennis Heimbigner
(dennis@cs.colorado.edu)