Syntactic analysis

Syntax analysis generally

The syntactic analysis defines a mapping of a concrete syntax on a more abstract grammar tree, which we call in SITE Common Representation. A concrete grammar tree of a SDL specification can be stored as ASCII file, which is produced with the help of Kimwitu functionality. Such an ASCII file (with the ending .cr) is the base for the exchange of specifications between the individual SITE components. The syntax analysis accepts both SDL-96 and ASN.1, whereby the combination of both languages is possible. Over many command line parameters the user can influence characteristics of the accepted syntax.

Back to the start

Error messages

Syntax errors are printed with the appropriate error position. For the support of error detection the current input symbol and a syntactic context are printed. That is (are) the current grammar rule(s), whereby the status of the processing is marked by a dot.

Example: The definition

B::= SEQUENCE { 
       i INTEGER
       b BOOLEAN
     }
implies the message
work.sdl:3:  parse error before `b`:
  ExtendedElementeTypeList -> ElementeTypeList .
  ExtendedElementeTypeList -> ElementeTypeList . ',' DotDotDot
  ElementeTypeList -> ElementeTypeList . ',' ElementeType
The syntactic analysis detects an error in the file work.sdl line 3. the symbol 'b' cannot be processed in the current grammatical context. The dot in the rules points out, that either the list of the components is finnished or ',' or '... ' is expected as next symbol. Obviously here the comma is missing. If the user does not know, how the component list is finnished, the SDL grammar description of the analysis organized as hypertext can be used. This document is generated with the SITE component gconv directly from the grammar definition file of the analysis component.
This of us developed method for error printing among other things is even inserted into newer versions by Kimwitu.
Back to the start

Parser technology

The implementation of the syntax analysis is based on the UNIX compiler tools yacc and lex, whereby now the GNU versions bison and flex can be only used. That is situated on the one hand at the size of the grammar and other since one at the method, like the SDL/PR text for reconstruction purposes to the CR is transferred (only starting from CR version 3.4). If the analysis was successful, then one (and only one!) CR file is written.

Lexical analysis

The scanner assembles lexical units of the languages (tokens) from characters. In our case the scanner has several main statuses:

Language switching replaces also the keyword hash tables, thats why there are no syntactic language compromises with SITE. Between the lexical units can be as many as desired space. Those are comments, blanks and other control characters. This space is stored together with the original way of writing of the following lexical unit and likewise transferred to the Parser. Space and tokens are separated for later recognizing by control characters. Comments are examined additionally for directives, which are interesting for the scanner.
A second substantial function of the scanner is the switching of the input source to different files. All SDL tools know include directives, and so SITE, too. Unfortunately is inclusion sophistiated for SDL. A include directive which referes to a reference definition (that is the complete definition), can be specified at the referenced definition (block B referenced; do not mistake!). Such include directives may be executed however only at end of the packages or the system. One needs finally a stack of file lists, which is kept account also properly, in order to be able to produce dependencies for makefiles. Additionally, multiple inclusions are warned. One notes the fact that such include directives are executed by the Parser i.e. the CR contains the contents of these files. Reloading missing Packages in other tools does not have anything to do with this include, in particular there are no tests!

Grammar analysis

The Parser structures accordingly its given grammar rules a syntax tree, i.e. constructors generated by Kimwitu are called. In the case of errors takes place the attempt of a stabilization, which does not succeed always satisfyingly because of the by hand inserted stabilization rules of a grammar with several 100 rules. Any idea...? Additionally, by the scanner stored textual information of the lexical units and with file changeover also the new file name is assigned as attribute to the nodes of the rule. For this reason also the complete file structure from a CR file is again restorable.

Back to the start

Further documentation

There is a UNIX manual, which explains the comand line parameters in detail. Also various student work, e.g.
Andreas Lutsch, Ein Parser für SDL (German) with syntax diagrams, Jahresarbeit, Humboldt Universität zu Berlin, November 1993
Ralf Schröder, SDL-92 data handling in combination with ASN.1, master thesis, Humboldt University Berlin, February 1994
contain partial aspects of the analysis. All things is often not given the topicality. The view into the grammar sources or also the generated grammar rules of the Parser will answer most questions. An introduction to the SITE technology generally (parser, semantics, generation of code) gives a small EBNF pretty printer. This example was used in a lecture about Kimwitu.

Back to the start

Sources and versions

The sources are embedded in a CVS structure. This contains however all project-specific adjustments, which are not freely available. Therefore in irregular periods public versions are extracted, during the writing of these pages used we version 3.4 internally. One can inquire whether there are relevant modifications in relation to the public version. The substantial reason, a version restrainable are actually only critical CR modifications. At the moment we use internally CR version 3.4. the sources should be on our FTP server One can use binary files of various architectures, if there are no problems with dynamic libraries. The IDL version is not public yet, because it is not finished the subsequent treatment.

Still another request: The grammar supplied by us is supposed the only freely available, complete and propably very often copied (it is ok!) SDL-96-Grammatik. It would be beautiful to send us a message in case of errors and improvements. We are working also on SDL-2000!

Back to the start

Installation

The syntax analysis must be found as program which can be executed, i.e. possibly PATH has to be set accordingly. There are no further dependencies. One should copy the manual to a place accessible for the command man(ending to consider). One can also read that!

Back to the start

Compilation

The Compilation is not quite simple, since various tools become necessary. One can download oneself if necessary the generated c-files and only compile these. That would be also the simplest method for Windows, not everyone installs a GNU environment. Here a description of actions take place at the comilation:

Now to the problems:

Back to the start

Known bugs

There are voices, which want to stabilize errors with exit(1). That is with today's speed of computers a better version, than the implemented one. The managment of the options and the files is quite chaotic and therefore guaranteed not error free. Being we implement SDL-2000, then it will give some improvements, up to then...
The parser accepts more syntax than the standard permits, the semantics analysis should handle that. Altogether, there is no proof for the conformity of the parser to the standard, all half years also an error is found.

Back to the start

created 29-Nov-1994 by SITE maintenance crew
last change  at Fre Jun 27 11:26:44 CEST 2003