Parsing and Printing tools

Camlp5 provides two parsing tools:

The first parsing tool, the stream parsers, is the elementary system. It is pure syntactic sugar, i.e. the code is directly converted into basic OCaml statements: essentially functions, pattern matchings, try. A stream parser is just a function. But the system does not manage associativity, nor parsing level. Left recursion results on infinite loops, just like functions whose first action would be a call to itself.

The second parsing tool, the extensible grammars, are more sophisticated. A grammar written with them is more readable, and look like grammars written with tools like "yacc". They take care of associativity, left recursion, and level of parsing. They are dynamically extensible, what allows the syntax extensions what Camlp5 provides for OCaml syntax.

In both cases, the input data are streams.

Camlp5 also provides:

The next sections give an overview of the parsing and printing tools.

  1. Stream parsers
  2. Extensible grammars
  3. Pretty module
  4. Extensible printers

Stream parsers

The stream parsers is a system of recursive descendant parsing. Streams are actually lazy lists. At each step, the head of the list is compared against a stream pattern. There are three kinds of streams parsers:

The differences are about:

The imperative parsers implement what is called "predictive parsing", i.e. recursive descendant parsing without backtrack.

In the imperative version, there also exist lexers, a shorter syntax when the stream elements are of the specific type 'char'.

Extensible grammars

Extensible grammars manipulate grammar entries. Grammar entries are abstract values internally containing mutable stream parsers. When a grammar entry is created, its internal parser is empty, i.e. it always fails when used. A specific syntactic construction, with the keyword "EXTEND" allows one to extend grammar entries with new grammar rules.

In opposition to stream parsers, grammar entries manage associativity, left factorization, and levels. Moreover, these grammars allow optional calls, lists and lists with separators. They are not however functions and hence cannot have parameters.

Since the internal system is stream parsers, extensible grammars use recursive descendant parsing.

The parsers of the OCaml language in Camlp5 are written with extensible grammars.

Pretty module

The "Pretty" module is an original tool allowing control over the displaying of lines. The user must specify two functions where:

The system first tries to print on a single line. At any time, if the line overflows, i.e. if its size is greater than some "line length" specified in the module interface, or if it contains newlines, the function is aborted and control is given to the second function, to print on several lines.

This is a basic, but powerful, system. It supposes that the programmer takes care of the current indentation, and the beginning and the end of its lines.

The module will be extended in the future to hide the management of indendations and line continuations, and by the supply of functions combinating the two cases above, in which the programmer can specify the possible places where newlines can be inserted.

Extensible printers

The extensible printers are symmetric to the extensible grammars. The extensible grammars take syntax rules and return syntax trees. The extensible printers are actually extensible functions taking syntax trees as parameters and returning the pretty printed statements in strings.

The extensible printers can have printing levels, just like grammars have parsing levels, and it is possible to take the associativity into account by provided functions to call either the current level or the next level.

The printers of the OCaml language are written with extensible printers.


Copyright 2007-2012 Daniel de Rauglaudre (INRIA)

Valid XHTML 1.1