On the trees underlying natural language sentences

Christian Retoré

Colloquium in Honor of Gérard Huet, Paris 22-23 June 2007

We address the old question which lead to formal language theory: which formal grammars generates natural language sentences? We will first survey the linguistic hypotheses and formal studies which lead to the description of sentences as the strings produced by mildly context sensitive grammars. In order to interpret sentences one needs to assign to them some structure, usually trees and sometimes graphs. I will survey the current knowledge on theses classes of trees before presenting a result by Kobele, Salvati and Ithe present author inspired by the two step approach to natural language formalisms of Michaelis, Mönnich and Morawietz: the parse trees produced by minimalist grammars, a mildly-context-sensitive formalism introduced by Stabler, can be described as the image by some tree transducer of a regular tree language.