GAG Genetic algorithms for Grammars
(The sex life of grammars)

Marc Blasband (Compuleer)

Spoken dialog systems are being deployed commercially on a large scale. They
are computer systems that answer queries from human callers through the
telephone. They use grammars to restrict the large number of ways the
utterances from the callers can be interpreted and to map this input onto
the concepts needed to answer the questions of the clients.

The labour required to build these grammars is extensive. Genetic algorithms can generate these grammars automatically. As it is the case for hand crafted grammars, word categories must be defined, the needed concepts determined and a dictionary with all the words used made available. Many solved examples are also required. A solved example contains the original input utterance and the meaning of every sentence. This meaning is represented here by the list of concepts expressed in the utterance with their values.

The purpose of the project GAG is to generate the grammar automatically using the examples without any modification to the text. A real life corpus was the starting point. The concepts expressed in every utterance and their values were determined manually.

The paper describes the project GAG. It begins by introducing the type of grammars that are considered and the way the genetic algorithm works. Then the paper lists the lessons learnt for the structure of the grammars and for the parsers that should use these grammars. The major point is the separate handling of the literal meaning of a text (e.g. tomorrow) and its canonical meaning (e.g. 1 October 1999).

The results of experiments are then shown in the paper with the performance reached. The 87% accuracy compares favourably with the performance of hand crafted grammars.