Typing as a means for validating feature structures

Anoop Sarkar (University of Pennsylvania)
Shuly Wintner (University of Pennsylvania)

The XTAG grammar development system makes limited use of feature
structures which can be attached to nodes in the trees that make up a
grammar. The system allows the user to define path equations that are
interpreted as specifying feature structures. Feature structure
specifications can refer to lexical items, tree families or specific
trees, and are declared in three different formats and three different
files.  This organization leaves room for several kinds of errors,
inconsistencies and typos in feature structure manipulation: undefined
features can be referenced, paths can be assigned undefined values,
incompatible features can be equated, etc.

We present a method for validating the consistency of feature structure specifications by imposing a type discipline. A typed system facilitates a great number of compile-time checks: many possible errors can be detected before the grammar is used for parsing. We have constructed a type signature for an existing broad-coverage grammar of English, and implemented a type inference algorithm that operates on the feature structure specifications in the grammar. The algorithm reports occurrences of incompatibility with the type signature. We have detected a large number of errors in the grammar; four types of errors will be discussed.

While the method we suggest was tested on an XTAG grammar, it is in principle applicable to any linguistic formalism that uses untyped feature structures (in particular, LFG). Additional advantages are possible in the future: typed feature structures provide means for better linguistic generalizations to be made, and usually lead to more efficient processing.