NP chunking using Inductive Logic Programming

Stasinos Konstantopoulos (Rijksuniversiteit Groningen)


This is to report the results of approaching the problem of NP
chunking using Inductive Logic Programming techniques. The problem, as
defined in Ramshaw et al. (1995), is the machine learning of rules
that identify base NPs in text annotated with part-of-speech tags, by
tagging each word as being "inside" or "outside" an NP. (Consecutive
NPs are appropriatelly treated.) The machine learning technique used
in Ramshaw et al. (1995)  is Brill's Transformation-Based Learning.

The same input data as in Ramshaw et al. (1995) is used here, but the machine learning techinque is Inductive Logic Programming, and specifically the Progol algorithm, as implemented in P-Progol/Sicstus. The problem is formulated as the machine learning of a Prolog predicate that will accept a part-of-speach tagged word and its context as input and either fail (ungrammatical input) or tag the word as being inside or outside a base NP.

This paper deals with how these two approaches compare and tries to draw conclusions on the applicability of ILP for the given task of NP chunking.