Predicting Phrase Breaks: a Comparison between HMMS and Memory-Based Learning on the MARSEC corpus

Bertjan Busser (Tilburg University)


This paper presents a comparison of HMMs and Memory-Based Learning,
two artificial learning algorithms, on a specific task: assigning
phrase breaks from part-of-speech (POS) sequences. In effect, it is a
comparison of HMMs and Memory-Based Learning on the experiments
described in (Tayloer and Black 1998). The HMM by design consists of
two models: the POS model, that models the probability of POS
sequences, and the language model, that models the probability of
phrase breaks given a POS sequence. The task has been modularized into
two subtasks.  Memory-Based Learning produces only one model, or
classifier. Nevertheless performance almost equals that of the HMM.