An HPSG-Annotated Test Suite for Polish

Malgorzata Marciniak (Polish Academy of Sciences)
Agnieszka Mykowiecka (Polish Academy of Sciences)
Adam Przepiorkowski (Polish Academy of Sciences)
Anna Kupsc (Polish Academy of Sciences)

This paper presents both conceptual and technical issues concerning the
construction of a test-suite of Polish sentences annoted with HPSG-like
attribute-value matrices (AVMs), a part of a bigger EU project (CRIT-2).
Unlike tree-banks, test-suites are constructed and annotated manually, in
order to exhaustively cover syntactic phenomena of a natural language,
including those rarely occuring in its every-day use.

The long-term aim of this subproject is to evaluate computational grammars of Polish, both quantitatively (what percent of constructions covered by the test-suite are also covered by the grammar?) and qualitatively (does the resulting parse contain all the information available in the corresponding test-suite annotation?).

We describe the design of this test-suite, discussing various conceptual decisions we made, including the employment of HPSG-like AVM representations as the annotation formalism. We also present the classification of grammatical phenomena of Polish used for indexing the sentences in the test-suite. Finally, we present the graphical user interface constructed in order to facilitate entering and viewing test sentences, their indices and their AVM annotations. (We are ready to provide a demo.)