Computational Linguistic
Colloquia

at UiL OTS

Trans 10, Fridays 11.00-12.30




The Computational Linguistic group organizes a series of meetings in which each of the members presents his/her current research. This initiative intends to gather people of the institute which share interest in formal frameworks of natural language. Besides these bi-weekly meetings, several workshops are organized during the year. More information about these events can be found at Workshops on Computational Linguistics and Logic.

A preliminary program of the first two months follows below. We will up-to-date the page making available abstracts and further information regarding the meeting week by week.

Papers and abstracts of last years presentations can be found here.

Please contact Willemijn Vermaat if you want to give a talk or if you have any questions, suggestions or comments.




Programme

Date Time Speaker Title
February 22, 2002 11.00 - 12.30 Heleen Hoekstra & Ton van der Wouden Spoken Dutch Syntax [Abstract]
March 8, 2002 11:00 - 12.30 Organisor: Paola Monachesi Special event: Computational Tools for Linguistics
March 22, 2002 11:00 - 12.30
Trans 10, 0.17
Jan van Eijck Updates in Context Semantics [Abstract] [Slides]
April 5, 2002 15:00 - 16.30
Trans 10, 2.04
Igor Boguslavsky The ETAP linguistic processor: functions, linguistic knowledge and formalism [Abstract]
April 19, 2002 15:00 - 16.30
Trans 10, room 2.04
Jan Odijk Reusing Lexicons and Grammars [Abstract]
May 3, 2002 11:00 - 12.30
Trans 10, room 2.04
Crit Cremers Automatic meaningful category based generation [Abstract]
May 31, 2002 11:00 - 12.30
Trans 10, room 1.01 (CHANGE of ROOM!)
Reinhard Muskens Lambda grammars [Abstract]
July 16, 2002 11:00 - 12.30
Trans 10, room 0.19
Rajeev Gore Formalised (Weak and Strong) Cut Elimination for Display Logic [Abstract], [Slides]




Abstracts:

Speaker:
Heleen Hoekstra en Ton van der Wouden (Utrecht University)
Title:
Spoken Dutch Syntax
Abstract:
In this talk we will tell something about the CGN project (CGN: Corpus Gesproken Nederlands 'Spoken Dutch Corpus') in general and its syntactic annotation in particular. Moreover, we will discuss some phenomena that are typical for spoken Dutch.

Speaker:
Jan van Eijck (CWI Amsterdam/Utrecht University)
Title:
Updates in Context Semantics
Abstract:
Context semantics is a framework for natural language semantics based on type theory where contexts are represented as lists of (indexed) objects. We first explain the general picture. Next, we demonstrate meaning representation in context semantics. Finally, we discuss how (indexed) contexts can be used for an account of epistemic updates. In short, we picture how the information gets updated of someone who receives a message represented in context logic.

Speaker:
Igor Boguslavsky
Institute for Information Transmission Problems of the Russian Academy of Sciences, Moscow

Prof. Dr., Head of the Computational linguistics lab at the Institute for Information Transmission Problems of the Russian Academy of Sciences. Working in the area of computational linguistics (machine translation, syntax, semantics, lexicon) and theoretical linguistics (scope theory, syntax - semantics interface).

Title:
The ETAP linguistic processor: functions, linguistic knowledge, formalism
Abstract:

I will present a multi-functional linguistic processor (ETAP) developed in the Institute for Information Transmission Problems of the Russian Academy of Sciences. It can be considered as an implementation of the "Meaning — Text" theory by Melèùk and Apresjan. Its major functions include: machine translation, multilingual communication via an interlingua (UNL), natural language interface to databases, meaning preserving paraphrasing, checking and correction of syntactic errors. It is also used for the construction of a dependency-based treebank for Russian. The following are the most important features of the ETAP environment and its modules:

Time permitting, I will also present the ETAP formalism, based on three-value first order predicate logic and intended for the description of all sorts of dictionary, syntactic, semantic, transformational and combinatorial information needed for natural text analysis and generation.

Speaker:
Jan Odijk
Last year Jan Odijk is promoted to professor of language and speech technology at the UiL-OTS. As a student he started his career at the Utrecht University with a study in Russian language. He graduated in the study of General Linguistics with a specialsation in syntax and computerlinguistics. In Tilburg he finished his PhD in 1993 on the thesis "Compositionality and syntactic generalizations". Since then he has been involved in many projects of computer linguistics, such as the work he id with the company of Lernhout & Hauspie.
Title:
Reusing Lexicons and Grammars
Abstract:

The presentation will consist of two different but related parts. Part I is backward looking and about reusing lexicons. Part II is forward-looking and about reusing grammars.

In part I, I will give a global description of work I was involved in in the past 5 years on consolidating, standardizing and integrating lexical resources. I will describe the problems posed by the fact that we had a huge number of lexicons available from different sources and for multiple languages and language pairs, each with its own structure and in its own format, and I will globally describe the approach taken to consolidating these resources in order to optimize their reuse in a wide range of language and speech technologies.

In Part II, the focus will be on research I hope to carry out in the coming years on the use and reuse of grammars. I will describe my view on the current status of grammars, especially in relation to their use in language and speech technology. I will sketch a very tentative research programme in which grammars are used as a basis to automatically derive linguistic components such as domain-tuned grammars, chunkers, PoS-taggers, etc. of varying complexity that can actually be used as linguistic components in language and speech technologies, and that are easily scalable depending on the requirements of the technology that they are a part of and the platforms on which the technology has to run.



Speaker:
Crit Cremers
Title:
Automatic meaningful category based generation
Abstract:

Generating meaningful wellformed Dutch imposes partly the same, partly different conditions on grammar and algorithms as meaningful parsing does. The Leiden parser and generator Delilah operates on sentence level and for both processes lives on a single grammatical core. This core can be localized in trhe plane between montegovian and combinatory categorial 'traditions'. The construction is engined mainly by unification. In this talk-with-image-and-sound I will try to present the generator and to share with the audience underlying struggles on

In addition, I would like to introduce two ongoing research projects and one public construal at Leiden University related to meaning-driven generation.



Speaker:
Reinhard Muskens
Title:
Lambda grammars
Abstract:

The grammars that I will present in this talk are multidimensional categorial grammars, but they are undirected and stand in a tradition that begins with Curry's (1961) idea to represent syntactic information with the help of what are essentially typed lambda terms (much of Dick Oehrle's work is also within this tradition and I will build upon his work). Representing syntactic information with the help of lambda terms is what makes undirectionality possible, as ordering information no longer needs to be present at the level of types.

Grammars will be set up in a multidimensional way. A *sign* will be defined as a sequence of lambda terms (for example, one term for word order and dominance information, one for feature information, one for the semantics, etc.). Signs can combine with the help of linear combinators. It has recently turned out that on the technical side Lambda Grammars are equivalent to collections of `Abstract Categorial Grammars' (ACGs), developed independently by Philippe de Groote. ACGs are based on linear logic.

One desirable consequence of moving to a nondirectional system, with lambda terms representing ordering information, is that medial gaps and peripheral gaps are now treated on a par. This means that the problems the original Lambek calculus has with extraction from medial positions (e.g. in treatments of scope) do not arise and that the theory can do without the kind of additional machinery that has been proposed for dealing with medial gaps in directed systems (Morrill, Moortgat).

The idea can be worked out in several directions. One obvious connection (the multicomponent character, linear logic) is with Lexical-Functional Grammar and in previous work I have shown that many of the central ideas of this theory can be implemented without any difficulty. In this talk I will emphasize connections with the multimodal approach to categorial grammar. I will argue that, while it is best to keep the types of the system `clean', the multimodal game can in fact be played on the level of terms, using a transcription of the Van Benthem - Kurtonina semantics for the modal operators.



Speaker:
Rajeev Gore
Title:
Formalised (Weak and Strong) Cut Elimination for Display Logic
Abstract:
We use a deep embedding of the display calculus for relation algebras dRA in the logical framework Isabelle/HOL to formalise the weak and strong normalisation theorems for cut-elimination in dRA. Unlike other "implementations'', we explicitly formalise the structural induction in Isabelle/HOL and believe this to be the first full formalisation of cut-admissibility in the presence of explicit structural rules. We also present a new, machine-checked, proof of strong normalisation of cut-elimination for dRA which does not use measures on the size of derivations. We believe this is the first full formalisation of a strong normalisation result for a sequent system using a logical framework. Our formalisations generalise easily to other display calculi and can serve as a basis for formalised proofs of weak and strong normalisation for the classical and intuitionistic versions of a vast range of substructural logics like the Lambek calculus, linear logic, relevant logic, BCK-logic, and their modal extensions.

Willemijn Vermaat
Last modified: Wed July 17 11:19 MET DST 2002