CLASS Distinguished Public Lecture

Date:	Oct 31st, 3:30pm-6:30pm
Place:	HSS - Seminar Room 3, NTU

Time table:

3:30	Robust Parsing: Bridging the Coverage Chasm	Dan Flickinger (Stanford)
4:30	Tea and snacks
5:00	Non-compositional lexical semantics: how can idioms be represented in a lexical resource	Christiane Fellbaum (Princeton)
6:00	Discussion

Non-compositional lexical semantics: how can idioms be represented in a lexical resource (Christiane Fellbaum)

Abstract:
Idioms constitute a subclass of multi-word units that exhibit strong collocational preferences and whose meanings are at least partially non-compositional. The classic view of idioms as "long words" admits of little or no variation of a canonical form. Fixedness is thought to reflect semantic non-compositionality: the non-availability of semantic interpretation for some or all idiom constituents and the impossibility to parse syntactically ill-formed idioms block regular grammatical operations. We argue that corpus data showing a wide range of discourse-sensitive morphosyntactic flexibility and lexical variation--even in cases where the constituents cannot be semantically interpreted--refute this simplistic view of idioms. Such data weaken the categorical distinction between idioms and freely composed phrases and pose a challenge to the representation of idioms and their constituents in lexical resources designed for Natural Language Processing. We discuss one possible solution, illustrated by the treatment of idioms in the large lexical database WordNet.

Biography of the speaker: Christiane Fellbaum is a Senior Research Scholar in the Computer Science Department. Her Ph.D. is in Linguistics and her research focuses on computational and corpus linguistics and lexical semantics. She teaches a course on Bilingualism and enjoys exploring new languages and faraway places.
She is Co-Founder and Co-President of the Global WordNet Association. She was awarded Wolfgang Paul Prize and the Antonio Zampolli Prize. She is partner in the European Projects KYOTO, SIERA. A Permanent Fellow and Member of the Center for Language, Berlin-Brandenburg Academy of Sciences. A Member, Board of Directors, American Friends of the Humboldt Foundation. And she currently works supported by the U.S. National Science Foundation, the European Union (Seventh Framework), the Frank Moss Foundation and the Tim Gill Foundation.

Robust Parsing: Bridging the Coverage Chasm (Dan Flickinger)

Abstract:
Grammar implementations which are guided by linguistic theory will normally lack coverage of even some well-formed utterances, since no current theory exhaustively characterizes all of the phenomena in any language. For many uses of a grammar, approximate or robust analyses of the out-of-grammar utterances would be better than nothing, and a variety of approaches have been developed for such robust parsing. In this paper I present an implemented method which adds two simple "bridging" rules to an existing broad-coverage grammar, the English Resource Grammar, allowing any two constituents to combine. This method relies on a parser which can efficiently pack the full parse forest for an utterance, and then selectively unpack the most likely N analyses guided by a statistical model trained on a manually constructed treebank....

Biography of the speaker: Dan Flickinger is a Senior Research Associate at the Center for the Study of Language and Information (CSLI) and Project Manager of the LinGO Laboratory at CSLI, Stanford University
Flickinger is the principal developer of the English Resource Grammar (ERG), a precise broad-coverage implementation of Head-driven Phrase Structure Grammar (HPSG). His current research is focused in two broad areas: Parsing text for improved information retrieval, and applying the ERG to improved educational software. Flickinger’s central research interests are in wide-coverage grammar engineering for both parsing and generation, lexical representation and the syntax-semantics interface.

This lecture is supported by the NTU Centre for Liberal Arts and Social Sciences (CLASS) and the Singapore MOE Tier2 Grant That's what you meant: A Rich Representation for Manipulating Meaning (MOE ARC41/13).

Francis Bond <bond@ieee.org>
Computational Linguistics Laboratory
Division of Linguistics and Multilingual Studies
Nanyang Technological University