Fred Jelinek Seminar Series

In the academic year 2011/2012, we were honored to host first six speakers of the Fred Jelinek Seminar Series (see the Archive).

Current Lecture

Monday, May 20, 2013

LEO WANNER (Universitat Pompeu Fabra, Barcelona)
TOWARDS A MULTILAYER AND MULTIDIMENSIONAL CORPUS ANNOTATION: FOLLOWING THE FOOTPRINTS OF THE MEANING-TEXT THEORY

The talk takes place on May 20, 2013, at 1:30 p.m. at the Faculty of Mathematics and Physics, Malostranske nam. 25, 4th floor, room S1 (428). PDF invitation

Abstract:

An increasing number of treebanks is available for training statistical Natural Language Processing applications. Nearly all of them capture linguistic phenomena of different nature (at least word order, morphological features and syntactic dependencies), but only a few (among them, the Prague Dependency Treebank, PDT) actually separate these phenomena in terms of different levels of annotation; the majority uses one single agglomerated annotation structure. Such a structure can be considered deficient from the theoretical (linguistic) point of view. It also reduces the quality of the annotated resources, which in turn hampers the quality of the applications trained on them. As already pointed out by numerous scholars, the annotation of corpora is of higher quality when a well-defined linguistic model which supports multi-level annotation is followed. In my talk, I will present the annotation of Spanish and English corpora rooted in the linguistic model of the Meaning-Text Theory. I will introduce the annotation schema we have developed for the surface-syntactic layer of Spanish and discuss how we (semi-)automatically derive from the surface-syntactic annotation the more abstract deep-syntactic and semantic annotations. In the second half of my talk, I will report on our work in progress on the annotation of the Penn Treebank with the Theme/Rheme structure. To conclude, I will draw some parallels between the annotation philosophy underlying PDT 2.0 and ours.

About the seminar

The seminar series is founded in recognition of the late Professor Frederick Jelinek, an honorary doctor of Charles University and almost for twenty years a guest Professor of our Faculty.

Professor Dr. Frederick Jelinek (1932-2010), dr.h.c. Charles University in Prague, Julian Smith Professor at JHU, Baltimore, MD, USA, of Czech origin, was an outstanding researcher in Electrical Engineering and Computational Linguistics. His breakthrough ideas have led to a whole new research paradigm - application of stochastic methods - in the field of automatic speech recognition as well as in natural language processing in general. He held leading positions at Cornell University, IBM T. J. Watson Research Center and Johns Hopkins University, and was a guest professor of Charles University in Prague.

Seminars usually start at 1:30pm every other Monday (taking turns with the regular Seminar on Formal Linguistics), in the room S1 (4th floor) at the MFF UK bulding at Malostranske nam. 25, 11800 Prague 1, Czech Republic.

Archive and links to recordings

Date Invited Lecturer Topic (links to recording if available)
Apr 29, 2012 Barbara Moser-Mercer, University of Geneve EXPERT PERFORMANCE AND THE MULTILINGUAL BRAIN
Apr 26, 2012 Martin Kay, Stanford University PUTTING LINGUISTICS BACK INTO COMPUTATIONAL LINGUISTICS
Nov 26, 2012 Martha Palmer, University of Colorado at Boulder BEYOND SHALLOW SEMANTICS
Oct 22, 2012 Martin Kay, Stanford University THE NEW MACHINE TRANSLATION—GETTING BLOOD FROM A STONE
Oct 8, 2012 James Pustejovsky, Brandeis University REPRESENTING SPATIAL INFORMATION IN LANGUAGE
Sep 25, 2012 Anders Søgaard, University of Copenhagen LEARNING UNDER BIAS IN NLP (slides)
May 14, 2012 Geoffrey Leech, University of Lancaster DECLINE AND DISAPPEARANCE: ON THE NEGATIVE SIDE OF RECENT CHANGE IN ENGLISH
March 26, 2012 Mark Steedman, University of Edinburgh THE STATISTICAL PROBLEM OF LANGUAGE ACQUISITION
Dec 12, 2011 Manfred Stede, University of Potsdam FROM OPINION MINING TO TEXT PARSING: TOWARD THE AUTOMATIC ANALYSIS OF EDITORIALS
Nov 14, 2011 Dan Flickinger, Stanford University COMBINING SYMBOLIC AND STATISTICAL METHODS IN CORPUS-BASED NLP
Nov 7, 2011 Mirjam Fried, Charles University in Prague HUMAN LANGUAGE AS AN EXERCISE IN CREATIVE RECYCLING: WELCOME TO THE WORLD OF GRAMMATICAL CONSTRUCTIONS
Oct 10, 2011 Joakim Nivre, Uppsala University LOST IN THE WOODS? TRANSITION-BASED DEPENDENCY PARSING WITH NON-PROJECTIVE TREES

Content: Magda Ševčíková. Webmasters: Juraj Šimlovič.
Site is valid XHTML 1.0 and valid CSS. Maintained with TED Notepad replacement and Vim text editor.
2007 © Institute of Formal and Applied Linguistics. All Rights Reserved.

Site navigation: