Fred Jelinek Seminar Series
In the academic year 2011/2012, we were honored to host first six speakers of the Fred Jelinek Seminar Series (see the Archive).
Current Lecture
Monday, May 20, 2013
LEO WANNER (Universitat Pompeu Fabra, Barcelona)
TOWARDS A MULTILAYER AND MULTIDIMENSIONAL CORPUS ANNOTATION: FOLLOWING THE FOOTPRINTS OF THE MEANING-TEXT THEORY
The talk takes place on May 20, 2013, at 1:30 p.m. at the Faculty of Mathematics and Physics, Malostranske nam. 25, 4th floor, room S1 (428). PDF invitation
Abstract:
An increasing number of treebanks is available for training statistical Natural Language Processing applications. Nearly all of them capture linguistic phenomena of different nature (at least word order, morphological features and syntactic dependencies), but only a few (among them, the Prague Dependency Treebank, PDT) actually separate these phenomena in terms of different levels of annotation; the majority uses one single agglomerated annotation structure. Such a structure can be considered deficient from the theoretical (linguistic) point of view. It also reduces the quality of the annotated resources, which in turn hampers the quality of the applications trained on them. As already pointed out by numerous scholars, the annotation of corpora is of higher quality when a well-defined linguistic model which supports multi-level annotation is followed. In my talk, I will present the annotation of Spanish and English corpora rooted in the linguistic model of the Meaning-Text Theory. I will introduce the annotation schema we have developed for the surface-syntactic layer of Spanish and discuss how we (semi-)automatically derive from the surface-syntactic annotation the more abstract deep-syntactic and semantic annotations. In the second half of my talk, I will report on our work in progress on the annotation of the Penn Treebank with the Theme/Rheme structure. To conclude, I will draw some parallels between the annotation philosophy underlying PDT 2.0 and ours.
About the seminar
The seminar series is founded in recognition of the late Professor Frederick Jelinek, an honorary doctor of Charles University and almost for twenty years a guest Professor of our Faculty.
Professor Dr. Frederick Jelinek (1932-2010), dr.h.c. Charles University in Prague, Julian Smith Professor at JHU, Baltimore, MD, USA, of Czech origin, was an outstanding researcher in Electrical Engineering and Computational Linguistics. His breakthrough ideas have led to a whole new research paradigm - application of stochastic methods - in the field of automatic speech recognition as well as in natural language processing in general. He held leading positions at Cornell University, IBM T. J. Watson Research Center and Johns Hopkins University, and was a guest professor of Charles University in Prague.
Seminars usually start at 1:30pm every other Monday (taking turns with the regular Seminar on Formal Linguistics), in the room S1 (4th floor) at the MFF UK bulding at Malostranske nam. 25, 11800 Prague 1, Czech Republic.


