will be held in
Prague, Czech Republic
January 23-24, 2018
at the Faculty of Mathematics and Physics, Charles University,
Malostranské náměstí 25, 11800 Prague 1, Czech Republic
TLT serves as a venue for new and ongoing research on the topic of linguistics and treebanks. The 16th edition of TLT will return to the heart of Europe, at Charles University, Prague, in January 2018.
For 16 years now, TLT has served as a venue for new and ongoing high-quality work related to syntactically-annotated corpora, i.e., treebanks; with a focus on all the aspects of treebanking – descriptive, theoretical, formal and computational – but also going beyond treebanks, including other levels of annotation such as frame semantics, coreference or events, to name only a few.
Building systems able to provide a semantic representation of texts has long been an objective, both in linguistics and in applied NLP. Although advances in machine learning sometimes seem to diminish the need to use as input sophisticated structured representations of sentences, the enthusiasm for interpreting trained neural networks somewhat seems to reaffirm that need. Because they represent schematic situations, semantic frames (Fillmore, 82), as intantiated into FrameNet (Baker, Fillmore and Petruck, 83) are an appealing level of generalization over the eventualities described in texts. In this talk, I will present some feedback from the development of a French FrameNet, including analysis of the main difficulties we faced during annotation. I will describe how linking generalizations can be extracted from the frame-annotated data, using deep syntactic annotations. I will then investigate what kind of input is most effective for FrameNet parsing, from no syntax at all to deep syntactic representations.
The work I'll present is a joint work with Marianne Djemaa, Philippe Muller, Laure Vieu, G. de Chalendar, B. Sagot, P. Amsili (for the French FrameNet), C. Ribeyre, D. Seddah, G. Perrier, B. Guillaume (deep syntax), and Olivier Michalon, Alexis Nasr (semantic parsing).
Research in syntactic parsing is largely driven by progress in intrinsic evaluation and there have been impressive developments in recent years in terms of evaluation measures, such as F-score or labeled accuracy. At the same time, a range of different syntactic representations have been put to use in treebank annotation projects and there have been studies measuring various aspects of the "learnability" of these representations and their suitability for automatic parsing, mostly also evaluated in terms of intrinsic measures. In this talk I will provide a different perspective on these developments and give an overview of research that examines the usefulness of syntactic analysis in downstream applications. The talk will discuss both constituency-based and dependency-based representations, with a focus on various flavours of dependency-based representations, ranging from purely syntactic representations to more semantically oriented representations. The recently completed shared task on Extrinsic Parser Evaluation was aimed at assessing the utility of different types of dependency representations for downstream applications and I will discuss some of our findings based on the results from this task as well as follow-up experiments and analysis.
This year, the TLT is co-located with the Workshop on Provenance and Annotation in Computational Linguistics 2018, taking place also in Prague (same venue as TLT), on January 22nd, 2018, organized by Miriam Butt of University of Konstanz, Germany. The workshop is free of charge (please use the TLT registration form to register for the Workshop).
Since TLT in Warsaw, the CRH ("Corpus-based Research in the Humanities") workshop has been collocated with TLT. This year, it takes place just a couple hours away - in Vienna, Austria, immediately following the TLT.