Monday, December 12, 2011 - 13:30

From opinion mining to text parsing: Toward the automatic analysis of editorials

Abstract: "Opinion mining" has been a flourishing application of Computational Linguistics for several years now, one reason being its commercial relevance in tracking opinions about products on web sites. For these purposes, current systems sometimes yield respectable results. But when moving to opinions whose target is less clear-cut (as for instance in political debate), the task becomes much more difficult. In my talk, I will present work on the SO-CAL opinion recognition system, where I have collaborated with colleagues at
Simon-Fraser University (Taboada et al. 2011), and discuss the various problems surfacing when applying this lexicon-based approach to different kinds of editorial text. One central problem is the role of discourse structure, which needs to be taken into account for appropriately recognizing author's stances. In our current work at Potsdam, we combine layers of discourse structure (most importantly coreference and connectives with their scopes) with the sentence-based opinion recognition as in SO-CAL; I present some first results of these efforts.

Manfred Stede studied Computer Science and Linguistics at TU Berlin, University of Edinburgh and Purdue University, where he obtained his M.Sc. in 1989. In his Ph.D. thesis (University of Toronto, 1996), he developed a model of handling lexical semantics and the connection to knowledge representation in multilingual text generation. From 1995 to 2000 he worked on the German Verbmobil machine translation project at TU Berlin, focusing on knowledge-based disambiguation and identification of dialogue acts. After a short stay in a company in Berlin, he became professor of Applied Computational Linguistics at Potsdam University in 2001. He directed applications-oriented research projects on information extraction, text summarization, text generation, and linguistic databases; on the theoretical side, he has been primarily interested in models of discourse structure that combine multiple layers of representation. He published three books (the latest, "Discourse Processing", is just now coming out), and authored or co-authored a dozen journal papers plus numerous book chapters and conference papers.