Principal investigator (ÚFAL): 
Provider: 
Grant id: 
ME10018
ÚFAL budget: 
1,7 mil. Kč
Duration: 
2010 - 2012

Kontakt

Towards a Computational Analysis of Text Structure

Cooperation Czech Republic - USA

The project aimes to create a computational model of the text structure in Czech and in English that serves for NLP purposes. The project is based on the data from two annotated corpora: Prague Dependency Treebank (Charles University in Prague, Czech Republic) and Penn Discourse Treebank (University of Pennsylvania, USA). In the course of the project, the Czech Prague Discourse Treebank 1.0 has been be completed and released http://ufal.mff.cuni.cz/discourse/, and the multilingual data has been be used for testing and subsequent improvement of annotation systems. The results of the analyses are available for automatic information retrieval, text summarization or machine translation.
 

Partner:

University of Pennsylvania

Institute for Research in Cognitive Science

Philadelphia, PA, USA

Publications

Jínová Pavlína, Mírovský Jiří, Poláková Lucie: Analyzing the Most Common Errors in the Discourse Annotation of the Prague Dependency Treebank. In: Proceedings of the 11th International Workshop on Treebanks and Linguistic Theories, Copyright © Edicoes Colibri, Lisboa, Lisboa, Portugal, ISBN 978-989-689-274-6, pp. 127-132, 2012
Poláková Lucie: Annotating Discourse in Prague Dependency Treebank. Contributed talk, Penn Discourse Treebank Workshop 2012, University of Pennsylvania, Prague, Czech Republic, Apr 2012
Hajičová Eva, Rysová Kateřina: Some aspects of the information structure of the sentence . Contributed talk, Prague Workshop on Discourse Annotation, UFAL MFF UK, Prague, Czech Republic, May 2011
Rysová Magdaléna: Discourse Connectives and Their Alternative Lexicalizations in Czech. Contributed talk, MULDICO workshop, Friedrich Schiller University of Jena, Prague, Czech Republic, Oct 2012
Poláková Lucie: Learning about Connectives from Syntactic Annotation. Contributed talk, MULDICO workshop, Friedrich Schiller University of Jena, Prague, Czech Republic, Oct 2012
Jínová Pavlína, Mírovský Jiří, Poláková Lucie: Semi-Automatic Annotation of Intra-Sentential Discourse Relations in PDT. In: Proceedings of the Workshop on Advances in Discourse Analysis and its Computational Aspects (ADACA) at Coling 2012, Copyright © Coling 2012 Organizing Committee, Mumbai, India, pp. 43-58, 2012
Zikánová Šárka: Some notes on the comparison of the PDTB and PDT discourse annotation. Contributed talk, Prague Workshop on Discourse Annotation, Prague, Czech Republic, May 2011
Rysová Magdaléna: Problematic relations in the PDT annotation: Explication and Cause . Contributed talk, Prague Workshop on Discourse Annotation, UFAL MFF UK, Prague, Czech Republic, Jun 2011
Jínová Pavlína: Vybrané problematické aspekty konektivních prostředků v rámci anotace mezivýpovědních významových vztahů v PDT . In: Bohemica Olomucensia, Vol. 2, Copyright © Vydavatelství University Palackého, ISSN 1803-876X, pp. 138-147, 2011
Rysová Magdaléna: On Discourse Annotation in PDT. Contributed talk, Seminar of Formal Linguistics, UFAL MFF UK, Prague, Czech Republic, Oct 2012
Poláková Lucie, Jínová Pavlína, Zikánová Šárka, Hajičová Eva, Mírovský Jiří, Nedoluzhko Anna, Rysová Magdaléna, Pavlíková Veronika, Zdeňková Jana, Pergler Jiří, Ocelák Radek: Prague Discourse Treebank 1.0. Data/software, ÚFAL MFF UK, Prague, Czech Republic, http://ufal.mff.cuni.cz/discourse/, Nov 2012
Zikánová Šárka: Interplay of Discourse, Information Structure and Coreference Relations. Contributed talk, Prague Workshop on Discourse Annotation, Prague, Czech Republic, May 2011
Poláková Lucie, Jínová Pavlína, Zikánová Šárka, Bedřichová Zuzanna, Mírovský Jiří, Rysová Magdaléna, Zdeňková Jana, Pavlíková Veronika, Hajičová Eva: Manual for Annotation of Discourse Relations in Prague Dependency Treebank. Technical report no. 2012/47, Copyright © Institute of Formal and Applied Linguistics, Charles University in Prague, Prague, Czech Republic, pp. 1-83, 83 pp., Dec 2012
Mírovský Jiří, Jínová Pavlína, Poláková Lucie: Does Tectogrammatics Help the Annotation of Discourse?. In: Proceedings of the 24th International Conference on Computational Linguistics (Coling 2012), Copyright © Coling 2012 Organizing Committee, Mumbai, India, pp. 853-862, 2012