PDiT: Prague Discourse Treebank - Publications

Publications related to the project

Discourse relations

Theoretical background and the beginning of the project

Mladová, Lucie, Zikánová, Šárka, Bedřichová, Zuzanna, Hajičová, Eva. 2009. Towards a Discourse Corpus of Czech. In Proceedings of the fifth Corpus Linguistics Conference (CL 2009), Liverpool, Velká Británie, in press (doc)
Mladová, Lucie, Zikánová, Šárka, Hajičová, Eva. 2008. From Sentence to Discourse: Building an Annotation Scheme for Discourse Based on Prague Dependency Treebank. In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008), Marakéš, Maroko (pdf)

Complex description of the topic, later phases of the project

Zikánová, Šárka. 2012. Text annotations in the Prague Dependency Treebank. Accepted for publication in: Linguistica Pragensia, Ústav pro jazyk český AV ČR, ISSN 0862-8432, 9 pp. (pdf)
Mladová, Lucie. 2011. Annotating Discourse in the Prague Dependency Treebank (An Overview). A presentation at Prague Workshop in Discourse Annotation, Faculty of Mathematics and Physics, Charles University in Prague, May 2011
Mladová, Lucie. 2011. Annotating Discourse in Prague Dependency Treebank. A presentation at the workshop Annotation of Discourse Relations in Large Corpora at the conference Corpus Linguistics 2011 (CL 2011), Birmingham, Great Britain, July 2011

Annotation of discourse in PDT

Poláková, Lucie, Jínová, Pavlína, Zikánová, Šárka, Hajičová, Eva, Mírovský, Jiří, Nedoluzhko, Anna, Rysová, Magdaléna, Pavlíková, Veronika, Zdeňková, Jana, Pergler, Jiří, Ocelák, Radek. 2012. Prague Discourse Treebank 1.0. Data/software, ÚFAL MFF UK, Prague, Czech Republic, Nov 2012 (downloadable CD distribution)
Poláková, Lucie, Jínová, Pavlína, Zikánová, Šárka, Bedřichová, Zuzana, Mírovský, Jiří, Rysová, Magdaléna, Zdeňková, Jana, Pavlíková, Veronika, Hajičová, Eva. 2012. Manual for Annotation of Discourse Relations in the Prague Dependency Treebank. Technical Report No. 47, ÚFAL, Charles University in Prague (pdf)
Mírovský, Jiří, Mladová, Lucie, Žabokrtský, Zdeněk. 2010. Annotation Tool for Discourse in PDT. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), Beijing, China, pp. 9–12 (pdf)

Inter-annotator agreement

Jínová, Pavlína, Mírovský, Jiří, Poláková, Lucie. 2012. Analyzing the Most Common Errors in the Discourse Annotation of the Prague Dependency Treebank. In Proceedings of the 11th International Workshop on Treebanks and Linguistic Theories (TLT 11), Lisboa, Portugal
Mírovský, Jiří, Mladová, Lucie, Zikánová, Šárka. 2010. Connective-Based Measuring of the Inter Annotator Agreement in the Annotation of Discourse in PDT. In Proceedings of the 23rd International Conference on Computational Linguistics (Coling 2010), Beijing, China, pp. 775–781 (pdf)
Zikanová, Šárka, Mladová, Lucie, Mírovský, Jiří, Jínová, Pavlína. 2010. Typical Cases of Annotators’ Disagreement in Discourse Annotations in Prague Dependency Treebank. In Proceedings of the 7th International Conference on Language Resources and Evaluation (LREC 2010), Valletta, Malta, pp. 2002–2006 (pdf)

Linguistic research based on the annotated data

Mírovský, Jiří, Jínová, Pavlína, Poláková, Lucie. 2012. Does Tectogrammatics help the Annotation of Discourse?. In Procedings of the 24th International Conference on Computational Linguistics (COLING 2012), Mumbai, India, December 2012
Jínová, Pavlína, Mírovský, Jiří, Poláková, Lucie. 2012. Semi-Automatic Annotation of Intra-sentential Discourse Relations in PDT. In Procedings of the 24th International Conference on Computational Linguistics (COLING 2012), ADACA Discourse Workshop, Mumbai, India, December 2012
Poláková, Lucie, Jínová, Pavlína, Mírovský, Jiří. 2012. Interplay of Coreference and Discourse Relations: Discourse Connectives with a Referential Component. In Proceedings of the 8th International Conference on Language Resources and Evaluation (LREC 2012), İstanbul, Turkey, pp. 146–153 (pdf)
Jínová, Pavlína. 2012. Nejčastější konektivní prostředky kauzálního vztahu v Pražském závislostním korpusu. Accepted to be published in Studie z aplikované lingvistiky 2012
Jínová, Pavlína. 2012. Diverzita významů konektivních prostředků v rámci anotace mezivýpovědních významových vztahů v PDT: výraz přitom. Submitted to publication in Bohemica Iuvenilia 2012
Jínová, Pavlína, Mladová, Lucie, Mírovský, Jiří. 2011. Sentence Structure and Discourse Structure: Possible Parallels. In Proceedings of the International Conference on Dependency Linguistics (Depling 2011), Barcelona, Spain, pp. 233–240 (pdf)
Rysová, Magdaléna. 2011. Problematic relations in the PDT annotation: Explication and Cause. Presentation on Prague Workshop in Discourse Annotation, Faculty of Mathematics and Physics, Charles University in Prague, May 2011
Jínová, Pavlína. 2011. Vybrané problematické aspekty konektivních prostředků v rámci anotace mezivýpovědních významových vztahů v PDT. Bohemica Iuvenilia, 2, pp. 138–147
Jínová, Pavlína. 2011. Connective means of causal relation in the Prague Dependency Treebank. A presentation at the workshop Conjunctions and Contextualizers, Faculty of Philosophy, Charles University in Prague, November 2011
Zikánová, Šárka. 2011. Contrast as a phenomenon of discourse, information structure and coreference. Oral presentation at The Sixth Annual Meeting of the Slavic Linguistics Society, Aix-en-Provence, France, September 2011

Coreference Relations

Complex description of the annotation scheme, inter-annotator agreement, tools

Nedoluzhko, Anna. 2011. Rozšířená textová koreference a asociační anafora. Koncepce anotace českých dat v Pražském závislostním korpusu. Prague, ÚFAL. (in Czech), 2011 (pdf)
Nedoluzhko, Anna; Mírovský, Jiří. 2011a. Annotating Extended Textual Coreference and Bridging Relations in the Prague Dependency Treebank. Annotation manual. Technical report No. 44, ÚFAL, Charles University in Prague, 2011, 63 pp. (pdf)
Nedoluzhko, Anna; Mírovský, Jiří, Hajičová, Eva, Pergler, Jiří, Ocelák, Radek. 2011b. Extended Textual Coreference and Bridging Relations in PDT 2.0. CD-ROM ÚFAL, Prague, 2011
Nedoluzhko, Anna. 2010. Coreferential relationships in text - comparative analysis of annotated data.. In Kibrik, Alexandr E. a kol. (eds.). Computational Linguistics and Intellectual Technologies. Papers from the Annual International Conference Dialogue 2010, Issue 9 (16), Moscow, RGGU, 2010 (html)
Nedoluzhko, Anna, Mírovský, Jiří, Pajas, Petr. 2010. Annotation Tool for Extended Textual Coreference and Bridging Anaphora. In Proceedings of LREC 2010. Malta, pp. 168-171 (pdf)
Nedoluzhko, Anna, Mírovský, Jiří, Pajas, Petr. 2010. The Coding Scheme for Annotating Extended Nominal Coreference and Bridging Anaphora in the Prague Dependency Treebank. In Proceedings of ACL-IJCNLP 2009, Linguistic Annotation Workshop (LAW III). Suntec, Singapore, 2009 (pdf)
Nedoluzhko, Anna, Mírovský, Jiří, Ocelák, Radek, Pergler, Jiří. 2009. Extended Coreferential Relations and Bridging Anaphora in the Prague Dependency Treebank. In: Proceedings of the 7th Discourse Anaphora and Anaphor Resolution Colloquium (DAARC 2009), Goa, India, ISBN 978-3-642-04974-3, 2009, pp. 1–16 (pdf)
Nedoluzhko, Anna. 2009. Razmetka koreferencii na sintaksičeski annotorovannom korpuse češskich tekstov. In Kibrik, Alexandr E. a kol. (eds.), Computational Linguistics and Intellectual Technologies. Papers from the Annual International Conference DIALOGUE 2009. Issue 8 (15), Moskva: RGGU, 2009, pp. 332-339 (html)
Kučová, Lucie, Hajičová, Eva. 2004. Coreferential Relations in the Prague Dependency Treebank. In Proceedings of 5th Discourse Anaphora and Anaphor Resolution Colloquium, Edicoes Colibri
Kučová, Lucie, Kolářová, Veronika, Žabokrtský, Zdeněk, Pajas, Petr, Čulo, Oliver. 2003. Anotování koreference v Pražském závislostním korpusu. Praha: ÚFAL/CKL MFF UK, 51, Technical report 2003-19

Coreference resolution

Novák, Michal, Žabokrtský, Zdeněk. 2011. Resolving Noun Phrase Coreference in Czech. In Proceedings of the DAARC 2011 Conference, 2011, Faro, Portugal
Nguy, Giang Linh, Novák, Michal, Nedoluzhko, Anna. 2011. Coreference Resolution in the Prague Dependency Treebank. Technical Report No. 43, ÚFAL, Charles University in Prague, 2011, 71 pp. (pdf)
Novák, Michal. 2010. Machine learning approach to anaphora resolution. Master’s thesis, 2010, MFF UK, Prague
Nguy, G. Linh, Novák, Václav, Žabokrtský, Zdeněk. 2009. Comparison of Classification and Ranking Approaches to Pronominal Anaphora Resolution in Czech. In Proceedings of the SIGDIAL 2009 Conference, 2009, London, UK, Prague

Other related literature

Bejček, Eduard, Panevová, Jarmila, Popelka, Jan, Smejkalová, Lenka, Straňák, Pavel, Ševčíková, Magda, Štěpánek, Jan, Toman, Josef, Žabokrtský, Zdeněk, Hajič, Jan. 2011. Prague Dependency Treebank 2.5. Data/software, Charles University in Prague, MFF, ÚFAL, Praha, Czechia, Dec 2011 (http://ufal.mff.cuni.cz/pdt2.5/)
Mikulová, Marie et al. 2005. Anotace na tektogramatické rovině Pražského závislostního korpusu. Anotátorská příručka. Technical report no. 2005/TR-2005-28, ÚFAL MFF UK, Prague, ISSN 1214-5521, 2005, 1185 pp. (pdf)
Pajas, Petr, Štěpánek, Jan. 2008. Recent Advances in a Feature-Rich Framework for Treebank Annotation. In Proceedings of the 22nd International Conference on Computational Linguistics - Coling 2008, Manchester, UK, ISBN 978-1-905593-45-3, pp. 673-680
Prasad, Rashmi, Dinesh, Nikhil, Lee, Nikhil, Miltsakaki, Eleni, Robaldo, Livio, Joshi, Aravind, and Webber, Bonnie. 2008. The Penn Discourse Treebank 2.0. In Proceedings of the 6th International Conference on Language Resources and Evaluation (LREC 2008), Marrakech, Morocco (pdf)
Miltsakaki, Eleni, Robaldo, Livio, Lee, Alan, Joshi, Aravind. 2008. Sense Annotation in the Penn Discourse Treebank. Proceedings of the 9th International Conference on Intelligent Text Processing and Computational Linguistics, Haifa, Israel, 2008
Hajič, Jan, Panevová, Jarmila, Hajičová, Eva, Sgall, Petr, Pajas, Petr, Štěpánek, Jan, Havelka, Jiří, Mikulová, Marie, Žabokrtský, Zdeněk, Ševčíková-Razímová Magda. 2006. Prague Dependency Treebank 2.0. Software prototype, Linguistic Data Consortium, Philadelphia, PA, USA, ISBN 1-58563-370-4, www.ldc.upenn.edu, Jul 2006 (http://ufal.mff.cuni.cz/pdt2.0/)
Cohen, Jacob. 1960. A coefficient of agreement for nominal scales. Educational and Psychological Measurement 20 (1), 1960, pp. 37–46