Authors
Jan Hajič, Eva Hajičová, Jarmila Panevová, Petr Sgall, Silvie Cinková, Eva Fučíková, Marie Mikulová, Petr Pajas, Jan Popelka, Jiří Semecký, Jana Šindlerová, Jan Štěpánek, Josef Toman, Zdeňka Urešová, Zdeněk Žabokrtský
Credits
The Prague Czech English Dependency Treebank 2.0 has come true as a joint effort of a number of people. Alphabetical order (based on their last names) is used throughout, except for publications (such as the Annotator's Guidelines) and tools, where the published order of the authors is respected.
Coordinator (EN, CZ): Jan Hajič
Linguistic support (EN, CZ): Eva Hajičová, Jarmila Panevová, Petr Sgall
Coordination, training, manuals:
- EN: Silvie Cinková
- CZ: Marie Mikulová
Valency lexicons:
- EN (EngvalLex): Jana Šindlerová
- CZ (PDT-ValLex): Zdeňka Urešová
Data pre-processing, annotation support and post-annotation checking:
- EN: Silvie Cinková, Eva Fučíková, Josef Toman, Jiří Semecký
- CZ: Marie Mikulová, Jan Popelka, Jan Štěpánek
Major software and data processing modules: Petr Pajas, Zdeněk Žabokrtský
Additional annotators training (EN): Jana Šindlerová
Annotators:
- English deep-syntax (tectogrammatical) annotation: Kristýna Čermáková, Vojtěch Diatka, Matěj Korvas, Ema Krejčová, Jan Mašek, Anja Nedolužko, Lucie Poláková, Magdalena Rysová, Lenka Šíková, Jana Šindlerová, Kristýna Tomšů, Kateřina Veselá, Kateřina Veselovská
- Czech deep-syntax (tectogrammatical) annotation: Zuzanna Bedřichová, Kristýna Čermáková, Jitka Faktorová, Ivana Klímová, Martina Koppová, Alena Kropíková, Michala Lvová, Aneta Pečenková, Lenka Šíková, Katka Voleková, Olga Zitová
- Czech surface-syntax (analytical) annotation of 2.000 sentences: Ivana Klímová
- Czech coreference annotation: Eliška Černá, Veronika Čurdová, Eliška Davidová, Vojtěch Diatka, Ivan Kafka, Radka Mačugová, Hana Vildová, Klára Zindulková, Zdeněk Zůcha
Czech translation supervision and revisions: Marie Mikulová, Jan Štěpánek
Tools:
- TrEd: Petr Pajas, Peter Fabian
- btred: Petr Pajas
- PML Tree Query: Petr Pajas, Jan Štěpánek
- Treex: Zdeněk Žabokrtský, Martin Popel, David Mareček, Ondřej Bojar, Václav Klimeš, Tomáš Kraut, Václav Novák, Jan Ptáček, Rudolf Rosa, Daniel Zeman
- Segmentation and tokenization of Czech texts: Jan Hajič, Michal Křen
- Morphological Analyzer of Czech: Jan Hajič, Jaroslava Hlaváčová
- English lemmatization: Jiří Semecký
- Czech Tagger: Jan Hajič
- A-layer parser for Czech: Jason Baldridge, Ryan McDonald (MST parser)
- T-layer parser for annotation of Czech: Václav Klimeš
- Wrappers for the parsers: Jan Hajič
- Aligner: David Mareček, Václav Novák, Zdeněk Žabokrtský
- Web-based interface for annotation progress monitoring: Eva Fučíková, Jiří Semecký, Jan Štěpánek, Josef Toman
- XSH: Petr Pajas
Publications:
- Collection: Silvie Cinková
- Formatting: Josef Toman, Silvie Cinková
DVD-ROM, web design: Josef Toman
Data validation: Eva Fučíková, Josef Toman
Accompanying documentation: Silvie Cinková, Josef Toman, Jan Hajič
The English part of PCEDT 2.0 draws on other annotations performed worldwide. Although our linguistic approach is different in many points, we have made substantial use of these annotation efforts while automatically pre-processing the data for our annotators. We are very grateful to the teams of the flat noun phrase annotation, Penn Treebank, PropBank, NomBank and BBN Pronoun Coreference and Entity Type Corpus, whose work has saved a lot of our annotators' time. They are (at least):
- James R. Curran and David Vadas (flat noun phrase annotation)
- Mitchell P. Marcus, Beatrice Santorini, Mary Ann Marcinkiewicz and Ann Taylor (Penn Treebank)
- Martha Palmer, Paul Kingsbury, Olga Babko-Malaya, Scott Cotton, and Benjamin Snyder (PropBank)
- Martha Palmer, Karin Kipper, Edward Loper, Szuting Yi, Susan Brown, Arrick Lafranchi, Russell-Lee Goldman, Derek Trumbo, Andy Dolbey, Hoa Trang Dang, Neville Ryan, Benjamin Snyder (VerbNet)
- Adam Meyers, Ruth Reeves, Catherine Macleod (NomBank)
- Ralph Weischedel and Ada Brunstein (BBN Pronoun Coreference and Entity Type Corpus)
The Czech part of the corpus was tagged with the MST Parser developed by Jason Baldridge and Ryan McDonald.
This web page uses Oxygen icons (among others). These icons can be freely copied under the LGPLv3.