David Mareček

office
409
email
david.marecek@mff.cuni.cz
phone
+420 951 554 273
fax
+420 257 223 293
address
Malostranské náměstí 25
118 00 Praha 1
Czech Republic

Main Research Interests

dependency syntax, unsupervised parsing, machine translation

Projects

Teaching

NPFL097 - Selected Problems in Machine Learning

Selected Bibliography

Theses

  • Doctoral thesis (2012): Unsupervised Dependency Parsing [pdf] [slides]
  • Master thesis (2008): Automatic Alignment of Tectogrammatical Trees from Czech-English Parallel Corpus [pdf] [slides]
  • Bachelor thesis (2006): Novelizátor zákonů (in Czech) [pdf]

Selected Papers

list of all publications

2015

  • David Mareček: Multilingual Unsupervised Dependency Parsing with Unsupervised POS tags In: MICAI 2015: Advances in Artificial Intelligence and Soft Computing, Part I, pp. 72-82, Springer, Berlin/Heidelberg, ISBN 978-3-319-27059-3

2014

  • Daniel Zeman, Ondřej Dušek, David Mareček, Martin Popel, Loganathan Ramasamy, Jan Štěpánek, Zdeněk Žabokrtský, and Jan Hajič: HamleDT: Harmonized Multi-LanguageDependency Treebank In Language Resources and Evaluation, ISBN 1574-020X, vol. 48,no. 4, pp. 601-637, 2014
  • Pavel Pecina, Ondřej Dušek, Lorraine Goeuriot, Jan Hajič, Jaroslava Hlaváčová, Gareth J.F. Jones, Liadh Kelly, Johannes Leveling, David Mareček, Michal Novák, Martin Popel, Rudolf Rosa, Aleš Tamchyna, Zdeňka Urešová Adaptation of machine translation for multilingual information retrieval in medical domain In: Artificial Intelligence in Medicine, ISBN 0933-3657, vol. 61, no. 3, pp. 165-185, 2014

2013

  • David Mareček and Milan Straka: Stop-probability estimates computed on a large corpus improve Unsupervised Dependency Parsing In Annual Meeting of the Association for Computational Linguistics (ACL'13), Sofia, Bulgaria, August 2013 [pdf]
  • Martin Popel, David Mareček, Jan Štěpánek, Daniel Zeman, and Zdeněk Žabokrtský: Coordination Structures in Dependency Treebanks In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, pp. 517-527, Association for Computational Linguistics, Sofia, Bulgaria, August 2013
  • Rudolf Rosa, David Mareček, and Aleš Tamchyna: Deepfix: Statistical Post-editing of Statistical Machine Translation Using Deep Syntactic Analysis In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics, Student Research Workshop, pp. 172-179, Association for Computational Linguistics, Sofia, Bulgaria, August 2013

2012

  • Rudolf Rosa and David Mareček: Dependency Relations Labeller for Czech. In Text, Speech and Dialogue, Lecture Notes in Computer Science, Volume 7499, pp 256-263, Springer-Verlag Berlin/Heidelberg, 2012 [pdf]
  • David Mareček and Zdeněk Žabokrtský: Exploiting Reducibility in Unsupervised Dependency Parsing. In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning, pages 297-307, Jeju Island, Korea, July, 2012 [pdf] [bib] [slides]
  • Rudolf Rosa, Ondřej Dušek, David Mareček, and Martin Popel Using Parallel Features in Parsing of Machine-Translated Sentences for Correction of Grammatical Errors. In Proceedings of the Sixth Workshop on Syntax, Semantics and Structure in Statistical Translation, pages 39-48, Jeju Island, Korea, 2012 [pdf] [bib]
  • Rudolf Rosa, David Mareček, and Ondřej Dušek: DEPFIX: A System for Automatic Correction of Czech MT Outputs. In Proceedings of the Seventh Workshop on Statistical Machine Translation, Association for Computational Linguistics, pages 362-368, Montreal, Canada, June 7-8, 2012 [pdf] [bib]
  • Ondřej Dušek, Zdeněk Žabokrtský, Martin Popel, Martin Majliš, Michal Novák, and David Mareček: Formemes in English-Czech Deep Syntactic MT. In Proceedings of the Seventh Workshop on Statistical Machine Translation, Association for Computational Linguistics, pages 267-274, Montreal, Canada, June 7-8, 2012 [pdf]
  • David Mareček, Zdeněk Žabokrtský: Unsupervised Dependency Parsing using Reducibility and Fertility features. In Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure, pages 84–89, Montréal, Canada, June 7, 2012 [pdf]
  • Daniel Zeman, David Mareček, Martin Popel, Loganathan Ramasamy, Jan Štěpánek, Zdeněk Žabokrtský and Jan Hajič: HamleDT: To Parse or Not to Parse?. In Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12), pp. 2735-2741, Istanbul, Turkey, 2012 [pdf]
  • Ondřej Bojar, Zdeněk Žabokrtský, Ondřej Dušek, Petra Galuščáková, Martin Majliš, David Mareček, Jiří Maršík, Michal Novák, Martin Popel and Aleš Tamchyna: The Joy of Parallelism with CzEng 1.0. In Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12), pp. 3921-3928, Istanbul, Turkey, 2012 [pdf]

2011

  • David Mareček and Zdeněk Žabokrtský: Gibbs Sampling with Treeness constraint in Unsupervised Dependency Parsing. In Proceedings of RANLP Workshop on Robust Unsupervised and Semisupervised Methods in Natural Language Processing, pp. 1–8, Hissar, Bulgaria, 2011 [pdf] [slides PPT]
  • Martin Popel, David Mareček, Nathan Green and Zdeněk Žabokrtský: Influence of Parser Choice on Dependency-Based MT. In Proceedings of WMT 2011, EMNLP 6th Workshop on Statistical Machine Translation, Edinburgh, UK, pp. 433–439, 2011 [pdf] [poster JPG]
  • David Mareček, Rudolf Rosa, Petra Galuščáková and Ondřej Bojar: Two-step translation with grammatical post-processing. In Proceedings of WMT 2011, EMNLP 6th Workshop on Statistical Machine Translation, Edinburgh, UK, pp. 426–432, 2011 [pdf] [poster PDF]
  • David Mareček: Combining Diverse Word-Alignment Symmetrizations Improves Dependency Tree Projection. In Computational Linguistics and Intelligent Text Processing, Lecture Notes in Computer Science Volume 6608/2011, pages 144-154, DOI: 10.1007/978-3-642-19400-9_12, Springer Berlin/Heidelberg, 2011 [pdf] [slides PPT]

2010

  • Natalia Klyueva, David Mareček: Towards Parallel Czech-Russian Dependency Treebank. In Proceedings of AEPC 2010: Workshop on Annotation and Exploitation of Parallel Corpora, Tartu, Estonia, 2010 [pdf]
  • Martin Popel, David Mareček: Perplexity of n-gram and Dependency Language Models. In Proceedings of TSD 2010, 13th International Conference on Text, Speech and Dialog, Brno, Czech Republic, 2010 [pdf]
  • Ondřej Bojar, Kamil Kos, David Mareček: Tackling Sparse Data Issue in Machine Translation Evaluation. In Proceedings of 48th Annual Meeting of the Association for Computational Linguistics, Uppsala, Sweeden, 2010 [pdf]
  • David Mareček, Martin Popel, Zdeněk Žabokrtský: Maximum Entropy Translation Model in Dependency-Based MT Framework. In Proceedings of the Fifth Workshop on Statistical Machine Translation, Uppsala, Sweeden, 2010 [pdf]

2009

  • David Mareček, Natalia Kljueva: Converting Russian Treebank SynTagRus into Praguian PDT Style.. In Proceedings of Multilingual resources, technologies and evaluation for Central and Eastern European languages, Borovets, Bulgaria, 2009 [pdf] [slides] [poster]  
  • David Mareček: Improving Word Alignment Using Alignment of Deep Structures. In Proceedings of The 12th International Conference TSD 2009, Plzeň, Czech Republic, 2009 [pdf] [slides]
  • David Mareček: Using Tectogrammatical Alignment in Phrase‐Based Machine Translation.. WDS'09 Proceedings of Contributed Papers, MFF UK, Prague, 2009 [pdf] [slides]
  • Ondřej Bojar, David Mareček, Václav Novák, Martin Popel, Jan Ptáček, Jan Rouš, Zdeněk Žabokrtský: English-Czech MT in 2008.. In Proceedings of the Fourth Workshop on Statistical Machine Translation, Athens, Greece. Association for Computational Linguistics, 2009 [pdf]

2008

  • David Mareček, Zdeněk Žabokrtský, Václav Novák: Automatic Alignment of Czech and English Deep Syntactic Dependency Trees.. In Proceedings of EAMT08, Hamburg, Germany, 2008 [pdf] [poster]
     
  • Petr Pajas, David Mareček: MEd - an editor of interlinked multi-layered linearly-structured linguistic annotations, UK MFF UFAL, 2007

Other talks

  • Unsupervised Dependency Parsing, pondělní seminář ÚFAL, Praha, April 2, 2012 [slides PPT]
  • Neřízený závislostní parsing, Schůzka projektu Res Informatica, Praha, November 15, 2011 [slides PPT]
  • Dependency tree projection across parallel texts, FEAST meeting, Saarland University, Saarbrücken, October 18, 2010 [slides PPT] [slides PDF]
  • Maximum Entropy Translation Model in Dependency-Based MT Framework, PIRE meeting, Uppsala, July 15, 2010 [slides]
  • Using alignment of tectogrammatical trees in phrase-based machine translation, SMT seminář, Praha, May 11, 2009 [slides]
  • Analysis and alignment of parallel data in TectoMT, MT Marathon, Prague, January 29, 2009 [slides]
  • Automatické párování uzlů českých a anglických tektogramatických stromů, pondělní seminář, Praha, October 13, 2008 [slides]

 

Students

Master students:

Bachelor students:

  • Filip Hlásek - Automatický dešifrátor pro šifrovací hry, defended in 2014