David Mareček
|
Institute of Formal and Applied Linguistics Faculty of Mathematics and Physics Charles University in Prague Address: UFAL, MFF UK Malostranské náměstí 25 CZ-118 00 Praha tel: +420-2-2191-4273 e-mail: marecek@ufal.mff.cuni.cz |
Research interests
dependency syntax, unsupervised parsing, machine translation
Projects
- TectoMT - Modular NLP Framework, Tree-based machine translation system with transfer on deep syntax
- FAUST - Feedback Analysis for User Adaptive Statistical Translation
Témata prací (in Czech)
ročníkové projekty, bakalářské práce, diplomové práce
Publications
Theses
- Doctoral thesis: Unsupervised Dependency Parsing [pdf] [slides]
- Master thesis: Automatic Alignment of Tectogrammatical Trees from Czech-English Parallel Corpus [pdf] [slides]
- Bachelor thesis: Novelizátor zákonů (in Czech) [pdf]
Papers
2012
-
Rudolf Rosa and David Mareček:
Dependency Relations Labeller for Czech.
In Text, Speech and Dialogue, Lecture Notes in Computer Science, Volume 7499, pp 256-263, Springer-Verlag Berlin/Heidelberg, 2012
[pdf]
-
David Mareček and Zdeněk Žabokrtský:
Exploiting Reducibility in Unsupervised Dependency Parsing.
In Proceedings of the 2012 Joint Conference on Empirical Methods in Natural Language Processing and Computational Natural Language Learning,
pages 297-307, Jeju Island, Korea, July, 2012
[pdf]
[bib]
[slides]
-
Rudolf Rosa, Ondřej Dušek, David Mareček, and Martin Popel
Using Parallel Features in Parsing of Machine-Translated Sentences for Correction of Grammatical Errors.
In Proceedings of the Sixth Workshop on Syntax, Semantics and Structure in Statistical Translation, pages 39-48, Jeju Island, Korea, 2012
[pdf]
[bib]
-
Rudolf Rosa, David Mareček, and Ondřej Dušek:
DEPFIX: A System for Automatic Correction of Czech MT Outputs.
In Proceedings of the Seventh Workshop on Statistical Machine Translation, Association for Computational Linguistics,
pages 362-368, Montreal, Canada, June 7-8, 2012
[pdf]
[bib]
-
Ondřej Dušek, Zdeněk Žabokrtský, Martin Popel, Martin Majliš, Michal Novák, and David Mareček:
Formemes in English-Czech Deep Syntactic MT.
In Proceedings of the Seventh Workshop on Statistical Machine Translation, Association for Computational Linguistics,
pages 267-274, Montreal, Canada, June 7-8, 2012
[pdf]
[bib]
-
David Mareček, Zdeněk Žabokrtský:
Unsupervised Dependency Parsing using Reducibility and Fertility features.
In Proceedings of the NAACL-HLT Workshop on the Induction of Linguistic Structure,
pages 84–89, Montréal, Canada, June 7, 2012
[pdf]
-
Daniel Zeman, David Mareček, Martin Popel, Loganathan Ramasamy, Jan Štěpánek, Zdeněk Žabokrtský and Jan Hajič:
HamleDT: To Parse or Not to Parse?.
In Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12),
pp. 2735-2741, Istanbul, Turkey, 2012
[pdf]
-
Ondřej Bojar, Zdeněk Žabokrtský, Ondřej Dušek, Petra Galuščáková, Martin Majliš, David Mareček, Jiří Maršík, Michal Novák, Martin Popel and Aleš Tamchyna:
The Joy of Parallelism with CzEng 1.0.
In Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12),
pp. 3921-3928, Istanbul, Turkey, 2012
[pdf]
2011
-
David Mareček and Zdeněk Žabokrtský:
Gibbs Sampling with Treeness constraint in Unsupervised Dependency Parsing.
In Proceedings of RANLP Workshop on Robust Unsupervised and Semisupervised Methods in Natural Language Processing,
pp. 1–8, Hissar, Bulgaria, 2011
[pdf]
[slides PPT]
-
Martin Popel, David Mareček, Nathan Green and Zdeněk Žabokrtský:
Influence of Parser Choice on Dependency-Based MT.
In Proceedings of WMT 2011, EMNLP 6th Workshop on Statistical Machine Translation,
Edinburgh, UK, pp. 433–439, 2011
[pdf]
[poster JPG]
-
David Mareček, Rudolf Rosa, Petra Galuščáková and Ondřej Bojar:
Two-step translation with grammatical post-processing.
In Proceedings of WMT 2011, EMNLP 6th Workshop on Statistical Machine Translation,
Edinburgh, UK, pp. 426–432, 2011
[pdf]
[poster PDF]
-
David Mareček:
Combining Diverse Word-Alignment Symmetrizations Improves Dependency Tree Projection.
In Computational Linguistics and Intelligent Text Processing, Lecture Notes in Computer Science
Volume 6608/2011, pages 144-154, DOI: 10.1007/978-3-642-19400-9_12, Springer Berlin/Heidelberg, 2011
[pdf]
[slides PPT]
2010
-
Natalia Klyueva, David Mareček:
Towards Parallel Czech-Russian Dependency Treebank.
In Proceedings of AEPC 2010: Workshop on Annotation and Exploitation of Parallel Corpora,
Tartu, Estonia, 2010
[pdf]
-
Martin Popel, David Mareček:
Perplexity of n-gram and Dependency Language Models.
In Proceedings of TSD 2010, 13th International Conference on Text, Speech and Dialog,
Brno, Czech Republic, 2010
[pdf]
-
Ondřej Bojar, Kamil Kos, David Mareček:
Tackling Sparse Data Issue in Machine Translation Evaluation.
In Proceedings of 48th Annual Meeting of the Association for Computational Linguistics,
Uppsala, Sweeden, 2010
[pdf]
-
David Mareček, Martin Popel, Zdeněk Žabokrtský:
Maximum Entropy Translation Model in Dependency-Based MT Framework.
In Proceedings of the Fifth Workshop on Statistical Machine Translation,
Uppsala, Sweeden, 2010
[pdf]
2009
-
David Mareček, Natalia Kljueva:
Converting Russian Treebank SynTagRus into Praguian PDT Style..
In Proceedings of Multilingual resources, technologies and evaluation for Central and Eastern European languages,
Borovets, Bulgaria, 2009
[pdf]
[slides]
[poster]
-
David Mareček:
Improving Word Alignment Using Alignment of Deep Structures.
In Proceedings of The 12th International Conference TSD 2009,
Plzeň, Czech Republic, 2009
[pdf]
[slides]
-
David Mareček:
Using Tectogrammatical Alignment in Phrase‐Based Machine Translation..
WDS'09 Proceedings of Contributed Papers, MFF UK, Prague, 2009
[pdf]
[slides]
-
Ondřej Bojar, David Mareček, Václav Novák, Martin Popel, Jan Ptáček, Jan Rouš, Zdeněk Žabokrtský:
English-Czech MT in 2008..
In Proceedings of the Fourth Workshop on Statistical Machine Translation,
Athens, Greece. Association for Computational Linguistics, 2009
[pdf]
2008
-
David Mareček, Zdeněk Žabokrtský, Václav Novák:
Automatic Alignment of Czech and English Deep Syntactic Dependency Trees..
In Proceedings of EAMT08, Hamburg, Germany, 2008
[pdf]
[poster]
- Petr Pajas, David Mareček: MEd - an editor of interlinked multi-layered linearly-structured linguistic annotations, UK MFF UFAL, 2007
Other talks
- Unsupervised Dependency Parsing, pondělní seminář ÚFAL, Praha, April 2, 2012 [slides PPT]
- Neřízený závislostní parsing, Schůzka projektu Res Informatica, Praha, November 15, 2011 [slides PPT]
- Dependency tree projection across parallel texts, FEAST meeting, Saarland University, Saarbrücken, October 18, 2010 [slides PPT] [slides PDF]
- Maximum Entropy Translation Model in Dependency-Based MT Framework, PIRE meeting, Uppsala, July 15, 2010 [slides]
- Using alignment of tectogrammatical trees in phrase-based machine translation, SMT seminář, Praha, May 11, 2009 [slides]
- Analysis and alignment of parallel data in TectoMT, MT Marathon, Prague, January 29, 2009 [slides]
- Automatické párování uzlů českých a anglických tektogramatických stromů, pondělní seminář, Praha, October 13, 2008 [slides]