Ondřej Dušek - Bibliography

Papers

2021

  • Jonáš Kulhánek, Vojtěch Hudeček, Tomáš Nekvinda, Ondřej Dušek. AuGPT: Auxiliary Tasks and Data Augmentation for End-To-End Dialogue with Pre-Trained Language Models, in: NLP4ConvAI Workshop. [arXiv]
  • Xinnuo Xu, Ondřej Dušek, Shashi Narayan, Verena Rieser, Ioannis Konstas. MiRANews: Dataset and Benchmarks for Multi-Resource-Assisted News Summarization, In: EMNLP Findings. [Anthology]
  • Emiel van Miltenburg, Miruna Clinciu, Ondřej Dušek, Dimitra Gkatzia, Stephanie Inglis, Leo Leppänen, Saad Mahamood, Emma Manning, Stephanie Schoch, Craig Thomson, Luou Wen. Underreporting of errors in NLG output, and what to do about it, In: INLG (Commendation for an outstanding position paper). [Anthology]
  • Zdeněk Kasner, Simon Mille and Ondřej Dušek. Text-in-Context: Token-Level Error Detection for Table-to-Text Generation, In: INLG [Anthology / Poster].
  • Vojtěch Hudeček, Ondřej Dušek and Zhou Yu. Discovering Dialogue Slots with Weak Supervision, In: ACL. [Anthology]
  • Xinnuo Xu, Ondřej Dušek, Verena Rieser and Ioannis Konstas. AggGen: Ordering and Aggregating while Generating, In: ACL. [Anthology]
  • Tomáš Nekvinda and Ondřej Dušek. Shades of BLEU, Flavours of Success: The Case of MultiWOZ, In: GEM Workshop. [Anthology]
  • Sebastian Gehrmann et al. (50+ authors). The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics, In: GEM Workshop. [Anthology]
  • Léon-Paul Schaub, Vojtěch Hudeček, Daniel Štancl, Ondřej Dušek and Patrick Paroubek. Defining And Detecting Inconsistent System Behavior inTask-oriented Dialogues, In: TALN-RECITAL. [Anthology]

2020

  • Ondřej Dušek and Zdeněk Kasner. Evaluating Semantic Accuracy of Data-to-Text Generation with Natural Language Inference, In: INLG (Best Paper Award). [ACL anthology / video / Github]
  • Zdeněk Kasner and Ondřej Dušek. Data-to-Text Generation with Iterative Text Editing, In: INLG. [ACL anthology]
  • Zdeněk Kasner and Ondřej Dušek. Train Hard, Finetune Easy: Multilingual Denoising for RDF-to-Text Generation, In: WebNLG+ Workshop. [PDF]
  • Jindřich Libovický, Zdeněk Kasner, Jindřich Helcl, and Ondřej Dušek. Expand and Filter: CUNI and LMU Systems for the WNGT 2020 Duolingo Shared Task, In: WNGT Workshop. [ACL anthology]
  • Tomáš Nekvinda and Ondřej Dušek. One Model, Many Languages: Meta-learning for Multilingual Text-to-Speech, In: Interspeech. [ISCA archive / Github]
  • Jan Vainer and Ondřej Dušek. SpeedySpeech: Efficient Neural Speech Synthesis, In: Interspeech. [ISCA archive / Github]
  • Xinnuo Xu, Ondřej Dušek, Jingyi Li, Verena Rieser, and Ioannis Konstas. Fact-based Content Weighting for Evaluating Abstractive Summarisation, In: ACL. [ACL anthology / video / Github]

2019

  • Ondřej Dušek, Jekaterina Novikova, and Verena Rieser. Evaluating the State-of-the-Art of End-to-End Natural Language Generation: The E2E NLG Challenge, In: Computer Speech and Language. [ScienceDirect / arXiv / web]
  • Ondřej Dušek, Karin Sevegnani, Ioannis Konstas, and Verena Rieser. Automatic Quality Estimation for Natural Language Generation: Ranting (Jointly Rating and Ranking), In: INLG, Tokyo. [arXiv / slides / Github]
  • Ondřej Dušek, David M. Howcroft, and Verena Rieser. Semantic Noise Matters for Neural Natural Language Generation, In: INLG, Tokyo. [PDF / poster / Github]
  • Ondřej Dušek and Filip Jurčíček. Neural Generation for Czech: Data and Baselines, In: INLG, Tokyo. [arXiv / slides / Github (code) / Github (data)]
  • Simon Keizer, Ondřej Dušek, Xingkun Liu, and Verena Rieser. User Evaluation of a Multi-dimensional Statistical Dialogue System, In: SIGDIAL, Stockholm.  [ACL / arXiv / poster / code]

2018

  • Ondřej Dušek, Jekaterina Novikova, and Verena Rieser. Findings of the E2E NLG Challenge, In: INLG, Tilburg. [arXiv / web / slides]
  • Xinnuo Xu, Ondřej Dušek, Ioannis Konstas, and Verena Rieser. Better Conversations by Modeling, Filtering, and Optimizing for Coherence and Diversity, In: EMNLP, Brussels. [arXiv / Github / poster]
  • Jekaterina Novikova, Ondřej Dušek, and Verena Rieser. RankME: Reliable Human Ratings for Natural Language Generation, In: NAACL, New Orleans, 2018. [arXiv / Poster / Github]
  • Shubham Agarwal, Ondřej Dušek, Ioannis Konstas, and Verena Rieser. Improving Context Modelling in Multimodal Dialogue Generation, In: INLG, Tilburg. [arXiv / Github / poster]
  • Shubham Agarwal, Ondřej Dušek, Ioannis Konstas, and Verena Rieser. A Knowledge-Grounded Multimodal Search-Based Conversational Agent, In: SCAI EMNLP workshop, Brussels. [arXiv / Github / poster]
  • Igor Shalyminov, Ondřej Dušek, and Oliver Lemon. Neural Response Ranking for Social Conversation: A Data-Efficient Approach, In: SCAI EMNLP workshop, Brussels. [arXiv / Github / slides]

2017

  • Ondřej Dušek, Jekaterina Novikova, and Verena Rieser. Referenceless Quality Estimation for Natural Language Generation, In: LGNL, Sydney, 2017. [arXiv / Poster / Slides / Github]
  • Jekaterina Novikova, Ondřej Dušek, Amanda Cercas Curry, and Verena Rieser. Why We Need New Evaluation Metrics for NLG, In: EMNLP, Copenhagen, 2017. [arXiv / Github]
  • Jekaterina Novikova, Ondřej Dušek, and Verena Rieser. The E2E Dataset: New Challenges For End-to-End Generation, In: SIGDIAL, Saarbrücken, 2017. [arXiv / Web / Poster / Slides / Video]
  • Jekaterina Novikova, Ondřej Dušek, and Verena Rieser. Data-driven Natural Language Generation: Paving the Road to Success, In: WiNLP, Vancouver, 2017. [arXiv]

2016

  • Ondřej Dušek and Filip Jurčíček. A Context-aware Natural Language Generator for Dialogue Systems, In: SIGDIAL, Los Angeles, 2016. [PDF / arXiv / Software]
  • Ondřej Dušek and Filip Jurčíček. Sequence-to-Sequence Generation for Spoken Dialogue via Deep Syntax Trees and Strings, In: ACL, Berlin, 2016. [PDF / arXiv / Software]
  • Ondřej Bojar, Ondřej Dušek, Tom Kocmi, Jindřich Libovický, Michal Novák, Martin Popel, Roman Sudarikov, and Dušan Variš. CzEng 1.6: Enlarged Czech-English Parallel Corpus with Processing Tools Dockered, In: TSD, Brno, 2016. [View on SpringerLink]
  • Rudolf Rosa, Martin Popel, Ondřej Bojar, David Mareček, and Ondřej Dušek. Moses & Treex Hybrid MT Systems Bestiary, In: DMTW, Lisbon, 2016. [PDF]
  • Roman Sudarikov, Ondřej Bojar, Ondřej Dušek, Martin Holub, and Vincent Kríž. Verb Sense Disambiguation in Machine Translation, In: HyTra-6, Osaka, 2016. [PDF]
  • Ondřej Dušek and Filip Jurčíček. A Context-aware Natural Language Generation Dataset for Dialogue Systems, In: RE-WOCHAT, Portorož, 2016. [PDF / PDF slides]

2015

  • Rudolf Rosa, Ondřej Dušek, Michal Novák, and Martin Popel. Translation Model Interpolation for Domain Adaptation in TectoMT, In: Deep MT Workshop, Prague, 2015 [PDF / PDF slides]
  • Ondřej Dušek, Luís Gomes, Michal Novák, Martin Popel, and Rudolf Rosa. New Language Pairs in TectoMT, In: WMT, Lisbon, 2015 [PDF / PDF poster]
  • Ondřej Dušek and Filip Jurčíček. Training a Natural Language Generator from Unaligned Data, In: ACL-IJCNLP, Beijing, 2015. [PDF / PDF slides / PDF poster (for YRRSDS) / Presentation video / Software]
  • Ondřej Dušek, Eva Fučíková, Jan Hajič, Martin Popel, Jana Šindlerová, and Zdeňka Urešová. Using Parallel Texts and Lexicons for Verbal Word Sense Disambiguation, In: Depling, Uppsala, 2015. [PDF / PDF slides]
  • Zdeňka Urešová, Ondřej Dušek, Eva Fučíková, Jan Hajič, and Jana Šindlerová. Bilingual English-Czech Valency Lexicon Linked to a Parallel Corpus, In: LAW IX - The 9th Linguistic Annotation Workshop, Denver, 2015. [PDF]

2014

  • Daniela Majchráková, Ondřej Dušek, Jan Hajič, Agáta Karčová, Radovan Garabík. Semi-automatic Detection of Multiword Expressions in the Slovak Dependency Treebank, In: Computational Linguistics in Bulgaria, Sofia, 2014. [PDF]
  • Daniel Zeman, Ondřej Dušek, David Mareček, Martin Popel, Loganathan Ramasamy, Jan Štěpánek, Zdeněk Žabokrtský and Jan Hajič. HamleDT: Harmonized multi-language dependency treebank, in: Language Resources and Evaluation (48) 4, December 2014. [View on SpringerLink]
  • Ondřej Dušek, Ondřej Plátek, Lukáš Žilka, and Filip Jurčíček. Alex: Bootstrapping a Spoken Dialogoue System for a New Domain by Real Users, in: Proceedings of Sigdial, Philadelphia, 2014. [PDF / PDF poster]
  • Ondřej Dušek, Jan Hajič, Jaroslava Hlaváčová, Michal Novák, Pavel Pecina, Rudolf Rosa, Aleš Tamchyna, Zdeňka Urešová and Daniel Zeman. Machine Translation of Medical Texts in the Khresmoi Project, in: Ninth Workshop on Statistical Machine Translation, Baltimore, 2014. [PDF]
  • Ondřej Dušek, Jan Hajič, and Zdeňka Urešová: Verbal Valency Frame Detection and Selection in Czech and English, in: The 2nd Workshop on EVENTS, Baltimore, 2014. [PDF / PDF poster]
  • Pavel Pecina, Ondřej Dušek, Lorraine Goeuriot, Jan Hajič, Jaroslava Hlaváčová, Gareth Jones, Liadh Kelly, Johannes Leveling, David Mareček, Michal Novák, Martin Popel, Rudolf Rosa, Aleš Tamchyna, and Zdeňka Urešová: Adaptation of Machine Translation for Multilingual Information Retrieval in the Medical Domain, in: Artificial Inteligence in Medicine (61) 3, 2014. [View on ScienceDirect]
  • Matěj Korvas, Ondřej Plátek, Ondřej Dušek, Lukáš Žilka, and Filip Jurčíček: Free English and Czech Telephone Speech Corpus Shared Under the CC-BY-SA 3.0 License, in: Proceedings of LREC, Reykjavík, 2014. [PDF / PDF slides]
  • Zdeňka Urešová, Ondřej Dušek, Jan Hajič, and Pavel Pecina: Multilingual Test Sets for Machine Translation of Search Queries for Cross-lingual Information Retrieval in the Medical Domain, in: Proceedings of LREC, Reykjavík, 2014. [PDF / PDF poster]

2013

  • Ondřej Dušek, Filip Jurčíček: Robust Multilingual Statistical Morphological Generation Models, in: ACL Student Research Workshop, Sofia, 2013. [PDF / PDF slides / Presentation video / Software used for the experiments]
  • Ondřej Dušek: Towards a Truly Statistical Natural Language Generator for Spoken Dialogues, in: Week of Doctoral Students. Prague, 2013. [PDF / PDF slides]
  • Aleš Tamchyna, Ondřej Dušek, Rudolf Rosa, Pavel Pecina: MTMonkey: A Scalable Infrastructure for a Machine Translation Web Service, in: The Prague Bulletin of Mathematical Linguistics 100, 31-40. [PDF / PDF poster / Software]

2012

  • Ondřej Dušek, Zdeněk Žabokrtský, Martin Popel, Martin Majliš, Michal Novák, David Mareček: Formemes in English-Czech Deep Syntactic MT, in: Proceedings of the Seventh Workshop on Statistical Machine Translation, Montréal, 2012. [PDF]
  • Rudolf Rosa, David Mareček, Ondrej Dušek: DEPFIX: A System for Automatic Correction of Czech MT Outputs, in: Proceedings of the Seventh Workshop on Statistical Machine Translation, Montréal, 2012. [PDF]
  • Rudolf Rosa, Ondřej Dušek, David Mareček, Martin Popel: Using Parallel Features in Parsing of Machine-Translated Sentences for Correction of Grammatical Errors, in: Proceedings of SSST-6, Jeju, 2012. [PDF]
  • Ondřej Bojar, Zdeněk Žabokrtský, Ondrej Dušek, Petra Galušcáková, Martin Majliš, David Marecek, Jiří Maršík, Michal Novák, Martin Popel, Aleš Tamchyna: The Joy of Parallelism with CzEng 1.0, in: Proceedings of LREC, Istanbul, 2012. [PDF]

Theses

  • Novel Methods for Natural Language Generation in Spoken Dialogue Systems. Ph.D. Thesis, Faculty of Mathematics and Physics, Charles University, Prague, 2017. [PDF / PDF summary / PDF slides]
  • Confrontation of Czech and German valency lexicons. Master's thesis, Faculty of Arts, Charles University in Prague, 2013. [PDF (in German)]
  • Deep automatic analysis of English. Master's thesis, Faculty of Mathematics and Physics, Charles University in Prague, 2010. [PDF]
  • BashCommander. Bachelor thesis, Faculty of Mathematics and Physics, Charles University in Prague, 2007. [PDF]

Talks

  • Large Neural Language Models for Data-to-text Generation. AICZECHIA Seminar, Online. Mar 22, 2022 [PDF slides]
  • Better Supervision for End-to-end Neural Dialogue Systems. VSG Invited Talks @ FIT, Brno University of Technology. Dec 1, 2021 [Web] [PDF slides] [Video]
  • Accuracy in Neural Text Generation. Heinrich-Heine University of Düsseldorf seminar on Selected Topic in Machine Learning and Natural Language Processing. Jul 23, 2021 [PDF slides]
  • Dialogue Systems at Charles University. Czechbots conference. Mar 3, 2020. [PDF slides]
  • Challenges in Neural NLG. ÚFAL Monday seminar. Dec 2, 2019. [PDF slides]
  • Challenges in Neural NLG. Apple Cambridge. Oct 16, 2019. [PDF slides]
  • Challenges in Response Generation and Conversational AI. ILCC/HCRC Seminar, University of Edinburgh. Sep 14, 2018. [PPTX slides (24MB)]
  • Can You Be Friends with a Smart Speaker Device? Pint of Science Festival, Edinburgh. May 15, 2018. [PPTX slides (63MB)]
  • Sequence-to-sequence Natural Language Generation. University of Sheffield. Jun 1, 2017. [PDF slides]
  • Home Intelligent? Assistants. Edinburgh Science Festival. Apr 8, 2017. [PPTX slides (63MB)]
  • Sequence-to-sequence Natural Language Generation for Spoken Dialogue Systems. ÚFAL Monday seminar. Mar 28, 2017. [PDF slides / Presentation video]
  • Sequence-to-sequence Natural Language Generation. HWU Interaction Lab meeting. Nov 16, 2016. [PDF slides]
  • Sequence-to-sequence Natural Language Generation. Diligent project meeting. Nov 10, 2016. [PDF slides]
  • Natural Language Generation (Mostly) for Spoken Dialogue Systems. Lecture in Filip Jurčíček's Statistical Dialogue Systems Course. May 11, 2016. [PDF slides]
  • Natural Language Generation for Spoken Dialogue Systems. Lecture in Filip Jurčíček's Statistical Dialogue Systems Course. May 14, 2015. [PDF slides]
  • A Two-stage Syntax-based Natural Language Generator. ÚFAL Monday seminar. Mar 9, 2015. [PDF slides / Presentation video]
  • Tecto to AMR and Translation (with Tim O'Gorman and others). JHU/CLSP Fred Jelinek Memorial PIRE Workshop, Aug 1, 2014. [PDF slides / Video]
  • Ein Vergleich der deutschen und tschechischen Valenzwörterbücher durch Korpusanalyse und Befragung unter Linguisten. The 4th PRAGESTT Students' German Philology Conference. Mar 21, 2014. [PDF slides / PDF handout (in German)]
  • Natural Language Generation (Not Only) in Dialogue Systems. Lecture in Filip Jurčíček's Statistical Dialogue Systems Course. May 22, 2013. [PDF slides]
  • Learning Morphology from the Corpus. ÚFAL Monday seminar. Nov 11, 2013. [PDF slides / Presentation video]