Josef Jon

office
425
email
jon@ufal.mff.cuni.cz
address
Malostranské náměstí 25
118 00 Praha 1
Czech Republic

Main Research Interests

Machine translation

Curriculum Vitae

Education

  • 2013–2017 Bachelor, Faculty of Information Technology, Brno University of Technology, Computer Science.
  • 2014 Bachelor (Erasmus), Escuela Técnica Superior de Ingeniería Informática, Universidad de Sevilla, Computer Science.
  • 2017–2019 Masters, Faculty of Information Technology, Brno University of Technology, Bioinformatics.
    • Master’s thesis: Exploring Contextual Information in Neural Machine Translation.
  • 2022–current PhD, Faculty of Mathematics and Physics, Charles University, Natural language processing.

Experience

  • 2016–present NLP developer, Lingea, Brno.
    • Development and deployment of machine translation systems and related tools.
    • Integration of terminology databases and dictionaries into neural machine translation.
    • Document-level NMT.
    • Domain adaptation and self-adapting NMT.
    • Efficient and multilingual NMT models suitable for deployment.
    • Low-resource NMT (mainly focused on Slavic languages, capitalizing on language similarity).
    • Integration of NMT into our CAT system.
    • Management of EU projects.
  • 2020–present Machine translation researcher, Charles University, Prague.
    • Bergamot project – efficient and private client-side NMT for browsers.

Selected Bibliography

  1. Josef Jon, Ondřej Bojar (2023): Character-level NMT and language similarity. In: Proceedings of Machine Translation Summit XIX vol. 1: Research Track, pp. 360-371, Asia-Pacific Association for Machine Translation (AAMT), Kyoto, Japan, ISBN 978-4-9913461-0-1 (pdf, bibtex)
  2. Josef Jon, Ondřej Bojar (2023): Breeding Machine Translations: Evolutionary approach to survive and thrive in the world of automated evaluation. In: Proceedings of 61st Annual Meeting of the Association for Computational Linguistics (Volume 1: Long Papers), pp. 2191-2212, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-959429-72-2 (url, bibtex)
  3. Josef Jon, Martin Popel, Ondřej Bojar (2023): CUNI at WMT23 General Translation Task: MT and a Genetic Algorithm. In: Proceedings of the Eighth Conference on Machine Translation, pp. 119-127, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 979-8-89176-041-7 (pdf, bibtex)
  4. Josef Jon, Dušan Variš, Michal Novák, Joao Paulo Aires, Ondřej Bojar (2023): Negative Lexical Constraints in Neural Machine Translation. In: Proceedings of Machine Translation Summit XIX vol. 1: Research Track, pp. 372-384, Asia-Pacific Association for Machine Translation (AAMT), Kyoto, Japan, ISBN 978-4-9913461-0-1 (pdf, bibtex)
  5. Josef Jon, Martin Popel, Ondřej Bojar (2022): CUNI-Bergamot Submission at WMT22 General Task. In: Proceedings of the Seventh Conference on Machine Translation, pp. 280-289, Association for Computational Linguistics, Stroudsburg, PA, USA (pdf, local PDF, bibtex)
  6. Josef Jon, João Paulo de Souza Aires, Dušan Variš, Ondřej Bojar (2021): End-to-End Lexically Constrained Machine Translation for Morphologically Rich Languages. In: Proceedings of the Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, pp. 4019-4033, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-954085-52-7 (url, local PDF, bibtex)
  7. Josef Jon, Michal Novák, João Paulo de Souza Aires, Dušan Variš, Ondřej Bojar (2021): CUNI systems for WMT21: Multilingual Low-Resource Translation for Indo-European Languages Shared Task. In: Proceedings of the Sixth Conference on Machine Translation, pp. 354-361, Association for Computational Linguistics, Online, ISBN 978-1-954085-94-7 (url, local PDF, bibtex)
  8. Josef Jon, Michal Novák, João Paulo de Souza Aires, Dušan Variš, Ondřej Bojar (2021): CUNI systems for WMT21: Terminology translation Shared Task. In: Proceedings of the Sixth Conference on Machine Translation, pp. 828-834, Association for Computational Linguistics, Online, ISBN 978-1-954085-94-7 (url, local PDF, bibtex)

Supervisor: Ondřej Bojar