Josef Jon
Main Research Interests
Machine translation
Curriculum Vitae
Education
- 2013–2017 Bachelor, Faculty of Information Technology, Brno University of Technology, Computer Science.
- 2014 Bachelor (Erasmus), Escuela Técnica Superior de Ingeniería Informática, Universidad de Sevilla, Computer Science.
- 2017–2019 Masters, Faculty of Information Technology, Brno University of Technology, Bioinformatics.
- Master’s thesis: Exploring Contextual Information in Neural Machine Translation.
- 2022–current PhD, Faculty of Mathematics and Physics, Charles University, Natural language processing.
Experience
- 2016–present NLP developer, Lingea, Brno.
- Development and deployment of machine translation systems and related tools.
- Integration of terminology databases and dictionaries into neural machine translation.
- Document-level NMT.
- Domain adaptation and self-adapting NMT.
- Efficient and multilingual NMT models suitable for deployment.
- Low-resource NMT (mainly focused on Slavic languages, capitalizing on language similarity).
- Integration of NMT into our CAT system.
- Management of EU projects.
- 2020–present Machine translation researcher, Charles University, Prague.
- Bergamot project – efficient and private client-side NMT for browsers.
Selected Bibliography
- Google Scholar
- ORCID: 0000-0002-6163-4889
- Scopus ID: 57219789644
- Researcher ID: ADU-3839-2022
- CUNI-Bergamot Submission at WMT22 General Task. In: Proceedings of the Seventh Conference on Machine Translation, pp. 280-289, Association for Computational Linguistics, Stroudsburg, PA, USA (pdf, local PDF, bibtex)
- End-to-End Lexically Constrained Machine Translation for Morphologically Rich Languages. In: Proceedings of the Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing, pp. 4019-4033, Association for Computational Linguistics, Stroudsburg, PA, USA, ISBN 978-1-954085-52-7 (url, local PDF, bibtex)
- CUNI systems for WMT21: Terminology translation Shared Task. In: Proceedings of the Sixth Conference on Machine Translation, pp. 828-834, Association for Computational Linguistics, Online, ISBN 978-1-954085-94-7 (url, local PDF, bibtex)
- CUNI systems for WMT21: Multilingual Low-Resource Translation for Indo-European Languages Shared Task. In: Proceedings of the Sixth Conference on Machine Translation, pp. 354-361, Association for Computational Linguistics, Online, ISBN 978-1-954085-94-7 (url, local PDF, bibtex)
Supervisor: Ondřej Bojar