Jindřich Libovický

office
N233
email
libovicky@ufal.mff.cuni.cz
phone
+420 951 552 954
address
IMPAKT – „N“
V Holešovičkách 747/2
180 00 Praha 8
Czech Republic

Main Research Interests

multilingual language modeling, machine translation, multilingual tokenization, combining language and vision, cross-lingual fairness

I am a researcher at the institute, and with my group, I focus on multilinguality and cross-lingual fairness. My team focuses on multilingual language modeling and machine translation. Our research covers key areas including how language models align across different languages, developing better tokenization methods that work well for multiple languages, and studying how language models perform differently across languages to improve fairness.

Projects

Current projects as principal investigator

Language Neutral and Culturally Aware Multilingual Neural Sentence Representations (2023 – 2026)
This is a grant from PRIMUS, Charles University's program, to support young PIs in starting their own groups. The project investigates how multilingual neural language models represent and transfer knowledge across different languages, with a particular focus on cross-lingual alignment and semantic similarity in encoder models and sentence embeddings.

Better Tokenization for Multilingual Language Models and Machine Translation (2025 – 2027)
This is a grant from the Czech Science Foundation. The project aims to develop semantically-grounded subword segmentation techniques that create more meaningful and cross-linguistically alignable units, thereby reducing vocabulary size and improving parameter efficiency in massively multilingual language models.

As a team member

Linguistics, Artificial Intelligence, and Language and Speech Technologies: From Research to Applications (2025 – 2028)
This project aims to strengthen collaboration between two academic institutions and three innovative companies in language and speech technologies for AI systems. It will bridge classical linguistics with modern data-driven approaches to enable widespread AI application deployment across all economic and social sectors while respecting legal frameworks and societal priorities. I am a work package leader in this project.

Curriculum Vitae

Experience

  • Researcher Associate @Charles University (from 2022)
  • Researcher @Ludwig-Maximilians-Universität München (2019 – 2021)
  • Research Assistant @Charles University (2013 – 2019)
  • Software Engineering Intern @Google (2017)
  • Analytic Linguist Intern @Google (2016)
  • Research Development Support @IBM Czech Republic (2012 – 2015)

Education

  • Ph.D. in Computational Linguistics, Charles University, Faculty of Mathematics and Physics (2013 – 2019)
  • Masters degree in Media Studies (2014 – 2017), Charles University, Faculty of Social Sciences
  • Masters degree in Computational Linguistics (2011 – 2013), Charles University, Faculty of Mathematics and Physics
  • Bachelor degree in Media Studies (2011 – 2014), Charles University, Faculty of Social Sciences
  • Bachelor degree in Computer Science (2007 – 2011), Charles University, Faculty of Mathematics and Physics

Teaching

I am happy to supervise NLP-related bachelor's and master's theses. Have a look at some prospective topics.

Selected Bibliography

The full list of publications on a separate page.

Jindřich Libovický, Helmut Schmid, Alexander Fraser.
Why don′t people use character-level machine translation?.
In: Findings of the Association for Computational Linguistics: ACL 2022. 2022
Jindřich Libovický, Alexander Fraser.
Neural String Edit Distance.
In: Proceedings of the Sixth Workshop on Structured Prediction for NLP. 2022
Katharina Hämmerl, Jindřich Libovický, Alexander Fraser.
Combining Static and Contextualised Multilingual Embeddings.
In: Findings of the Association for Computational Linguistics: ACL 2022. 2022
Jindřich Libovický, Rudolf Rosa, Alexander Fraser.
On the Language Neutrality of Pre-trained Multilingual Representations.
In: Findings of the Association for Computational Linguistics: EMNLP 2020. 2020
Shruti Palaskar, Jindřich Libovický, Spandana Gella, Florian Metze.
Multimodal Abstractive Summarization for How2 Videos.
In: Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics. 2019
Jindřich Libovický, Jindřich Helcl.
End-to-End Non-Autoregressive Neural Machine Translation with Connectionist Temporal Classification.
In: Proceedings of the 2018 Conference on Empirical Methods in Natural Language Processing (EMNLP). 2018
Jindřich Libovický, Jindřich Helcl, David Mareček.
Input Combination Strategies for Multi-Source Transformer Decoder.
In: Proceedings of the Third Conference on Machine Translation. 2018
Jindřich Libovický, Jindřich Helcl.
Attention Strategies for Multi-Source Sequence-to-Sequence Learning.
In: Proceedings of the 55th Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers). 2017

 

See the full list on my Google Scholar profile or our institute's database.

Recent Blog Posts

Visit my blog at jlibovicky.github.io.

Students

Currently supervised PhD student

Andrei Manea (since 2023)

Gianluca Vico (since 2024)

Katharina Hämmerl (with Alexander Fraser at TUM, since 2021)