For a longer list of our text and speech translation projects and datasets, scroll down.

Machine Translation Research Group

  • ELITR - Simultaneous speech translation

  • elitr.eu: The project “ELITR, European Live Translator” supported by the EU under the Horizon 2020 programme and coordinated by Ondřej Bojar, created and provides a system for automatic transcription and simultaneous translation for live events, e.g. conference presentations; ready to complement or sometimes even substitute simultaneous interpreting. ELITR also explored the topic of automatic minuting, i.e. speech summarization.
  • CHARLES TRANSLATOR (2019-)

  • Neural machine translation between Ukrainian and Czech. Project page: https://ufal.mff.cuni.cz/ufal-ukraine. Freely available at Charles Translator for Ukraine and as an app for Android.
  • CUBITT (2020-)

  • Our neural machine translation system matched humans in translation quality already in 2020, two years before ChatGPT made the Transformer architecture popular. Martin Popel's team results featured in Nature Communications. The system can be freely tested at https://lindat.cz/cubbitt.
  • Chimera (2013-2015)

  • Chimera, a hybrid Machine Translation system which was developed at the Institute of Formal and Applied Linguistics, combines a deep-syntactic transfer-based system TectoMT, very large parallel and monolingual data in a Moses factored setup to ensure morphological coherence, and finally Depfix, a rule-based automatic post-editing system that corrects grammaticality (agreement and valency) of the output as well as some features vital for adequacy, namely lost negation. The system was able to succesfully compete the best machine translation systems tech.ihned.cz/hnfuture/...    [learn more].

 

Other MT Projects

Title Tags
Abstract Meaning Representation Annotations, Machine Translation, Semantics
Alex Dialogue Systems Framework Dialog, Machine Translation, Morphology, Parsers, Speech Recognition, Tools
Batch Translation for IBM Machine Translation
Bengali Visual Genome Annotations, Corpora, Data, Machine Translation, Multi-modality, Multilingual
Centrum vizuální historie Malach Data, Discourse, Machine Translation, Multi-modality, Multilingual, Speech Recognition
Chimera Machine Translation
CUBBITT translation Machine Translation
Czech in the Machine Translation Era Machine Translation
Czechizator Machine Translation, Morphology
CzEng Corpora, Data, Machine Translation, Multilingual
CzEngVallex - Czech and English verbal valency Annotations, Corpora, Data, Lexicons, Machine Translation, Multilingual, Semantics, Taggers
Depfix Machine Translation, Morphology, Parsers, Taggers, Tools
European Language Grid Corpora, Data, Lexicons, Machine Translation, Multilingual, Parsers, Tools
Eyetracked Multi-Modal Translation Data, Machine Translation, Multi-modality, Psycholinguistics
Hausa Visual Genome Machine Translation, Multi-modality
Hausa Visual Question Answering Dataset Corpora, Data, Machine Translation, Multi-modality, Multilingual
HindEnCorp Corpora, Data, Machine Translation, Monolingual, Multilingual
Hindi Visual Genome Corpora, Data, Machine Translation, Multi-modality, Multilingual
Malayalam Visual Genome Corpora, Data, Machine Translation, Multi-modality, Multilingual
MTMonkey Machine Translation, Tools
Neural Monkey Machine Learning, Machine Translation, Multi-modality, Multilingual, Tools
NLP_HEALTHCARE2020 Machine Translation, Morphology, Multi-modality, Multilingual, Publications, Semantics, Speech Recognition
Odia Visual Genome Corpora, Data, Machine Translation, Multi-modality, Multilingual
OdiEnCorp Corpora, Machine Translation, Monolingual
Prague Czech-English Dependency Treebank 3.0 Annotations, Coreference, Corpora, Data, Lexicons, Machine Translation, Morphology, Valency
QT21 Corpora, Data, Lexicons, Linked data, Machine Learning, Machine Translation, Multilingual, Semantics, Tools
Slovakoczech NLP workshop Annotations, Coreference, Corpora, Data, Dialog, Discourse, Information Retrieval, Information Structure, Lexicons, Linked data, Machine Learning, Machine Translation, Monolingual, Morphology, Multi-modality, Multilingual, Multiword Expressions, Parsers, Publications, Semantics, Speech Recognition, Speech Retrieval, Spellcheckers, Taggers, Tools, Valency
Strojový překlad se sémantickou informací Annotations, Lexicons, Machine Translation, Semantics, Valency
TectoMT Machine Translation, Tools
Tweeslate Machine Translation, Multilingual
ÚFAL for Ukraine Machine Translation, Multilingual
UFAL Medical Corpus Corpora, Data, Machine Translation, Multilingual
UFAL Parallel Corpus of North Levantine Corpora, Data, Machine Translation
WAT2025_English-to-Indic_Multimodal_Translation Data, Machine Translation, Multi-modality, Multilingual