Principal investigator (ÚFAL):
This project aims to investigate possibilities for using machine translation for subtitling of speech, which is already being interpreted into another language or languages, for which human interpreting is not available. This technology increases the number of languages, in which it is possible to follow the speech, without the necessity to employ additional human interpreters. The fact that a machine, in contrast to a human, can process audio signals from multiple speakers in parallel can be leveraged for disambiguation of the source and result to improvement of the translation quality and, in theory, even to correction or completion of the interpreters based on source speech.
Publications
-
Dominik Macháček, Matúš Žilinec, Ondřej Bojar (2021): Lost in Interpreting: Speech Translation from Source or Interpreter?. In: Proceedings of INTERSPEECH 2021, pp. 2376-2380, ISCA, Baxas, France (pdf, bibtex)
-
Dominik Macháček, Ondřej Bojar (2020): Presenting Simultaneous Translation in Limited Space. In: Proceedings of the 20th Conference Information Technologies - Applications and Theory (ITAT 2020), pp. 32-37, Tomáš Horváth, Košice, Slovakia (pdf, obd, bibtex)
-
Dominik Macháček, Jonáš Kratochvíl, Sangeet Sagar, Matúš Žilinec, Ondřej Bojar, Thai-Son Nguyen, Felix Schneider, Philip Williams, Yuekun Yao (2020): ELITR Non-Native Speech Translation at IWSLT 2020. In: Proceedings of the 17th International Conference on Spoken Language Translation, pp. 200-208, Association for Computational Linguistics, Online, ISBN 978-1-952148-07-1 (pdf, local PDF, obd, bibtex)
-
Peter Polák, Sangeet Sagar, Dominik Macháček, Ondřej Bojar (2020): CUNI Neural ASR with Phoneme-Level Intermediate Step for Non-Native SLT at IWSLT 2020. In: Proceedings of the 17th International Conference on Spoken Language Translation, pp. 191-199, Association for Computational Linguistics, Online, ISBN 978-1-952148-07-1 (url, local PDF, obd, bibtex)