Live Credible Translation

TL;DR: I aim to make live speech translation more credible by adding quality estimation module.

I am Dominik Macháček, Live Credible Translation is my invidual grant project, STEL (Speech Translation Error Labelling) is one subproject.

What is "Live'' Translation? ‘‘Live’‘, or Simultaneous Speech Translation, is a task that is important for enabling direct interaction of people using different languages. It combines speech processing, MT, and simultaneous policies to deliver speech-to-text or speech-to-speech translations with short additive latency, typically 2-4 seconds. The translation must be processed simultaneously as the source is being produced. There are challenges that are in addition to offline speech translation: fast computing, and a problem of translating partial, gradually incoming sentences without full future context.

Why "Credible" Translation? Quality Estimation score indicates how likely are the translation outputs correct or wrong. Similarly to MT QE, efficient and reliable SST QE could enable new practical applications that have a potential to enhance the credibility of automatic simultaneous translation. It can be also used in practical applications beyond the state of the art: real-time SST post-editing, such as an intelligent support for humans or LLM correcting SST outputs in real-time, multi-sourcing – using the speech of the original speaker and one or more simultaneous interpreters as multiple sources, and others.

Speech Translation Error Labelling (STEL)

Since the task of highlighting speech translation error spans would be very useful but there are no methods or resources yet, we propose an annotation protocol, collect an end-to-end authentic evaluation dataset, and investigate exisiting baseline systems.

Automatic Labelling of Speech Translation Errors, a pre-print created with Maike Züfle and Ondrej Klejch.
Data and code: https://github.com/CSTR-Edinburgh/STEL
Data on Huggingface: https://huggingface.co/datasets/maikezu/STEL-0.1
Poster

Simultaneous Speech Translation in 2026

An important step for more credible live speech translation is to apply the improvement methods on the up to date simultaneous speech translation systems. The best way of doing it is to stay at the frontier that pushes the state of the art, as we did in 2025 with SimulStreaming. This year, we remove most critical bottlenecks from the last year:

Canary-1B-v2 is a new, better performing model than the previously used Whisper-v3, but it's lacking support for the simultaneous mode. We add it.
A Pocket Offline Model for Simultaneous Speech Translation as CUNI Submission to IWSLT 2026, IWSLT 2026. Aziz Sharipov Ortega and Dominik Macháček.
Cascades of ASR and offline LLMs deployed in simultaneous mode will very likely stay in production, due to large flexibility and quality. But the implementation of the top-performing simultaneous method is missing. We add it.
AlignAtt4LLM: Fast AlignAtt for Decoder-Only LLMs at IWSLT 2026 Simultaneous Speech Translation Task. IWSLT 2026. Quentin Fuxa and Dominik Macháček.

Other and future work

My other projects in progress:

Simultaneous speech translation for Scottish Gaelic.
IWSLT 2026 Speech Translation Metrics Shared Task
I'm involved in SMURF4EU and in the JSALT2026 project on full duplex conversational systems.

Next LCT objectives:

Efficient, explainable, and real-time SST QE.
End-to-end applications, such as ELITR AI Interpreting enhanced for more credibility, real-time post-editing, question answering, etc. Collaboration on user study welcome!

Collaboration welcome!

I'm open to student projects, especially at the University of Edinburgh. The topics may include: speech translation post-editing, multimodal LLMs, multilinguality, deploying and piloting end-to-end applications, studying user feedback, etc.
I'm open to bringing simultaneous speech translation to new environments where it can help.
I'm open to serve events with AI Interpreting service.
I'm open to tech transfer. Do you have a startup project but need a consultant with my expertise? Contact me.
I'm open to interdisciplinarity, especially with linguistics, translation studies, and simultaneous interpretation.

Selected achievements:

Project presented: UNCE meeting (slides), CSTR+StatMT (slides), UKIS (abstract, poster).
1/2026-12/2027: individual post-doc fellowship at the University of Edinburgh, as a visiting researcher at the CSTR and StatMT groups, supervised by Lexi Birch.
I presented a baseline SST system at IWSLT 2025, including an interactive demo.

Acknowledgements:
This work has been supported by Czech Operational Program OP JAK, the MSCA CZ project MSCA Fellowships -- UK 4, CZ.02.01.01/00/22_010/0013392, “LCT.”

Institute of Formal and Applied Linguistics

Charles University, Czech Republic
Faculty of Mathematics and Physics

Search form

Live Credible Translation

Speech Translation Error Labelling (STEL)

Simultaneous Speech Translation in 2026

Other and future work