Live Credible Translation

TL;DR: I aim to make live speech translation more credible by adding quality estimation module. 

I am Dominik Macháček, Live Credible Translation is my invidual grant project, STEL (Speech Translation Error Labelling) is one subproject.

What is "Live'' Translation? ‘‘Live’‘, or Simultaneous Speech Translation, is a task that is important for enabling direct interaction of people using different languages. It combines speech processing, MT, and simultaneous policies to deliver speech-to-text or speech-to-speech translations with short additive latency, typically 2-4 seconds. The translation must be processed simultaneously as the source is being produced. There are challenges that are in addition to offline speech translation: fast computing, and a problem of translating partial, gradually incoming sentences without full future context.

Why "Credible" Translation? Quality Estimation score indicates how likely are the translation outputs correct or wrong. Similarly to MT QE, efficient and reliable SST QE could enable new practical applications that have a potential to enhance the credibility of automatic simultaneous translation. It can be also used in practical applications beyond the state of the art: real-time SST post-editing, such as an intelligent support for humans or LLM correcting SST outputs in real-time, multi-sourcing – using the speech of the original speaker and one or more simultaneous interpreters as multiple sources, and others.

Speech Translation Error Labelling (STEL)

Since the task of highlighting speech translation error spans would be very useful but there are no methods or resources yet, we propose an annotation protocol, collect an end-to-end authentic evaluation dataset, and investigate exisiting baseline systems.

Simultaneous Speech Translation in 2026

An important step for more credible live speech translation is to apply the improvement methods on the up to date simultaneous speech translation systems. The best way of doing it is to stay at the frontier that pushes the state of the art, as we did in 2025 with SimulStreaming. This year, we remove most critical bottlenecks from the last year:

Other and future work

My other projects in progress:

  • Simultaneous speech translation for Scottish Gaelic.
  • IWSLT 2026 Speech Translation Metrics Shared Task
  • I'm involved in SMURF4EU and in the JSALT2026 project on full duplex conversational systems.

Next LCT objectives:

  • Efficient, explainable, and real-time SST QE.
  • End-to-end applications, such as ELITR AI Interpreting enhanced for more credibility, real-time post-editing, question answering, etc. Collaboration on user study welcome!

Collaboration welcome!

  • I'm open to student projects, especially at the University of Edinburgh. The topics may include: speech translation post-editing, multimodal LLMs, multilinguality, deploying and piloting end-to-end applications, studying user feedback, etc.
  • I'm open to bringing simultaneous speech translation to new environments where it can help.
  • I'm open to serve events with AI Interpreting service
  • I'm open to tech transfer. Do you have a startup project but need a consultant with my expertise? Contact me.
  • I'm open to interdisciplinarity, especially with linguistics, translation studies, and simultaneous interpretation.

Selected achievements:

Acknowledgements:
This work has been supported by Czech Operational Program OP JAK, the MSCA CZ project MSCA Fellowships -- UK 4, CZ.02.01.01/00/22_010/0013392, “LCT.”