Chimera

We have reached our chimera to beat Google Translate in English-to-Czech machine translation. We combined all we have: a deep-syntactic transfer-based system TectoMT, very large parallel and monolingual data in a Moses factored setup to ensure morphological coherence, and finally Depfix, a rule-based automatic post-editing system that corrects grammaticality (agreement and valency) of the output as well as some features vital for adequacy, namely lost negation.

The full technical description:

  • Bojar Ondřej, Rosa Rudolf, Tamchyna Aleš: Chimera – Three Heads for English-to-Czech Translation. In: Proceedings of the Eight Workshop on Statistical Machine Translation, Copyright © Association for Computational Linguistics, Sofija, Bulgaria, ISBN 978-1-937284-57-2, pp. 92-98, 2013

Other papers:

 

The official WMT results:

Demo

A demo of the statistical component of Chimera (i.e., only the Moses system) is available online. The models were pruned to allow for real-time translation.

Acknowledgement

The development of Chimera for WMT13 was supported by the following grants:

  • GAP406/11/1499 (Čeština ve věku strojového překladu)
  • FP7-ICT-2011-7-288487 (MosesCore)
  • FP7-ICT-2010-6-257528 (Khresmoi)
  • SVV 267 314 (Teoretické základy informatiky a výpočetní lingvistiky)