The Chimera-TectoMT architecture of Machine Translation with Deep Linguistic Analysis
Chimera-TectoMt architektura strojového překladu s hlubokou jazykovou analýzou
Publisher's city and country
Stroustrup, PA, USA
San Diego Sheraton hotel
Prezentace vítězného systému ze soutěže (Shared Task) WMT 2013-2015. Byla prezentována architektura, zásadní moduly a principy, a statistické komponenty.
The TectoMT system is a result of long-term development which began in the pre-statistical era at Charles University in Prague and continued to include state-of-the-art tools for POS tagging, morphological feature disambiguation, lemmatization parsing, and some aspects of semantic analysis. It follows the usual Analysis – Transfer – Generation workflow, with transfer trained on a large parallel corpus using Hidden Markov Tree Model. Generation is partly rule-based (at the syntax level) and partly statistical (at the inflection/morphology level). Chimera is a hybrid system that uses a specific combination of TectoMT and a standard Phrase-based SMT (Moses), complemented by a “Depfix” automatic post-editing system, which as a whole improves on the individual systems, as documented in the results of the recent WMT Shared tasks. The system has been originally developed for English-Czech and recently transferred to several other languages within the EU QTLeap project (qtleap.eu), where it has been successfully used in the IT domain for both question and answer translation in a Q&A context. Both the TectoMT and Chimera systems will be presented together with a discussion about language (in)dependence of such a hybrid solution.
default – not confidential
NAACL SedMT 2016 Workshop
invited talk at conference/workshop
Creator: Common Account
Created: 10/20/16 6:10 PM
Modifier: Common Account
Modified: 10/20/16 6:10 PM