[ Skip to the content ]

Institute of Formal and Applied Linguistics

at Faculty of Mathematics and Physics, Charles University, Prague, Czech Republic

[ Back to the navigation ]


Year 2016
Type in proceedings
Status published
Language English
Author(s) Bojar, Ondřej Graham, Yvette Kamran, Amir Stanojević, Miloš
Title Results of the WMT16 Metrics Shared Task
Czech title Výsledky soutěže ve vyhodnocování strojového překladu
Proceedings 2016: Stroudsburg, PA, USA: WMT 2016 (ACL): Proceedings of the First Conference on Machine Translation (WMT). Volume 2: Shared Task Papers
Pages range 199-231
How published online
URL http://www.statmt.org/wmt16/pdf/W16-2302.pdf
Supported by 2015-2018 H2020-ICT-2014-1-645452 (QT21: Quality Translation 21) 2015-2018 H2020-ICT-2014-1-644402 (Himl (Health in my Language)) 2012-2016 PRVOUK P46 (Informatika)
Czech abstract Článek shrnuje výsledky soutěže v hodnocení kvality strojového překladu. Letos je soutěž rozšířena o několik novinek: větší počet jazykových párů, data z více domén, tři způsoby ručního vyhodnocení, jemuž se mají automatické metody v soutěži přiblížit.
English abstract This paper presents the results of the WMT16 Metrics Shared Task. We asked participants of this task to score the outputs of the MT systems involved in the WMT16 Shared Translation Task. We collected scores of 16 metrics from 9 research groups. In addition to that, we computed scores of 9 standard metrics (BLEU, SentBLEU, NIST, WER, PER, TER and CDER) as baselines. The collected scores were evaluated in terms of system-level correlation (how well each metric’s scores correlate with WMT16 official manual ranking of systems) and in terms of segment level correlation (how often a metric agrees with humans in comparing two translations of a particular sentence). This year there are several additions to the setup: large number of language pairs (18 in total), datasets from different domains (news, IT and medical), and different kinds of judgments: relative ranking (RR), direct assessment (DA) and HUME manual semantic judgments. Finally, generation of large number of hybrid systems was trialed for provision of more conclusive system-level metric rankings.
Specialization linguistics ("jazykověda")
Confidentiality default – not confidential
Open access no
DOI http://dx.doi.org/10.18653/v1/w16-2302
Editor(s)* Ondřej Bojar
ISBN* 978-1-945626-10-4
Address* Stroudsburg, PA, USA
Month* August
Venue* Humboldt University
Publisher* Association for Computational Linguistics
Institution* Association for Computational Linguistics
Creator: Common Account
Created: 9/6/16 2:57 PM
Modifier: Almighty Admin
Modified: 2/25/17 10:06 PM

Content, Design & Functionality: ÚFAL, 2006–2016. Page generated: Wed Jul 18 04:51:16 CEST 2018

[ Back to the navigation ] [ Back to the content ]

100% OpenAIRE compliant