Project Description

The FAUST project will develop machine translation (MT) systems which respond rapidly and intelligently to user feedback. Current web-based MT systems provide high-volume translation without real-time. Most systems provide no opportunity for users to offer opinions or corrections for translation results. Other systems ask users for feedback on translation, however the user does not see any benefit to providing feedback: the translation does not change in response to the feedback. Our goal is to develop high-volume translation systems capable of adapting to user feedback in real-time.

We will build on the current leading commercial statistical MT systems developed by Language Weaver and deployed by Softissimo Inc at Our research will be based on translation in five bidirectional language pairs in these EU official languages:

  • Czech-English,
  • French-English,
  • Romanian-English,
  • Spanish-English,
  • Spanish-Catalan.
We will also study translation from Arabic and Chinese into English. The systems we develop will be used directly by users at the website. We will develop novel feedback collection mechanisms which encourage users to interact with the systems and to provide feedback. In this way we will collect the data, in the form of user judgements of translation quality and feedback for translation correction, which will guide our research.

The objectives of the FAUST Project are to:

  • Enhance the high-volume, translation website with an experimental and evaluation infrastructure that will enable the study of instantaneous user feedback in MT.
  • Deploy novel web-oriented, feedback collection mechanisms that reduce noise in feedback provided by users and increase the utility of the web contributions.
  • Automatically acquire novel data collections to study translation as informed by user feedback.
  • Develop mechanisms for instantaneously incorporating user feedback into the machine translation engines that are used in production environments, such as those that power the website.
  • Create novel automatic metrics of translation quality which reflect preferences learned from user feedback.
  • Develop new translation models driven by user feedback data and integrate natural language generation directly into MT to improve translation fluency and reduce negative feedback from users.

Project Partners

University of Cambridge University of Cambridge, Department of Engineering, UK,
University of Cambridge, Computer Laboratory, UK,
Universitat Politecnica de Catalunya Universitat Politecnica de Catalunya, Spain,
Charles University in Prague Charles University, Institute of Formal and Applied Linguistics, Czech Republic,
Language Weaver Inc. Language Weaver Inc., USA,
Language Weaver SRL, Romania,
Softissimo Inc. Softissimo Inc., France.

Content: David Marecek.
Site is valid XHTML 1.0 and valid CSS.
2010 © Institute of Formal and Applied Linguistics. All Rights Reserved.

Site navigation: