ÚFAL Participation on EuroMatrix

We aim at machine translation using a formal deep-syntactic representation of the structure of the sentence, so called tectogrammatical layer.

Reports

A short description of the linguistic representation is available as:

Marie Mikulová, Allevtina Bémová, Jan Hajič, Eva Hajičová, Jiří Havelka, Veronika Kolářová, Lucie Kučová, Markéta Lopatková, Petr Pajas, Jarmila Panevová, Magda Ševčíková, Petr Sgall, Jan Štěpánek, Zdeňka Urešová, Kateřina Veselá, and Zdeněk Žabokrtský: Annotation on the tectogrammatical level in the Prague Dependency Treebank. EuroMatrix Deliverable 3.1, May, 2007.
PDF

The theoretical background of our model of tree transformations and some preliminary experimental results were published as:

Ondřej Bojar and Martin Čmejrek: Mathematical Model of Tree Transformations. EuroMatrix Deliverable 3.2, December, 2007.
PDF

The users' guide and a brief overview of the software is available in:

Ondřej Bojar, Miroslav Janíček, Miroslav Týnovský: Implementation of Tree Transfer System. EuroMatrix Deliverable 3.3, September, 2008.
PDF

Further improvements, evaluation of the system and application of tectogrammatical layer for automatic MT quality evaluation:

Ondřej Bojar, Miroslav Týnovský: Evaluation of Tree Transfer System. EuroMatrix Deliverable 3.4, March, 2009.
PDF

Software

Tree Aligner
Tree Aligner is an experimental implementation to extract treelet-to-treelet translation dictionary from a parallel treebank.
Download Tree Aligner, version 0.8, Perl implementation
Download Tree Aligner, version 0.8, Java implementation
TreeDecode
TreeDecode is a configurable tree-to-tree transfer system.
Download TreeDecode, version 0.8, source code
Download TreeDecode, version 0.8, binary for Linux i686
Download TreeDecode, version 0.8, binary for Linux x64 (Intel 64-bit architecture)
QuickJudge
QuickJudge is a simple perl script to faciliate manual judgement of string segments, e.g. sentences in MT output.
Download and read the documentation of QuickJudge.

Demo

There is a Czech<->English machine translation demo based on Moses available. You may also use a simplified interface to the demo.

Related Links


Content: Ondřej Bojar.
Authors: Ondřej Bojar.
This project has been supported by the grant FP6-IST-5-034291-STP (EuroMatrix).
2008 © Institute of Formal and Applied Linguistics. All Rights Reserved.