ÚFAL Participation on EuroMatrix
We aim at machine translation using a formal deep-syntactic representation of the structure of the sentence, so called tectogrammatical layer.
Reports
A short description of the linguistic representation is available as:Marie Mikulová, Allevtina Bémová, Jan Hajič, Eva Hajičová, Jiří Havelka, Veronika Kolářová, Lucie Kučová, Markéta Lopatková, Petr Pajas, Jarmila Panevová, Magda Ševčíková, Petr Sgall, Jan Štěpánek, Zdeňka Urešová, Kateřina Veselá, and Zdeněk Žabokrtský: Annotation on the tectogrammatical level in the Prague Dependency Treebank. EuroMatrix Deliverable 3.1, May, 2007.
The theoretical background of our model of tree transformations and some preliminary experimental results were published as:
Ondřej Bojar and Martin Čmejrek: Mathematical Model of Tree Transformations. EuroMatrix Deliverable 3.2, December, 2007.
The users' guide and a brief overview of the software is available in:
Ondřej Bojar, Miroslav Janíček, Miroslav Týnovský: Implementation of Tree Transfer System. EuroMatrix Deliverable 3.3, September, 2008.
Further improvements, evaluation of the system and application of tectogrammatical layer for automatic MT quality evaluation:
Ondřej Bojar, Miroslav Týnovský: Evaluation of Tree Transfer System. EuroMatrix Deliverable 3.4, March, 2009.
Software
- Tree Aligner
-
Tree Aligner is an experimental implementation to extract treelet-to-treelet
translation dictionary from a parallel treebank.
Download Tree Aligner, version 0.8, Perl implementation
Download Tree Aligner, version 0.8, Java implementation - TreeDecode
-
TreeDecode is a configurable tree-to-tree transfer system.
Download TreeDecode, version 0.8, source code
Download TreeDecode, version 0.8, binary for Linux i686
Download TreeDecode, version 0.8, binary for Linux x64 (Intel 64-bit architecture) - QuickJudge
-
QuickJudge is a simple perl script to faciliate manual judgement of string segments, e.g. sentences in MT output.
Download and read the documentation of QuickJudge.
Demo
There is a Czech<->English machine translation demo based on Moses available. You may also use a simplified interface to the demo.
Related Links
- ÚFAL Website
- EuroMatrix Project Website
- CzEng, Czech-English parallel corpus.
- PCEDT 1.0, Prague Czech-English Dependency Treebank, a parallel treebank version 1.0.