Introduction

What is Treex?

Treex (formerly TectoMT) is a highly modular NLP software system implemented in Perl programming language under Linux.

It is primarily aimed at Machine Translation, making use of the ideas and technology created during the Prague Dependency Treebank project. At the same time, it is also hoped to significantly facilitate and accelerate development of software solutions of many other NLP tasks, especially due to re-usability of the numerous integrated processing modules (called blocks), which are equipped with uniform object-oriented interfaces.

Online web interface

If you want to try Treex, we recommend to start with Treex::Web.

CPAN releases

Treex-Core Treex-Unilang Treex-EN Treex-Doc

Applications

NLP Applications developed within the Treex NLP framework:

  • TectoMT – English-to-Czech machine translation based on tectogrammatics (with an on-line demo)
  • Depfix – a system for rule-based correction of English-Czech statistical MT outputs
  • HamleDT – harmonized dependency treebanks of 28 languages

Tutorial

  1. Installation Guide
  2. First Steps

There is a special version of the tutorial for computer labs at Malá Strana and MT-Marathon 2013.  

Acknowledgement

Work on this framework was supported by the grants FP7-ICT-2007-3-231720 (EuroMatrix Plus), MSM 0021620838 (Moderní metody, struktury a systémy informatiky), LC536 (Centrum komputační lingvistiky), GAUK 116310.