The data of Prague Czech-English Dependency Treebank 2.0 Coref can be found in the data
directory and follows the structure of the original PCEDT 2.0 release: sections 00
-24
containing one gzipped Treex file (*.treex.gz
) per document.
The data are stored in the Treex format, which is an application of the Prague Markup Language (PML; Pajas and Štěpánek, 2008), an XML-based format designed for linguistic treebank annotations. For the sake of completeness, PML schemata describing the structure of the Treex files are enclosed in the resources
directory.
Tree editor TrEd (Pajas and Štěpánek, 2008) can be used to open and browse the data. The editor can be downloaded for various platforms from its home page. Please follow the installation instructions specified at the page for your operating system.
After the installation, an extension needs to be installed:
Now, TrEd is able to open the data of PCEDT 2.0 Coref, displaying the analytical and tectogrammatical trees of one English sentence and its Czech translation (4 trees) at once.
In case of troubles with the installation of TrEd or with browsing the data, please contact the authors at tred at ufal.mff.cuni.cz
.