Reads a file in CSTS format (default format of PDT) and writes a file in FS format (secondary file format, readable for the tree viewers). The output file contains phrase trees instead of dependency trees, so there are nonterminal nodes not corresponding to any single word in the sentence. The program operates with lemmas, not word forms.
Usage: the input is read from stdin or from a file whose name is supplied as command line argument. The output is written on stdout.
Platform: a perl script.
Reads a file in CSTS format (default
format of PDT) and writes a file with phrase structures in a
self-explaining bracketed format (example: (TOP (VP
Přišel/>Vp (NP Pavel/>N1 ) ) ./Z )
). Word following a
left bracket is nonterminal (phrase name). The words are
presented as form/tag
pairs, tags are shortened to
two characters. Phrase heads are marked by preceding their tag
with the >
character. The lemmas are lost during
the conversion process.
Please be aware that the phrase structure is not capable of capturing nonprojective constructions occurring in Czech. This may result in structures violating the original word order.
Usage: the input is read from stdin. The output is written on stdout.
Platform: dep2tree is a unix shell script. It is only a front end that calls a bunch of perl scripts (check the perl path on the first line of each of them!) and even a binary file (the front end assumes it's running under Linux and calls the appropriate binary; other binaries for Suns also available; recompile the appended source code for other platforms).
Acknowledgement: this is a code by Michael Collins written for the JHU Workshop '98 project.
Reads a file generated by dep2tree
and ports
it back to CSTS. As
dep2tree
loses information, the resulting file will
by no means be identical to the original!
Usage: the input is read from stdin or from a file whose name is supplied as command line argument. The output is written on stdout.
Platform: a perl script.