PML-TQ is a powerful open-source search tool for all kinds of linguistaically annotated treebanks with several client interfaces and two search backends (one based on a SQL database and one based on Perl and the TrEd toolkit). The tool works natively with treebanks encoded in the PML data format (conversion scripts are available for many established treebank formats).
- Search your local files:
- Register to search various treebanks using our server:
We are hosting a PML-TQ search service for PDT 2.0 and various other treebanks, including Penn Treebank 3, Penn Chinese Treebank, Penn Arabic Treebank, Tiger Corpus 1.0. To register, send an email to Jan Štěpánek; for some treebanks, you will need to obtain a license from the treebank distributor. The server is accessible from several clients, including modern web browsers or TrEd (see clients).
- Search any treebank on your own PML-TQ server:
Download and install the PML-TQ server (Linux, UNIX, Mac OS X) on your computer/server.
- PML-TQ Manual and Query Language Reference (incomplete)
- Web Browser
A fully graphical client for PML-TQ with client-side searching capability is part of the tree editor TrEd (a GPL-licensed software available separatelly) as an extension called pmltq. Several other extensions provide PML schemas and visualization stylesheets for various treebanks.
To install this extension, start TrEd, selectand select 'pmltq'. When done, press Shift+F3 to start the search. Select for searching using a PML-TQ server, or 'Files (local)' for searching local files using client-side search engine built into the client.
A simple text-based client called pmltq is included in the server package.
This distribution contains a fast and efficient implementation of PML-TQ powered by an SQL database with a client-server architecture (HTTP client -> custom HTTP server -> CGI -> SQL database backend).
The server is intended for searching large static data sets (complete treebanks). For individual files or small treebanks, up to say 10K trees (your mileage may vary), the client-side PML-TQ implementation in TrEd is usually sufficient.
Running a PML-TQ server requires either Oracle or PostgresSQL database, Perl >= 5.8.8 and several Perl modules installable from CPAN. The treebank must be encoded in or converted to the PML format.
Current version is 0.7.10 (beta). This realease is ready for testing, but some important parts of the documentation are still missing.
pmltq.tar.gz - PML-TQ distribution package
Subdirectories: config - sample configuration files (must edit first!) contrib - sample conversion scripts (e.g. for PDT 2.0) doc - documentation libs - perl modules used by pmltq resources - PML schemas used by pmltq sql - SQL scripts to init the database run - unified server startup/shutdown script and configuration Scripts: install_deps.sh - install modules required by the search server pmltq_http - small HTTP server providing PML-TQ services pml2base.pl - PML to SQL database conversion script pmltq - command-line client for both dabase and Perl-driven query engine
Installation of PML-TQ Server
To run PML-TQ Servers, you will first need to install an SQL database server. Fully supported are Oracle 10g or 11g and PostgreSQL ver. min. 8.4.1.
Then follow carefully the instructions in the README file provided in the distribution and the configuration scripts you will be asked to edit during the installation process. Since individual steps of the server installation are still poorly documented, do not hesitate to ask the authors for guidance via e-mail.