
PML Toolkit - pmltk
===================


AUTHORS

  Petr Pajas   (pajas at ufal.mff.cuni.cz)
  Jan Štěpánek (stepanek at ufal.mff.cuni.cz)

ABSTRACT

  PML, which stands for "Prague Markup Language", is an XML-based,
  universally applicable data format based on abstract data types
  intended primarily for interchange of linguistic annotations.

  This package contains the specification and basic toolkit for the PML
  format.

CONTENT

  Documentation:

	VERSION - version of the PML specification and toolkit
	README  - this document
	doc/index.html - brief description of PML

	doc/pml_doc.html - PML format specification (HTML)
	doc/pml_doc.pdf  - PML format specification (PDF)

	examples/ - PML examples used in the specification

  Tools:

	bin/pml_simplify - tool to simplify modular schemata
	bin/pml_validate - PML schema and instance validator
	bin/pml_copy     - copy/move/gzip/gunzip related PML instances
	                   without breaking internal references

  Conversion tools and support for other data formats:

	formats/alpino          - Alpino Treebank
	formats/arabic_treebank - Penn Arabic Treebank
	formats/conll2009       - CoNLL (up to CoNLL 2009)
	formats/hydt            - Hydarabad Treebank
	formats/perseus         - Latin Treebank (Perseus project)
	formats/ptb             - Penn Treebank
	formats/sinica          - Sinica Chinese Treebank
	formats/tiger           - Tiger Treebank XML format

  Schemas and support for PML-based treebanks:

	formats/pdt20       - Prague Dependency Treebank (PDT) 2.0 schemas
	                      and conversion from legacy PDT formats

	formats/pdt_vallex  - convert PDT 2.0 valency lexicon to PML
	formats/cac         - Czech Academic Treebank

  Other PML formats:

	formats/pml_schema_tree - conversion of PML schema to a PML instance
	formats/tree_generic    - basic schema to derive specific
                                  treebank schemata from
	formats/parallel        - schema and support for parallel treebanks
	formats/any_xml         - convert arbitrary XML to PML and back

  Perl API for PML:

	libs/pml-base/Treex/PML.pm  - base module which loads other parts of the API
                                      (run perldoc on the file to read the documentation)

  RelaxNG grammars for PML:

	rng/pml_schema.rng        - RelaxNG grammar for PML schema (all versions)
	rng/pml_schema_inline.rng - RelaxNG grammar for PML schema embedded in a PML instance

	rng/pml_common.rng        - auxiliary grammar included by
	                            grammars generated by pml2rng.xsl
	rng/pml_internal.rng      - ditto

  Auxiliary tools:
	tools/dtd2pml.pl       - convert any DTD to a PML schema
	tools/pml2rng.xsl   - convert PML schema to a RelaxNG grammar
	tools/pml2pls.btred - create a binary dump of a PML document
	                      for faster loading by Treex::PML::Document
	tools/pml_simplify.xsl - XSLT 2.0 implementation of pml_simplify

	tools/foreach_match.pl - search PML instances using attribute paths
	tools/pml_rw.pl        - read/write PML instances (perl API testing tool)
	tools/knit.pl      - read PML instance and save it keeping in all
	                     material embedded via the #KNIT role

	tools/msv          - shell wrapper around MSV (Sun's Multi-Schema Validator)
	tools/relames      - shell wrapper around the Relames RNG validator
	tools/rng_validate - a RNG validator combining xmllint and jing
