Up

ANALYTICAL LAYER


In general       Transducing the ATS to the TGTS       References       What is PDT


Characterization of Analytic Tree Structure (ATS)
The layer of analytic syntax does not immediately correspond to a level substantiated by linguistic theory, although it may be viewed as coming close to the level of 'surface syntax' as present in the earlier stages of Functional Generative Description (FGD) (see Sgall, 1992 as for reasons to abandon this level and thus a multistratal approach). The main difference between 'surface' and the analytic layer is that every function word and punctuation mark gets a node of its own in the syntactic network.

 Values of the analytical function attribute (available in: pdffile, psfile)


Transducing the ATSs to TGTSs
An automatic procedure has been formulated and implemented that carries out tree-pruning (i.e. that transforms most of the nodes that represent function words and punctuation marks into indices added to the labels of the nodes for autosemantic words), changes some of the morphological symbols (for number, tense, modality, etc.) into grammateme values, and establishes new nodes in some prototypical cases of surface deletions; see PDT references for more detailes



Documentation of the annotation style on analytical layer is available in PDT References. Description of the post-annotation checking steps done at the syntactic-analytic layer is available ( pdffile, psfile). All papers related to  theoretical background and to various tasks of automatic processing upon the PDT analytical layer are listed below (alphabetically by the last name of the first author).

References
The file pdt.bib contains a collection of books, papers, technical reports related to the PDT.
  1. Michael Collins, Jan Hajic, Eric Brill, Lance Ramshaw, Christoph Tillmann. (1999). A Statistical Parser of Czech. In Proceedings of 37th ACL'99, pp. 505--512, University of Maryland, College Park, June 22-25.
    Available in: BibTex item
  2. Jan Hajic, Eric Brill, Michael Collins, Barbora Hladka, Douglas Jones, Cynthia Kuo, Lance Ramshaw, Oren Schwartz, Christopher Tillmann, Daniel Zeman. (1998).

  3. Core Natural Language Processing Technology Applicable to Multiple Languages: Workshop98 Final Report for the 1998 Language Engineering Workshop for Students and Professionals: Integrating Research and Education, Center for Language and Speech Processing,
    Johns Hopkins University, Baltimore, MD,  Research Note 37.
    Available in: BibTex item
  4. Jan Hajic, Kiril Ribarov. (1997). Rule-Based Dependencies. In Proceedings of the Workshop on the Empirical Learning of Natural Language Processing Tasks, pp. 125-136, Prague, Czech Republic.
    Available in: BibTex item
  5. Jarmila Panevova. (1980). Formy a funkce ve stavbe ceske vety [Forms and functions in the structure of the Czech sentence], Prague, Academia.
    Available in: BibTex item
  6. Kiril Ribarov. (1996). Automatic Natural Language Grammar. MSc. Thesis (in Czech), Institute of Formal and Applied Linguistics, Charles University, Prague, Czech Republic.
    Available in: BibTex item
  7. Kiril Ribarov. (2000). Rule-Based Tagging: Morphological Tagset versus Tagset of Analytical Functions. In Proceedings of LREC'2000, pp. 1123-1125, Athens, Greece,  ps file, BibTex item
  8. Anoop Sarkar, Daniel Zeman
    Automatic Extraction of Subcategorization Frames for Czech
    In: Proceedings of the 18th International Conference on Computational Linguistics,Coling 2000
    Universität des Saarlandes, Saarbrücken, Germany, 2000
    Available in: pdffile, psfile, BibTex item
  9. Petr Sgall. (1967). Generativni popis jazyka a ceska deklinace. Academia, Prague, Czech Republic.
    Available in: BibTex item
  10. Petr Sgall. (1992). Underlying Structure of Sentence and Its Relation to Semantics.
    In: Wiener Slawistischer Almanach. Sonderband 33. Ed. by T. Reuther. Wien: Gesellschaft zur Forderung slawistischer Studien, pages 273-282.
    Available in: BibTex item
  11. Petr Sgall, Eva Hajicova, Jarmila Panevova. (1986). The Meaning of the Sentence and Its Semantic and Pragmatic Aspects. Reidel Publishing Company, Dordrecht, Netherlands, Academia, Prague, Czech Republic.
    Available in: BibTex item
  12. Vladimir Smilauer. (1969). Novoceska skladba [Syntax of Contemporary Czech], 3rd ed., SPN, Prague, Czech Republic.
    Available in: BibTex item
  13. Daniel Zeman, Anoop Sarkar
    Learning Verb Subcategorization from Corpora: Counting Frame Subsets
    In: Proceedings of the Second International Conference on Language Resources and Evaluation, LREC 2000,
    ELRA, Athîna, Greece, 2000
    Available in: pdffile, psfile, rtffile, BibTex item
  14. Daniel Zeman
    A Statistical Approach to Parsing of Czech
    In: Prague Bulletin of Mathematical Linguistics, volume 69, pages 29-37
    Univerzita Karlova, Praha, 1998
    Available in: html, pdffile, psfile, rtffile, BibTex item
  15. Daniel Zeman
    Pravděpodobnostní model významových zápisů vět (MSc. thesis)
    Matematicko-fyzikální fakulta Univerzity Karlovy, Praha, 1997
    Available in: html, pdffile, psfile, rtffile, BibTex item