Up
TECTOGRAMMATICAL LAYER
In general
TGTS
Data sample
References
What is PDT
General Characterization of
Tectogrammatical Layer
The tagging scheme on the deep structure level is based on the dependency-based
theoretical framework of the Functional Generative Description (FGD), namely
on its level of the tectogrammatical representations (for motivating discussions
and for more details, see e.g. Sgall 1967 and 1992; Sgall et al. 1986;
a formalization can be found in Petkevic 1987 and 1995, see References).
The tectogrammatical layer can be characterized as the level of linguistic
(literal) meaning, i.e. as the structuring of the cognitive content proper
to a particular language. On this level, the irregularities of the outer
shape of sentences are absent (including synonymy and at least the prototypical
cases of ambiguity) and it can thus serve as a useful interface between
linguistics in the narrow sense (as the theory of language systems) on
one side and such interdisciplinary domains as that of semantic interpretation
(logical analysis of language, reference assignment based on inferencing
using contextual and other knowledge, further metaphorical and other figurative
meanings), that of discourse analysis or text linguistics, and so on, on
the other.
Characterization of the Tectogrammatical
Tree Structure (TGTS)
(for detailed information see:
E. Hajicova. Dependency-Based Underlying-Structure Tagging of a Very Large
Czech Corpus, see References)
-
A node of the TGTS represents an occurrence of an autosemantic (lexical,
meaningful) word; the correlates of function words (i.e. synsemantic, auxiliary
word forms) are attached as indices to the autosemantic words to which
they belong (i.e. auxiliary verbs and subordinating conjunctions to the
verbs, prepositions to nouns, etc.); coordinating conjunctions remain as
nodes of their own (similarly as in the ATSs...odkaz).
-
In cases of deletions in the surface shapes of sentences nodes for the
deleted autosemantic words are added to the tree structure.
-
Non-projective structures are not allowed on the tectogrammatical layer
of tagging; the relevant asymmetries are accounted for as differences between
underlying and morphemic word order.
-
Not only the direction of the dependency relation (dependent from the right
- dependent from the left), but also the ordering of the sister nodes is
specified in the TGTSs.
-
Each TGTS has the form of a dependency tree with the verb of the
main clause as its root (to be more precise, the root of the TGTS is a
special node identifying the sentence of which the given structure is the
TGTS, and the node of the main verb is the only node immediately depending
on this identifier). In case of nominal 'sentences' (i.e. of constructions
without a finite verb), three possibilities obtain: (i) the governing
verb is added (in case of surface deletions, which is relatively rare),
or (ii) a symbol for 'empty verb' ('EV') is added as the governor
(e.g. Od na�eho washingtonského zpravodaje 'From our correspondent
from Washington', with the node for 'correspondent' depending on 'EV'),
or (iii) the governing nominal node acts as the governor (e.g. with
author names).
Each label of a node consists of the following
parts:
-
the lexical value proper of the word (represented in a preliminary way
just with the usual graphemic form of the word, the 'lemma'),
-
the values of the morphological grammatemes (corresponding primarily to
the values of morphological categories such as modality, tense, aspect
with verbs, gender and number with nouns, degree of comparison with adjectives),
-
the values of the attribute 'functor', corresponding to (underlying) syntactic
functions (Actor, Objective, Means, Locative, etc., we write the values
of functors in upper case letters); as a matter of fact, in case of doubts
(since the precise formulation of the criteria can only be achieved later,
on the basis of analyses that will have the possibility to use a large
tagged corpus as their starting point) the annotators have the possibility
to indicate two different values for every functor,
-
the values of the attribute 'syntactic grammateme', corresponding to secondary
syntactic functions and combined with some of the functors according to
a more subtle (semantic) differentiation of these syntactic relations that
is rendered on the surface first of all by prepositions and cases of nouns;
this concerns the functors with the meaning of location LOC, DIR-1, DIR-2
and DIR-3 (corresponding to the questions 'where?', 'from where?', 'through
which place?' and 'where to?', respectively); thus e.g. LOC (expressed
in Czech by several prepositions which combine either with the locative
(Loc) or with the instrumental (Instr) case of the noun) is subcategorized
into na+Loc ('on': na stole 'on the table'), v+Loc ('in'), u+Loc ('by'),
nad+Instr. ('above'), pod+Instr ('under'), za+Instr ('behind'), mezi.1+Instr
('among'), mezi.2+Instr ('between'), etc. As for functors having a temporal
meaning, a similar subcategorization is established with the functor TWHEN
(with the grammatemes AFT 'after', BEF 'before', NIL 'on Monday', 'next
year'). A positive or negative grammateme is attached to ACMP ('with' vs.
'without'), REG ('with regard' vs. 'without regard') and BEN ('for' vs.
'against');
-
the values of a special grammateme capture the basic information about
the topic-focus articulation (TFA) of the sentence
See complete list of TGTS attributes and their values
(available in: pdffile, psfile).
Documentation of the annotation style on tectogrammatical
layer is available in PDT references.
All papers related to the theoretical background are listed bellow
(alphabetically by the last name of the first author).
References
The file pdt.bib contains a collection of books, papers, technical reports
related to the PDT.
-
Eva Hajicova. (1993). Issues of Sentence Structure and Discourse
Patterns. Prague: Charles University.
Available in: BibTex item
-
Eva Hajicová (1998a). Movement Rules Revisited. In:
Processing of Dependency-Based Grammars, Proceedings from the Workshop
COLING/ACL, Montreal, ed. S. Kahane and A. Polguere, 49-57.
Available in: BibTex item
-
Eva Hajicova, Marketa Ceplova. (2000). Deletions and Their Reconstruction
in Tectogrammatical Syntactic Tagging of Very Large Corpora. In Proceedings
of COLING'2000, pp. 228-284, Saarbruecken, Germany.
Available in: BibTex item
-
Eva Hajicova, Jarmila Panevova. (1984). Valency (case) frames of
verbs. In: Sgall (1984:147-188).
Available in: BibTex item
-
Eva Hajicova, Barbara Partee, Petr Sgall (1998): Topic-focus
articulation, tripartite structures, and semantic content. Amsterdam:Kluwer
Available in: BibTex item
-
Marcus M. P., Kim G., Marcinkiewicz M. A. et al. (1994). The Penn Treebank:
Annotating Predicate Argument Structure. Proceedings of the ARPA Human
Language Technology Workshop. San Francisco: Morgan Kaufmann.
-
Marcus M. P., Santorini B. and Marcinkiewicz M. A. (1993). Building a Large
Annotated Corpus of English: the Penn Treebank. Computational Linguistics,
19(2), 313-330.
-
Jarmila Panevova. 1974. "On verbal frames in Functional Generative
Description". Prague Bulletin of Mathematical Linguistics 22:3-40;
23(1975):17-52.
Available in: BibTex item
-
Jarmila Panevova. 1980. Formy a funkce ve stavbe ceske vety.
[Forms and Functions in the Structure of the Czech Sentence]. Prague: Academia.
Available in: BibTex item
-
Vladimir Petkevic (1987). A New Dependency Based Specification of
Underlying Representations of Sentences. Theoretical Linguistics
14:143-172.
Available in: BibTex item
-
Vladimir Petkevic. (1995). A New Formal Specification of Underlying
Representations. Theoretical Linguistics 21:7-61.
Available in: BibTex item
-
Petr Sgall. (1967). Generativni popis jazyka a ceska deklinace.
[Generative Description of Czech and Czech Declension.] Prague: Academia.
Available in: BibTex item
-
Petr Sgall ed. (1984). Contributions to Functional Syntax, Semantics
and Language Comprehension. Amsterdam: Benjamins - Prague: Academia.
Available in: BibTex item
-
Petr Sgall. 1992. Underlying Structure of Sentences and Its Relations
to Semantics. Wiener Slawistischer Almanach. Sonderband 33. Ed.
by T. Reuther. Wien: Gesellschaft zur Förderung slawistischer Studien,
273-282.
Available in: BibTex item
-
Petr Sgall. (1997a). Valency and Underlying Structure. An Alternative
View on Dependency. In: L. Wanner (ed.): Recent Trends in Meaning-Text
Theory. Amsterdam/Philadelphia: Benjamins, 149-166.
Available in: BibTex item
-
Petr Sgall (1997b). On the Usefulness of Movement Rules. In: Caron
B. (ed.), Actes du 16e Congres International des Linguistes (Paris
20-25 juillet 1997), Oxford: Elsevier Sciences.
Available in: BibTex item
-
Petr Sgall (in press). The Freedom of Language. To appear in Prague
Linguistic Circle Papers 4.
-
Petr Sgall, Eva Hajicova, Jarmila Panevova (1986): The Meaning
of the Sentence in Its Semantic and Pragmatic Aspects, ed. by J. L.
Mey, Dordrecht:Reidel - Prague: Academia.
Available in: BibTex item