## 5. T-lemmas and node types

For every node type we can specify the following:

• `nodetype` = `root` (see Section 1, "The technical root node").

The technical root node of a tectogrammatical tree has no `t_lemma` attribute.

• `nodetype` = `atom` (see Section 2, "Atomic nodes").

The t-lemmas assigned to this kind of nodes usually correspond to their m-lemmas. An exception to the rule are nodes representing syntactic negation; these are assigned the t-lemma substitute `#Neg`.

• `nodetype` = `coap` (see Section 3, "Paratactic structure root nodes").

Paratactic structure root nodes have so called representative t-lemmas, which usually correspond to their m-lemmas (e.g.: a (=and), nebo (=or), krát (=times); sometimes, also non-alphabetical/non-numerical symbols are used, e.g.: +). Nodes representing complex conjunctions and conjunction pairs have multi-word t-lemmas (e.g.: buď_nebo (=either_or); see Section 3.1, "Multi-word t-lemma"); this is the case of some operators, too (e.g.: od_ do (=from_to)).

Punctuation marks are represented by nodes with t-lemma substitutes (e.g.: `#Comma`, `#Dash` etc.; see Section 4, "T-lemma substitutes").

For the analysis of coordinating connectives and operators, see Section 16, "Co-ordinating connectives and operators".

• `nodetype` = `list` (see Section 4, "List structure root nodes").

List structure root nodes have the following t-lemma substitutes: `#Idph` and `#Forn`.

• `nodetype` = `fphr` (see Section 5, "Nodes representing foreign-language expressions").

The t-lemmas assigned to nodes representing foreign-language expressions correspond to their surface forms.

• `nodetype` = `dphr` (see Section 6, "Nodes representing the dependent parts of idiomatic expressions").

The t-lemmas assigned to nodes representing the dependent parts of idiomatic expressions are the actual word forms present at the surface level. If the dependent part of an idiomatic expression contains more components, its t-lemma is complex, which means that the node with the `DPHR` functor has a t-lemma containing all the components of the expression in question, in their surface form and order, connected by the underscore mark.

• `nodetype` = `complex` (see Section 7, "Complex nodes").

The t-lemmas assigned to complex nodes are nouns, adjectives, numerals, verbs and adverbs (occasionally also words of other parts of speech). Often, the t-lemma and m-lemma are the same (like in the following sentence: Otec čte noviny. (=Father is reading a newspaper) - the m-lemmas / t-lemmas are: otec, číst, noviny (=Father, read, newspaper).

The t-lemmas are different from their respective m-lemmas in the following cases (cf. also Section 2, "The relation between a node's t-lemma and m-lemma and between its t-lemma and word form"):

• personal and possessive pronouns are represented by nodes with the `#PersPron` t-lemma,

• short forms of adjectives are represented by their respective long forms (e.g. zklamán (= disappointed) is represented by a node with the t-lemma zklamaný),

• the t-lemma assigned to a reflexive verb is formed by the infinitive of the relevant verb plus the reflexive se, which is connected to the verb by the underscore mark (e.g. smát_se),

• the t-lemma assigned to an expression of the form number+adjective contains both its parts, connected by the underscore mark (e.g.: 45_letý (=45 years old),

• foreign surnames containing van, von, de etc. have multi-word t-lemmas (e.g.: van_Gogh, de_Vito),

• numbers with the function of a "label", like telephone numbers, post codes etc. have multi-word t-lemmas, too (see Section 10.1.3, "Numerals with the function of a "label""; e.g.: 420_987_596_281; 278_11).

• differences between t-lemmas and their respective m-lemmas also result from the attempt to capture derivational processes; the derived forms are represented by the t-lemmas of the base forms. For the analysis of the individual types, see Section 2, "The relation between a node's t-lemma and m-lemma and between its t-lemma and word form", in more detail also Section 1, "Syntactic and lexical derivation".

• `nodetype` = `qcomplex` (see Section 8, "Quasi-complex nodes").

Quasi-complex nodes are newly established nodes which are assigned t-lemma substitutes (see Section 4, "T-lemma substitutes").