1. Representing coreference in the tectogrammatical trees

The current way of representing coreference makes use of the fact that every node of every tree has an identifier (the value of the id attribute), which is unique within PDT. If coreference is a link between two nodes (one node referring to another), it is enough to specify the indentifier of the coreferred node in the appropriate attribute of the coreferring node. Individual coreference subtypes are distinguished by the value of another attribute.

Three attributes have been introduced for representing coreference:

Every coreferring node is assigned a value only in one of these attributes.

Depending on which part of the tree it is referred to, there are the following cases of coreference :

Coreference relations can also be established between nodes that are not present at the surface level, i.e. between newly established nodes with various t-lemma substitutes (see also Section 4, "Survey of types of coreference with respect to the t-lemmas of the coreferring nodes"). Coreference relations often form long coreference chains at the end of which there are expressions that do not refer to any other node (see Section 5.1, "Preserving the coreference chains").