t

The morphological tag of the current token (which can be found in the text part of <f> or <d>), manually disambiguated. The tagset is defined by the morphological dictionary used for preprocessing the data.

In the Prague Dependency Treebank (PDT), the following tagset system is currently in use. For more information, please refer to the PDT documentation.

Each tag is a 15-tuple of symbols (mostly uppercase letters and digits, but many lowercase and special symbols are used as well). Each single-character position contains a value from one morphological category. 13 categories are in fact fully used:

Position Category name Description
1 POS Part of Speech
2 SUBPOS Detailed Part of Speech
3 GENDER Grammatical Gender (for agreement)
4 NUMBER Grammatical Number (for agreement)
5 CASE Morphological Case
6 POSSGENDER Gender of Possessor
7 POSSNUMBER Number of Possessor
8 PERSON Person
9 TENSE Tense
10 GRADE Degree of Comparison
11 NEGATION Negation
12 VOICE Voice
13 RESERVE1 Reserved
14 RESERVE2 Reserved
15 VAR Variant, Style, Register

For more information on the individual categories, especially the sets of possible values, please see the full Tagset documentation ( psfile, pdffile) or the quick tagset reference ( htmlfile, pdffile).


Content


ATTRIBUTES
CONTENT DECLARATION

Tag Minimization
Open Tag: REQUIRED
Close Tag: OPTIONAL

Parent Elements


Top Elements
All Elements


csts DTD