t

The morphological tag of the current token (which can be found in the text part of <f> or <d>), manually disambiguated. The tagset is defined by the morphological dictionary used for preprocessing the data.

In the Prague Dependency Treebank (PDT), the following tagset system is currently in use. For more information, please refer to the PDT documentation.

Each tag is a 15-tuple of symbols (mostly uppercase letters and digits, but many lowercase and special symbols are used as well). Each single-character position contains a value from one morphological category. 13 categories are in fact fully used:

Position Category name Description

1 POS Part of Speech

2 SUBPOS Detailed Part of Speech

3 GENDER Grammatical Gender (for agreement)

4 NUMBER Grammatical Number (for agreement)

5 CASE Morphological Case

6 POSSGENDER Gender of Possessor

7 POSSNUMBER Number of Possessor

8 PERSON Person

9 TENSE Tense

10 GRADE Degree of Comparison

11 NEGATION Negation

12 VOICE Voice

13 RESERVE1 Reserved

14 RESERVE2 Reserved

15 VAR Variant, Style, Register

Position	Category name	Description
1	POS	Part of Speech
2	SUBPOS	Detailed Part of Speech
3	GENDER	Grammatical Gender (for agreement)
4	NUMBER	Grammatical Number (for agreement)
5	CASE	Morphological Case
6	POSSGENDER	Gender of Possessor
7	POSSNUMBER	Number of Possessor
8	PERSON	Person
9	TENSE	Tense
10	GRADE	Degree of Comparison
11	NEGATION	Negation
12	VOICE	Voice
13	RESERVE1	Reserved
14	RESERVE2	Reserved
15	VAR	Variant, Style, Register

For more information on the individual categories, especially the sets of possible values, please see the full Tagset documentation (psfile, pdffile) or the quick tagset reference (htmlfile, pdffile).

Content

(#PCDATA)

ATTRIBUTES
CONTENT DECLARATION

Tag Minimization: Open Tag: REQUIRED
Close Tag: OPTIONAL

Parent Elements

Top Elements
All Elements

csts DTD