l

Lemma as defined by a morphological dictionary, manually disambiguated. It has the same format also in the <MMl> and <MDl> text fields.

In the Czech data provided in the Prague Dependency Treebank, and partially also in the Czech National Corpus (CNC), the formal structure of the lemma is described in the remaining part of this section.

The lemma string includes an optional sense ID (a decimal number separated from the lemma by a single dash symbol, such as -3), followed optionally by syntactic, semantic and style tags, derivational information and a comment (or any combination of those) marked in a non-SGML way: each tag is only one-letter long, it is attached to the lemma by an underscore and a single markup symbol:

Syntactic tags

Syntactic tags have been used formerly for alternate part of speech for some words, but are not used today except for verb aspect distinction for regular verbs (T, W symbols). Part of speech symbols can be always found in the associated morphological tag (<t>, <MMt>, <MDt>), and the abbreviation information from the tag (8 in its VAR (last) column) has precedence over the B designation here.
NNoun
JAdjective
AAdjective
ZPronoun
TImperfective verb
WPerfective verb
VVerb (aspect not specified)
MNumeral
CConjunction
DAdverb
PPreposition
FInterjection
IParticle
BAbbreviation
QUnused
XUnused

Semantic tags

GGeographical name
YPerson's first (given) name
SPerson's family name
ENames of members of nations, cities, ethnic groups etc.
RProduct name
KOrganization name
mOther proper name
HChemistry
UMedicine
LNatural Sciences
jLaw, Legal
gGeneral Technical term
cElectronics, Computers
yDIY, travel, free time
bEconomy and Finances
uCulture, Education, Arts, other science
wSports
pPolitics, Government, Military
zEnvironment
oColors

Style tags

sBookish
aArchaic
nDialect
hColloquial (not tolerated in the standard)
eExpressive
lSlang
vVulgar (extremely expressive)
tForeign-language word
xParallel spelling/form, do not use for morph. generation

Derivation information, general comment

Derivation information and general comment (introduced by the caret symbol, ^) are furthermore always contained within a set of parentheses.

Within the parentheses, derivation information always starts with a star (*) as a distinguishing symbol (vs. a general comment), optionally preceded by a derivation type formed by the symbol ^ (caret) and a two-letter code. After the star symbol, a "rule" follows which describes how to get the (underlying) lemma which the current lemma has been derived from. The rule has two parts:

If a star (*) is used in the deletion part, the to-be-appended part which follows the contain the complete original lemma. Otherwise, the number of symbols (including the sense ID if any) designated by the deletion part should be stripped off before attaching the to-be-appended part to form the original lemma.

Examples:

Eventually, all proper derivations should have the ^XX derivation present; all remaining comments starting with a star (*) and containing the string transformation rule described above will be considered synonyms, not derivations.


Content


No ATTRIBUTES
CONTENT DECLARATION

Tag Minimization
Open Tag: REQUIRED
Close Tag: OPTIONAL

Parent Elements


Top Elements
All Elements


csts DTD