VALLEX 2.5 – Logical Structure of the Lexicon
The primary goal of the following text is to briefly describe the
content of VALLEX 2.5 data from a structural point of view.
Linguistic issues requiring an extensive explanation or discussion
are mostly left apart. However, more detailed description (and
also additional relevant references) can be found in
Žabokrtský, 2005. Some theoretical issues concerning valency are
summarized in Lopatková, 2003.
As for terminology, the terms used here either belong to the
broadly accepted linguistic terminology, or come from FGD
(which we have used as the background theory), or are defined
somewhere else in this text.
Contents

1 Lexemes
On the highest level, VALLEX 2.5 is composed of lexemes. Lexeme is understood as a
two-fold abstract entity, see Cruse, 1986: it associates a set of possible lexical forms (by which the
presence of the lexeme is manifested in an utterance, Section 2) with a set of
lexical units (complexes of syntactic and semantic features, LUs for short, Section 3). In simpler words, lexical forms can be viewed as the conjugated forms of a given
verbal lexeme, whereas each LU corresponds roughly to the lexeme used in a specific sense and
with specific syntactic combinatorial potential.
It is usual in dictionaries that the set of all possible lexical
forms of a given lexeme is represented only by the infinitive form
called lemma.
Lemma in VALLEX 2.5 should be considered as a complex
structure:
-
it always contains the ‘base’ infinitive form;
-
it is always labeled in superscript with its morphological
aspect (Section 2.2);
-
it may be also labeled with a Roman number in
subscript if it is necessary to distinguish it from its homograph
(e.g. nakupovatI – to buy vs.
nakupovatII – to heap,
see Section 2.4).
-
it may contain also reflexive particle (e.g. bát se – to fear,
see Section 2.1);
In VALLEX 2.5, there are typically two or more lemmas listed at
the beginning of the lexeme entry. It follows the FGD principle of
treating aspectual counterparts (perfective and imperfective verbs
expressing the same lexical meaning, Section 2.2) as
manifestations of the same lexeme. Another reason for more lemmas
being present in the same lexeme might be the existence of
orthographic variants (Section 2.3).
2.1 Reflexive Lemmas
In VALLEX 2.5, two types of reflexive constructions are distinguished:
-
Reflexive lexemes – they are represented as separate lexemes, and the reflexive particles
se or si are considered as parts of their lemmas. Two types are considered:
- ‘reflexiva tantum’ (e.g. bát se – to fear,
smát se – to laugh)
- derived reflexives (e.g. odpovídat se – to account,
šířit se – to spread, vrátit
se – to return)
-
Reflexive usage of irreflexive lexemes – if the reflexive
particles/pronouns se or si have
specific syntactic function(s), reflexive forms of particular verbs
are treated within irreflexive lexemes and their possible functions
are specified (see Section 5.2 and
Section 5.3):
- se
can be a part of the reflexive passive form (e.g. in
pátrá se po zloději – a
thief is being looked for);
- se or si can be a complementation fulfilling some valency slot of
the governing verb (e.g. mýt se – to wash oneself;
where se is PAT (Patient) coreferential
with ACT (actor))
- it can mark reciprocity (e.g. kopat
se in kopou se vzájemně do nohou – they kick each other’s legs).
2.2 Aspectual Counterparts
Imperfective and perfective verb forms are distinguished in Czech
(as well as a specific subclasses of iterative verbs and so called
biaspectual verbs); this characteristic is called aspect.
In VALLEX 2.5, the value of aspect is attached to each lemma as a
superscript label:
-
impf for imperfective;
-
pf for perfective;
-
iter for iterative verbs;
-
biasp for biaspectual verbs.
There are three ways how aspectual counterparts (verbs with the
same or very similar lexical meaning differing in aspect) are
formed in Czech (sorted according to productivity):
-
affixation: an imperfective verb is derived from the perfective
one, e.g. by infix -ova-: vypsat /
vypisovat – to excerpt, to write off;
-
prefixation: a perfective verb is derived from the imperfective
one by adding a prefix: psát / napsat – to
write;
-
suppletive (phonemically unrelated) couples: vzít /
brát – to take.
Aspectual counterparts of the first and third type constitute a
single lexeme in VALLEX 2.5, as e.g. in the case of
nasedat impf, nasednout pf,
nasedávat iter – to get on.
As already mentioned, a LU typically shares all its lemmas with
the other LUs in the lexeme in which it is embedded. However,
there are exceptions: the aspectual counterpart(s) need not be the
same for all LUs of the particular lexeme. For example,
odpovědět pf is a counterpart of
odpovídat impf in the sense ‘to answer’, but not
in the sense ‘to correspond’. In such cases, the set of applicable
lemmas is specified directly for the LU introduced by the abbreviation jen (and overrides the set of
lemmas specified for the whole lexeme).
There might be more than one lemma with the same aspect in a lexeme without being lemma
variants. Then the aspect flags are distinguished by Arabic numbers, as e.g. in the lexeme
osušovat impf1, osoušet impf2, osušit pf – to dry up,
to wipe, or odřezávat impf, odříznout pf1, odřezat pf2
– to cut off (unique aspect flags are necessary because they serve also for co-indexing the
lemmas with example sentences illustrating the usage of the lexeme).
Some verbs (e.g. informovat – to inform, charakterizovat – to characterize)
can be used in different contexts either as imperfective or as perfective. They are called
biaspectual verbs.
Within imperfective verbs, there is a subclass of iterative verbs (iter.). Czech iterative
verbs are derived more or less in a regular way by affixes such as -va- or
-íva-, and express extended and repetitive actions (e.g. číst – to read
→ čítávat, chodit – to walk → chodívat ). In
VALLEX 2.5, iterative verbs containing double affix -va- (e.g. chodívávat )
are completely disregarded, whereas the remaining iterative verbs occur as headword lemmas of
the relevant lexeme.
2.3 Lemma Variants
Lemma variants (many of which are just spelling variants, i.e. orthographic variants)
are groups of two or more lemmas that are
interchangeable in any context without any change of the meaning
(e.g. dovědět se/dozvědět se – to learn).
Usually, the only difference is just a small alternation in the
morphological stem, which might be accompanied by a subtle
stylistic shift (e.g. myslet/myslit – to think, the
latter one being bookish). Moreover, although the infinitive forms
of the variants differ in spelling, some of their conjugated forms
might be identical (mysli (imper.sg.) both for
myslet and myslit ).
There are rare exceptions when only one of the variants can be used, e.g. plavat and
plovat – to swim, are usually considered to be variants, see, e.g. SSJČ, 1964,
although, in some contexts, only plavat, in the sense ‘to flounder’, can be used
(plavat při zkoušce, *plovat při zkoušce ). The applicable lemmas must be then
listed for the specific LU as in any other cases when a LU imposes a further limitation on the
set of lexical forms.
2.4 Homographs
Homographs are lemmas ‘accidentally’ identical in the spelling but
considerably different in their meaning (there is no obvious
semantic relation between them). They also might differ as to
their etymology (e.g. nakupovatI – to buy vs.
nakupovatII – to heap), aspect (Section 2.2) (e.g. stačitIpf – to be enough
vs. stačitIIimpf – to catch up with), or
conjugated forms (žilo (past.sg.fem) for
žítI – to live vs. žalo (past.sg.fem)
žítII – to mow.
In VALLEX 2.5, such lemmas are distinguished by Roman numbering in the
subscript. These numbers should be understood as inseparable parts of VALLEX 2.5 lemmas.
3 Lexical Units
Each lexeme is formed by a set of lexical units that are assigned to respective lexical forms
(represented by their lemmas). Following Cruse, 1986, we understand lexical units (LUs) as
“form-meaning complexes with (relatively) stable and discrete semantic properties”. Roughly
speaking, LU can be understood as ‘a given word in the given sense’. In the Czech tradition,
this concept of LU corresponds to Filipec’s ‘monosemic lexeme’, see Filipec and Čermák, 1985.
Within each lexeme in VALLEX 2.5, LUs are numbered by Arabic
numbers. In the printed and html versions of the lexicon, the LU entry
starts with its number.
The ordering of lexical units is not completely random, but it is
not perfectly systematic either. So far, it is based only on the
following weak intuition: the primary and/or the most frequent
meanings should go first, whereas rare and/or idiomatic meanings
should go last. (We do not guarantee that the ordering of LUs in
VALLEX 2.5 exactly matches their frequency in the contemporary
language.)
By default, a LU ‘inherits’ all lemmas specified for the given
lexeme in which it is embedded. However, it might happen that for
a given LU not all the forms specified for the whole lexeme are
applicable. In such cases, the list of applicable lemmas is
specified for the given LU separately.
Available information about each LU entry in VALLEX 2.5 is captured by obligatory and optional
attributes. The former ones have to be filled with every LU. The latter ones might be empty,
either because they are not applicable (e.g. no control can be applicable for verbs without
infinitive complementations), or because the annotation was not finished yet (e.g. attribute
class; Section 5.4).
Obligatory LU attributes:
-
valency frame (abbr. frame), Section 4;
-
gloss – verb or paraphrase roughly synonymous with the given sense/meaning;
this attribute is not supposed to serve as a source of synonyms or even of genuine lexicographic definition
– it should be used just as a clue for fast orientation within the word entry! (introduced by ≈);
-
example – sentence(s) or sentence fragment(s) containing the given verb used with the given valency frame (abbr. example).
Optional LU attributes:
-
information on control (abbr. (abbr. control), Section 5.1;
-
possible type(s) of reflexive constructions (abbr. rfl), Section 5.2;
-
possible type(s) of reciprocal constructions (abbr. rcp), Section 5.3;
-
affiliation to a syntactico-semantic class (abbr. class) Section 5.4;
-
flag for idiom (Section 5.5).
4 Valency Frames
The core valency information is encoded in the valency
frame. Within the FGD framework, valency frames (in a narrow sense)
consist only of inner participants (both obligatory and optional)
and obligatory free modifications, Panevová, 1974; Panevová, 1994. In
VALLEX 2.5, valency frames are enriched with quasi-valency
complementations. Moreover, a few non-obligatory free
modifications occur in valency frames too, since they are
typically related to some verbs (or even to whole classes of
them) and not to others.
(The other free modifications can
occur with the given verb too, but they are not contained in the
valency frame as their presence in a sentence is not understood as
syntactically conditioned in FGD.)
In VALLEX 2.5, a valency frame is modeled as a sequence of frame slots. Each frame slot
corresponds to one (either required or specifically permitted) complementation of the given
verb.
Note on terminology: in this text, the term ‘complementation’
(dependent item) is used in its broad sense, not related to
the traditional argument/adjunct (complement/modifier) dichotomy.
The following attributes are assigned to each slot:
Some slots tend to occur systematically together. In order
to capture this type of regularity, we have introduced the mechanism of slot expansion,
Section 4.4 (full valency frame is obtained after performing these expansions).
4.1 Functors
In VALLEX 2.5, functors (labels for ‘deep roles’; similar to
theta-roles) are used for expressing
types of relations between verbs and their complementations. According to FGD, functors are
divided into inner participants (actants) and free modifications (this division roughly
corresponds to the argument/adjunct dichotomy), see Panevová, 1974; Panevová, 1994. In VALLEX 2.5, we
also distinguish an additional group of quasi-valency complementations,
see esp. Lopatková and Panevová, 2005.
Functors that occur in VALLEX 2.5 are listed in the following tables
Inner participants:
-
ACT (actor): Peter read a letter.
-
ADDR (addressee): Peter gave Mary a book.
-
PAT (patient): I saw him.
-
EFF (effect): We made her the secretary.
-
ORIG (origin): She made a cake from apples.
Quasi-valency complementations:
-
DIFF (difference): The value of shares has risen by 100%.
-
OBST(obstacle): The boy stumbled
over a stump.
-
INTT (intent): He came there to look for Jane.
Free modifications:
-
ACMP (accompaniment): Mother came with her children.
-
AIM (aim): John came to a bakery for a piece of bread.
-
BEN (benefactive): She made this for her children.
-
CAUS (cause): She did so since they wanted it.
-
COMPL (complement): They painted the wall blue.
-
CRIT (criterion): Peter has to do it exactly according to directions.
-
DIR1 (direction-from): He went from the forest to the village.
-
DIR2 (direction-through): He went through the forest to the village.
-
DIR3 (direction-to): He went from the forest to the village.
-
DPHR (dependent part of a phraseme): Peter talked horse again.
-
EXT (extent): The temperatures reached an all time high.
-
HER (heritage): He named the new villa after his wife.
-
LOC (locative): He was born in Italy.
-
MANN (manner): They did it quickly.
-
MEANS (means): He wrote it by hand.
-
RCMP (recompense): She bought a new shirt for 25 $.
-
REG (regard): With regard to George she asked his teacher for advice.
-
SUBS (substitution): He went to the theater instead of his ill sister.
-
TFHL (temporal-for-how-long): They interrupted their studies for a year.
-
TFRWH (temporal-from-when): His bad reminiscences came from this period.
-
THL (temporal-how-long ): We were there for three weeks.
-
TOWH (temporal-to when): He put it over to next Tuesday.
-
TSIN (temporal-since-when): I have not heard about him since that time.
-
TTILL (temporal-till-when): It will last till 5 o’clock.
-
TWHEN (temporal-when): He will come tomorrow.
Note 1: Besides the functors listed in the tables above, also
value DIR occurs in the VALLEX 2.5 data. It is used only as a
special symbol for the slot expansion (Section 4.4).
Note 2: The set of functors as introduced in FGD and used in the
Prague Dependency Treebank is richer than that shown above, see
Mikulová et al. , 2006. We do not use its full (current) set in VALLEX
2.5 due to several reasons. Some functors do not occur with verbs
at all (e.g. MAT – material, partitive, as sklenice
piva.MAT – glass of beer), some other functors can occur there
but represent other than dependency relations (e.g. coordination,
Jim nebo.CONJ Jack – Jim or Jack). And still others can
occur with verbs as well but their behavior is absolutely
independent of the head verb; thus they have nothing to do with
valency frames (e.g. ATT – attitude, udělal to
dobrovolně.ATT – he did it willingly).
In a sentence, each frame slot can be expressed by a limited set
of morphemic means which we call forms. In VALLEX 2.5, the set of
possible forms (supposing active verb form) is defined either
explicitly, or implicitly.
In the first case (explicitly declared forms), the forms are
enumerated in a list attached as a subscript to the given slot (in the case of
arguments and quasi-valency complementations, no other forms can be
used; in the case of free modifiers, the possible forms are not
necessarily limited to those given in the list).
In the second case (implicitly declared forms), no such list is
specified because the set of possible forms is implied by the
functor of the respective slot (in other words, all forms possibly
expressing the given functor may appear).
The list of forms attached to a frame slot may contain
values of the following types:
-
Pure (prepositionless) case. There are seven morphological cases in
Czech. In the VALLEX 2.5 notation, we use numbering traditional in
the Czech linguistics:
1 – nominative, 2 – genitive, 3 – dative, 4 – accusative, 5 –
vocative, 6 – locative, and 7 – instrumental.
-
Prepositional case. Lemma of the preposition
(i.e., preposition without vocalization) and the
number of the required morphological case are specified
(e.g. z+2, na+4, o+6 …). The prepositions occurring
in VALLEX 2.5 are the following: bez, do, jako, k, kolem,
mezi, místo, na, nad,
o, od, po, pod, podle, pro, proti, před, přes, při, s, u, v, z, za. (jako is traditionally
considered as a conjunction, but it is included in this
list as it requires a particular morphological case in some
valency frames).
-
Infinitive construction. The abbreviation inf stands
for infinitive verbal complementation; inf can appear together
with a conjunction (e.g. než+inf), but it happens very rarely in Czech.
-
Subordinated clauses. Subordinated content clauses
introduced by subordinating conjunctions are represented by
the conjunction lemmas;
the following values occur in VALLEX 2.5:
aby, ať, až, jak, zda,
že.
Subordinated content clauses not introduced by a conjunction
(e.g. those having the form of an indirect speech with an interrogative pronoun or pronominal adverb)
are represented by the abbreviation cont.
-
Construction with adjectives. Abbreviation adj-digit
stands for an adjective complementation in the given case,
e.g. adj-1 (e.g. cítím se slabý – I feel weak).
-
Constructions with být. Infinitive of verb být
(to be) may combine with some of the types above, e.g. být+adj-1
(e.g. zdá se to být dostatečné – it seems to be sufficient).
-
Part of phraseme. If the set of the possible lexical
values of the given complementation is very small (often
one-element), we list these values directly (e.g. napospas
for the phraseme ponechat (někoho) napospas (někomu) – to leave sb at the mercy (of sb)).
If no forms are listed explicitly for a frame slot, then
the list of possible forms implicitly results from the functor of the slot
according to the following (yet incomplete) lists:
-
ACMP: bez+2, s+7, společně s+7, spolu s+7, v čele s+7, v souvislosti s+7, ve spojení s+7, včetně+2, … ;
-
AIM: aby, ať, do+2, k+3, na+4, o+4, pro+4, pro případ+2, proti+3, v zájmu+2, za+4, za+7,
že, … ;
-
BEN: 3, na+4, na účet+2, na úkor+2, na vrub+2, pro+4, proti+3, v+4, ve prospěch+2, v rozporu s+7, s+7, v zájmu+2, … ;
-
CAUS: 7, aby, adverb, díky+3, jelikož, ježto, k+7, kvůli+3, na+4, na+6, na základě+2,
nad+7, následkem+2, od+2, pod+7, pod
náporem+2, pod tíhou+2, pod váhou+2, poněvadž, pro+4, proto, protože, v+6, v důsledku+2,
v souvislosti s+7, vinou+2,
vlivem+2, vzhledem k+3, z+2, z důvodu+2, za+4, za+7, zásluhou+2,
že ,
… ;
-
CRIT: 2, 7, dle+2, podle+2, na+6, na základě+2, po vzoru+2, přiměřeně+3, v+6, v duchu+2,
v rozporu s+7, v souladu s+7, v souhlase s+7, v závislosti na+6, ve shodě s+7, ve smyslu+2,
ve světle+2, z titulu+2, … ;
-
DIR1: adverb, od+2, s+2, z+2, ze strany+2, zpod+2, zpoza+2,
zpřed+2, … ;
-
DIR2: 7, adverb, kolem+2, cestou+2, mezi+7, napříč+7, po+6, podél+2, přes+4, skrz+4, v+6, …
-
DIR3: 7, adverb, do+2, do čela+2, k+3, kolem+2, mezi+4, mimo+4, na+4, na+6, nad+4, naproti+3,
okolo+2, po+4, po+6, pod+4, proti+3, před+4, přes+4, směrem do+2, směrem k+3, směrem na+4, v+4, vedle+2,
za+4, za+7, … ;
-
EXT: 2, 4, 7, adverb, do+2, kolem+2, k+3, na+4, na+6, nad+4,
okolo+2, po+6, pod+7, přes+4, v+4, z+2, za+4, … ;
-
LOC: adverb, blízko+2, blízko+3, daleko+2, do+2, kolem+2, mezi+7, mimo+4, na+4, na+6, na úroveň+2,
nad+7, naproti+3,
nedaleko+2, okolo+2, po+6, po boku+2, poblíž+2, pod+7, podél+2, proti+3, před+7,
přes+4, při+6, stranou+2, u+2, uprostřed+2, uvnitř+2, v+6, v čele+2, v oblasti+2, v rámci+2, v řadě+2,
vedle+2, za+4, za+7,
… ;
-
MANN: 7, adverb, do+2, formou+2, na+4, na+6, nad+4, o+4, po+6, pod+7, proti+3,
před+7, při+6, přes+4, s+7,
v+4, v+6,
v podobě+2, ve formě+2, vedle+2, z+2, za+4, za+7, jak, že, … ;
-
MEANS: 7, adverb, cestou+2, díky+3, do+2, na+4, na+6, o+6, po+6, pod+7,
pomocí+2, prostřednictvím+2, přes+4, s+7, s pomocí+2, v+6, z+2, za+4, skrz+2, za pomoci+2, že, ,
… ;
-
REG: 7, adverb, bez ohledu na+4, bez zřetele k+3, k+3, kolem+2,
na+4, na+6, na téma+2, nad+7,
nezávisle na+6, o+6, ohledně+2, po+6, pro+4, před+7, při+6,
s+7, se zřetelem k+3, se zřetelem na+4, s ohledem na+4, u+2,
v+6, v otázce+2, v případě+2, v rámci+2, v souvislosti s+7, ve věci+2,
ve vztahu k+3, vůči+3, vzhledem k+3, z+2, z hlediska+2, za+4,
… ;
-
SUBS: jménem+2, namísto+2, místo+2, výměnou za+4, za+4, … ;
-
TFHL: adverb, do+2, na+4, po+2, pro+4, … ;
-
TFRWH: z+2, od+2, … ;
-
THL: 2, 4, 7, adverb, až, dokud, do+2, na+4, po+4, po dobu+2, přes+4,
v+2, za+4, … ;
-
TOWH: adverb, do+2, k+3, na+4, pro+4, … ;
-
TSIN: adverb, od+2, počínaje+7, z+2, … ;
-
TTILL: adverb, do+2, dokud, k+3, než, po+4, … ;
-
TWHEN: 2, 4, 7, adverb, až, do+2, jakmile,
k+3, když, kolem+2, koncem+2, mezi+7,
na+4, na+6, na závěr+2, než, o+6, okolo+2, po+6, počátkem+2, postupem+2,
poté co, před+7, předtím než, při+6, s+7, u příležitosti+2,
v+4, v+6, v době+2, v období+2, v průběhu+2, v závěru+2, z+2, za+2, za+4, začátkem+2,
… ;
4.3 Types of Complementations
Within the FGD framework, valency frames (in a narrow sense) consist only of inner participants
(both obligatory and optional) and
obligatory free modifications.
As a criterion for obligatoriness, the dialogue test was introduced by
Panevová in Panevová, 1974, see also Sgall, Hajičová, and Panevová, 1986. It should be emphasized that in this
context the term obligatoriness is related to the presence of the
given complementation in the deep (tectogrammatical) structure,
and not to its (surface) deletability in a sentence (moreover, the
relation between deep obligatoriness and surface deletability is
not at all straightforward in Czech).
In
VALLEX 2.5, valency frames are enriched with quasi-valency
complementations. Moreover, a few non-obligatory free
modifications occur in valency frames too, since they are
typically related to some verbs (or even to whole classes of them)
and not to others.
The attribute type is attached to each frame slot and can have one of the
following values: or for inner participants and
quasi-valency complementations, and or for free
modifications.
4.4 Slot Expansion
Some slots tend to occur systematically together.
For instance, verbs of motion can be often modified
with direction-to and/or direction-through and/or direction-from modifier.
We decided to capture this type of regularity by introducing
the abbreviation flag for a slot. If this flag is set (in the VALLEX 2.5
notation it is marked with an upward arrow ),
the full valency frame is obtained after slot expansion.
If one of the frame slots is marked with the upward arrow, then the full valency frame
will be obtained after substituting this slot with a sequence of slots as follows:
-
↑DIRtyp → DIR1typ DIR2typ DIR3typ
-
↑DIR1obl → DIR1obl DIR2typ DIR3typ
-
↑DIR2obl → DIR1typ DIR2obl DIR3typ
-
↑DIR3obl → DIR1typ DIR2typ DIR3obl
-
↑THLobl → THLobl TSINtyp TTILLtyp
-
↑THLtyp → TSINtyp THLtyp TTILLtyp
5 Optional LU Attributes
5.1 Control
The term control (abbr. control) relates in this context to a certain type
of predicates (verbs of control) and two coreferential expressions, a ‘controller’ and a
‘controllee’, see also Panevová, 1996. In VALLEX 2.5, control is captured in the data only in
the situation in which a verb has an infinitive modifier (regardless of its functor). Then the
controllee is an element that would be a ‘subject’ of the infinitive (which is structurally
excluded on the surface), and controller is the co-indexed expression. In VALLEX 2.5, the type
of control is stored in the frame attribute ‘control’ as follows:
-
if there is a coreferential relation between
the (unexpressed) subject (‘controllee’) of the infinitive verb and
one of the frame slots of the head verb, then the attribute
is filled with the functor of this slot (‘controller’);
-
otherwise (i.e., if there is no such coreference), value is used.
Examples:
-
pokusit se – to try, e.g. Jiří se pokusí přijít – Jiří will try to come, control: ACT;
-
slyšet – to hear, e.g. děti slyší
někoho přicházet – children hear somebody coming, control: PAT;
-
jít, in the sense jde to udělat – it is possible to do it, control:
.
5.2 Reflexivity
The optional attribute reflexivity (abbr. rfl) indicates
possible syntactic functions of the reflexive particles/pronouns
se or si.
The reflexive particles/pronouns se or si are
used in Czech as formal means expressing the following syntactic
constructions:
-
derived diatheses: the particle se is a part of the reflexive passive
verb form:
-
for transitive verbs (e.g plány se připravují – plans are prepared);
marked with the label ;
-
for intransitive verbs (e.g. pátrá se po zloději – a
thief is being looked for;
v neděli se chodí do kostela – on Sundays one visits the church); marked with the label
.
-
grammatical coreference: the pronouns se or si stands for an inner
participant that is coreferential with Actor (e.g. mýt se
– to wash oneself, coreference between ACT and PAT (in
Accusative); podřídit si zaměstnance – to bring under
the employees, coreference between ACT and ADDR in dative); marked
with the labels (in the case of si ) or (in
the case of se ).
Note that the attribute reflexivity does not cover reflexive verb
forms where reflexive particles se or si are
parts of the infinitive forms, i.e. true reflexive (e.g.
bát se – to fear, smát se – to laugh) as well
as derived reflexive (e.g. odpovídat se – to account,
šířit se – to spread, vrátit se – to return)
(as already discussed in Section 2.1), nor the
reciprocal function of se or si pronouns (see
Section 5.3).
5.3 Reciprocity
Reciprocity is understood as a possibility of (two or more) valency
complementations to be in relations with each other that may be viewed
symmetrically (and their roles are interchangeable).
In Czech, if Actor and some other complementation are reciprocal, then the reflexive verb form
is used and these two complementations are expressed either as a coordinated nominal group (as
in Petr a Marie se hádali – Peter and Mary argued (with one another)), or as a plural
noun (přátelé se navštěvují – friends visit each other), possibly with additional
adverbs spolu, navzájem, … .
If Actor is not affected, the reciprocity may follow from the plural form or coordination (with
no other formal sign), as in seznámil je – he introduced them (to each other).
The possibility of reciprocal usage is indicated in the attribute reciprocity (rcp for
short), the value of which is a pair (or triple) of functors involved, e.g. ACT-ADDR for
hádat se – to argue, neustále se spolu hádali – they argued with each other
all the time; or ACT-ADDR-PAT for mluvit – to talk, mluví spolu o sobě –
they talked with each other about themselves.
In the case of derived reflexive lexemes of inherently reciprocal
verbs (with the obligatory complementation with the form s+7),
both LUs for irreflexive and reflexive lexemes have assigned
attribute rcp.
5.4 Semantic Class
Semantic classes are assigned to a significant part of lexical units (2,903 LUs out of 6,460, i.e. 45% of all LUs). These classes were built strictly in a ‘bottom-up’ way, by grouping LUs
with similar syntactic property and with respect to their semantics. The following 22 semantic
classes were established:
-
appoint verb (23 LUs), e.g.
nominovat – to nominate, určovat, určit – to assign (as in určila ho za svého zástupce
– she assigned him as her assistant), ustanovovat, ustanovit – to appoint, … ;
-
cause motion (43 LUs), e.g. hýbat, hnout, hýbnout – to move (as in
hnul pravou rukou – he moved his right hand), mávat, mávnout – to wave,
vrhat - to throw, … ;
-
combining (96 LUs), e.g.
míchat – to mix, přidat, přidávat – to
add, spojit, spojovat – to join/to combine,
… ;
-
communication (364 LUs), e.g. číst – to read, hovořit – to talk,
nařizovat, nařídit – to command, pochybovat – to hesitate/to question, … ;
-
contact (115 LUs), e.g.
dotýkat se, dotknout se – to contact, narážet, narazit – to hit (against sth),
tisknout – to press, … ;
-
emission (22 LUs), e.g.
pouštět, pustit – to run (as in tričko pustilo barvu – the shirt lost color),
vysílat, vyslat – to radiate/to emit, … ;
-
exchange (177 LUs), e.g.
dávat, dát – to give, dostávat, dostat – to get, platit – to pay,
pronajímat, pronajmout – to let, … ;
-
expansion (19 LUs), e.g.
pronikat, proniknout – to spread, šířit – to diffuse/to disseminate, … ;
-
extent (20 LUs), e.g.
činit – to amount, dosahovat, dosáhnout – to reach, vycházet, vyjít
– to cost/to come to (as in boty vyjdou na tisíc korun – shoes come to one thousand crowns),
… ;
-
change (318 LUs), e.g.
budovat – to build, klesat, klesnout – to fall
(as in teplota klesla pod bod mrazu – the temperature fell below freezing point),
proměňovat, proměnit – to change, růst – to grow, vytvářet, vytvořit – to create,
… ;
-
intervention (10 LUs), e.g.
zasahovat – to meddle, mluvit – to speak/to interfere (as in do toho nemůžu mluvit – I have no voice in this),
… ;
-
location (399 LUs), e.g.
doplňovat, doplnit – to add, nacházet, najít – to find,
shromažďovat – to gather, … ;
-
mental action (304 LUs), e.g.
cítit se – to feel (as in cítit se dobře – to feel fine),
jásat – to exult, mrzet – to be sorry, … ;
-
modal verb (15 LUs), e.g.
dovést – to be able, chtít – to want, … ;
-
motion (309 LUs), e.g.
běžet – to run, dorážet, dorazit – to arrive,
hýbat se – to move (as in Nehýbej se! – Don’t move!), … ;
-
perception (104 LUs), e.g.
hledět – to look, pamatovat – remember,
všímat se, všimnout si – to notice, … ;
-
phase of action (80 LUs), e.g.
končit – to end (as in zde les končí – here the forest ends),
vrcholit – to culminate, vznikat, vzniknout – to arise, … ;
-
phase verb (76 LUs), e.g.
iniciovat – to initiate,
končit – to end (as in končit školu – to finish the school),
najet – to cover (as in najeli aspoň 500 mil – they covered at least 500 miles), … ;
-
providing (51 LUs), e.g.
naplňovat, naplnit – to fill/to replentish, oloupávat, olupovat, oloupnout, oloupat –
to peel (as in oloupat ovoce – to peel fruit),
vybavovat, vybavit – to equip, … ;
-
psych verb (83 LUs), e.g.
klamat – to deceive, těšit – to pleasure, … ;
-
social interaction (86 LUs),
potkávat se, potkat se – to meet (as in potkává se s přáteli v baru –
he used to meet his friends in bar), e.g.
spojovat se, spojit se – to interconnect/to get in touch
(as in spojím se s ním co nejdříve – I will get in touch with him as soon as possible),
souhlasit – to agree, … ;
-
transport (189 LUs), e.g.
donášet, donést – to bring/to carry,
přemisťovat/přemísťovat, přemístit – to move, shrnovat, shrnout –
to heap, … .
We admit that this classification is tentative and should be understood merely as an intuitive
gathering of frames, rather than a properly defined ontology. The motivation for introducing
such semantic classification in VALLEX 2.5 was the fact that it simplifies systematic checking
of consistency and allows for making more general observations about the data.
5.5 Idioms
When building VALLEX, we have focused mainly on primary or usual meanings of verbs.
We also noted many LUs corresponding to peripheral usages of
verbs. However, their coverage in VALLEX might not be complete.
We call such LUs idiomatic and mark them with the label ‘idiom’.
An idiomatic frame is tentatively characterized either by a substantial shift
in meaning (with respect to the primary sense), or by a small and strictly limited set of
possible lexical values in one of its complementations, or by occurrence of another type of
irregularity or anomaly.
References

- Cruse (1986)
Cruse, D. A.: 1986. Lexical Semantics. Cambridge University Press, Cambridge.
- Filipec and Čermák (1985)
Filipec, Josef and František Čermák.: 1985. Česká lexikologie. Academia, Praha.
- Hajič (2005)
Hajič, Jan.: 2005. Complex Corpus Annotation: The Prague Dependency Treebank. In Mária Šimková, editor, Insight into Slovak and Czech Corpus Linguistics. Veda Bratislava, Slovakia, pages 54–73.
- Hajičová, Partee, and Sgall (1998)
Hajičová, Eva, Barbara H. Partee, and Petr Sgall.: 1998. Topic-Focus Articulation, Tripartite Structures, and Semantic Content, volume 71 of Studies in Linguistics and Philosophy. Kluwer, Dordrecht.
- Lopatková and Panevová (2005)
Lopatková, Markéta and Jarmila Panevová.: 2005. Recent developments in the theory of valency in the light of the Prague Dependency Treebank. In Mária Šimková, editor, Insight into Slovak and Czech Corpus Linguistics. Veda Bratislava, Slovakia, pages 83–92.
- Lopatková et al. (2002)
Lopatková, Markéta, Zdeněk Žabokrtský, Karolína Skwarska, and Václava Benešová.: 2002. Tektogramaticky anotovaný valenční slovník českých sloves. Technical Report TR-2002-15, ÚFAL/CKL MFF UK, Prague.
- Lopatková et al. (2003)
Lopatková, Markéta, Zdeněk Žabokrtský, Karolína Skwarska, and Václava Benešová.: 2003. VALLEX 1.0 Valency Lexicon of Czech Verbs. Technical Report TR-2003-18, UFAL/CKL MFF UK, Prague.
- Lopatková (2003)
Lopatková, Markéta.: 2003. Valency in the Prague Dependency Treebank: Building the Valency Lexicon. The Prague Bulletin of Mathematical Linguistics, (79–80):37–60.
- Mikulová et al. (2006)
Mikulová, Marie, Alevtina Bémová, Jan Hajič, Eva Hajičová, Jiří Havelka, Veronika Kolářová, Lucie Kučová, Markéta Lopatková, Petr Pajas, Jarmila Panevová, Magda Razímová, Petr Sgall, Jan Štěpánek, Zdeňka Urešová, Kateřina Veselá, and Zdeněk Žabokrtský.: 2006. Annotation on the tectogrammatical level in the Prague Dependency Treebank. Annotation manual. Technical Report TR-2006-30, ÚFAL MFF UK, Prague.
- Pala and Ševeček (1997)
Pala, Karel and Pavel Ševeček.: 1997. Valence českých sloves. In Sborník prací FFBU, pages 41–54, Brno.
- Panevová (1974)
Panevová, Jarmila.: 1974. On Verbal Frames in Functional Generative Description. The Prague Bulletin of Mathematical Linguistics, (22):3–40.
- Panevová (1994)
Panevová, Jarmila.: 1994. Valency Frames and the Meaning of the Sentence. In Philip A. Luelsdorff, editor, The Prague School of Structural and Functional Linguistics. John Benjamins Publishing Company, pages 223–243.
- Panevová (1996)
Panevová, Jarmila.: 1996. More Remarks on Control. Prague Linguistic Circle Papers, John Benjamins, 2:101–120.
- SSČ (2003)
SSČ.: 2003. Slovník spisovné češtiny pro školu a veřejnost. Academia, Praha. (3rd edition).
- SSJČ (1964)
SSJČ.: 1964. Slovník spisovného jazyka českého. Academia, Praha.
- Sgall, Hajičová, and Panevová (1986)
Sgall, Petr, Eva Hajičová, and Jarmila Panevová.: 1986. The Meaning of the Sentence in Its Semantic and Pragmatic Aspects. D. Reidel Publishing Company, Dordrecht.
- Svozilová, Prouzová, and Jirsová (1997)
Svozilová, Naďa, Hana Prouzová, and Anna Jirsová.: 1997. Slovesa pro praxi. Academia, Praha.
- Svozilová, Prouzová, and Jirsová (2005)
Svozilová, Naďa, Hana Prouzová, and Anna Jirsová.: 2005. Slovník slovesných, substantivních a adjektivních vazeb a spojení. Academia, Praha.
- Žabokrtský (2005)
Žabokrtský, Zdeněk.: 2005. Valency Lexicon of Czech Verbs. Ph.D. thesis, Charles University, Prague, Czech Rep.
Valid XHTML 1.0!