2.2. Valency frames and the way they are recorded in the valency lexicon

A detailed description of a valency frame record is to be found in Section 5, "The notation of valency frames and its semantics"; in the present section, only the basic information (as to the form of the record) is given.

A valency frame record is a sequence of records of individual valency modifications (types of dependents), separated by spaces.

Competing valency modification (see Section 2.3.1.5.1, "Competing manner adjuncts") are separated by the | mark.

As for idiomatic expressions and complex predicates, the valency frames of their governing verbs contain, apart from their arguments and obligatory adjuncts, also the dependent parts of the idiomatic expressions or complex predicates in question (with the functors CPHR or DPHR; see Section 2.2.2, "Valency frames of idiomatic expressions (phrasemes) and complex predicates").

The lexical meaning linked to a given valency frame is illustrated by examples; often, synonyms and antonyms are provided, too, or aspectual counterparts, if possible.

In the example part of a valency frame record, one can also occasionally find so called typical adjuncts, i.e. those modifications that are not required (they are not semantically obligatory) but which are characteristic for a given verb (noun, adjective) in the given meaning.

A valency frame lists the valency modifications in the following order: ACT, CPHR, DPHR, PAT, ADDR, ORIG, EFF, BEN, LOC, DIR1, DIR2, DIR3, TWHEN, TFRWH, TTILL, TOWH, TSIN, TFHL, MANN, MEANS, ACMP, EXT, INTT, MAT, APP, CRIT, REG.

A valency modification record contains information regarding the functor and surface form of the given modification (see Section 2.2.1, "Specification of the surface form of valency modifications").

The question mark preceding the functor specification indicates optionality; if the question mark is not present, the modification is obligatory.

Examples of valency frame records:

Empty valency frames. Valency frames may also be empty, i.e. they may contain no valency positions (valency modifications in the narrow sense). Such a valency frame is specified as EMPTY.

EMPTY is used:

2.2.1. Specification of the surface form of valency modifications

The surface form of a valency modification is the form in which the given modification is represented on the analytical level (i.e. on the level (lower than the tectogrammatical one) where all words contained in the surface form of a sentence are present). The surface form specification contains the following information:

  • the syntactic dependency of a given modification;

  • the requirements as to the part_of_speech characteristics and morphemics of the given modification.

Sometimes, it is necessary to specify the lemma of a preposition, for example.

In PDT, in contrast to the original system of valency frame representation (known from the literature on valency), an enhanced encoding system is used, which enables a uniform treatment of simple cases (like capturing the case requirements independently on the part-of-speech membership of the modification and other properties), as well as more complex cases (like idiomatic expressions; see Section 2.2.2, "Valency frames of idiomatic expressions (phrasemes) and complex predicates").

Dependency specification. To indicate the dependency, square brackets ( [ ] ) are used; sister nodes are separated by a comma ( , ). The notation is, then:

  • governing-node[dependent-node1,dependent-node2].

Specification of the part-of-speech membership and morphemic properties. The requirements as to the part of speech and morphemics of individual nodes are encoded in an abbreviated form (using one symbol for each class), introduced after a period or colon (the difference between a period and colon - as a means of separating the lemma and the morphological information - is discussed in Section 5, "The notation of valency frames and its semantics"), in the following order: part of speech, gender, number, case, degree. 4 means that the governing verb requires a modification in the accusative case, P6 refers to 'locative plural'. If a surface-level category is not specified, it means that the given valency modification may get any value of the category.

Examples of surface-form encoding:

  • nominative: .1

  • accusative: .4

  • adjective in instrumental: .a7

  • possessive pronoun or adjective: .u

  • numeral: .m

  • pronoun: .p

  • infinitive: .f

  • adverb: .d

  • interjection: .i

  • subordinate clause, with any kind of conjunction: j[.v]

  • (asyndetic) content clause (a subordinate clause beginning with a relative pronoun/adverb): .c

  • direct speech: .s

  • feminine: .F

  • singular: .S

In some cases, surface-form specifications include also the information regarding the analytical forms (lemmas) of dependent (analytical) nodes, which are part of the surface form of a modification in question; these are prepositions, subordinating conjunctions and also dependent parts of idiomatic expressions (see Section 2.2.2, "Valency frames of idiomatic expressions (phrasemes) and complex predicates"). For example, the requirement that a modification have a form of a subordinate clause with the conjunction že is encoded like this: že[.v].

For the sake of simplicity, when specifying what kind of prepositional phrase is required by a given verb, an abbreviated form is used (for the list, see Section 5.5, "Abbreviated forms of realization records"). For example, na+4 is short for: na-1[.4].

Cf. other cases of surface-form specifications:

  • preposition o plus a noun in locative: o+6

  • preposition bez plus a noun in genitive: bez+2

  • complex preposition na rozdíl od plus a noun in genitive: od[na,rozdíl,.2]

  • a subordinate clause with the conjunction aby: aby[.v]

Surface-form specification contains all surface-form variants of a given modification found in the analyzed data, stylistic variants included. What is not included in the information on the surface form of a modification are the changes in form that result from productive processes (syntactic transformations, e.g. passivization, reciprocity); for a detailed discussion see Section 2.2.3, "Productive changes in the surface form (not specified in the valency frames)".

A surface form of an obligatory adjunct is usually not specified, which means that all usual forms can be used. This is indicated by the star symbol ( * ), which is used instead of the explicit specification of a surface form. With arguments, the surface forms are always specified.

2.2.2. Valency frames of idiomatic expressions (phrasemes) and complex predicates

Idiomatic expressions (see Section 8, "Idioms (phrasemes)") and complex predicates (see Section 9.3, "Complex predicates") represent more complex cases; their dependent parts are included in the valency frames of the relevant head verbs, among other valency modifications (their functor is CPHR or DPHR).

Valency frames of idiomatic expressions. Specifying the surface forms may get rather complicated with idiomatic expression. When specifying the surface form of the dependent part of an idiomatic expression, it is necessary to capture the following facts: how many parts (words) the dependent part has, what are their morphological categories and often also the precise lexical content of these parts. There is a convention adopted for representing these requirements.

Examples:

  • the valency frame entry for the idiom jít příkladem (=be an example to sb):

    ACT(.1) DPHR(příklad.S7)

    šla příkladem (=she was an example to sb)

    BEN šla jí příkladem (=she was an example to her)

    MEANS svým chováním (=with her behavior)

  • the valency frame for the idiom: lapat po dechu (=gasp for breath):

    ACT(.1) DPHR(po-1[dech.S6])

    lapat po dechu (=gasp for breath)

  • the valency frame for the idiom běhá mráz po zádech (=approx.: give sb the creeps, the experiencer is in the dative, the source is a PP):

    ACT(.3) DPHR(mráz.S1,po-1[záda:P6])

    mráz mi běhal po zádech (=it gave me the creeps)

Valency frames of complex predicates. All complex predicates (with the CPHR functor) that have the same verb in their verbal part and the nominal part of which may be formed by various synonyms and antonyms are assigned the same valency frame. The surface form of the nominal part of a complex predicate is specified as follows: a list of possible synonyms and antonyms in curly brackets ( { } ) is followed by the list of possible morphemic forms. The list of the synonyms and antonyms (their lemmas) ends with three dots, which indicates that the list is not exhaustive; it only contains the cases collected so far. The rule of argument shifting does not apply here (see Section 2.1.4, "Criteria for determining the type of argument (the principle of shifting)").

Valency frames of complex predicates are discussed in more detail in Section 9.3.3, "Valency frames of complex predicates".

2.2.3. Productive changes in the surface form (not specified in the valency frames)

Surface-form specifications contain all variants found in the analyzed data, with certain exceptions, though. These exceptions are cases when the change (shift) in form is caused by a productive process.

The cases when a surface-form variant is not recorded in the valency lexicon include:

  • passivization.

    A valency frame only specifies those surface forms that occur in active sentences. When a verb is used in its passive form, the surface forms of some of its modifications (these are usually the Actor and Patient) change in a predictable way. These surface forms are not included in the valency frames.

    Example:

    • Stavební firma.ACT postavila dům.PAT (=The building company built a house.)

      Passive: Dům.PAT byl postaven stavební firmou.ACT (=The house.NOM was built by a building company.INSTR)

      The nominative case the Patient gets as a result of passivization is not included in the surface form variants of the argument. Similarly, the instrumental case the Actor gets is not among the possible surface forms of the argument.

      The valency frame of the verb postavit (=build):

      ACT(.1) PAT(.4) ?ORIG(z+2)

    • Stavební firma.ACT staví dům.PAT (=The building company is building a house.)

      Passive: Dům.PAT se staví. (=The house is being built; lit. House REFL builds)

      The nominative case the Patient gets as a result of passivization is not included in the surface form variants of the argument. The presence of the reflexive se is not indicated (as a possibility) in the valency frame either.

      The valency frame of the verb stavět (=build.IMPF):

      ACT(.1) PAT(.4) ?ORIG(z+2)

  • resultative constructions.

    The surface form variants that are the result of a verb occuring in a resultative construction (resultative=res1; see Section 5.14, "The resultative grammateme (resultative aspect)"), are not indicated in the valency frame of the verb.

    Example:

    • Otec.ACT pronajal auto sousedovi.ADDR (=Father rented out a car to a neighbour.)

      Resultative: Soused.ADDR má auto pronajato od otce/otcem.ACT (=lit. Neighbour.NOM has car rented from/by Father.)

      The nominative case the Addressee gets as a result of the verb being in the resultative aspect is not included in the surface form variants of the argument. Similarly, the instrumental case the Actor gets (or the PP form od+2) is not among the possible surface forms of the argument.

      The valency frame of the verb pronajmout (=rent out):

      ACT(.1) PAT(.4) ADDR(.3)

  • dispositional modality.

    The surface form variants that are the result of a verb occuring in a construction with the dispositional modality meaning (dispmod=disp1; see Section 5.11, "The dispmod grammateme (dispositional modality)"), are not indicated in valency frame of the verb.

    Example:

    • Žáci.ACT počítají příklady.PAT (=The pupils are doing exercises.)

      Dispositional modality construction: Příklady.PAT se žákům.ACT počítají dobře.MANN (=lit. Examples.NOM REFL pupils.DAT count/do well.)

      The nominative case the Patient gets as a result of being in a construction with the dispositional modality meaning is not included in the surface form variants of the argument. Similarly, the dative case the Actor gets is not among the possible surface forms of the argument. The presence of the reflexive se or the obligatory presence of a manner adverbial are not indicated in the valency frame either.

      The valency frame of the verb počítat (=count):

      ACT(.1) PAT(.4,že[.v],zda[.v],jestli[.v],.v[kolik])

  • forms used for expressing subtle shifts in the meaning of arguments.

    The basic form of an argument (e.g. the nominative for the Actor or accusative for the Patient) may be replaced by another form if a slightly different/more specific meaning (captured by a subfunctor) is to be expressed. These forms are used for a given meaning (subfunctor) regularly, therefore, they are not listed as possible forms of particular valency modifications (in individual valency frames).

    These are the following forms:

    • genitive (of negation, partitive g.).

      Examples:

      Ta vesnice má vodu.PAT (=The village has water.NOM)Ta vesnice nemá vody.PAT (=The village doesn't have (any) water.GEN)

      Ubývá voda.ACT (=The water.NOM is disappearing.)Ubývá vody.ACT (=The water.GEN is disappearing.)

      Dodal sůl.PAT (=He added salt.ACC)Dodal soli.PAT (=He added salt.GEN)

      On má knihy.PAT (=He has books.ACC)On má knih.PAT (=He has (lots of) books.GEN)

    • po+6.

      Examples:

      Na každé větvi viselo jablíčko.ACT (=lit. On each branch hung apple.NOM)Na každé větvi viselo po jablíčku.ACT (=lit. On each branch hung PO apple.LOC; the distributive meaning made more explicit)

      Dal každému dítěti jablíčko.PAT(=lit. (He) gave each child apple.ACC)Dal každému dítěti po jablíčku.PAT (=lit. (He) gave each child PO apple.LOC; the distributivity strengthened)

    • na+4

      Examples:

      Sto.ACT mušek rozžehlo si světla v trávě. (=lit. Hundred.NOM (fire)flies lit REFL lights in grass.)Na sta.ACT mušek rozžehlo si světla v trávě. (=lit. NA hundreds.ACC (fire)flies lit REFL lights in grass; quantity emphasized)

      Roznesl stovky.PAT letáků. (=lit. (He) distributed hundreds.ACC leaflets.)Roznesl na stovky.PAT letáků. (=lit. (He) distributed NA hundreds.ACC leaflets; quantity emphasized)

    • okolo+2

      Examples:

      Deset knih.ACT leží na stole. (=lit. Ten books lie on table.)Okolo deseti knih.ACT leží na stole. (=lit. About ten.GEN books lie on table; meaning: approximately)

      Má deset knih.PAT (=lit. (He) has ten books.)okolo deseti knih.PAT (=lit. (He) has about ten.GEN books; i.e. approximately)

    • kolem+2

      Examples:

      Deset knih.ACT leží na stole. (=lit. Ten books lie on table.)Kolem deseti knih.ACT leží na stole. (=lit. About ten.GEN books lie on table; i.e. approximately)

      Má deset knih.PAT (=lit. (He) has ten books.)kolem deseti knih.PAT (=lit. (He) has about ten.GEN books; i.e. approximately)

    • nad+4

      Examples:

      Deset knih.ACT leží na stole. (=lit. Ten books lie on table.)Nad deset knih.ACT leží na stole. (=lit. Above ten.ACC books lie on table; meaning: more than)

      Má deset knih.PAT (=lit. (He) has ten books.)nad deset knih.PAT (=lit. (He) has above ten.ACC books; i.e. more than)

    • pod+4

      Examples:

      Deset knih.ACT leží na stole. (=lit. Ten books lie on table.)Pod deset knih.ACT leží na stole. (=lit. Under ten.ACC books lie on table; meaning: less than)

      Má deset knih.PAT (=lit. (He) has ten books.)pod deset knih.PAT (=lit. (He) has under ten.ACC books; i.e. less than)

      Examples:

    • přes+4

      Examples:

      Deset knih.ACT leží na stole. (=lit. Ten books lie on table.)Přes deset knih.ACT leží na stole. (=lit. Over ten.ACC books lie on table; meaning: more than)

      Má deset knih.PAT (=lit. (He) has ten books.)přes deset knih.PAT (=lit. (He) has over ten.ACC books; i.e. more than)

    • k+3

      Examples:

      Deset knih.ACT leží na stole. (=lit. Ten books lie on table.)K deseti knihám.ACT leží na stole. (=lit. Towards ten.DAT books lie on table; meaning: approximately)

      Má deset knih.PAT (=lit. (He) has ten books.)k deseti knihám.PAT (=lit. (He) has towards ten.DAT books; i.e. approximately)

    • do+2

      Examples:

      Deset knih.ACT leží na stole. (=lit. Ten books lie on table.)Do deseti knih.ACT leží na stole. (=lit. Up_to ten.GEN books lie on table; i.e. maximum)

      Má deset knih.PAT (=lit. (He) has ten books.)do deseti knih.PAT (=lit. (He) has up_to ten.GEN books; i.e. maximum)

    • od+2

      Examples:

      Deset knih.ACT leží na stole. (=lit. Ten books lie on table.)Od deseti knih.ACT leží na stole. (=lit. From ten.GEN books lie on table; i.e. minimum)

      Má deset knih.PAT (=lit. (He) has ten books.)od deseti knih.PAT (=lit. (He) has from ten.GEN books; i.e. minimum)

    • od+2; (přes+4); do+2 (and other forms used for referring to intervals; see Section 16.2, "Operators")

      Examples:

      Deset knih.ACT leží na stole. (=lit. Ten books lie on table.)Od pěti do deseti knih.ACT leží na stole. (=lit. From five.GEN to ten.GEN books lie on table; i.e. an interval is given)

    !!! The presented meanings (partitivity, distributivity, approximation) are going to be represented by subfunctors (assigned to arguments) in a future version of PDT.

  • reciprocity.

    The fact that the sentence has a reciprocal meaning is signalled by the presence of se (mezi sebou, k sobě (=lit. among themselves, to themselves; meaning: with/to/... each other)). These expressions are understood as a formal means of expressing reciprocity; they are not recorded in the valency frames (i.e. in their surface-form specification part). For more details see Section 2.4.2.1, "Valency frames and reciprocity".

    A typical form used for expressing reciprocity is the form mezi+7 (=between/among + instrumental). The form mezi+7 is not included in the list of possible surface forms of an argument; it is a regular way of expressing reciprocity (see also Section 2.4.2.1, "Valency frames and reciprocity")

  • numeral+noun constructions.

    Certain numeral+noun constructions (see Section 10.1.1, "Numerals with the role of an attribute (RSTR)") are analyzed in such a way that the formally dependent noun (in genitive) is understood as the governing node of the construction whereas the formally governing numeral is taken to be the dependent node (i.e. on the tectogrammatical level). If a numeral+noun expression is in a valency position, the surface form of the governing node (of the modification in question) is genitive; however, this genitive form is not listed as a possible surface form of the given valency modification. It is the dependent node that has the appropriate surface form (i.e. listed in the valency frame for the given argument) here.

    Example:

    • Dívky.ACT koupily dětem čokoládu.PAT (=The girls bought the children chocolate.ACC)

      Numeral+noun expressions: Dvě dívky.ACT koupily dětem hodně čokolády.PAT (=Two girls bought the children a lot of chocolate.GEN)

      The genitive form is not included in the list of possible surface forms of the Patient (or Actor etc.).

      The valency frame of the verb koupit (=buy):

      ACT(.1) PAT(.4) ?ADDR(.3,pro+4) ?ORIG(od+2)

  • coordination and apposition.

    If a valency position is occupied by a coordination or apposition (see Section 6, "Parataxis") only the form of the first conjunct is recorded in the valency frame in some cases, which is relevant namely in the following cases:

    • the second conjunct is a relative clause with the connective což (see Section 5.4.1.1, "Constructions with the connectives "což", "přičemž", "načež", "pročež", "začež", "aniž"").

      Example:

      • Obdržel sto.PAT korun, což není.PAT málo. (=He received one hundred crowns, which is not little.)

        The relative clause with the connective což is taken to be a Patient of the verb obdržet (=receive), which is in apposition with sto korun (=one hundred crowns). The list of possible surface-forms of the Patient only contains the form of the first conjunct.

        The valency frame of the verb obdržet (=receive):

        ACT(.1) PAT(.4) ?ORIG(od+2;z+2)

    • appositions with the conjunction jako (see Section 6.2.1.3, "Apposition with the conjunction "jako"").

      Example:

      • Rád hraje skladby.PAT , jako je.PAT ta, co jsme právě slyšeli. (=He likes to play pieces like the one we've just heard.)

        The clause with the conjunction jako is analyzed as a Patient of the verb hrát (=play), which is in apposition with skladby (=piece). The list of possible surface-forms of the Patient only contains the form of the first conjunct.

        The valency frame of the verb hrát (=play):

        ACT(.1) PAT(.4)

    • coordinations with "atd.", "apod." (see Section 6.2.1.1, "Coordination with "atd.", "apod.", "aj."").

      Example:

      • Koupili jsme papíry.PAT , tužky.PAT atd..PAT (=We bought paper, pencils etc.)

        The abbreviation atd. (=etc.) is analyzed as a Patient of the verb koupit (=buy), which forms a coordination with papíry (=papers) and tužky (=pencils). The list of possible surface-forms of the Patient only contains the form of the first (and the second) conjunct.

        The valency frame of the verb koupit (=buy):

        ACT(.1) PAT(.4) ?ADDR(.3,pro+4) ?ORIG(od+2)

Transformational rules may be applied to the original valency frame - this is a guarantee (or rather, a way of testing) that the verb was assigned a correct valency frame.

2.2.4. Valency lexicon

Valency frames (assigned to individual meanings of words) are recorded in the valency lexicon. The valency lexicon contains valency frames of semantic verbs, nouns, adjectives and adverbs. Individual valency frames are clustered on the basis of what t-lemma they are related to (for a discussion of t-lemmas see Chapter 4, Tectogrammatical lemma (t-lemma)).

The valency lexicon does not contain t-lemma substitutes (#Colon, #EmpVerb etc.) and t-lemmas of those nodes present at the surface level that are expressed by pronouns (it means that the valency lexicon does not contain t-lemmas like the following: který, jaký (=which, what) etc.). For a discussion of pronouns standing in place of lexical units with subcategorization requirements, see Section 2.4.3.4, "Pronouns in place of words with valency".

The valency lexicon was being constituted during the annotation; therefore, only those verbs, nouns, adjectives and adverbs - i.e. those of their meanings - are included which occured in the analyzed data. For example, if a verb has two different valency frames in the lexicon, it means that these two meanings of the verb were found in the analyzed data; however, the given verb may have other meanings (i.e. other valency frames), too.

!!! The current version of the valency lexicon contains:

  • valency frames of all semantic verbs (and verbal idioms) found in the analyzed data.

  • valency frames of those semantic nouns which constitute the nominal part of complex predicates (i.e. those with the CPHR functor), found in the analyzed data.

  • valency frames of those semantic nouns, adjectives and adverbs that have at least one argument as their daughter node, i.e. a node with one of the following functors: ACT, PAT, ADDR, EFF or ORIG.

  • valency frames for non-verbal idioms if the governing node is either a semantic adverb or a semantic noun.

  • valency frames of non-verbal idioms if the governing node is a semantic verbal noun (a noun ending with -ní or -tí). For other nouns to be included in the valency lexicon, they need to meet certain conditions.

The valency lexicon only contains the t-lemmas of those nodes that have the value of the nodetype attribute specified as complex. The t-lemmas of traditional verbs, nouns, adjectives and adverbs the nodetype attribute of which has a value other than complex (according to the rules in Chapter 3, Node types) are not included in the valency lexicon (even if they have argument modifiers).