1.     Agreement

 

 

Authors:           Jiří Hana
                        Elke Teich
                        Hana Skoumalová
                        Kamenka Staykova
                        Lena Sokolova
                        Michael V. Boldasov


Table of contents of the chapter:

1.1       Introduction. 3

1.1.1    Morphological abbreviations. 3

1.2       Subject-Predicate Agreement 4

1.2.1    Analysis. 4

1.2.1.1     Compound verbal forms. 6

1.2.1.2     Coordinated subject 7

1.2.2    Implementation. 8

1.2.2.1     Inflection of the Thing. 8

1.2.2.2     Subject side systems. 9

1.2.2.3     Information passing systems. 10

1.2.2.4     Coordination. 11

1.2.2.5     Language differences. 11

1.2.2.6     Example of generation. 12

1.3       Agreement within nominal group. 14

1.3.1    Analysis. 14

1.3.2    Implementation. 14

1.3.2.1     Status, provenance, …... 14

1.3.2.2     Ordinal numerals. 15

1.3.2.3     Cardinal numerals. 16

1.3.2.4     Deictic. 17

1.3.2.5     Language differences. 17

1.3.2.6     Example of generation. 18

1.4       Agreement of subject and predicative adjective. 19

1.4.1    Analysis. 19

1.4.2    Implementation. 20

1.4.2.1     Preselections on the clause rank. 20

1.4.2.2     Adjective group rank. 20

1.4.2.3     Example of generation. 20

1.4.2.4     Language differences. 21

1.5       Conclusion. 22

1.6       References. 22

1.1        Introduction

Agreement or congruence can be described as two (or more) syntactical units sharing particular grammatical features, e.g., case, number, gender or person. In Czech, Bulgarian and Russian, we can distinguish three kinds of agreement:

1.      Subject – predicate. Present in Bulgarian, Czech and Russian

(1)                        E:      The line disappeared.

Bg:    Линията    е          изчезнала.

L:      line-FSg      is-Sg3 disappeared-FSg

Cz:    Úsečka        zmizela.

L:      Line-FSg     disappeared-FSg3.

2.      Subject – predicative adjective agreement. Present in Bulgarian, Czech and Russian

(2)                        E:      Command is accessible.

Bg:    Командата                  е          достъпна.

L:      command-FSg              is-Sg3 accessible-FSg

Cz:    Příkaz                            je          dostupný.

L:      Command-ISgNom      is-Sg3 accessible-NomISg.

Ru:    Команда                 доступна.

L:      Command-FSg       is-accessible-FSg.

3.      Agreement within the nominal group. Present in Czech, Russian

(3)                        E:      Enter the fifth external point.

Bg:    Задайте     петата          външна           точка.

L:      Enter-Pl2    fifth-FSg          external-FSg point-FSg

Cz:    Zadejte        pátý                 externí                   bod.

L:      Enter-Pl2    fifth-ISgAcc    external-ISgAcc  point-ISgAcc

Ru:    Введите     пятую             внешнюю              точку.

L:      Enter-Pl2    fifth-FSgAcc   external-FSgAcc point-FSgAcc

Agreement can be classified also from another perspective: syntactical or semantic. Although semantic agreement is also present in described languages, it is quite rare and not important for our domain, therefore in the following we will deal only with the syntactical one.

1.1.1        Morphological abbreviations

Throughout this chapter, in word-by-word translation to English we use following abbreviations of morphological categories. Categories present in the abbreviations are in the following order: POS[1], Gender, Number, Case and Person. A category is omitted if it is not relevant (e.g. case for finite verb) or not interesting in the given context. The possible values for each category are the following (of course, not all are present in all languages):

·        POS: Adj (adjective), PastPart (past participle), etc.

·        Gender: M (masculine, in Czech masculine animate), I (masculine inanimate[2]), F (feminine), N (neuter)

·        Number: Sg (singular), Pl (plural)

·        Case[3]: Nom (nominative), Gen (genitive), Dat (dative), Acc (accusative), Voc (vocative), Loc (locale) and Ins (instrumental)

·        Person: 1, 2, 3

Therefore, for example: FSg means feminine singular, Sg3 means Singular third person and NSgNom means neuter, singular, nominative.

 

Czech

Russian

Bulgarian

Gender

 

M, F, N

 

I

 

 

Number

Sg, Pl

Case

Nom, Gen, Dat, Acc, Loc, Ins

Voc

 

Person

1,2,3

Table 1 – Comparison of morphological features relevant for agreement

1.2        Subject-Predicate Agreement

1.2.1        Analysis

In Czech, Bulgarian and Russian, a predicate[4] usually agrees with its nominative subject in person, number and gender (if applicable). 

(4)                        E:      Command was accessible.

Bg:    Командата           беше         достъпна.

L:      Command-FSg       was-Sg3    accessible-FSg

Cz:    Příkaz                      byl[5]            dostupný.

L:      Command-ISg        was-ISg3  accessible-ISg.

(5)                        E:      Command was accessible.

Bg:    Командата           е                 достъпна.

L:      Command-FSg       is-Sg3        accessible-FSg

Cz:    Příkaz                      je                dostupný.

L:      Command-ISg        is-Sg3        accessible-ISg.

(6)                        E:      The system enables you to create a multiline style …

Bg:    Системата     позволява            да създадете...

L:      System-FSg             enable-Sg3     create-inf.

Cz:    Systém               umožňuje        vytvářet    styly    multičár          ...

L:      System-ISg        enables-3Sg   to-create  styles   of-multilines   ...

This holds even if this subject is realized by a zero pronoun (so called pro-drop)[6].

(7)                        E:      Enter the distance between ...

Bg:    Задайте     разстоянието    между...

L:      enter-Pl2    distance                between…

Cz:    Zadejte        vzdálenost                   mezi           ...

L:      Enter-Pl2    distance                between          ....

Ru:    Введите     расстояние         между           

L:      Enter-2Pl    distance                between          ....

(8)                        E:      Enter the distance between ...

Bg:    Вие               задайте         разстоянието    между...

L:      you-Pl2       enter-Pl2        distance                between…

Cz:    Vy                 zadejte             vzdálenost                  mezi           ...

L:      You              enter-2Pl        distance                between          ....

If the subject is in a case different from nominative[7] (e.g., in genitive)

(9)                        E:      Five points disappeared.

Cz:    Pět         bodů                      zmizelo.

L:      Five       points-IPlGen             disapeared-NSg3.

or the category of case is inappropriate for the subject (infinitival or sentential subjects)[8].

(10)                     E:      To open a drawing is simple.

Cz:    Otevřít         kresbu                          je   jednoduché.

L:      To-open             drawing-FsgAcc  is   simple.

or if the verb has no subject at all (e.g. meteorological verbs or certain feelings verbs)[9]

(11)                     E:      It rains.

Bg:    Вали.

L:      Rains-Sg3    

Cz:    Prší.

L:      Rains-Sg3

(12)                     E:      I am cold.

Bg:    Студено     ми              е.

L:      Cold                   I-Dat   is-Sg3

Cz:    Je                 mi        zima.

L:      Is-3Sg          I-Dat   cold

(13)                     E:      The button will be clicked[10].

Cz:    Klepne        se         na        tlačítko.

L:      Click-Sg3   refl             on        button.

then the verb is assigned the default category of gender, number and person, which is neuter, singular and 3rd person.

Number of the predicate is determined by grammatical number of the subject, no matter if it denotes single object or set of objects.

(14)                     E:      The scissors disappeared.

Cz:    Nužky                       zmizely.

L:      Scissors-IPlNom    disappeared-IPl3.

1.2.1.1     Compound verbal forms

Compound verbal forms consist of finite forms of auxiliary verb and nonfinite forms (infinitive, participle) of the meaningful verb. For example, in Czech there are following compound verbal forms:

future tense

aux + infinitive

já budu volat

past tense

aux + past participle[11]

já jsem volal

present conditional

aux + past participle

já bych volal

past conditional

present cond. of aux[12] + past participle

já bych byl volal

passive

aux + passive participle

já jsem volán

For detailed description of compound verbal forms (see ## Chapter 4. Mood and modality). All these words (except infinitive) have to agree with the subject in the same way as finite verb does[13]. The only difference is the set of morphological categories the word accepts:

Language

Verbal form

Gender

Number

Person

Bulgarian, Czech, Russian

finite

­­–

+

+

participles

+[14]

+

Bulgarian

infinitive (da construction)

+[15]

+

 

(15)                     E:      You can save the line.

Bg:    Вие              можете         да запазите              линията.

L:      you-Pl2       can-Pl2           save-Pl2DaConstr    line

1.2.1.2     Coordinated subject

In Czech and Russian, agreement with coordinated subject is rather complicated. For our domain, we can simplify the problem by assuming that number of predicate with coordinated subject is always plural and that person has to be uniform across the nominal group. For more detailed description of this problem for Czech see [Bémová 1995]

For Czech, gender of predicate will be minimal gender of participants of coordination, computed under following order[16]: m < i < f < n for Czech. This covers also trivial case when the gender of all participants is the same. For Bulgarian and Russian, this is not important because gender is not distinguished in plural.

(16)                     E:      The line and the box were deleted.

Cz:    Úsečka        a          políčko          byly                 smazány.

L:      Line-FSg     and      field-NSg        were-FPl3      deleted-FPl

Ru:    Линия         и          окно                 были               удалены.

L:      Line-FSg     and      field-NSg        were-Pl           deleted-Pl

However in Czech, there is an exception: if all participants have neuter gender and at least one is in singular then the gender of the predicate is feminine[17]:

(17)                     E:      The button and the box were enabled.

Cz:    Tlačítko                   a         políčko     byly                  povolené.[18]

L:      Button-NSg             and      field-NSg  were-FPl3      enabled-FPl

For more detailed description of agreement in Czech see [Kopečný 1962]

1.2.2        Implementation

The main problem with implementation is that number and especially gender of subject are not known when it is possible to inflectify finite, and it is not possible to inflectify finite when they are known. Therefore, it is necessary to use agreement operator (in simplified notation =). To make the treatment consistent we will use the same mechanism also for person.[19]

Second problem is that we do not know how to implement linguistically plausible notion of default values for finite. Therefore, we will handle this case in the same way as normal agreement (determine the values in subject side systems and pass them by agreement operators to predicate). Of course, this does not work for sentences without subject, however such sentences are not in our domain.

Of course, each language uses only systems that it needs (Bulgarian omits systems dealing with case, Bulgarian and Russian omits inanimate gender, etc.)

1.2.2.1     Inflection of the Thing

The agreement systems are all heavily dependent on the inflectional properties of the noun or pronoun (even not inserted) realizing the subject. Therefore, we present the features for these properties first:

For case:

      Thing-Case-<C>
         where <C>
Î {Nom, Gen, Dat, Acc, Voc, Loc, Ins}

For gender:

      Thing-Gender-<G>
         where <G>
Î {M, I, F, N}

For number:

      Thing-Number-<N>
         where <N>
Î {Sg, Pl}

Not all of these properties are present in all languages and properties for one category need not to be in one system.

1.2.2.2     Subject side systems

These systems determine the categories of predicate depending on the categories of the subject. We can distinguish two cases – the predicate does (SVAgreement) or does not (SVNoAgreement) agree with its subject. It does when the subject is in nominative, it does not otherwise (genitive subject[20])

SVAgreement(Thing-Case-Nom)
   [SVAgreement]

SVNoAgreement(Thing-Case-Gen)
   [SVNoAgreement]

Systems determining gender of the predicate (neuter is default):

Subj-Agr-Gender-<G> (Thing-Gender-<G> & SVAgreement)
   [Subj-Agr-Gender-<G>
]
   where
<G> Î {M,I,F}

Subj-Agr-Gender-N (Thing-Gender-N or SVNoAgreement)
   [Subj-Agr-Gender-N
]

Systems determining number of the predicate (singular is default):

Subj-Agr-Number-Sg (Thing-Number-Sg or SVNoAgreement)
   [Subj-Agr-Gender-Sg
]

Subj-Agr-Number-Pl (Thing-Number-Pl & SVAgreement)
   [Subj-Agr-Number-Pl
]

Systems determining person of the predicate (3rd person is default):

Subj-Agr-Person-<P> (Pronoun-Person-<P> & SVAgreement)
   [Subj-Agr-Person-<P>
]
   where
<P> Î {1,2}

Subj-Agr-Person-3
   (Pronoun-Person-3 or nominal-term-resolution or SVNoAgreement)
   [Subj-Agr-Person-3
]

1.2.2.3     Information passing systems

These systems are used to pass information determined by subject systems to appropriate words of predicate.

System passing information to finite:

SUBJECT-FINITE-AGREEMENT (Finite-Inserted & Subject-Inserted)
   [Subject-Finite-Agreement]
      (Subject = Finite
            (Subj-Agr-Number-Sg ~ :::Number-Sg-Form)
            (Subj-Agr-Number-Pl ~ :::Number-Pl-Form)

            (Subj-Agr-Person-1 ~ :::Person-1-Form)
            (Subj-Agr-Person-2 ~ :::Person-2-Form)
            (Subj-Agr-Person-3 ~ :::Person-3-Form))

This system ensures that when Subject side systems determine number and person of the predicate (i.e. enters feature Subj-Agr-*-*), finite is inflectified appropriately.

 

System passing information to past or passive participles:

Subject-AuxStem-Agreement
   ( (Past-Participle-Inserted | Participle-Passive) &
      Subject-Inserted)
   [Subject-AuxStem-Agreement]
      (Subject = AuxStem
            (Subj-Agr-Number-Sg ~ :::Number-Sg-Form)
            (Subj-Agr-Number-Pl ~ :::Number-Pl-Form)

            (Subj-Agr-Gender-M ~ :::Gender-M-Form)
            (Subj-Agr-Gender-I ~ :::Gender-I-Form)
            (Subj-Agr-Gender-F ~ :::Gender-F-Form)
            (Subj-Agr-Gender-N ~ :::Gender-N-Form))

This system ensures that when Subject side systems determine number and gender of the predicate, participle is inflectified appropriately.

Bulgarian resources use similar system to ensure agreement with infinitive (da construction), it connects it by agreement with finite:

AUXSTEM-INSERT (Modal | P-Future | Da-phase)
   [Auxstem-Inserted]
      + Auxstem
      + Da-particle
      ! Da-particle da
      ^ Da-particle ^ Auxstem
      (Finite = AuxStem
      (Person-First-Form  ~ :::Person-First-Form)
      (Person-Second-Form ~ :::Person-Second-Form)
      (Person-Third-Form  ~ :::Person-Third-Form)
      (Number-Sg-Form ~ :::Number-Sg-Form)
      (Number-Pl-Form ~ :::Number-Pl-Form))

1.2.2.4     Coordination

Even if we simplify the problem by assuming that that number of predicate with coordinated subject is always plural and that person has to be uniform across the nominal group, it still remains to determine gender of the predicate and than inflectify it appropriately[21]. The former is possible by determining minimal value of gender by comparing two adjacent members of coordinated subject each time. However the latter seems to be impossible in current version of KPML – we need to pass information up (similar to gender and person for simple subject), but we need to pass it across more than one rank – therefore we cannot use agreement operator.

As a (very inelegant) solution we apply the feminine gender to the predicate by default:

  1. In plural, forms participles, etc in feminine gender are the forms in masculine inanimate
  2. Feminine and masculine inanimate forms are the most probable in texts of CAD/CAM domain
  3. In our corpus, there are no coordinated subjects

1.2.2.5     Language differences

1.2.2.5.1     Czech

Previous implementation (except system Auxstem-insert for Bulgarian) describes Czech, because it has all morphological features present in other two languages.

1.2.2.5.2     Bulgarian

Most of the differences in Bulgarian are implications of the fact that Bulgarian does not have cases. In Bulgarian Subject is always agreed with the predicate, so some of Subject side systems described above (and in particular, fork for applying default no-agreement cases (SVNoAgreement)) are not needed. The predicate in Bulgarian can contain "da-construction" and the system dealing with Subject-"da-construction" agreement in person and number is presented above.

1.2.2.5.3     Russian

Russian resources model Subject – predicate agreement as far as agreement in Nominal group. It is very similar to Czech language – both keep their inflectional character. In agreement Russian is the same as Czech (besides it does not have vocative and masculine inanimate gender). The implementation is similar to the Czech implementation. So we do not consider the technical details here. The resulting graph structure with grammar form for 2nd person plural indicative is shown in Figure 2. In Figure 3 we also show passive construction where the zero auxiliary verb with grammar characteristics is shown and agreement in gender with passive participle. The zero auxiliary is a very specific character of Russian influencing in particular the non pro-dropping feature of the language in difference to Bulgarian and Czech.

1.2.2.6     Example of generation

The Figure 1 depicts structure graph of the following sentence:

(18)                     E:      You enter the command.

Cz:    Vy                 zadáte       příkaz.

L:      You-MPl2   enter-2Pl  command

From inflection features (in boxes), you can see that Finite has the same number and person as subject has.

Figure 1 – Subject – predicate agreement (Cz)

The Figure 2 depicts structure graph of the following sentence:

(19)                     E:      You draw an arc.

Ru:    Вы                нарисуете     дугу

L:      You-MPl2   enter-2Pl        command

Figure 2 – Subject – predicate agreement (Ru)

Figure 3 – Passive construction with zero auxiliary verb agreement (Ru)

1.3        Agreement within nominal group

1.3.1        Analysis

Within the nominal group, there is agreement between the head (pro)noun (Thing) and premodifiers, i.e., deictics and qualities, such as Status, Provenance, Age, Size and Colour.

1.3.2        Implementation

Each language uses only systems that it needs (Bulgarian omits systems dealing with case, Bulgarian and Russian omits inanimate gender, etc.)

For discussion about implementation of agreement with coordinated subject see 1.2.2.4

1.3.2.1     Status, provenance, …

1.3.2.1.1     Higher rank – Nominal group

In the same way as Nigel does, we distinguish five types of qualities: Status, Provenance, Age, Size and Colour. The systems accounting for the types of possible qualities take the following form:

<X>-MODIFICATION (Nominal)
    [<X>-Modified]
      ! <X>
      <X>:Adjectival-group
      <X>:Congruent
   [Not-<X>-Modified]
   Chooser   <X>-Modification-Chooser
   where <X>
Î {Status, Provenance, Age, Size, Colour}

As and example, we show the system for Status  (<X> = Status):

Status-MODIFICATION (Nominal)
    [Status -Modified]
      ! Status
      Status:Adjectival-group
      Status:Congruent
   [Not-Status-Modified]
   Chooser   Status-Modification-Chooser

Preselection of <X> as Congruent ensures that on the lower rank (adjectival group rank) it is known if the adjectival group should agree (be congruent) with its head.

Inflection of adjectival group is driven by preselections in systems described by following template:

<X>-<C>-<V>-PR (Thing-<C>-<V> & <X>-Modified)
   [<X>-<C>-<V>-Pr] <X>::Quality-<C>-<V>
   where
      <X>
Î {Status, Age, Provenance, Size, Colour}
      <C>
Î {Case, Gender, Number}
      <V>
Î {Nom, Gen, Dat, Acc, Voc, Loc, Ins} for <C> = Case
      <V>
Î {M, I, F, N} for <C> = Gender
      <V>
Î {Sg, Pl} for <C> = Number

Therefore, there is 5*(7+4+2) = 65 systems. If we added more complicated cases of agreement (e.g. dual number), there would much more systems. Unfortunately, there is not easily possible to generate all of these systems from some template similar to the one shown. Example of system described by the template:

STATUS-NUMBER-PL-PR (Thing-Number-Pl & Status-Modified)
   [Status -Number-Pl-Pr] Status ::Quality-Number-Pl

1.3.2.1.2     Lower rank – Adjectival group

CONGRUENT-FORK (Adjectival-Group)
   [Congruent]
   [Not-Congruent]
   :Chooser  Under-Status-Chooser

Inflection of Quality is realized by following three systems:

QUALITY-CASE (Congruent)
   [Quality-Case-Nom] Quality:::Case-Nom-Form
   [Quality-Case-Gen] Quality:::Case-Gen-Form
   [Quality-Case-Dat] Quality:::Case-Dat-Form
   [Quality-Case-Acc] Quality:::Case-Acc-Form
   [Quality-Case-Voc] Quality:::Case-Voc-Form
   [Quality-Case-Loc] Quality:::Case-Loc-Form
   [Quality-Case-Ins] Quality:::Case-Ins-Form

QUALITY-GENDER (Congruent)
   [Quality-Gender-M] Quality:::Gender-M-Form
   [Quality-Gender-I] Quality:::Gender-I-Form
   [Quality-Gender-F] Quality:::Gender-F-Form
   [Quality-Gender-N] Quality:::Gender-N-Form

QUALITY-NUMBER (Congruent)
   [Quality-Number-Sg] Quality:::Number-Sg-Form
   [Quality-Number-Pl] Quality:::Number-Pl-Form

1.3.2.2     Ordinal numerals

1.3.2.2.1     Higher rank – nominal group

Preselections on this rank drive inflections on lower rank.

Numerative-<C>-<V>-PR (Thing-<C>-<V> & Numerified)
   [Numerative-<C>-<V>-Pr] Numerative:Temperer-<C>-<V>
   where
      <C>
Î {Case, Gender, Number }
      <V>
Î {Nom, Gen, Dat, Acc, Voc, Loc, Ins} for <C> = Case
      <V>
Î {M, I, F, N} for <C> = Gender

1.3.2.2.2     Lower rank

Following systems inflectify ordinal numeral depending on preselections form higher rank.

ORDINAL-CASE (Congruent)
   [Ordinal-Case-Nom] Ordinal:::Case-Nom-Form
   [Ordinal-Case-Gen] Ordinal:::Case-Gen-Form
   [Ordinal-Case-Dat] Ordinal:::Case-Dat-Form
   [Ordinal-Case-Acc] Ordinal:::Case-Acc-Form
   [Ordinal-Case-Voc] Ordinal:::Case-Voc-Form
   [Ordinal-Case-Loc] Ordinal:::Case-Loc-Form
   [Ordinal-Case-Ins] Ordinal:::Case-Ins-Form

ORDINAL-GENDER (Congruent)
   [Ordinal-Gender-M] Ordinal:::Gender-M-Form
   [Ordinal-Gender-I] Ordinal:::Gender-I-Form
   [Ordinal-Gender-F] Ordinal:::Gender-F-Form
   [Ordinal-Gender-N] Ordinal:::Gender-N-Form

ORDINAL-NUMBER (Congruent)
   [Ordinal-Number-Sg] Ordinal:::Number-Sg-Form
   [Ordinal-Number-Pl] Ordinal:::Number-Pl-Form

1.3.2.3     Cardinal numerals

1.3.2.3.1     Higher rank – nominal group

Preselections on this rank drive inflections on lower rank.

Numerative-<C>-<V>-PR (Thing-<C>-<V> & Numerified)
   [Numerative-<C>-<V>-Pr] Numerative:Temperer-<C>-<V>
   where
      <C>
Î {Case, Gender }
      <V>
Î {Nom, Gen, Dat, Acc, Voc, Loc, Ins} for <C> = Case
      <V>
Î {M, I, F, N} for <C> = Gender

1.3.2.3.2     Lower rank

Following systems inflectify cardinal numeral depending on preselections form higher rank.

TEMPERER-CASE (Simplex-Cardinal)
   [Temperer-Case-Nom] Temperer:::Case-Nom-Form
   [Temperer-Case-Gen] Temperer:::Case-Gen-Form
   [Temperer-Case-Dat] Temperer:::Case-Dat-Form
   [Temperer-Case-Acc] Temperer:::Case-Acc-Form
   [Temperer-Case-Voc] Temperer:::Case-Voc-Form
   [Temperer-Case-Loc] Temperer:::Case-Loc-Form
   [Temperer-Case-Ins] Temperer:::Case-Ins-Form

TEMPERER-GENDER (Simplex-Cardinal)
   [Temperer-Gender-M] Temperer:::Gender-M-Form
   [Temperer-Gender-I] Temperer:::Gender-I-Form
   [Temperer-Gender-F] Temperer:::Gender-F-Form
   [Temperer-Gender-N] Temperer:::Gender-N-Form

TEMPERER-NUMBER (Simplex-Cardinal)
   [Temperer-Number-Sg] Temperer:::Number-Sg-Form

1.3.2.4     Deictic

Main difference between deictic and previous parts of sentence is, that deictic does not have its own rank – it is on the same level as Thing is.

Det-<C>-<V> (Thing-<C>-<V> & Explicit-Deictic)
   [Det-<C>-<V>-Pr] Deictic:::<C> -<V>-Form>
   where
      <C>
Î {Case, Gender, Number }
      <V>
Î {Nom, Gen, Dat, Acc, Voc, Loc, Ins} for <C> = Case
      <V>
Î {M, I, F, N} for <C> = Gender
      <V>
Î {Sg, Pl} for <C> = Number

1.3.2.5     Language differences

1.3.2.5.1     Czech

Previous implementation describes Czech, because it has all morphological features present in other two languages.

1.3.2.5.2     Russian

In agreement within nominal group, Russian is the same as Czech (besides it does not have vocative and masculine inanimate gender). The implementation is similar to the Czech implementation. So we do not consider the technical details here.

1.3.2.5.3     Bulgarian

The same is true for Bulgarian (besides it does not have cases and masculine inanimate gender). Bulgarian also has different treatment of deictics

In Bulgarian the (nominal group's) Deictic is realized as function of the whole nominal group, so the scheme of preselections in their rank (NG) and inflections on the lower rank is kept here.

In Bulgarian when Deictic of nominal group is SPECIFIC, DEMONSTRATIVE and NONSELECTIVE (in NIGEL terms), which is analogue to English Deictic "the", it is realized as a morphological marker by the morphological module. This marker (the Deictic) could be carried by different element of the nominal group (Numerative, Quality, Thing). When the Thing is inflectified the following system is used:

NOMINATIVE-NONSELECTIVE-NOUN
   (Nonselective & Nominative &
   Not-Status-Modified & Not-Colour-Modified & Not-Age-Modified &
   Not-Size-Modified & No-Post-Deictic )
   [Full-Article]  Thing:::Definite-Word-FA

When the Deictic is demonstrated by the element of the adjectival group we use the system shown bellow to transform the Deictic function to preselection of adjectival group:

ADJECTIVAL-GR-DETERMINATION-FA
   (Nominative & Nonselective &
      (Status-Modified | Colour-Modified Age-Modified |
      Size-Modified Post-Deictic-Modified))
   [Full-Article-AG]  AG-Deictic:FA-Determination

Further the characteristic FA-determination (full-article-determination) is associated with a particular element of the adjectival group by the realization statement of the next system:

ADJECTIVAL-GR-ARTICLE-REALIZATION (Adjectival-Group)
   [FA-Determination] 
      Quality:::Definite-Word-FA
      Numerative:::Definite-Word-FA
      Ordinal:::Definite-Word-FA

Same mechanism is used for NONSPECIFIC, NONSELECTIVE, SINGULAR Deictic, which is in Bulgarian a morphological marker corresponding to English Deictic "a(n)".

All other types of deictics in nominal group (specific and non-specific) have the feature Explicit-Deictic and for their agreement with the Thing element in gender and number are used systems of the type DET-<C>-<V>

 

1.3.2.6     Example of generation

The Figure 4 depicts structure graph of the following sentence:

(20)                     E:      Enter the fifth external point.

Cz:    Zadejte        pátý                 externí                   bod.

L:      Enter-Pl2    fifth-ISgAcc    external-ISgAcc  point-ISgAcc

From inflection features (in boxes), you can see, that Ordinal under Ordinator and Quality under Status have the same gender (masculine inanimate – gender-i-form), number (singular – number-sg-form) and case (accusative – case-acc-form) as Thing. This is ensured by preselections marked by ellipses.

 

Figure 4 - Agreement within nominal group

1.4        Agreement of subject and predicative adjective

1.4.1        Analysis

From some point of view agreement with predicative adjective is mixture of subject-verb agreement and agreement within nominal group. Predicative adjective agrees with subject in gender, number and case[22] (only nominative or genitive are possible).

(21)                     E:      The command is accessible.

Bg:    Командата                  е                 достъпна.

L:      Command-FSg             is-Sg3        accessible-FSg

Cz:    Příkaz                            je                dostupný.

L:      Command-ISgNom      is-Sg3        accessible-NomISg.

(22)                     E:      Lines are visible.

Bg:    Линиите          са               видими.

L:      Line-Pl              are-Pl3     visible-Pl

Cz:    Úsečky               jsou            viditelné.

L:      Line-FPlNom   are-Pl3     visible-FPlNom.

(23)                     E:      Five lines are visible.

Cz:    Pĕt               úseček                          je         viditelných.

L:      Five-Nom    line-FPlGen[23]             is-Sg3 visible-FPlGen.

1.4.2        Implementation

Predicative adjective in Nigel is realized as Quality under Attribute (See Figure 4).

1.4.2.1     Preselections on the clause rank

Preselection of Attribute is performed by the following system. Agreement operator ensures that Attribute is preselected[24] for gender number and case if appropriate feature is entered in Subject.

SUBJECT-PREDICATIVEADJ-AGREEMENT (Ascriptive & Subject-Inserted)
   [Subject-Predicativeadj-Agreement]
      (Subject ~ Attribute
         (Thing-Gender-M = Quality-Gender-M)
         (Thing-Gender-I = Quality-Gender-I)
         (Thing-Gender-F = Quality-Gender-F)
         (Thing-Gender-N = Quality-Gender-N)

         (Thing-Number-Sg = Quality-Number-Sg)
         (Thing-Number-Pl = Quality-Number-Pl)

         (Thing-Number-Nom = Quality-Number-Nom)
         (Thing-Number-Gen = Quality-Number-Gen))

Current version of KPML (3.0) does not show preselections done by agreement operator in the structure graph (C.f. Figure 5).

1.4.2.2     Adjective group rank

The inflection of Quality inserted under Attribute is done by systems QUALITY-CASE, QUALITY-GENDER, QUALITY-NUMBER, described in chapter 1.3.2.1.2 above.

1.4.2.3     Example of generation

The Figure 5 depicts structure graph of the following sentence:

(24)                     E:      Commands are accessible.

Cz:    Příkazy              jsou            dostupné.

L:      Commands        are-Pl3     accessible-IPl.

You can see, that Quality under Attribute has the same gender and number as subject (masculine inanimate – gender-i-form and plural – number-pl-form). Preselections done by agreement operator are not displayed in Structure graph.

Figure 5 - Agreement of subject and predicative adjective

1.4.2.4     Language differences

1.4.2.4.1     Czech

Previous implementation describes Czech, because it has all morphological features present in other two languages.

1.4.2.4.2     Bulgarian

Bulgarian omits in system SUBJECT-PREDICATIVEADJ-AGREEMENT lines responsible for agreement in inanimate gender and case:

SUBJECT-PREDICATIVEADJ-AGREEMENT (Ascriptive & Subject-Inserted)
   [Subject-Predicativeadj-Agreement]
      (Subject ~ Attribute
         (Thing-Gender-M = Quality-Gender-M)
         (Thing-Gender-F = Quality-Gender-F)
         (Thing-Gender-N = Quality-Gender-N)

         (Thing-Number-Sg = Quality-Number-Sg)
         (Thing-Number-Pl = Quality-Number-Pl))

1.5        Conclusion

Agreement in Bulgarian, Czech and Russian is more complicated than similar phenomenon in English. It is driven mostly by syntactical properties of agreeing units. Implementation described above covers all agreement necessary for final corpora of Agile; moreover it implements many cases not covered by the corpora. Modularity and overall design of all systems allows easy enhancement for more special cases in the future. The only exception is agreement with coordinated subject that is impossible to be fully covered in current version of KPML.

1.6        References

Kopečný František (1962): Základy české skladby, SPN Praha

Bémová A. et al. (1995): Linguistic problems of Czech, Project Peco 2924, Charles University Prague



[1] Sometimes more detailed than classical divison to 9 or 10 POS categories, e.g. PastPart (past participle). This category is also omitted if it is the same for the English word.

[2] Present only in Czech

[3] Present only in Czech and Russian

[4] By that we mean finite verb for simple verbal forms and all parts of compound verbal forms (See 1.2.1.1 for more details)

[5] It is in fact past participle. See 1.2.1.1 for more details

[6] In Czech and Bulgarian (in Russian in imperative), if the subject is not stressed it is often realized as zero pronoun (or, looking from a different perspective, the personal pronoun is omitted on the surface level). It is true in both indicative and imperative. If the pronominal subject is to be stressed, the personal pronouns must be explicitly expressed.

[7] This is present only in Czech and Russian

[8] Currently not present in our domain.

[9] Not present in our domain.

[10] na tlačítko” is adjunct in Czech and “klepnout” is intransitive verb, therefore when transformed into reflexive passive, there is no subject.

[11] In Czech, the auxiliary verb is not present in the third person

[12] That means: be + past part. of be

[13] For Bulgarian, it seems to be more natural to say that only finite agree with subject and other parts (infinitive, participle) agree with the finite

[14] Russian and Bulgarian do not distinguish gender of past participles in plural.

[15] Simple da construction do not distinguish gender.

[16] E.g.

Subject

Finite verb

Why

m+m

m

m is the only thing to select

m+f

m

m < f

m+f+f

m

m < f

m+f+n

m

m < f & m < n

m+n

m

m < n

f+n

f

f < n

f+I

i

i < f

Plural verbal and adjectival forms for feminine (f) and masculine inanimate (i) are the same, therefore it does not matter if we consider i to be smaller than f or vice versa.

[17] Just to make things looking more complicated (obě in the second clause has to be in neuter, therefore also the second verb has to be in neuter):

E:      The button and the box were enabled and both disappeared.

Cz:    Tlačítko                   a          políčko          nebyly                    povolené        a   
obě                  zmizela.

L:      Button-NSg             and      field-NSg        not-were-FPl3     enabled-FPl  and
both-NPl        disappeared-NPl.

[18] It does not mean that the feminine and neuter plural forms of verbs are the same. The verb is really in feminine form. The sentence (incorrect) with verb in neuter plural would look like:

Cz:*  Tlačítko                   a          políčko           nebyla                   povolená.

L:      Button-NSg             and      field-NSg        not-were-NPl3     enabled-NPl

[19] Even for person there are some cases when the person of predicate is different from semantically derived person of subject:

E:      Five of you came.

Cz:    Pĕt         vás                         přišlo.

L:      Five       you-PlGen2    came-NSg3

[20] There are no infinitives or clauses in subject in our domain. However, in the future, appropriate feature is just simply added into SVNoAgreement after Thing-Case-Gen

[21] This is necessary only for Czech, Bulgarian and Russian do not distinguish gender in plural.

[22] In Czech and Russian, not in Bulgarian.

[23] Genitive instead of nominative is required by the numerals higher than four. See [Chapter ##10. Quantification]

[24] Keyword preselection is omitted in agreement operator: (A = B) in fact means (A = (:B))