Amongst proper nouns and titles we include:
names of persons.
For example: Anička (=Annie) , Božena Němcová (=Božena Němcová), Sněhurka (=Snow White), Novákovi (=The Nováks) .
identification of nationalities, groups and residents.
For example: Čech (=Czech), Pražan (=resident of Prague), Přemyslovec (=member of Premyslid dynasty) .
names of animals.
For example: Vořech (=Mongrel) , Zrzečka (=Ginger (red squirrel in children's story)), Pú (=Pooh Bear).
geographical names.
For example: Jupiter (=Jupiter), Evropa (=Europe), Balkánský poloostrov (=Balkan Peninsula), Máchovo jezero (=Lake Mácha), řeka Svatého Vavřince (=St. Lawrence River), Hradec Králové (=Hradec Králové (name of a town)), Sídliště Antala Staška (=Antal Stašek Housing Estate), Vodičkova ulice (=Vodičkova Street), ulice Na Příkopě (=Na Příkopě Street), Boubínský prales (=Boubín Forest) .
official titles of institutions, organisations, companies and businesses.
For example: Česká republika (=The Czech Republic), Rada bezpečnosti Organizace spojených národů (=The Security Council of the United Nations Organisation), Poslanecká sněmovna (=The Chamber of Deputies (Lower House of Czech Parliament)), klub Za starou Prahu (=The For Old Prague Club), lékárna U Jednorožce (=The Unicorn Pharmacy).
titles of documents, creative works and works of art.
For example: Osudová (=Beethoven's Fifth Symphony), Naše řeč (=Our Language (periodical)), Kde domov můj? (=Where is my homeland? (Czech national anthem)).
titles of products.
For example: automobil Škoda Favorit (=The Škoda Favorit car), Palmex (=Palmex (washing powder)).
titles of notable events and chronological periods.
For example: Vánoce (=Christmas), Mistrovství světa v ledním hokeji 2004 (=The 2004 World Ice Hockey Championship).
titles of awards and prizes.
For example: medaile Za zásluhy (=The Medal of Merit).
a title identifying a category or type.
For example: Pozdravujte všechny výletníky typu "ven z auta, šup na hrad a šup do auta". (=Greet all visitors of the "Out of the car, quickly to the castle and quickly back in the car" type.)
and other identifications and titles in the broad sense of identification (frequently also written with a lower-case initial letter, but then usually within quotation marks).
For example: Staří čeští intelektuálové tehdy dostali nálepku "zrádné intelektuální reakce". (=Old Czech intellectuals were dubbed "treacherous intellectual reactionaries" in those days.); Do lázeňského města přijeli vyzváni motem turnaje "Kdo nebyl v Poděbradech, nemá rád tenis". (=They came to the spa town in response to the slogan "If you haven't been to Poděbrady you don't like tennis.); Vytvořit určitý prostor, později nazvaný "transformační polštář" (=To create a certain space, later known as "the transformation cushion".); Říkali tomu "dialog". (=They called this "a dialogue".); ...dokud se nenaplní úsloví "Na každého jednou dojde". (=until the saying "Everybody will have their turn" comes true)
NB! The boundary of the title (identifying expression) is not clear-cut. It has been found that a title can probably follow any concrete or abstract noun. Numbers functioning as "labels" (for example: strana 25 (=page 25)) are annotated according to the rules given in Section 10.1.3, "Numerals with the function of a "label"". A number of annotation rules have also been adopted under section Section 12, "Annotation of structured text". These rules have precedence over the rules given here.
For the annotation of proper nouns the rules given in Section 8.1, "Basic rules for the annotation of identifying expressions" and further specific rules given in this section apply.
Proper names of people (attribute is_name_of_person
). At all nodes representing expressions which are constituents of proper names of people (nodes representing forename or surname) the value 1
is entered in the attribute is_name_of_person
. See Table 8.2, "Values of the attribute is_name_of_person
".
Table 8.2. Values of the attribute is_name_of_person
0 |
the node represents an expression which is not a constituent of a proper name of a person |
1 |
the node represents an expression which is a constituent of a proper name of a person |
If the attribute is not filled in, the value is taken to be 0
.
!!! In the future it is anticipated that an attribute will be introduced specifying for every node whether or not it is a constituent of an identifying expression. Meanwhile only the attribute is_name_of_person
has been adopted and information as to whether a given node is or is not a constituent of an identifying expression is also provided for identifying expressions written between quotation marks by the attribute quot/type
(on this, see Section 19.1, "Text within quotation marks").
The question of the start of a title. During annotation it may sometimes be difficult to define the start (and frequently also the end) of a title, especially as conventions vary in respect of upper and lower case initial letters in certain types of titles (according the rules of Czech orthography one writes: Sídliště Míru (=The Peace Estate), but náměstí Míru (=Peace Square)). When defining the start of a title the following simple rules are therefore adopted:
if a generic common noun (written with upper or lower case initial letters) is followed by a noun in the genitive or by a possessive adjective, this generic noun is treated as a constituent of the title.
Complex titles (conforming to the rules of Czech orthography) are therefore treated as a single title: Sídliště Antala Staška (=The Antal Stašek Housing Estate; Sídliště Míru (=The Peace Estate); náměstí Míru (=Peace Square); řeka Svatého Vavřince (=The St. Lawrence River); vodopády Viktoriiny (=Victoria Falls); ulice Boženy Němcové (=Božena Němcová Street); most Palackého (=Palacký Bridge); Země Františka Josefa (=Franz Josef Land); ostrov Svatého Tomáše (=São Tomé Island); Divadlo J.K. Tyla (=J.K. Tyl Theatre); Galerie bratří Čapků (=Čapek Brothers Gallery) ; Dům módy (=The House of Fashion).
These titles all belong to group A, identifying expressions with a declinable governing constituent (for the rules of annotation, see Section 8.1.1, "Rules for the annotation of identifying expressions with a declinable governing constituent").
Examples:
Jdi přes most.DIR1
Palackého.RSTR
(=Cross Palacký Bridge) Fig. 8.148
Na náměstí.LOC
Míru. RSTR
je rušno. (=Peace Square is busy.) Fig. 8.149
Ellipsis of the governing constituent of the title (a declinable noun). If a generic common noun is not expressed at surface level (this is an exceptional case), the ellipsis of the governing noun is represented in the tectogrammatical tree according to the rules in Section 12.1.2, "Ellipsis of the governing noun". The newly established node is then treated as a constituent of the title. For example:
Vystoupíme na {#EmpNoun
.LOC
} Jiřího z Poděbrad. (=We are getting off at Jiřího z Poděbrad (George of Poděbrady) metro station).
if a generic common noun (written with a lower case initial letter) is followed by an attributive adjective in grammatical agreement with it, this generic noun is also treated as a constituent of the title.
Complex titles (conforming to the rules of Czech orthography) are therefore treated as a single title: poloostrov Pyrenejský (=The peninsula of the Pyrenees ) (Pyrenejský poloostrov (=The Pyrenees Peninsula)); moře Středozemní (=The Mediterranean Sea) (Středozemní moře (=The Mediterranean Sea)); kaple Betlémská (=The Bethlehem Chapel) (Betlémská kaple (=The Bethlehem Chapel)); ulice Spálená (=Spálená Street) (Spálená ulice (=Spálená Street)).
These titles all belong to group A, i.e. identifying expressions with a declinable governing constituent (for annotation rules, see Section 8.1.1, "Rules for the annotation of identifying expressions with a declinable governing constituent").
Examples:
Šli jsme ulicí.DIR2
Spálenou.RSTR
(=We were walking in Spálená Street.) Fig. 8.150
Šli jsme Spálenou.RSTR
ulicí.DIR2
(=We were walking in Spálená Street.)
Itálie leží na poloostrově.LOC
Pyrenejském.RSTR
(=Italy lies on the peninsula of the Pyrenees.)
Itálie leží na Pyrenejském.RSTR
poloostrově.LOC
(=Italy lies on the Pyrenees Peninsula.)
NB! If a generic common noun is followed by an adjective in the non-declinable nominative of identity, the common generic noun is not treated as a constituent of the title.
The generic noun is not a constituent of the title (conforming to the rules of Czech orthography): stanice Vltavská (=Vltavská Station (there is no Vltavská stanice))), symfonie Osudová. (=The Fifth Symphony)
Examples:
Šli jsme ulicí.DIR2
Spálená.ID
(=We were walking in Spálená Street.) Fig. 8.151
Vystoupíme na stanici.LOC
Vltavská.ID
(=We are getting off at Vltavská station.)
Tramvaje nejezdí v ulici.LOC
Spálená.ID
a v { ulice.LOC
} 17.listopadu.RSTR
(=The trams are not running in 17th November Street and in Spálená Street.)
NB! If a (declinable) adjectival title is a separate constituent of the sentence it is treated as nominalised, i.e. a node with the t-lemma substitute #EmpNoun
for the governing noun is not added (thus this is not a case of ellipsis as described in Section 12.1.2, "Ellipsis of the governing noun").
Examples:
Šli jsme Spálenou.DIR2
(=We were walking in Spálená (Street).) Fig. 8.152
Vystoupíme na Vltavské.LOC
(=We are getting off at Vltavská)
Poslouchá Osudovou.PAT
pořád dokola. (=He/She listens to the Fifth over and over again.)
if a generic common noun written with a lower case initial letter is followed by a nominative of identity (nominative, prepositional phrase, or other alternative form for a nominative of identity), the generic noun is not a constituent of the title.
The generic noun is not a constituent of the title (conforming to the rules of Czech orthography): sídliště Modřany (=The Modřany housing estate); stanice Náměstí míru (=Peace Square Station); restaurace U Medvídků (The Little Bears Restaurant); ulice Mezi Zahrádkami (=Mezi Zahrádkami Street); kino Blaník (=Blaník Cinema); hrad Karlštejn (=Karlstein Castle); hotel U Modré hvězdy (=The Blue Star Hotel).
These titles belong to group B, identifying expressions without a declinable governing constituent (for annotation rules, see Section 8.1.3, "Identification structure").
Examples:
Bydlíme v ulici.LOC
Mezi Zahrádkami.ID
(=We live in Mezi Zahrádkami Street) Fig. 8.153
Bydlíme {#Idph
.LOC
} Mezi Zahrádkami.ID
21. (=We live at 21 Mezi Zahrádkami) Fig. 8.154
Sejdeme se {#Idph
.LOC
} U Medvídků.ID
(=We'll meet at The Little Bears)
{#Idph
.LOC
} U Modré hvězdy.ID
už mají plno. (=The Blue Star is already full up.)
if a generic common noun written with an upper case initial letter is followed by a nominative of identity (nominative, prepositional phrase or other alternative form of the nominative of identity), the generic noun is a constituent of the title.
The generic noun is a constituent of the title:
Divadlo Loutka (=The Puppet Theatre)
Divadlo na Vinohradech (=Vinohrady Theatre)
Galerie Centrum (=Centrum Gallery)
Hudební divadlo v Karlíně (=Music Theatre in Karlín)
These titles belong to group A, identifying expressions with a declinable governing constituent (for annotation rules, see Section 8.1.1, "Rules for the annotation of identifying expressions with a declinable governing constituent").
!!! In the present state of the annotation rules, the definition of the start and end of a title is significant only for purposes of allocating identifying expressions to group A or B. In the future this question will be important for the introduction of an attribute defining at every node whether the expression it represents is or is not a constituent of the title.
Figure 8.150. Proper noun
Šli jsme ulicí Spálenou. (=lit. (We) were_walking AUX (through) street Spálená)
Figure 8.151. Proper noun
Šli jsme ulicí Spálená. (=lit. (We) were_walking AUX (through) street Spálená)
Figure 8.153. Proper noun
Bydlíme v ulici Mezi Zahrádkami. (=lit. (We) live in street Mezi Zahrádkami)
In this section, specific rules are introduced for certain types of proper noun.
Official geographical names. The structure of official geographical names (titles of towns, villages, streets, squares, districts, mountains, rivers, states, islands, peninsulas, lowlands and seas) is not analysed. All dependent nodes have the functor RSTR
.
Example:
Ústí nad Labem.RSTR
(=Ústí nad Labem (Ústí on the Elbe)) Fig. 8.155
NB! For titles of public transport stops and stations, titles of spaces, buildings, castles, and institutions, and for regional and local geographical titles this rule has not been adopted. In most cases, however (according to the usual rules of annotation, adopted here), their dependent nodes will also have the functor RSTR
.
Complex proper names of persons. In the case of complex proper nouns, the effective root of the title is the node representing the last part of the name. All other parts of the name are dependent on this node (as sister nodes) and they have the functor RSTR
. A hyphen or a space within a complex proper noun is treated as a surface convention and such orthographical features and variations are not reflected in the tectogrammatical trees.
Examples:
Klára.RSTR
Nováková.RSTR
Malá (=Klára Nováková Malá) Fig. 8.156
likewise: Klára Nováková-Malá (=Klára Nováková-Malá)
Jan.RSTR
Maria.RSTR
Plojhar (=Jan Maria Plojhar) Fig. 8.157
likewise: Jan-Maria Plojhar (=Jan-Maria Plojhar) and Jan Maria-Plojhar (=Jan Maria-Plojhar)
Anna.RSTR
Marie (=Anna Marie) Fig. 8.158
likewise: Anna-Marie (=Anna-Marie)
jméno Anna.RSTR
Marie.ID
(=the name Anna Marie)
rtěnka Margaret.RSTR
Astor.ID
(=Margaret Astor lipstick)
On the annotation of nominal groups (noun phrases) in which a common noun and a proper name of a person are combined, see also Section 11.4.1, "Combination of a common noun and a proper noun".
NB! compounds with a dash (not a hyphen). This is represented as a paratactic structure. For example:
dvojice Máčala.ID
- Lešický.ID
(=The Máčala-Lešický couple.) Fig. 8.159
Figure 8.159. A dash as a constituent of a title
dvojice Máčala - Lešický (=lit. couple Máčala-Lešický.)
Foreign-language proper names of persons. Complex foreign surnames or complex foreign forenames in a European language are represented by a single node. In the t-lemma of these nodes the respective m-lemmas of all constituents of the complex foreign name or surname are joined by underscore characters in the sequence in which they occur at surface level (see Section 3.1, "Multi-word t-lemma").
For example:
Malíř.RSTR
Leonardo.RSTR
da Vinci.ACT
je slavný. [ t-lemma
= da_Vinci] (=The painter Leonardo da Vinci is famous.)
Pan.RSTR
da Cruz.ACT
už je tady. [ t-lemma
= da_Cruz] (=Mr. da Cruz is here now.)
Foreign (European) proper names represented by a single node are treated according to the same rules as Czech names.
NB! Foreign proper names in a non-European language are represented according to the rules for foreign-language phrases (by a newly created node with the t-lemma #Forn
; see Section 9, "Foreign-language expressions"); for example:
čínský císař {#Forn
} Tung.FPHR
chun.FPHR
Chou.FPHR
(=The Chinese Emperor Tung chun Chou.)
Two declinable nouns as constituents of a title. Certain titles of towns, their districts, railway stations, bus stops etc. are formed by two declinable nouns, frequently hyphenated. A title potentially has two governing constituents. These titles are analysed structurally. The node representing the governing constituent of the specifying, more clearly defining part (usually the second part of the title, the second declinable noun) has the functor RSTR
. If it is not clear which is the governing part and which is the dependent part, the node representing the governing constituent of the first part (before the hyphen) is treated as the effective root node of the title. In such cases the hyphen is not represented by a node.
Examples:
Frýdek - Místek.RSTR
(=Frýdek-Místek) Fig. 8.160
Sejdeme se v Praze.LOC
- Nebušicích.RSTR
(=We'll meet at Prague-Nebušice.) Fig. 8.161
stanice Praha.ID
- Smíchov.RSTR
(=Prague-Smíchov Station) Fig. 8.162
Praha - Hlavní nádraží.RSTR
(=Prague Main Station)
See also annotation of nominal groups - Section 11.4, "Dependency relations in noun phrases (two nouns in the same form)".
Figure 8.161. Two declinable nouns as constituents of a title
Sejdeme se v Praze - Nebušicích. (=lit. (We) will_meet REFL at Prague-Nebušice.)
Figure 8.162. Two declinable nouns as constituents of a title
stanice Praha - Smíchov (=lit. station Prague-Smíchov)
A non-declinable noun in the nominative as a constituent of the title. Where a non-declinable noun in the nominative which is not the effective root of the title is a constituent of a title, the node representing this nominative has the functor RSTR
.
Such non-declinable nouns occur in the names of towns, their districts, offices, references to locations in the titles of organisations, detailed specifications of trade marks and, more recently, especially in the titles of sports competitions which include the non-declinable name of their sponsor.
Examples:
Pracuje v Chemopetrolu.LOC
Litvínov.RSTR
(=He/She works at Chemopetrol Litvínov) Fig. 8.163
Fotbalová Gambrinus.RSTR
liga (=The Gambrinus Football League) Fig. 8.164
Hokejová Český Telecom.RSTR
extraliga (=The Czech Telecom Special Hockey League)
Budou bydlet na Praze.LOC
- východ.RSTR
(=They are going to live in Prague-East.)
okres Praha.ID
- východ.RSTR
(=The Prague-East District)
u katastrálního úřadu Praha. ID
město.RSTR
(=at the City of Prague land registry)
s novou Škodou.ACMP
Favorit.RSTR
(=With the new Škoda Favorit.)
automobil Opel.ID
Astra.RSTR
(=The Opel Astra car)
prací prášek Palmex.ID
modrá síla.RSTR
(=Palmex Blue Force washing powder)
Válcovny plechu Frýdek.RSTR
-Místek.RSTR
(=Frýdek-Místek rolling mills)
fotbalový klub Bayern.ID
Mnichov.RSTR
(=Bayern Munich Football Club)
Figure 8.163. A non-declinable noun in the nominative as a constituent of a title
Pracuje v Chemopetrolu Litvínov. (=lit.(He/She) works at Chemopetrol Litvínov.)
Figure 8.164. A non-declinable noun in the nominative as a constituent of a title
Fotbalová Gambrinus liga. (=lit. Football Gambrinus League.)
Attributive adjectives and genitives signifying "in honour of, to the memory of". Nodes representing attributive adjectives formed from a proper name of a person, and which are constituents of the title, have the functor RSTR
. Similarly, a node representing a proper noun in the genitive (an alternative to the attributive adjective) or certain common nouns in the genitive carrying the meaning "in honour, to the memory" and which are constituents of a title, have the functor RSTR
.
Examples:
Karlova.RSTR
univerzita (=Charles University)
Smetanova.RSTR
Litomyšl (=Smetana's Litomyšl (international festival)
Parléřův.AUTH
Karlův.RSTR
most (=Parléř's Charles Bridge)
stanice Náměstí Míru.RSTR
(=Peace Square Station)
socha Svobody.RSTR
(=The Statue of Liberty)
Divadlo Járy Cimrmana.RSTR
(=The Jára Cimrman Theatre)
most Barikádníků.RSTR
(=The Barricade Bridge)
Sídliště Antala Staška.RSTR
(=The Antal Stašek Housing Estate)
NB! If the genitive of a noun which is a constituent of a title does not carry the meaning "in honour, to the memory", it may have a different functor; for example:
Organizace spojených národů.APP
(=The United Nations Organisation)
Pohár mistrů.APP
evropských zemí.APP
(=European Champions' Cup)
NB! Attributive adjectives carrying the meaning "in honour, to the memory" are to be distinguished from attributive adjectives carrying the meaning of the functor APP
(owner of a named object), and from attributive adjectives carrying the meaning of the functor AUTH
(creator of a named object), which are not constituents of a title. See also Section 10.2, "AUTH".
A common generic noun in apposition. In cases where a title from group B is not dependent on a common generic noun but is in apposition to it, the title is represented as an identifying structure whose root is a node with the t-lemma substitute #Idph
. The terminal constituent of the apposition structure is therefore a node representing the expressed common generic noun and a newly established node with the t-lemma substitute #Idph
. Cf.:
{#Idph
.DENOM
[is_member
=1
]} Proti všem.ID
, román.DENOM
[is_member
=1
] Jiráska. (=Against All, the novel by Jirásek)
The apposition between the newly established node with the t-lemma substitute #Idph
and the node representing the common generic noun román (=novel) will be represented in the tectogrammatical tree. Cf. Fig. 8.165.
If a title from group A and a common generic noun are in apposition, the terminal constituent of the appositional structure is the effective root node of the title and the node representing the expressed common generic noun.
Příklad:
Skláři.DENOM
[is_member
=1
] Vysočiny - stálá expozice.DENOM
[is_member
=1
] (=Vysočina Glassmakers - permanent exhibition) Fig. 8.166
Figure 8.165. A common generic noun in apposition
Proti všem, román Jiráska. (=lit. Against All, novel (of) Jirásek.)
Figure 8.166. A common generic noun in apposition
Skláři Vysočiny - stálá expozice. (=lit. Glassmakers (of) Vysočina - permanent exhibition)
Foreign-language titles in the position of the nominative of identity. Non-declinable foreign-language titles are represented according to the rules for foreign-language phrases (see Section 9, "Foreign-language expressions"). A complex foreign-language title is represented as a list structure for a foreign-language expression also in the position of the nominative of identity. The root of this structure is the effective root of the identification structure and it has the functor ID
.
Example:
časopis {#Forn
.ID
} Financial Times (=The Financial Times newspaper) Fig. 8.167
NB! However, if a simple non-declinable foreign-language title is in the position of a nominative of identity, it is represented only as an identifying structure and the node with the t-lemma #Forn
is not added to the tectogrammatical tree.
Example:
město Uyuni.ID
(=The city of Uyuni) Fig. 8.168
časopis Times.ID
(=The Times newspaper)