Main noun. The main word (head) in a multi-word name of a city is always noun; the same holds for a one-word city name. If it is homonymous with an adjective, a new noun lemma is created for the name. Thus Hluboká is lemmatized as Hluboká_;G / NNFS1-----A----
rather than hluboký / AAFS1----1A----
(lit. deep)Nouns that are frequently used in names (such as Újezd, Ústí may have their own geographical lemmas even if they are homonymous with a normal word. For homonymous pairs where the non-geographical usage is much more common (such as voda (water), ves (village), město (city)) it is recommended to stick with the non-geographical lemma even in geographical usages.
Modifiers in multi-word names. Attributive adjectives, prepositions, conjunctions etc. should be lemmatized as normal words. Other nouns may be lemmatized as geographical if they are nested geographical names (e.g. names of rivers or mountains in names of cities).
Part of speech of foreign words. Original part of speech of the word in the source language is used unless there is a good reason not to do so. Besides not knowing the original part of speech, a very good reason is that the word behaves as a different part of speech in Czech texts. For instance, blanc is adjective in French Mont Blanc but it behaves as a noun in na Mont Blanku. Mont can be annotated as an undeclined noun. See Chapter 6, Foreign words and phrases for more information on foreign words.
Table 3.2. Examples of geographical names
Name |
Type |
Morphological annotation |
---|---|---|
Česká republika |
country |
|
Ústí nad Labem |
city |
|
Karlovy Vary |
city |
|
Dobrá Voda |
city |
|
Odolena Voda |
city |
|
Černá v Pošumaví |
city |
|
Ohrada u Hluboké |
city |
|
Hradec Králové |
city |
|
Kostelec nad Černými Lesy |
city |
|
New York |
city |
|
A Coruńa |
city |
|
Săo Paulo |
city |
|
Rio de Janeiro |
city |
|
Le Havre |
city |
|
Krems an der Donau |
city |
|
San Juan de la Rambla |
city |
|
Kao-hsiung |
city |
|
Wu-lu-mu-čchi |
city |
|
Gerlachovský štít |
mountain |
|
Divoká Orlice |
river |
|