3.1. Personal names

Given names and surnames are distinguished by the term field in their lemmas (_;Y vs. _;S). Note that we do not use the terms first name and last name because in some cultures the surname (family name) comes first and, more importantly, sometimes the original order is respected in Czech texts. If a name can serve both as given and family name, the preferable solution is to reserve two lemmas (for instance, Pavel Pavel would be lemmatized as Pavel-1_;Y Pavel-2_;S. However, in some cases there is currently one lemma covering both usages (such as Pavel_;Y_;S).

If a person has only one name, it usually is a given name: Aristoteles_;Y (Aristotle).

Personal names homonymous with a normal Czech word should always have a lemma of their own. Thus Zeman (surname) is lemmatized as Zeman-1_;S, not zeman (squire).

Personal names are always tagged as nouns, even if they have an adjectival form (true for many Slavic surnames): Palacký_;S / NNMS1-----A----.

Czech female surnames are usually derived from (but not equal to!) a male surname. Their form strongly resembles a possessive adjective: paní Nováková (Mrs. Novák) differs from Novákova žena (Novák's wife) just in the length of the final a/á. However, Nováková will neither be analyzed as Novákův_;S_^(*2) / AUFS1M--------- (a surname cannot be adjective), nor as Novák_;S / NNMS1-----A---- (this lemma implies the masculine gender). The correct analysis would be Nováková_;S_^(*3) / NNFS1-----A---- (but it lacks the derivational information in the current data).

Foreign surnames of women are usually "femalized" in Czech texts (Condoleeza Riceová). In such cases they are treated as normal Czech female surnames. If they are left intact (Condoleeza Rice), their lemma must indicate their foreign origin and their tag must tell that their gender and case are unknown: Rice_;S_,t / NNXSX-----A----.

Otherwise, foreign personal names are rarely marked as foreign words because in Czech texts, they are usually declined according to the Czech grammar: Bill Clinton, bez Billa Clintona, Billu Clintonovi, s Billem Clintonem... Thus Bill is lemmatized as Bill_;Y, not Bill_;Y_,t. (See also Chapter 6, Foreign words and phrases.) Even if a name allows for a frozen (undeclined) form, there usually is a context in which it can be declined: kniha o Willie Nelsonovi vs. kniha o Williem Nelsonovi; zvolili Teng Siao-pchinga vs. zvolili pana Tenga. Some foreign names, such as Steffi, are never declined.