6.1. Articles

Unlike in many other languages, there are no articles in Czech. Articles in foreign phrases are annotated as adjectives.

In some languages, articles distinguish gender, number and case. Analogically to Czech, their lemma should reflect the masculine singular nominative form, the morphological tag should encode the real word form in the text. However, sometimes this approach is not possible due to a different gender or number in Czech: La Manche is feminine in French, masculine inanimate in Czech; Los Angeles is plural in Spanish, singular in Czech (and in English). There has to be a special lemma for each such frozen article. Thus, los would be annotated el-3_,t_^(šp._člen) / AAMSX----1A---- in "do Prahy přijeli Los Paraguayos" but los-3_,t_^(šp._člen) / AAXXX----1A---- in "pracuje v Los Angeles".

Note

The separate lemma reflects the fact that the word form is frozen since it was ported to other languages. However, it might not be needed. Articles are annotated as adjectives and adjectives (unlike nouns) are not required to stick with one gender.

Articles merged with a preposition (e.g. French du, Italian della, German aufs, beim, vom, zur, im, am...) are treated as prepositions.

Table 6.2. Articles in common foreign languages

Language

Form

Lemma

Tag

English

the

the-1_,t_^(angl._urč._člen)

AAXXX----1A----

English

a

a-2_,t_^(angl._neurč._člen)

AAXXX----1A----

English

an

a-2_,t_^(angl._neurč._člen)

AAXXX----1A---1

German

der

der-1_,t_^(něm._člen)

AAMS1----1A---- AAFS2----1A---- AAFS3----1A---- AAXP2----1A----

German

die

der-1_,t_^(něm._člen)

AAFS1----1A---- AAFS4----1A---- AAXP1----1A---- AAXP4----1A----

German

das

der-1_,t_^(něm._člen)

AANS1----1A---- AANS4----1A----

German

des

der-1_,t_^(něm._člen)

AAMS2----1A---- AANS2----1A----

German

dem

der-1_,t_^(něm._člen)

AAMS3----1A---- AANS3----1A----

German

den

der-1_,t_^(něm._člen)

AAMS4----1A---- AAXP3----1A----

Dutch

de

de-2_,t_^(niz._člen)

AAMSX----1A---- AAFSX----1A---- AAXPX----1A----

Dutch

het

de-2_,t_^(niz._člen)

AANSX----1A----

Dutch

den

de-2_,t_^(niz._člen)

AAMS3----1A---5 AANS3----1A---5

French

le

le-1_,t_^(fr._člen)

AAMSX----1A----

French

la

le-1_,t_^(fr._člen)

AAFSX----1A----

French

l

le-1_,t_^(fr._člen)

AAXSX----1A----

French

les

le-1_,t_^(fr._člen)

AAXPX----1A----

Italian

il

il-1_,t_^(it._člen)

AAMSX----1A----

Italian

la

il-1_,t_^(it._člen)

AAFSX----1A----

Italian

gli

il-1_,t_^(it._člen)

AAMPX----1A----

Italian

le

il-1_,t_^(it._člen)

AAFPX----1A----

Spanish

el

el-1_,t_^(šp._člen)

AAMSX----1A----

Spanish

la

el-1_,t_^(šp._člen)

AAFSX----1A----

Spanish

los

el-1_,t_^(šp._člen)

AAMPX----1A----

Spanish

las

el-1_,t_^(šp._člen)

AAFPX----1A----

Portuguese

o

o-10_,t_^(port._člen)

AAMSX----1A----

Portuguese

a

o-10_,t_^(port._člen)

AAFSX----1A----

Portuguese

os

o-10_,t_^(port._člen)

AAMPX----1A----

Portuguese

as

o-10_,t_^(port._člen)

AAFPX----1A----

Arabic

al, ad, an, ar, as, az

al-5_,t_^(arab._člen)

AAXXX----1A----

Arabic

el, ed, en, er, es, ez

el-5_,t_^(arab._člen)

AAXXX----1A----

Hebrew

ha

ha-2_,t_^(hebr._člen)

AAXXX----1A----