3. Identifying expressions

Identifying expressions are expressions used for identification (proper names, and titles, meta-language use).

Identifying expressions are divided into five basic groups according to the following criteria:

  1. do they only consist of a proper name? (Prague, the Kremlin)

  2. do they consist of a proper name and a common noun (a descriptor)? (Lake Michigan, the novel The Hours)

  3. when comprising a descriptor, is the descriptor integral part of the name (Lake Michigan) or not (Kent cigarettes)?

  4. being a company's name, does the identifying expression seem to be governed by an abbreviation?

  5. when without descriptor, does the name include a noun not governed by any preposition that can be the governing node of the entire name (Amadeus, Angela's Ashes, Two Towers, Hair) or does the name lack such an apparent governing element (On the Road, On Mice and Men, Breaking the Waves, Guess Who's Coming to Dinner)? Coordinated structures are governed by the coordination node, and therefore they are regarded as those lacking the governing noun (e.g. Harry Potter and the Goblet of Fire)

The following types of identifying expressions are thus distinguished in the annotation:

  1. Names comprising proper nouns without descriptors or names comprising proper nouns with descriptors as integral components. The effective root of an identifying expression has a functor according its position in the structure. Nodes dependent on the effective root are analyzed according to the standard annotation rules. Articles that are obviously part of the identifying expression are governed by the nouns and get their own node with the functor INTF. The annotation considers the article to be part of the identifying expression when:

    • it immediately precedes a proper noun: the Kremlin, the Koran, The Hague.

    • when it precedes a name in general: The Times, The Paris Peace Talks

    On the other hand, it is treated in the regular way when occurring inside of an identifyig expression: Kerouac's.AUTH {#Idph.DENOM} <On> <the> Road.ID

    The article is always the left sister of adjectival and nominal modifiers and the right sister of possible predeterminers.

    The effective root of the identifying expression of this type gets the appropriate functor according to itsfunction in the sentence. For example:

    I'm reading The.INTF Hours.PAT

    The.INTF United.RSTR Nations.APP Organization.DENOM for Education.BEN , Science.BEN and.CONJ Culture.BEN Fig. 5.20

    His.APP Hamlet.ACT is all torn. (Fig. 5.21)

  2. Names introduced or postmodified by a descriptor that is not integral part of the name: these are represented as identification structures (see Section 3.1, “Identification structure”). For example:

    the book The.INTF Hours.ID

    the Kent.ID cigarettes (the is attached as auxrf to cigarettes)

  3. Names of companies including an abbreviation like Ltd., GmbH., etc.: names of companies are not analyzed at this annotation stage. All members of the subtree get the attribute value [is_name=1], and they will be annotated together later.

  4. Names lacking a governing noun in a prepositionless case (On the Road, On Mice and Men, Breaking the Waves, Guess Who's Coming to Dinner): Like names introduced or postmodified by a descriptor that is not integral part of the name, this type of identifying expression is represented as identification structure (see Section 3.1, “Identification structure”). An artificial governing node with the t-lemma substitute #Idph is inserted as the effective root of the subtree.

Descriptor, which is not integral part of the name: The descriptor as non-integral part of the name is a (common) noun written in small letters, introducing (or following) a proper noun, title, an expression used metalinguistically or an expression quoted word for word. It can also occur with the preposition of:

the city <of> Prague.ID

the notice "Danger!".ID

Explicative of-attribute: an expression consisting of an of-phrase modifying a common or a proper noun while the following transformation is applicable: the concept of timetime is a (kind of) concept. the person of Christ ➝ Christ is a (kind of) person. This specific type of identifying expressions is analyzed with the help of an identification structure (see Section 3.1, “Identification structure”). For example:

the concept <of> time.ID

the issue <of> impeachment.ID

the person <of> Christ.ID

Proper names of people (the is_name_of_person attribute). At all nodes representing expressions which are constituents of proper names of people (nodes representing first name or surname) the value 1 is entered in the attribute is_name_of_person.

Figure 5.20. Identifying expression

Identifying expression

The United Nations Organization for Education, Science and Culture.

Figure 5.21. Identifying expression

Identifying expression

His Hamlet is all torn.