The English analytical layer has been obtained by an automatic conversion of the output of an automatic parser; in the PCEDT case the dependency MST parser and Pennconverter have been used.
The tree structure as produced by the MST parser differs in certain points from the main guidelines for the PDT-based analytical layer. Since we wanted this layer (albeit automatically generated) to be as close to the Czech annotation (of the same layer) as possible, additional conversion of the output of the parser has been done. For instance, complex verb forms (e.g. would have been playing) are always governed by the lexical verb node while the auxiliary verbs are its daughters. However, no "new" annotation has been introduced at this stage. The resulting annotation style differs from the specification of the analytical layer for Czech only in some minor points.
Figure 1
Figure 1 presents an example of a tree on the analytical (surface-dependency) layer (an a-layer tree, or a-tree). This section and the following section describe its structure and the inner structure of the complex labels of analytical nodes (a-nodes). The annotated sentence has a form of a dependency tree (with nodes and edges, as usual). Each tree has an extra technical root node, the attributes of which are only a subset of the attributes of the regular nodes and which differ also in their definition from all the other nodes (see below).
Figure 2
Unlike the tectogrammatical representation and the phrase-structure representations, the analytical representation reflects only tokens that are actually present in the text. It does not use any trace- or gap-like "artificial" nodes. The following attributes (complex labels) are attached to the analytical-layer nodes (see also Figure 2):
Figure 3
The topmost node in a tree is always the technical root (afun AuxS). The sentence punctuation (afun AuxK) is governed directly by the technical root. Linguistically, of course, the topmost node is (within the dependency framework) always the main predicate, or a coordination node (afun Coord) governing several main predicates as members of the coordination. The modifying word is always governed by the word it syntactically modifies. Hence, arguments and adverbials of a verb are governed by that verb (with a preposition or a subordinator in between, if the modifier is a prepositional phrase or a subordinate clause), adjectives modifying nouns are governed by the nouns and adverbs modifying adjectives are governed by the adjectives.
Figure 4
A coordination is governed by a node with the afun Coord (see Figure 4). This can be a punctuation (e.g. comma, colon or semicolon) or a coordinating conjunction (e.g. and, or). The members of the coordination get their afuns according to their position and function in the sentence, followed by the "suffix" _Co. The original Czech annotation scheme distinguishes among coordination, apposition and parenthesis. The English annotation uses this scheme only for coordination; sequences identified as apposition or parenthesis are not regarded as paratactic. Rather, their members are regarded as being governed by the first member. Only coordination members have thus is_member="1".
Complex verb forms consisting of a lexical verb and one or several auxiliary verbs are always governed by the lexical verb. This node gets its afun according to its position and function in the sentence. The auxiliary verbs are its children (see Figure 5). In the group will be used, the node used governs the nodes will and be). Verbs of control (e.g. plan to do something) govern the controlled verbs; so do modal verbs.
Figure 5
Prepositions and subordinating conjunctions always govern "their" content words. They get the "auxiliary" afuns AuxP, AuxC (see Figure 4); it is the content word (governed by a preposition/subordinating conjunction) that gets an afun according to its syntactic function in the sentence. In a clause, the predicate is the root node of the subtree representing the clause. E.g. in the sentence "I said that I like it", that is governed by said and gets the afun AuxC. In turn, that governs the node like, which has the afun Obj (object).
Modifiers of nouns, even the prepositional ones, are treated as Atr (attributes) and are governed by the nouns (Figure 4).
Nominal part of a copular predicate gets the afun Pnom. This can be a noun (Martin is a briliant speaker), an adjective (Joan is pretty) or a prepositional group (This was of great importance). Verb complements of other verbs are marked as Obj (e.g. They considered him smart).
The existential there is always the clause subject (see Figure 4). Determiners are governed by nouns and get the afun AuxA (see Figure 4; the afun AuxA does not exist in Czech). Verbless clauses are treated as cases of verbal ellipsis and the governing node gets the afun ExD.
The possible values of the afun (analytical function) attribute are listed here.
Please note: the English analytical annotation is only automatic and thus imperfect, with errors both in the structure and in the labeling. Therefore, there might be inconsistency (in terms of the linguistic information provided) along the node links between the manually annotated tectogrammatical layer and the analytical one. In such a case, the tectogrammatical layer (obviously) contains the correct interpretation of the sentence in question (barring annotation errors).