Documentation for the pdt20 extension

Table of contents


PML_A

ToC

PML_A.mak - Miscellaneous macros for the analytic layer of Prague Dependency Treebank (PDT) 2.0.

TectogrammaticalTree()

This function is only available in TrEd (i.e. in GUI). After a previous call to AnalyticalTree, it switches current view back to a tectogrammatical tree which refers to the current analytical tree.

GetCoordMembers($node,$no_recurse)

If the given node is coordination or apposition (i.e. its afun is Coord or Apos) return a list of the corresponding coordination members (nodes with is_member flag set). Otherwise return the node itself.

Unless $no_recurse is true, the function is recursively applied to those members that are themselves Coord or Apos.

This function differs from ExpandCoord in handling coordination members below AuxP and AuxC. While this function returns the nodes with the is_member flag (i.e. the nodes below AuxP and AuxC), ExpandCoord returns the AuxP and AuxC nodes above them.

GetMember($node)

This is a helper function used to identify and reach the actual coordination or apposition members (carrying is_member flag) from child nodes of a Coord or Apos.

Given a node, return it if its is_member attribute is 1. If its afun is AuxC or AuxP, recurse to the child nodes. In all other cases return an empty list.

ExpandCoord($node,$keep?)

If the given node is coordination or apposition (according to its Analytical function - attribute afun) expand it to a list of coordinated nodes. Otherwise return the node itself. If the argument keep is true, include the coordination/apposition node in the list as well.

This function differs from GetCoordMembers in handling coordination members below AuxP and AuxC. Unlike the latter, it returns the AuxP and AuxC nodes above the actual coordination members.

IsMember($node)

This is a helper function used to identify coordination or apposition members among child nodes of a Coord or Apos. It returns 1 if the given node has is_member=1 or if it has afun=AuxC or AuxP and a child node for which IsMember (recursively called) returns 1. If neither of these two conditions is met, the function returns 0.

GetSentenceString($tree?)

Return string representation of the given tree (suitable for Analytical trees).

DiveAuxCP($node)

You can use this function as a through argument to GetEParents and GetEChildren. It skips all the prepositions and conjunctions when looking for nodes which is what you usually want.

GetEParents($node,$through)

Return linguistic parent of a given node as appears in an analytic tree. The argument $through should supply a function accepting one node as an argument and returning true if the node should be skipped on the way to parent or 0 otherwise. The most common DiveAuxCP is provided in this package.

GetEChildren($node,$dive)

Return a list of nodes linguistically dependant on a given node. $dive is a function which is called to test whether a given node should be used as a terminal node (in which case it should return false) or whether it should be skipped and its children processed instead (in which case it should return true). Most usual treatment is provided in DiveAuxCP. If $dive is skipped, a function returning 0 for all arguments is used.

ANodeToALexRf(a_node,t_node,t_file)

Adds given a-node's id to a/lex.rf of the given t-node and adjusts t_lemma of the t-node accordingly. The third argument t_file specifies the Treex::PML::Document object to which the given t-node belongs.

ANodeToAAuxRf(a_node,t_node,t_file)

Appends given a-node's id to a/aux.rf of the given t-node. The third argument t_file specifies the Treex::PML::Document object to which the given t-node belongs.

CreateStylesheets()

Creates default stylesheet for PML analytical files unless already defined. Most of the colors it uses can be redefined in the tred config file .tredrc by adding a line of the form

  CustomColorsomething = ...

The stylesheet is named PML_A and it has the following display features:

1.

sentence is displayed in CustomColorsentence. If the form was changed (e.g. because of a typo), the original form is displayed in CustomColorspell with overstrike.

2.

analytical function is displayed in CustomColorafun. If the node's is_member is set to 1, the type of the structure is indicated by Co (coordination) or Ap (apposition) in CustomColorcoappa. For is_parenthesis_root, Pa is displayed in the same color.

PML_A2T

ToC

PML_A2T.mak - Helper macros for writing an a- to t-layer transformations over the Prague Dependency Treebank (PDT) 2.0 data

CreateTFile($a_file?)

Creates and returns empty t-file (Treex::PML::Document object) linked with a given a-file. It associates the newly created file with tdata_schema.xml PML schema. If no a_file is given, current file is used. Initially, the newly created t-file contains no trees. Trees can be added using AddNewTTree.

AddNewTTree($t_file?)

Creates a new t-tree linked with the current a-tree and appends the newly created t-tree to a given t-file. Initially the t-tree consists of the root node alone. More nodes can be added e.g. using PML_T::NewNode($parent) and linked to a-nodes using PML_A::AddANodeToALexRf and PML_A::AddANodeToAAuxRf.

If no t-file is given, the t-file currently associated with the current a-file is used (if any). See also more generic macro InitTTree.

InitTTree($t_file,$t_root,$a_root)

Initialize a given t-root node based on a given a-root node. Empty t-root node to be used with this function can be created using either NewTree or NewTreeAfter macros, or by a direct call to $t_file-new_tree($file_position)>.

PML_A_Edit

ToC

PML_A_Edit.mak - Miscellaneous macros for editing the analytic layer of Prague Dependency Treebank (PDT) 2.0.

AddThisToALexRf()

If called from analytical tree entered through PML_T_Edit::MarkForARf, adds this node's id to a/lex.rf list of the marked tectogrammatical node.

AddThisToAAuxRf()

If called from analytical tree entered through PML_T_Edit::MarkForARf, adds this node's id to a/aux.rf list of the marked tectogrammatical node.

RemoveThisFromARf()

If called from analytical tree entered through PML_T_Edit::MarkForARf, remove this node's id from a/lex.rf and a/aux.rf of the marked tectogrammatical node.

PML_A_View

ToC

PML_A_View.mak - Miscellaneous macros for viewing the analytic layer of Prague Dependency Treebank (PDT) 2.0.

PML_M

ToC

PML_M.mak - Miscellaneous macros for the morphological layer of the Prague Dependency Treebank (PDT) 2.0.

GetSentenceString($tree?)

Return the original sentence string.

CreateStylesheets()

Creates default stylesheet for PML analytical files unless already defined. Most of the colors it uses can be redefined in the tred config file .tredrc by adding a line of the form

  CustomColorsomething = ...

The stylesheet is named PML_A and it has the following display features:

1.

sentence is displayed in CustomColorsentence. If the form was changed (e.g. because of a typo), the original form is displayed in CustomColorspell with overstrike.

2.

analytical function is displayed in CustomColorafun. If the node's is_member is set to 1, the type of the structure is indicated by Co (coordination) or Ap (apposition) in CustomColorcoappa. For is_parenthesis_root, Pa is displayed in the same color.

PML_T

ToC

PML_T.mak - Miscellaneous macros for the tectogrammatic layer of Prague Dependency Treebank (PDT) 2.0.

AFile($fsfile?)

Return analytical file associated with a given (tectogrammatical) file. If no file is given, the current file is assumed.

GetANodeIDs($node?)

Returns a list of IDs of analytical nodes referenced from a given tectogrammatical node. If no node is given, the function applies to $this.

GetANodeREFs($node?)

Returns a list of PMLREFs of analytical nodes referenced from a given tectogrammatical node. If no node is given, the function applies to $this. This function is similar to GetANodeIDs() but it doesn't strip the file-ref part of the reference.

GetANodes($node?,$fsfile?)

Returns a list of analytical nodes referenced from a given tectogrammatical node. This combines references from a/tree.rf (root-node), a/aux.rf and a/lex.rf (non-root nodes). If no node is given, the function applies to $this. If the node belongs to other file than the current file, the optional second argument must specify the corresponding Treex::PML::Document object.

GetALexNode($node?)

Returns an analytical node referenced from a/lex.rf attribute of a given tectogrammatical node. If no node is given, the function applies to $this.

GetAAuxNodes($node?)

Returns a list of analytical nodes referenced from a/aux.rf attribute of a given tectogrammatical node. If no node is given, the function applies to $this.

GetANodeByID($id_or_ref,$fsfile?)

Looks up an analytical node by its ID (or PMLREF - i.e. the ID preceded by a file prefix of the form a#). This function only works if the current file is a tectogrammatical file and the requested node belongs to an analytical file associated with it.

GetANodesHash()

Return a reference to a hash indexing analytical nodes of the analytical file associated with the current tectogrammatical file. If such a hash was not yet created, it is created upon the first call to this function (or other functions calling it, such as GetANodes or GetANodeByID.

ClearANodesHash()

Clear the internal hash indexing analytical nodes of the analytical file associated with the current tectogrammatical file.

AnalyticalTree()

This function is only available in TrEd (i.e. in GUI). It switches current view to an analytical tree associated with a currently displayed tectogrammatical tree.

DrawCorefArrows()

Called from node_style_hook. Draws coreference arrows using following properties: textual arrows in CustomColor arrow_textual, grammatical in <arrow_grammatical> (and dashed in Full stylesheet), complement arrow in arrow_compl (and dot-dashed in Full stylesheet), segment arrow in arrow_segm and exophora arrow in arrow_exoph.

IsCoord($node?)

Check if the given node is a coordination according to its TGTS functor (attribute functor)

ExpandCoord($node,$keep?)

If the given node is coordination or aposition (according to its TGTS functor - attribute functor) expand it to a list of coordinated nodes. Otherwise return the node itself. If the argument keep is true, include the coordination/aposition node in the list as well.

GetSentenceString($tree?)

Return string representation of the given tree (suitable for Tectogrammatical trees).

GetEParents($node)

Return linguistic parents of a given node as appear in a TG tree.

GetEChildren($node?)

Return a list of nodes linguistically dependant on a given node.

GetEDescendants($node?)

Return a list of all nodes linguistically subordinated to a given node (not including the node itself).

GetEAncestors($node?)

Return a list of all nodes linguistically superordinated to (ie governing) a given node (not including the node itself).

GetESiblings($node?)

Return linguistic siblings of a given node as appears in a tectogrammatic tree. This doesn't include the node itself, neither those children of the node's linguistic parent that are in coordination with the node.

GetNearestNonMember($node?)

If the node is not a member of a coordination, return the node. If it is a member of a coordination, return the node representing the highest coordination $node is a member of.

IsFiniteVerb($node?)

If the node is the head of a finite complex verb form (based on m/tag of the referenced analytical nodes), return 1, else return 0.

IsPassive($node?)

If the node is the head of a passive-only verb form, (based on m/tag of the referenced analytical nodes), return 1, else return 0.

IsInfinitive($node?)

If the node is the head of an infinitive complex verb form, (based on m/tag of the referenced analytical nodes), return 1, else return 0.

IsModalVerbLemma($lemma)

Return 1 if trlemma is a member of the list of all possible modal verb lemmas (morfological lemma suffixes (/[-`_].*/) are ignored).

ModalVerbLemma($lemma)

Deprecated alias for IsModalVerbLemma

CreateStylesheets()

Creates default stylesheets for PML tectogrammatic files unless already defined. Most of the colors they use can be redefined in the tred config file .tredrc by adding a line of the form

  CustomColorsomething = ...

Default values can be found in PML.mak. The stylesheets are named PML_T_Compact and PML_T_Full. Compact stylesheet is suitable to be used on screen because it pictures many features by means of colours whilst the Full stylesheet is better for printing because it lists the values of almost all the attributes.

The stylesheets have the following features (if the stylesheet is not mentioned, the description talks about the Compact one):

1.

t_lemma is displayed on the first line. If the node's is_parenthesis is set to 1, the t_lemma is displayed in CustomColor parenthesis in the Compact stylesheet. If the node's sentmod is non-empty, its value is displayed in CustomColor detail after a dot. If there is a coreference leading to a different sentence, the t_lemma of the refered node is displayed in CustomColor coref, too.

2.

Node's functor is displayed in CustomColor func. If the node's subfunctor or is_state are defined, they are indicated in CustomColor subfunc. In the Full stylesheet, is_member is also displayed as "M" in CustomColor coappa and is_parenthesis as "P" in CustomColor parenthesis.

3.

For nodes of all types other than complex, nodetype is displayed in CustomColor nodetype. For complex nodes, their gram/sempos is displayed in CustomColor complex. In the Full stylesheet, all the non-empty values of grammatemes are listed in CustomColor detail, and for ambiguous values the names of the attributes are displayed in CustomColor detailheader.

4.

Generated nodes are displayed as squares, non-generated ones as ovals.

5.

Current node is displayed as bigger and with outline in CustomColor current.

6.

Edges from nodes to roots or from nodes with functor PAR, PARTL, VOCAT, RHEM, CM, FPHR, and PREC to their parents are thin, dashed and have the CustomColor line_normal. Edges from coordination heads with is_member are thin and displayed in CustomColor line_member. Edges from other nodes with is_member to their coordination parents are displayed with the lower half thick in CustomColor line_normal and upper half thin in CustomColor line_member. Edges from nodes without is_member to their coordination parents are displayed thin in CustomColor line_comm. Edges from coordination nodes without is_member to their parents are displayed with the lower half thin in CustomColor line_member and upper half thick in CustomColor line_normal. All other edges are displayed half-thick in CustomColor line_normal.

7.

The attribute tfa is reflected by the colour of the node. CustomColors tfa_c, tfa_f, tfa_c, and tfa_no are used. In the Full stylesheet, the value is also displayed before the functor in tfa_text.

8.

Attributes gram, is_dsp_root, is_name_of_person, and quot are listed in the hint box when the mouse cursor is over the node. In the Full stylesheet, they are diplayed at the last line in CustomColor detail (see 3).

DeleteNode(node?)

Deletes $node or $this, attaches all its children to its parent and recounts deepord. Cannot be used for the root.

DeleteSubtree(node?)

Deletes $node or $this and its whole subtree and recounts deepord. Cannot be used for the root.

NewNode(node?,id?)

Add new node as a son of the given node or current node, initializes the new node using InitNode. If id is specified, it is assigned to the new node. Otherwise, a unique ID is computed and assigned to the node using NewID. Not all the required attributes are being set!

InitNode(node,obj?)

Initialize already existing Treex::PML::Node object as a t-node by associating it with t-node PML schema type. If the node belongs to a different file than the current one, the Treex::PML::Document or some already initialized node of that file must be specified as the second argument. Returns the initialized node.

NewID(node?)

Tries to compute a new unique ID based on the ID's in the tree to which the given node belongs. If no node is specified, the global variable $root is used. Returns the computed ID.

OpenValFrameList(node?,options...)

Open a window with a list of possible valency frames for a given node, highlighting frames currently assigned to the node. All given options are passed to the approporiate ValLex::GUI method. Most commonly used are -no_assign => 1 to suppress the Assign button, -assign_func => sub { my ($node,$frame_ids,$frame_text)=@_; ... } to specify a custom code for assigning the selected frame_ids to a node, -lemma and -pos to override t_lemma and sempos of the node, -frameid to frames currently assigned to the node, -noadd = 1> to forbid adding new words to the lexicon (also implied by -no-assign.

OpenValLexicon(options...)

Open valency lexicon editor/browser GUI. All given options are passed to the approporiate ValLex::GUI method. Most commonly used are -lemma and -pos to override t_lemma and sempos of the node and -frameid to frames currently assigned to the node.

PML_T_Edit

ToC

PML_T_Edit.mak - Miscellaneous macros for editing the tectogrammatic layer of Prague Dependency Treebank (PDT) 2.0.

AddCoref(node,target,coref)

If the node does not refer to target by the coref of type $coref, make the reference, else delete the reference.

MouseEditLinks(ask,@_ of value_line_click_hook)

Enables changes of a/aur.rf and a/lex.rf by clicking on the words of the sentence (value line). 1. Alt + left click: makes the word the a/lex.rf of the current node. 2. Ctrl + left click: creates new son of the current node from the word. 3. Shift + left click: adds the word to a/aux.rf or removes it from linked a-nodes.

If ask is set to 1, asks for attributes for a new node.

RememberNode()

Remembers current node to be used later, e.g. with text_arrow_to_remembered.

MarkForARf()

Enter analytical layer with current node remembered. By calling PML_A_Edit::AddThisToA... you can make links between the layers.

PML_T_View

ToC

PML_T_View.mak - Miscellaneous macros for the viewing tectogrammatic layer of Prague Dependency Treebank (PDT) 2.0.

ShowValFrames

Displays valency frames for the lemma of the current node and highlights those assigned to that node.