FSNode

FSNode - Simple OO interface to tree structures of Fslib.pm

new
Create a new FSNode object. FSNode is basicly a hash reference, which means that you may simply acces node's attributes as $node->{attribute}
initialize
This function inicializes FSNode. It is called by the constructor new.
destroy
This function destroys a FSNode (and all its descendants). The node should not be attached to a tree.
parent
Return node's parent node (undef if none).
type
Return node's type node (undef if none).
root
Find and return the root of the node's tree.
level
Calculate node's level (root-level is 0).
lbrother
Return node's left brother node (undef if none).
rbrother
Return node's right brother node (undef if none).
firstson
Return node's first dependent node (undef if none).
set_type(type)
Associate FSNode object with a given Fslib::Type.
set_type_by_name (schema,type-name)
Lookup a structure or container declaration in the given Fslib::Schema by its type name and associate the corresponding Fslib::Type with the FSNode.
validate (attr-path?,log?)
Validates the content of the node according to the associated type and schema. If attr-path is non-empty, validate only attribute selected by the attribute path. An array reference may be passed as the 2nd argument log to obtain a detailed report of all validation errors.

Returns: 1 if the content conforms, 0 otherwise.

Note: this method requires PMLBackend (use ImportBackend to load it).

following (top?)
Return the next node of the subtree in the order given by structure (undef if none). If any descendant exists, the first one is returned. Otherwise, right brother is returned, if any. If the given node has neither a descendant nor a right brother, the right brother of the first (lowest) ancestor for which right brother exists, is returned.
following_visible (FSFormat_object,top?)
Return the next visible node of the subtree in the order given by structure (undef if none). A node is considered visible if it has no hidden ancestor. Requires FSFormat object as the first parameter.
following_right_or_up (top?)
Return the next visible node of the subtree in the order given by structure (undef if none), but not descending.
previous (top?)
Return the previous node of the subtree in the order given by structure (undef if none). The way of searching described in following is used here in reversed order.
previous_visible (FSFormat_object,top?)
Return the next visible node of the subtree in the order given by structure (undef if none). A node is considered visible if it has no hidden ancestor. Requires FSFormat object as the first parameter.
rightmost_descendant (node)
Return the rightmost lowest descendant of the node (or the node itself if the node is a leaf).
leftmost_descendant (node)
Return the leftmost lowest descendant of the node (or the node itself if the node is a leaf).
getAttribute (attr_name)
Return value of the given attribute.
attr (path)
Return value of an attribute specified as a path of the form attr/subattr/[n]/subsubattr/[m], where [n] can be used to pick n-th element of a list or alternative. If alternative or list is encountered but no index is given, then 1st element of the list or alternative is used (except for the case when list or alternative is found in the last path step, in which case the corresponding object - list or alternative - is returned as is).
set_attr (path,value,strict?)
Set value of an attribute specified by a path of the form attr/subattr/[n]/subsubattr/[m], where [n] can be used to pick n-th element of a list or alternative. If strict==0 and an alternative or list is encountered but no index is given, then 1st element of the list or alternative is used (except for the case when list or alternative is found in the last path step, in which case the entire list or alternative is overwritten by the given value). If strict==1 and a list or an alternative is encountered in the value tree but no step of the form [n] is given, a warning is issued and undef is returned. If strict==2, the same approach as with strict==1 is taken, only errors are reported via a croak.
setAttribute (name,value)
Set value of the given attribute.
children
Return a list of dependent nodes.
visible_children(fsformat)
Return a list of visible dependent nodea.
descendants
Return a list recursively dependent nodes.
visible_descendants(fsformat)
Return a list recursively dependent visible nodes.

FSFormat

FSFormat - Simple OO interface for FS instance of Fslib.pm

create (@header)
Create a new FS format instance object by parsing each of the parameters passed as one FS header line.
new (attributes_hash_ref?, ordered_names_list_ref?, unparsed_header?)
Create a new FS format instance object and initialize it with the optional values.
clone
Duplicate FS format instance object.
initialize (attributes_hash_ref?, ordered_names_list_ref?, unparsed_header?)
Initialize a new FS format instance with given values. See Fslib for more information about attribute hash, ordered names list and unparsed headers.
addNewAttribute (type, colour, name, list)
Adds a new attribute definition to the FSFormat. Type must be one of the letters [KPOVNWLH], colour one of characters [A-Z0-9]. If the type is L, the fourth parameter is a string containing a list of possible values separated by |.
readFrom (source,output?)
Reads FS format instance definition from given source, optionally echoing the unparsed input on the given output. The obligatory argument source must be either a GLOB or list reference. Argument output is optional and if given, it must be a GLOB reference.
toArray
Return FS declaration as an array of FS header declarations.
writeTo (glob_ref)
Write FS declaration to a given file (file handle open for reading must be passed as a GLOB reference).
sentord(), order(), value(), hide()
Return names of special attributes declared in FS format as @W, @N, @V, @H respectively.
isHidden (node)
Return the lowest ancestor-or-self of the given node whose value of the FS attribute declared as @H is either 'hide' or 1. Return undef, if no such node exists.
defs
Return a reference to the internally stored attribute hash.
list
Return a reference to the internally stored attribute names list.
unparsed
Return a reference to the internally stored unparsed FS header. Note, that this header must not correspond to the defs and attributes if any changes are made to the definitions or names at run-time by hand.
renew_specials
Refresh special attribute hash.
specials
Return a reference to a hash of attributes of special types. Keys of the hash are special attribute types and values are their names.
attributes
Return a list of all attribute names (in the order given by FS instance declaration).
atno (n)
Return the n'th attribute name (in the order given by FS instance declaration).
atno (attribute_name)
Return the definition string for the given attribute.
count
Return the number of declared attributes.
isList (attribute_name)
Return true if given attribute is assigned a list of all possible values.
listValues (attribute_name)
Return the list of all possible values for the given attribute.
color (attribute_name)
Return one of Shadow, Hilite and XHilite depending on the color assigned to the given attribute in the FS format instance.
special (letter)
Return name of a special attribute declared in FS definition with a given letter. See also sentord() and similar.
indexOf (attribute_name)
Return index of the given attribute (in the order given by FS instance declaration).
exists (attribute_name)
Return true if an attribute of the given name exists.
make_sentence (root_node,separator)
Return a string containing the content of value (special) attributes of the nodes of the given tree, separted by separator string, sorted by value of the (special) attribute sentord or (if sentord does not exist) by (special) attribute order.
clone_node
Create a copy of the given node.
clone_subtree
Create a deep copy of the given subtree.

FSFile

FSFile - Simple OO interface for FS files.

SYNOPSIS

  use Fslib;
  my $file="trees.fs";
  my $fs = FSFile->newFSFile($file);
  if ($fs->lastTreeNo<0) { die "File is empty or corrupted!\n" }
  foreach my $tree ($fs->trees) {
    ...    # do something on the trees
  }
  $fs->writeFile("$file.out");

REFERENCE

new (name?,file_format?,FS?,hint_pattern?,attribs_patterns?,unparsed_tail?,trees?,save_status?,backend?,encoding?,user_data?,meta_data?,app_data?)
Create a new FS file object and initialize it with the optional values.
create
Same as new but accepts name => value pairs as arguments. The following argument names are available:

filename, format, FS, hint, patterns, tail, trees, save_status, backend

See initialize for more detail.

clone ($clone_trees)
Create a new FSFile object with the same file name, file format, FSFormat, backend, encoding, patterns, hint and tail as the current FSFile. If $clone_trees is true, populate the new FSFile object with clones of all trees from the current FSFile.
initialize (name?,file_format?,FS?,hint_pattern?,attribs_patterns?,unparsed_tail?,trees?,save_status?,backend?,encoding?,user_data?,meta_data?,app_data?)
Initialize a FS file object. Argument description:
name (scalar)
File name
file_format (scalar)
File format indentifier (user-defined string). TrEd, for example, uses FS format, gzipped FS format and any non-specific format strings as identifiers.
FS (FSFormat)
FSFormat object associated with the file
hint_pattern (scalar)
TrEd's hint pattern definition
attribs_patterns (list reference)
TrEd's display attributes pattern definition
unparsed_tail (list reference)
The rest of the file, which is not parsed by Fslib, i.e. Graph's embedded macros
trees (list reference)
List of FSNode objects representing root nodes of all trees in the FSFile.
save_status (scalar)
File save status indicator, 0=file is saved, 1=file is not saved (TrEd uses this field).
backend (scalar)
IO Backend used to open/save the file.
encoding (scalar)
IO character encoding for perl 5.8 I/O filters
user_data (arbitrary scalar type)
Reserved for the user. Content of this slot is not persistent.
meta_data (hashref)
Meta data (usually used by IO Backends to store additional information about the file - i.e. other than encoding, trees, patterns, etc).
app_data (hashref)
Non-persistent application specific data associated with the file (by default this is an empty hash reference). Applications may store temporary data associated with the file into this hash.
readFile (filename, [backends...])
Read FS declaration and trees from a given file. The first argument must be a file-name. If a list of backend modules is specified, test methods of the modules are invoked as long as one of them succeeds. That module is than used as a backend for opening and parsing the file.

Note: this function sets noSaved to zero.

readFrom (glob_ref, [backends...])
Read FS declaration and trees from a given file (file handle open for reading must be passed as a GLOB reference). This function is limited to use FSBackend only. Sets noSaved to zero.
writeFile (filename)
Write FS declaration, trees and unparsed tail to a given file. Sets noSaved to zero.
writeTo (glob_ref)
Write FS declaration, trees and unparsed tail to a given file (file handle open for reading must be passed as a GLOB reference). Sets noSaved to zero.
newFSFile (filename,encoding?,[backends...])
Create a new FSFile object based on the content of a given file. If a list of backend modules is specified, read methods of the modules are invoked as long as one of them succeeds to open and parse the file.

Optionally, in perl ver. >= 5.8, you may also specify file character encoding.

filename
Return the FS file's file name.
changeFilename (new_filename)
Change the FS file's file name.
fileFormat
Return file format indentifier (user-defined string). TrEd, for example, uses FS format, gzipped FS format and any non-specific format strings as identifiers.
changeFileFormat(string)
Change file format indentifier.
backend
Return IO backend module name. The default backend is FSBackend, used to save files in the FS format.
changeBackend(string)
Change file backend.
encoding
Return file character encoding (used by Perl 5.8 input/output filters).
changeEncoding(string)
Change file character encoding (used by Perl 5.8 input/output filters).
userData
Return user data associated with the file (by default this is an empty hash reference). User data are not supposed to be persistent and IO backends should ignore it.
changeUserData(value)
Change user data associated with the file. User data are not supposed to be persistent and IO backends should ignore it.
metaData(name)
Return meta data stored into the object usually by IO backends. Meta data are supposed to be persistent, i.e. they are saved together with the file (at least by some IO backends).
changeMetaData(name,value)
Change meta information (usually used by IO backends). Meta data are supposed to be persistent, i.e. they are saved together with the file (at least by some IO backends).
listMetaData(name)
In array context, return the list of metaData keys. In scalar context return the hash reference where metaData are stored.
appData(name)
Return application specific information associated with the file. Application data are not persistent, i.e. they are not saved together with the file by IO backends.
changeAppData(name,value)
Change aplication specific information associated with the file. Application data are not persistent, i.e. they are not saved together with the file by IO backends.
listAppData(name)
In array context, return the list of appData keys. In scalar context return the hash reference where appData are stored.
FS
Return a reference to the associated FSFormat object.
changeFS(FSFormat_object)
Associate FS file with a new FSFormat object.
hint
Return the Tred's hint pattern declared in the FSFile.
changeHint(string)
Change the Tred's hint pattern associated with this FSFile.
pattern_count
Return the number of display attribute patterns associated with this FSFile.
pattern (n)
Return n'th the display pattern associated with this FSFile.
patterns
Return a list of display attribute patterns associated with this FSFile.
changePatterns(list)
Change the list of display attribute patterns associated with this FSFile.
tail
Return the unparsed tail of the FS file (i.e. Graph's embedded macros).
changeTail(list)
Modify the unparsed tail of the FS file (i.e. Graph's embedded macros).
trees
Return a list of all trees (i.e. their roots represented by FSNode objects).
changeTrees (list)
Assign a new list of trees.
treeList
Return a reference to the internal array of all trees (e.g. their roots represented by FSNode objects).
tree (n)
Return a reference to the tree number n.
lastTreeNo
Return number of associated trees minus one.
notSaved (value?)
Return/assign file saving status (this is completely user-driven).
currentTreeNo (value?)
Return/assign index of current tree (this is completely user-driven).
currentNode (value?)
Return/assign current node (this is completely user-driven).
nodes (tree_no, prev_current, include_hidden)
Get list of nodes for given tree. Returns two value list ($nodes,$current), where $nodes is a reference to a list of nodes for the tree and current is either root of the tree or the same node as prev_current if prev_current belongs to the tree. The list is sorted according to the FS->order attribute and inclusion of hidden nodes depends on the boolean value of include_hidden.
value_line (tree_no, no_tree_numbers?)
Return a sentence string for the given tree. Sentence string is a string of chained value attributes (FS->value) ordered according to the FS->sentord or FS->order if FS->sentord attribute is not defined.

Unless no_tree_numbers is non-zero, prepend the resulting string with a ``tree number/tree count: '' prefix.

value_line_list (tree_no)
Return a list of value (FS->value) attributes for the given tree ordered according to the FS->sentord or FS->order if FS->sentord attribute is not defined.
insert_tree (root,position)
Insert new tree at given position.
set_tree (root,pos)
Set tree at given position.
new_tree (position)
Create a new tree at given position and return pointer to its root.
delete_tree (position)
Delete the tree at given position and return pointer to its root.
destroy_tree (position)
Delete the tree at given position and return pointer to its root.

FSBackend

FSBackend - IO backend for reading/writing FS files using FSFile class.

$emulatePML
This variable controls whether a simple PML schema should be created for FS files (default is 1 - yes). Attribute whose name contains one or more slashes is represented as a (possibly nested) structure where each slash represents one level of nesting. Attributes sharing a common name-part followed by a slash are represented as members of the same structure. For example, attirubtes a, b/u/x, b/v/x and b/v/y result in the following structure:

{a = value_of_a, b => { u => { x => value_of_a/u/x }, v => { x => value_of_a/v/x, y => value_of_a/v/y } } }>

In the PML schema emulation mode, it is forbidden to have both a and a/b attributes. In such a case the parser reverts to non-emulation mode.

test (filehandle | filename, encoding?)
Test if given filehandle or filename is in FSFormat. If the argument is a file-handle the filehandle is supposed to be open by previous call to open_backend. In this case, the calling application may need to close the handle and reopen it in order to seek the beginning of the file after the test has read few characters or lines from it.

Optionally, in perl ver. >= 5.8, you may also specify file character encoding.

read (handle_ref,fsfile)
Read FS declaration and trees from a given file in FS format (file handle open for reading must be passed as a GLOB reference). Return 1 on success 0 on fail.
write (handle_ref,$fsfile)
Write FS declaration, trees and unparsed tail to a given file to a given file in FS format (file handle open for reading must be passed as a GLOB reference).
ParseFSTree ($fsformat,$line,$ordhash)
Parse a given string (line) in FS format and return the root of the resulting FS tree as an FSNode object.

Fslib::List

This class implements the attribute value type 'list'.

new(value?,...)
Create a new list (optionally populated with given values).
new_from_ref(array_ref, reuse)
Create a new list consisting of values in a given array reference. Use this constructor instead of new() for large lists by reference. If reuse is true, then the same array_ref scalar is reused within the Fslib::List object (i.e. blessed). Otherwise, a copy is created within the constructor.
values()
Retrurns a its values (i.e. the list members).

Fslib::Alt

This class implements the attribute value type 'alternative'.

new(value?,...)
Create a new alternative (optionally populated with given values).
values()
Retrurns a its values (i.e. the alternatives).

Fslib::Seq

This class implements the attribute value type 'sequence'. A sequence consists of items called elements. Each element is a name-value pair. Unlike hashes, sequences are ordered and may contain more than one element with a given name.

elements()
Return a list of [ name, value ] pairs representing the sequence elements.
elements_list()
Like elements, only this method returns a Fslib::List object.
content_pattern()
Return the regular expression constraint stored in the sequence object (if any).
set_content_pattern()
Store a regular expression constraint in the sequence object. This expressoin can be used later to validate sequence content (see validate() method).
values()
Return a list of values of all elements of the sequence. In array context, the returned value is a list, in scalar context the result is a Fslib::List object.
names()
Return a list of names of all elements of the sequence. In array context, the returned value is a list, in scalar context the result is a Fslib::List object.
element_at(index)
Return the element of the sequence on the position specified by a given index. Elements in the sequece are indexed as elements in Perl arrays, i.e. starting from $[, which defaults to 0 and nobody sane should ever want to change it.
name_at(index)
Return the name of the element on a given position.
name_at(index)
Return the value of the element on a given position.
delegate_names(key?)
If all element values are HASH-references, then it is possible to store each element's name in its value under a given key (that is, to delegate the name to the HASH value). The default valeu for key is #name. It is a fatal error to try to delegate names if some of the values is not a HASH reference.
validate(content_pattern?)
Check that content of the sequence statisfies a constraint specified by means of a regular expression content_pattern. If no content_pattern is given, the one stored with the object is used (if any; otherwise undef is returned).

Returns: 1 if the content satisfies the constraint, 0 otherwise.

push_element(name, value)
Append a given name-value pair to the sequence.

Fslib::Schema

This class implements elementary support for PML schemas. Although neither it's API nor implementation is stable, it is intended to fully replace the FSFormat class in the future. Currently it is only a XML::Simple representation of a PML schema file. Whether this is favourable or not, is yet to be discovered.

new(string)
Parses a given XML representation of the schema and returns a new Fslib::Schema instance.
readFrom(filename,opts)
Reads schema from a given XML file and returns a new Fslib::Schema object.

The 2nd argument, opts, is an optional hash reference with parsing options. The following options are recognized:

base_url - base URL for referred schemas

use_resources - if true, reffered schemas are also looked for in the $ResourcePath

revision, minimal_revision, maximal_revision - constraint the revision number of the schema

find_role(type,role)
Starting from a given schema type, locate and return a (possibly deeply nested) subtype of a given role.
node_type(type,role)
Find all types with role #NODE.
get_type_by_name(name)
Returns a HASH structure representing the declaration of the given named type.
resolve_type(type)
Returns type, unless it is only a type-reference in which case it follows the reference and returns the resulting type.
type(type)
Wrap given schema type into a Fslib::Type object and return the object. Both the current schema and the type can be retrieved from the Fslib::Type object.
attributes([type...])
Return attribute-paths to all atomic subtypes of given types. If no types are given, then types with role #NODE are assumed. In a way, this function tries to emulate the behavior of FSFormat->attributes.
validate_object (object, type, log)
Validates the data content of the given object against a specified type. The type may be either the name of a named type, or a Fslib::Type, or a HASH with a parsed declaration.

An array reference may be passed as the optional 3rd argument log to obtain a detailed report of all validation errors.

Returns: 1 if the content conforms, 0 otherwise.

Note: this method requires PMLBackend (use ImportBackend to load it).

validate_field (object, attr-path, type, log)
This method is similar to validate_object, but in this case the validation is restricted to the data substructure of object specified by the attr-path argument.

type is the type of object specified either by the name of a named type, or as a Fslib::Type, or as a HASH with a parsed declaration.

An array reference may be passed as the optional 3rd argument log to obtain a detailed report of all validation errors.

Returns: 1 if the content conforms, 0 otherwise.

Note: this method requires PMLBackend (use ImportBackend to load it).


Fslib::Type

This is a wrapper class for a schema type.

new(schema,type)
Return a new Fslib::Type object containing a given type of a given Fslib::Schema.
schema()
Retrieve the Fslib::Schema.
type_decl()
Return the raw Perl structure which resulted from parsing the PML schema declaration by XML::Simple.
members()
If the wrapped schema type is an AVS structure type, return names of its members (attributes), except for a possible member with role #CHILDNODES.
attributes()
Return attribute-paths to all atomic subtypes of the given type.
find(attribute-path)
Locate a subtype specified by a given attribute-path. Attribute path is a /-separated sequence of member and/or element names which identifies a path to a certain nested sub-type in the nesting of structures and element sequences.

Fslib

Fslib.pm - Simple low-level API for treebank files in .fs format. See FSFile, FSFormat and FSNode for an object-oriented abstraction over this module.

DESCRIPTION

This package has the ambition to be a simple and usable perl API for manipulating the treebank files in the .fs format (which was designed by Michal Kren and is the only format supported by his Windows application GRAPH.EXE used to interractively edit treebank analytical or tectogramatical trees). See also a description of this format at

http://ufal.mff.cuni.cz/pdt/Corpora/PDT_1.0/Doc/fs.html

The Fslib package defines functions for parsing .fs files, extracting headers, reading trees and representing them in memory using simple hash structures blessed to the FSNode class, manipulate the values of node attributes and modify the structure of the trees.

DATA STRUCTURES REPRESENTING NODES AND TREES

A tree is represented by it's root-node. A node a FSNode object, which in turn is a usual Perl hash reference where hash keys are names of attributes and hash values are the corresponding attribute values. Four special keys (defined as global variables) are reserved for representing the tree structure. These are namely $Fslib::parent, $Fslib::firstson, $Fslib::rbrother, and $Fslib::lbrother. Another special key $Fslib::type is sometimes used to store Fslib::Type information. It is highly recommended to use FSNode API instead of accessing these hash keys and the corresponding $Fslib::... variables directly.

ReadEscapedLine (FH)
 Params:
   FH - a file handle, e.g. STDIN
 Returns:
   This auxiliary function reads lines form FH as long as
   one without a trailing backslash is encountered. Returns
   concatenation of all lines read with all trailing backslash
   characters removed.
Next($node,[$top]), Prev($node,[$top])
 Params:
   $node - a reference to a tree hash-structure
   $top  - a reference to a tree hash-structure, containing
           the node referenced by $node
 Return:
   Reference to the next or previous node of $node on
   the backtracking way along the tree having its root in $top.
   The $top parameter is NOT obligatory and may be omitted.
   Return zero, if $top of root of the tree reached.
   There is no need to use this function directly. You should
   use B<FSNode->>B<following> method instead.
Cut($node)
 Params:
   $node - a reference to a node
  Description:
   Cuts (disconnets) $node from its parent and brothers
  Returns:
   $node
Paste($node,$newparent,$fsformat)
 Params:
   $node      - a reference to a (cutted or new) node
   $newparent - a reference to the new parent node
   $fsformat  - FSFormat object
 Description:
   connetcs $node to $newparent and links it
   with its new brothers, placing it to position
   corresponding to its numerical-argument value
   obtained via $fsformat->order.
 Returns $node
CloneValue($scalar)
 Params:
   $scalar - arbitrary Perl scalar
  Description:
   Returns a deep copy of the Perl structures containing
   in a given scalar.
  Returns:
   a deep copy of $scalar
FindInResources($filename)
 Params:
   $filename - a relative path to a file
 Description:
    If a given filename is a relative path of a file found in TrEd's
    resource directory, return an absolute path for the
    resource. Otherwise return filename.
ResolvePath($ref_filename,$filename,$use_resources?)
 Params:
   $ref_filename - a reference filename
   $filename     - a relative path to a file
   $use_resources - 0 or 1
  Description:
   If a given filename is a relative path, try to find the file in the
   same directory as ref-filename. In case of success, return a path
   based on the directory part of ref-filename and filename.  If the
   file can't be located in this way and use_resources is true, return
   the value of C<FindInResources(filename)>.
ImportBackends(@backends)
 Params:
   @backends  - a list of backend names
 Description:
   Demand to load the given backends and return a list of
   backends for which the demand was fulfilled. These
   backends may then be freely used in FSFile IO calls.
OBSOLETED functions
 FirstSon($node), Parent($node), LBrother($node), RBrother($node)
 Params:

   $node - a FSNode object

 Returns:

   Parent, first son, left brother or right brother resp. of the node
   referenced by $node

   There is no need to use these functions directly. You should
   use FSNode methods instead.