PMLInstance - Perl extension for loading/saving PML data
use PMLInstance;
Fslib::AddResourcePath( "$ENV{HOME}/my_pml_schemas" );
my $pml = PMLInstance->load({ filename => 'foo.xml' });
my $schema = $pml->get_schema;
my $data = $pml->get_root;
$pml->save();
This class provides a simple implementation of a PML instance.
None by default.
The following export tags are available:
Imports the following constants:
name of the "<LM>" (list-member) tag
name of the "<LM>" (alt-member) tag
XML namespace URI for PML instances
XML namespace URI for PML schemas
space-separated list of supported PML-schema version numbers
Imports internal _die, _warn, and _debug diagnostics commands.
An option 'config' of the methods load() and save() can provide a configuration file which is a PML file whose PML schema is pmlbackend_conf_schema.xml distributed with TrEd. This file can setup defaults for some options of load() and save() and it can also define rules for pre-processing the input documents before parsing them as PML and for post-processing the output documents after serializing them as PML. Currently only XSLT 1.0 based pre- and post- processing is supported.
Create a new empty PML instance object.
Read a PML instance from file, filehandle, string, or DOM. This method may be used both on an existing object (in which case it operates on and returns this object) or as a constructor (in which case it creates a new PMLInstance object and returns it). Possible options are:
{
filename => $filename, # and/or
fh => \*FH, # or
string => $xml_string, # or
dom => $document, # (XML::LibXML::Document)
parser => $xml_parser, # (XML::LibXML)
config => $cfg_pml, # (PMLInstance)
no_trees => $bool,
no_references => $bool,
no_knit => $bool,
selected_references => { name => $bool, ... },
selected_knits => { name => $bool, ... }
}
where filename may be used either by itself or in combination with
any of fh , string, or dom, which are otherwise mutually
exclusive. The parser option may be used to substitute a customized
XML::LibXML parser object. The config option may be used to pass a
PMLInstance representing a PMLBackend configuration file (see CONFIGURATION). If
no_trees is true, then the roles #TREES, #NODE and #CHILDNODES are ignored.
The option selected_references determines which reffiles (with
non-empty readas attribute) to read; if true, the reffile with a given
name is read, if false, it is never read; if a value is not given for
some reffile, the reffile is read unless the no_references flag is on.
The options selected_knits and no_knits determine
data from which reffiles can be copied into this document
following the rules for the role #KNIT. Their meaning is
just like that for selected_references and no_references.
Moreover, no_references implies no_knit, unless no_knit
is explicitly specified.
Returns 1 if the last load() was successful.
Save PML instance to a file or file-handle. Possible options are:
filename, fh, config, refs_save, write_single_LM. If both
filename and fh are specified, fh is used, but the filename
associated with the PMLInstance object is changed to filename. If
neither is given, the filename currently associated with the
PMLInstance object is used. The config option may be used to pass a
PMLInstance representing a PMLBackend configuration file (see CONFIGURATION). The
refs_save option may be used to specify which reference files
should be saved along with the PMLInstance and where to. The value of
refs_save, if given, should be a HASH reference mapping reference
IDs to the target URLs (filenames). If refs_save is given, only
those references listed in the HASH are saved along with the
PMLInstance. If refs_save is undefined or not given, all references
are saved (to their original locations). In both cases, only files
declared as readas='dom' or readas='pml' can be saved.
Translates the current PMLInstance object to a FSFile object
(using FSFile MetaData and AppData fields for storage of non-tree
data). If fsfile argument is not provided, creates a new FSFile object,
otherwise operates on a given fsfile. Returns the resulting FSFile object.
Translates a FSFile object to a PMLInstance object. Non-tree
data are fetched from FSFile MetaData and AppData fields. If called
on an instance, modifies and returns the instance, otherwise creates
and returns a new instance.
Retrieve a possibly nested value from the attribute data structure of $obj. The path argument uses an XPath-like expression of the form
step1/step2/...
where each step (depending on the value retrieved by the preceding part of the expression) can be one of:
to retrieve that member
to retrieve that attribute
to retrieve the first element of that name
to retrieve n-th element /counting from 1/ from a list, sequence, or an alternative
to retrieve n-th element named 'name' from a sequence
to retrieve the n-th element of a sequence provided the n-th element's name is 'name'
In the preceding cases, [n] can be negative, in which case the retrieved value is the n-th element from the end of the list or sequence.
If a step of the form [n] is not given for a list or alternative value then [1] is assumed and the next step is processed.
If the value retrieved by some step is undefined or the step does not match the data type of the value retrieved by the preceding steps, the evaluation is stopped and undef is returned.
For example,
my $value = PMLInstance::get_data($obj,'foo/bar[2]/[-4]/baz/[5]bam');
is roughly equivalent to
my $el = $obj->{foo}->values('bar')->[1]->[-4]->{baz}->[4];
my $value = $el->name eq 'bam' ? $el->value : undef;
but without the side effect of creating array or hash structures where there is none. To be more specific, if, say $obj->{x} is not defined, then the Perl expression
if ($obj->{x}[3]{y}) {...}
automatically causes a side-effect of creating an ARRAY reference in $obj->{x} and a HASH reference in the fourth element of this ARRAY. An analogous construct
PMLInstance::get_data($obj,'foo/[4]/baz');
simply returns undef without either of these side-effects.
The following behave the same (provided that the path /foo/bar[2] retrieves a list, sequence or an alternative and /foo/bar[2]/[1]/baz retrieves a sequence):
my $value = PMLInstance::get_data($obj,'foo/bar[2]/[1]/baz/[1]bam'); my $value = PMLInstance::get_data($obj,'foo/bar[2]/baz/bam');
This function returns all matches of a given attribute path on the
object. It works just as PMLInstance::get_data except that it recurses
into all values of a list, alt or sequence instead of just the first
one on attribute-path steps that do not give an exact
index. Furthermore, unlike PMLInstance::get_data, this functions
does expands trailing Lists and Alts, which means this:
If the path leads to a List or Alt value, the members values
are returned instead; this replacement is applied recursively.
The expansion of trailing Lists and Alts can be prevented by appending a slash followed by a dot to the attribute path ("$path/.").
Store a given value to a possibly nested attribute of $obj specified
by path. The path argument uses the XPath-like syntax described above
for the method PMLInstance::get_data. If $strict==0 and a non-index step is to be
processed on an alternative or list, then step [1] is assumed and the
1st element of the list or alternative is used for further processing
of the path expression (except when this occurs in the last step, in
which case the entire list or alternative is overwritten by the given
value). If $strict==1 and a non-index step is to be processed on an
alternative or list, a warning is issued and undef is returned. If
$strict==2, the same approach as with $strict==1 is taken, but croak is
used instead of warn.
This function traverses a given PML data structure and dispatches callbacks at all occurrences of given attribute paths.
If called on other object that PMLInstance (i.e. Fslib::Struct, Fslib::List, etc.), the corresponding data type (PMLSchema::* object) can be provided in the \%opts argument as
{ type => $type_decl }
The callback gets one argument: a hash reference of the form
{ value => $matched_obj, path => $matched_obj_path, type => $obj_type_decl }
where $matched_obj_path is full canonical path to the matching
object. The type key is present in hash only if for_each_match was
called on a PMLInstance or if PMLSchema type of the initial object was
given in \%opts.
The path syntax is as described in PMLInstance::get_data, with the
following differences:
1. Path steps of the form [n] or name[n], where n is a number, are not supported (but steps of the form [n]name work).
2. Additionally, steps can be separated with //. Like in XPath, this indicates a descendant axis, that allows arbitrary structures between the steps. I.e. a//z matches any data matched by a/z, a/b/z, /a/b/c/z, etc. One can also use // at the very beginning of an expression (//a/b) to match arbitrarily nested occurrence of a/b (e.g. one matching x/y/z/a/b).
This function returns all data matching given path or, if the second
argument is an array reference, any of given paths. The path(s), as
well as $obj and \%opts argument are as in
PMLInstance::for_each_match. The function returns an array in
array context and an array reference in scalar context.
Like PMLInstance::get_all_matches, but returns only the number of
matching objects (without creating any intermediate list).
Hash a given object under a given ID. If warn is true, then a warning is issued if the ID already wash hashed with a different object.
Lookup an object by ID.
Return the filename (string) or URL (URI object) of the PML instance.
Return URL of the PML instance as URI object.
Change filename of the PML instance.
Return ID of the XSL-based transformation specification which was used to convert between an original non-PML format and PML (and back).
Set ID of an XSL-transformation specification which is to be used for conversion from PML to an external non-PML format (and back).
Return PMLSchema object associated with the PML instance.
Associate a PMLSchema with the PML instance (this method should
not be used for an instance containing data).
Return URL of the PML schema file associated with the PML instance.
Change URL of the PML schema file associated with the PML instance.
Return the root data structure.
Set the root data structure.
Return a Fslib::List object containing data structures with role
'#NODE' belonging in the first block (list or sequence) with role
'#TREES' occuring in the PML instance.
If the PML instance consists of a sequence with role '#TREES', return a
Fslib::Seq object containing the maximal (but possibly empty)
initial segment of this sequience consisting of elements with role
other than '#NODE'.
If the PML instance consists of a sequence with role '#TREES', return
a Fslib::Seq object containing all elements of the sequence
following the first maximal contiguous subsequence of elements with
role '#NODE'.
Return the type declaration associated with the list of trees.
Returns a HASHref mapping file reference IDs to URLs.
Set a given HASHref as a map between refrence IDs and URLs.
Returns a HASHref mapping file reference names to reference IDs. Each value of the hash is either a ID string (if there is just one reference with a given name) or a Fslib::Alt containing all IDs associated with a given name.
Set a given HASHref as a map between refrence IDs and URLs.
Return a DOM or PMLInstance object representing the referenced resource with a given ID (applies only to resources declared as readas='dom' or readas='pml').
Use a given DOM or PMLInstance object as a resource of the current PMLInstance with a given ID (note that this may break knitting).
Fslib, PMLSchema, PML spec - http://ufal.mff.cuni.cz/jazz/PML/doc, TrEd - http://ufal.mff.cuni.cz/~pajas/tred
Copyright (C) 2006 by Petr Pajas
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.2 or, at your option, any later version of Perl 5 you may have available.