Treex::PML::Schema - Perl implements a PML schema.
This class implements PML schemas. PML schema consists of a set of
type declarations of several kinds, represented by objects inheriting
from a common base class Treex::PML::Schema::Decl
.
This class inherits from Treex::PML::Schema::Template.
Some methods use so called 'attribute paths' to navigate through nested and referenced type declarations. An attribute path is a '/'-separated sequence of steps, where step can be one of the following:
!
type-name'!' followed by name of a named type (this step can only occur as the very first step
name (of a member of a structure, element of a sequence or attribute of a container), specifying the type declaration of the specified named component
#content
the string '#content', specifying the content type declaration of a container
LM
specifying the type declaration of a list
AM
specifying the type declaration of an alt
[
NNN]
where NNN is a decimal number (ignored) are an equivalent of LM or AM
Steps of the form LM, AM, and [NNN] (except when occuring at the end of an attribute path) may be omitted.
This module exports constants for declaration types.
Export constant symbols (exported by default, too).
See Treex::PML::Schema::Constants.
NOTE: Don't call this constructor directly, use Treex::PML::Factory->createPMLSchema() instead!
Parses an XML representation of a PML Schema from a string, filehandle, local file, or URL, processing the modular instructions as described in
L<http://ufal.mff.cuni.cz/jazz/PML/doc/pml_doc.html#processing>
and returns the corresponding Treex::PML::Schema
object.
One of the following options must be given:
string
a XML string to parse
filename
a file name or URL
fh
a file-handle (IO::File, IO::Pipe, etc.) open for reading
The following options are optional:
base_url
base URL for referred schemas (usefull when parsing from a file-handle or a string)
use_resources
if this option is used with a true value, the parser will attempt to locate referred schemas also in Treex::PML resource paths.
revision
, minimal_revision
, maximal_revision
constraints to the revision number of the schema.
validate
if this option is used with a true value, the parser will validate the
schema on the fly using a RelaxNG grammar given using the
relaxng_schema
parameter; if relaxng_schema
is not given, the
file 'pml_schema_inline.rng' searched for in Treex::PML resource paths
is assumed.
relaxng_schema
a particular RelaxNG grammar to validate against. The value may be an URL or filename for the grammar in the RelaxNG XML format, or a XML::LibXML::RelaxNG object representation. The compact format is not supported.
An obsolete alias for Treex::PML::Schema->new({%$opts, filename=>$filename}).
This method serializes the Treex::PML::Schema object to XML. See Treex::PML::Schema::XMLNode->write for implementation.
IMPORTANT: The resulting schema is simplified, that is all modular instructions are processed and removed from it, see http://ufal.mff.cuni.cz/jazz/PML/doc/pml_doc.html#processing
One of the following options must be given:
string
a scalar reference to which the XML is to be stored as a string
filename
a file name
fh
a file-handle (IO::File, IO::Pipe, etc.) open for writing
One of the following options are optional:
no_backups
if this option is used with a true value, the writer will not attempt to create backup (tilda) files when overwriting an existing file.
no_indent
if this option is used with a true value, the writer will not add additional newlines and indentatin white-space to the result XML.
Return location of the PML schema file.
Return PML version the schema conforms to.
Return PML schema revision.
Return PML schema description.
Return the root type declaration (see Treex::PML::Schema::Root
).
Like $schema->get_root_decl->get_content_decl.
Return the constant PML_SCHEMA_DECL (for compatibility with the Treex::PML::Schema::Decl interface).
Return the string 'schema' (for compatibility with the Treex::PML::Schema::Decl interface).
Return name of the root element for PML instance.
Return names of all named type declarations.
This method returns a list of HASHrefs containing information about a named references to PML instances (each hash will currently have the keys 'name' and 'readas').
This method retrieves information about a specific named instance reference as a hash (currently with keys 'name' and 'readas').
This method traverses all nested declarations and sub-declarations and calls a given subroutine passing the sub-declaration object as a parameter.
Locate a declaration specified by attribute-path
starting from
declaration decl
. If decl
is undefined the root type declaration
is used. (Note that attribute paths starting with '/' are always
evaluated startng from the root declaration and paths starting with
'!' followed by a name of a named type are evaluated starting from
that type.) All references to named types are transparently resolved
in each step.
The caller should pass a true value in noresolve
to enforce Member,
Attribute, Element, Type, or Root declaration objects to be returned
rather than declarations of their content.
Attribute path is a '/'-separated sequence of steps (member, attribute, element names or strings matching [\d*]) which identifying a certain nested type declaration. A step of the aforementioned form [\d*] is match the content declaration of a List or Alt. Note however, that named stepsdive into List or Alt declarations automatically, too.
Return a list of declarations (objects derived from Treex::PML::Schema::Decl)
that have role equal to role
.
If start_decls
is specified, it must be an ARRAY reference of
declarations; in that case, only declarations nested below the listed
ones are considered.
WARINING: this function can be very slow, esp. if the type declarations are recursive.
Return a list of attribute paths leading to nested type declarations
of decl
with role equal to role
.
This is equivalent to
$schema->find_decl($decl,sub{ $_[0]->{role} eq $role },$opts);
Please, see the documentation for find_dec
for more information.
WARINING: this function can be very slow, esp. if the type declarations are recursive.
Return a list of attribute paths leading to nested type declarations
of decl
for which a given callback returns a true value. The tested
type declaration is passed to the callback as the first (and only)
argument.
If start_decls
is specified, it must be an ARRAY reference of
declarations; in that case, only declarations nested or referred to
from the listed ones are considered.
In array context return all matching nested declarations are returned. In scalar context only the first one is returned (with early stopping).
The last argument opts
can be used to pass some flags to the
algorithm. Currently only the flag no_childnodes
is available. If
true, then the function never recurses into content declaration of
declarations with the role #CHILDNODES.
Return a list of all type declarations with the role #NODE
.
Return the declaration of the named type with a given name (see
Treex::PML::Schema::Type
).
Validates the data content of the given object against a specified
type declaration. The type_decl argument must either be an object
derived from the Treex::PML::Schema::Decl
class or the name of a named
type.
An array reference may be passed as the optional 3rd argument log
to obtain a detailed report of all validation errors.
The flags
argument can specify flags that influance the
validation. The following constants can binary-OR'ed to obtain the
fags:
PML_VALIDATE_NO_TREES - do not validate nested data with roles #CHIDLNODES or #TREES and do not require that objects with the role #NODE implement the Treex::PML::Node role.
PML_VALIDATE_NO_CHILDNODES - do not validate nested data with the role #CHIDLNODES.
Returns: 1 if the content conforms, 0 otherwise.
This method is similar to validate_object
, but in this case the
validation is restricted to the data substructure of object
specified by the attr-path
argument.
type
is the type of object
specified either by the name of a
named type, or as a Treex::PML::Type, or a type declaration.
An array reference may be passed as the optional 3rd argument log
to obtain a detailed report of all validation errors.
Returns: 1 if the content conforms, 0 otherwise.
This method returns a list of all non-periodic canonical paths leading from given types to atomic values. Currently only the following options are supported:
no_childnodes => $bool
If true, the method does not descent to member types with the role #CHILDNODES.
no_nodes => $bool
If true, the method does not descent to member types with the role #NODE (except for the starting types).
with_LM => $bool
If true, the paths will include a LM step for each List type on the path.
with_AM => $bool
If true, the paths will include a AM step for each Alt type on the path.
with_Seq_brackets => $bool
If true, the paths will append a [0] after each step representing a sequence element
This function tries to emulate the behavior of
<<Treex::PML::FSFormat-
attributes>>> to some extent.
Return attribute paths to all atomic subtypes of given type
declarations. If no type declaration objects are given, then types
with role #NODE
are assumed. This function never descends to
subtypes with role #CHILDNODES
.
Prague Markup Language (PML) format: http://ufal.mff.cuni.cz/jazz/PML/
Tree editor TrEd: http://ufal.mff.cuni.cz/~pajas/tred
Related packages: Treex::PML, Treex::PML::Schema::Template, Treex::PML::Schema::Decl, Treex::PML::Instance,
Copyright (C) 2006-2010 by Petr Pajas
This library is free software; you can redistribute it and/or modify it under the same terms as Perl itself, either Perl version 5.8.2 or, at your option, any later version of Perl 5 you may have available.