Converting between SGML and FS

Petr Pajas


Table of Contents

1. Introduction
2. Installation
3. Usage
4. Expamles

1. Introduction

any2any is a command-line utility for conversion between different file formats representing a tree structure, including FS (Graph's and TrEd's native format), CSTS (PDT1.0 specific SGML format) and TrTree XML (which is a simple XML variant of the general FS format).

2. Installation

any2any is a part of TrEd distribution, so if Perl and TrEd are installed properly any2any should work fine. There is also a stand-alone distribution of any2any for those who do not wish to install TrEd. To install the stand-alone distribution, simply copy the whole any2any directory to any location you want and run any2any from that location. Make sure, that Perl 5 is installed on your system (Windows(TM) version from ActiveState may be found here). James Clark's SGML parser nsgmls is required for conversion from CSTS SGML format. Make sure, that this tool is installed on your system and that the path to the nsgmls binary executable is in your PATH.

3. Usage

any2any [-s suffix-pattern] [-a suffix] [-p prefix-pattern] [-r prefix] [-f output-format] [-T] [-N] [-G] [-C] input-file...

or

any2any [-u | -h]

Each input file is opened and converted to the output format specified in the -f parameter. File name of the output file is made-up from the input file name in the following way:

By default, output file name is the same as that of the input file and backup file (with a tilde ~ appended) is created to keep a copy of the original file. If -s pattern is given, it is interpreted as an regular expression and if a trailing part of the file name matches the regular expression, it is deleted. The value of -a parameter (if any) is appended to the end of the file name. If -p pattern is given, it is interpreted as an regular expression and the matching part of the beginning of the file name (including extension) is deleted. The filename is then prepended by the suffix given after -r (if any).

Parameters options

-f

Output format. The following output formats are available: fs (TrEd's native file format), csts (PDT 1.0 specific SGML format), TrXML (XML variant of the general FS format).

-s

Regular expression that should match the trailing part of the file name which is to be deleted.

-a

Suffix to be appended to the output file name (after the trailing part matched by -s is deleted from it).

-p

Regular expression that should match the beginning of the file name (including path) which is to be deleted.

-r

Prefix to be inserted to the beginning of the output file name (after the part matched by -p is deleted from it).

Conversion flags

-T

Build tectogrammatical tree structure and use tectogrammatical FS-File header when converting from CSTS.

-N

Fill empty ordering attributes with sentence ordering number when converting from CSTS.

-G

If set, the element <g> is not written to the output for nodes dependent on the node with <g>0 (usually the root node) when converting to CSTS.

-C

This flag is provided for backward compatibility with Dan Zeman's older conversion tool called cstsfs where some CSTS elements are treated in a different way.

4. Expamles

To convert all FS files *.fs in the fs/ subdirectory to CSTS SGML format, storing the output files in sgml/ subdirectory with the .csts extensions, the following command may be used:

any2any -s fs -a csts -p fs/ -r sgml/ -f csts fs/*.fs

To do exactly the vice versa, converting CSTS files from sgml/ to FS files in fs/, the following command may be used:

any2any -s csts -a fs -p sgml/ -r fs/ -f fs sgml/*.csts

Note

On some systems you may have to call any2any as perl any2any.