ntred

ntred - controller/hub/client interface to a cluster of btred servers


SYNOPSIS

To query the servers:

  ntred [-m|-I macro-file] [-e code] [--hub hub] [--port port]
        [-N|-H] [-T] [--key-file file] [--filelist file-list ] [-L files]
        -- script-arguments

To start remote servers and a hub:

  ntred -i [--servers server[,server,...]] [--serverlist server-list]
           [--filelist file-list] [--max-servers num] [--no-secondary-files]
           [--no-server-check] [--no-server-start] [--no-hub-start]
           [--old-dist-method] [--safe-mode] [--server-debug]
           [-m|-I macro-file] [-M module] [--btred path-to-btred]
           [--ssh ssh-command] [--local]
           [--key-file] [--hub hub] [--port port] [file [...]]

To close all remote servers and a hub:

  ntred --quit|--ps-quit

To kill all remote servers and a hub:

  ntred --kill|--ps-kill

To manage files on the servers:

  ntred --list-files|--list-changed-files
  ntred --reload-files
  ntred --reload-macros [-m macro-file]
  ntred --load-files [--filelist file-list] [file [...]]
  ntred --close-files
  ntred --save-files|--save-changed-files [-s strip-sfx]
       [-a append-sfx] [-p strip-prefix] [-r add-prefix] [-f out-fmt]
  ntred --dump-files [--filelist file-list] [file [...]]
  ntred --upload-file filename < fs-file

Get help:

  ntred -u          for usage (synopsis)
  ntred -h          for help
  ntred --man       for the manual page

DESCRIPTION

This program is able to start one or more btred servers on a set of host machines over SSH, create a proxy hub to provide the communication between the servers and a client, distribute given files over the servers (provided the servers are able to load the files from given filenames (eg. they share the files over NFS), query the servers using a btred-macro and collect the answers. It is highly recommended to use some password-free authentication method for SSH (e.g. Kerberos or ssh-agent), so that password input is not required each time the SSH connection is made (at least two per-host).

In the client mode, the standard output of the macro is printed to STDOUT of the client. STDERR is reserved for debugging and information messages as well as error messages caused by the macros on the servers. The rest of the error output from a server is stored in a file <logdir>/ntred-server-<host>.log (where <logdir> can be specified using --logdir, NTRED_LOGDIR or TMP or TEMP environment variable and defaults to /tmp if none of the previous exists).


OPTIONS

QUERY MODE OPTIONS

  ntred [-m macro-file] [-e code] [--hub hub] [--port port]
        [-N|-H] [-T] [--key-file file] -- script-arguments
--execute|-e code
This is the query to be executed on the remote servers. It usually either of a name of the macro defined in the file given as --macro-file or some Perl one-liner. If omitted, it defaults to 'autostart()' and a macro with this name must be defined in the macro file provided..
--macro-file|-m filename
A file containing macros that are to be sent to the servers and preloaded on the servers before the particular query code is evaluated. The query code to be evaluated must be specified in --execute.

Note that servers already have the set of macros they obtained on startup (i.e. --init): it is namely either the default set of macros (tred.mac) or the set specified via --macro-file, and beside one of these also the set specified via --include-macro-file.

--include-macro-file|-I filename
A file containing additional set of macros to be sent to the servers. In query mode, this option behaves just like --macro-file and is only provided for compatibility with btred. If both these options are used, both sets of macros are loaded.
--hub hostname
The hostname of the hub to connect to (defaults to localhost).
--port port
The port on which the hub is listening (defaults to 1500).
--all-trees|-T
Run the query code on all trees (wrapping the code into a if ($root) { do {{ CODE }} while TredMacro::NextTree() } loop).
--all-nodes|-N
Run the query code on all nodes (you still must use --all-trees or -T to process all trees) (wrapping the code into a while ($this) { CODE ; $this=$this-following }``;> loop).
--all-nonhidden-nodes|-H
Run the query code on all nodes that are not hidden (you still must use --all-trees or -T to process all trees). This wraps the code into a while ($this) { CODE ; $this=$this-following_visible(FS()) }``;> loop).
--listed-files|-L
Run the query code only on files specified on the command line (provided they are already present on some server, i.e. this option does not make servers load files they don't already have). Filenames may contain ordinary TrEd suffixes of the form ##tree or ##tree.node to indicate that the processing should apply only on a single tree (use in combination with -N) or a single node (use without -T and -N). NTrEd URIs of the form ntred:// are also allowed. Relative file-names are expanded according to the current working directory before they are sent to ntred servers.
--regexp-files|-R regular-expression
Run the query code only on files whose filenames match a given regular expression.
--filelist|-l filename
Like --listed-files|-L but this time the files to be processed are listed in the given file rather than on the command line. Both options may be used together in which case the file-lists are joined.
--key-file
This option may be used to provide a file with a session-key which is necessary for the authentication to the running hub. This defaults to `~/.ntred_session_key'.

HUB and SERVER MODE (-i or --init) OPTIONS

  ntred -i [--servers server[,server,...]] [--serverlist server-list]
           [--filelist file-list] [--max-servers num]
           [--no-secondary-files] [--no-server-check]
           [--no-server-start] [--no-hub-start] [--old-dist-method]
           [--safe-mode] [--server-debug] [--max-retries num]
           [-m macro-file] [-M module] [--btred path-to-btred]
           [--ssh ssh-command] [--local] [--key-file file]
           [--hub hub] [--port port] [file [...]]
--servers list of hosts
A comma separated list of hosts to run btred servers. The hostname may be optionally followed by a comma and a port number thus making it possible to run several btred servers on one host. If the port number is omitted, it defaults to 1600. See also --serverlist.
--serverlist filename
This provides more convenient way to specify servers by providing a file containing a list of servers, one per line. If neither --servers nor --serverlist is provided, then the list of servers is read from ~/.ntred_serverlist.
--filelist filename
A list of files to distribute between servers (one filename per line). Additional files may be given as command-line arguments.
--no-secondary-files
Don't load ``secondary'' files along with normal files (a file may require other - secondary - file to load along with it; this is typical for stand-off annotation where one tree is built upon another).
--max-servers number
Limit number of servers to start even if the list of servers contains more of them.
--max-retries number
Specifies how many times the hub tries to connect to a btred server before it gives up.
--no-server-check
Skip an initial check for server hosts availability (a dummy attempt for SSH connection).
--no-server-start
Don't start new btred servers on the remote hosts. Instead, start a hub and try to connect to the btred servers already running. This requires a server-session key to be given on the standard input.
--no-hub-start
Start servers on the remote hosts but don't start a hub. The server session key required for authentication to the servers will be printed on the standard output.
--safe-mode|-F
Run btred servers in safe mode in which all macros are processed in a Safe compartment whith some security restrictions. This mode seems to be likely to cause btred servers to suffer from memory leaks.

In the safe mode, only the following opcodes and opcode-sets are allowed (see Opcode):

  :base_core :base_mem :base_loop :base_math
  entereval caller dofile
  print entertry leavetry tie untie bless
  sprintf localtime gmtime sort require

plus :base_orig, but the following opcodes (which are forbidden):

  getppid getpgrp setpgrp getpriority setpriority pipe_op sselect
  select dbmopen dbmclose tie untie
--server-debug
Run btred server with -D flag for some more debugging information.
--macro-file|-m filename
A file containing the default set of macros to be prepared on btred servers. The file (with exactly the same path) must be visible from all server hosts.
--include-macro-file|-I filename
A file containing additional set of macros to be prepared on btred servers. This option is typically used instead of --macro-file to load macros from both filename and the default macro set from (tred.mac). --macro-file can still be used in combination with --include-macro-file to supply a replacement for tred.mac.
--key-file
Allows to specify a file where the the session-key for a client's authentication will be stored. It defaults to ~/.ntred_session_key.
--terminal-encoding|-d encoding
Automatically applies a given character encoding to all stdout output operations on the servers and command-line arguments. Only works with Perl >= 5.8.
--hub hostname
The hostname of the local machine the hub will listen on (defaults to localhost but a machine's hostname may be given to allow remote access to the hub).
--port port
The port number the hub will be listening on (defaults to 1500).
--preload-module|-M module-name
This option is passed to the btred command line when starting a btred server. It makes btred preload a given Perl module at btred startup so as it is available to all macros (DOES NOT WORK WITH RESTRICTED MODE). This option may be specified more than once with different modules.
--old-dist-method
Use old benchmark-based distribution method.

HUB CONTROL OPTIONS

  ntred --list-files|--list-changed-files
  ntred --reload-files [-filelist file-list] [--listed-files file [...]]
  ntred --reload-changed-files
  ntred --reload-macros [-m macro_file]
  ntred --load-files [--filelist file-list] [file [...]]
  ntred --close-files
  ntred --save-files|--save-changed-files [-s strip-sfx]
       [-a append-sfx] [-p strip-prefix] [-r add-prefix] [-f out-fmt] [--knit]
  ntred --quit
  ntred --kill|--ps-kill [--servers server[,server,...]] [--serverlist server-list]
  ntred --break|--ps-break
  ntred --dump-files [--filelist file-list] [file [...]]
  ntred --upload-file filename < fs-file
--init|-i
Start remote btred servers and a hub. See ``HUB and SERVER MODE''.
--quit
Sends the hub and all the servers a command to quit.
--break
Tells ntred nub to send USR1 a signal to all running btred servers. Upon receiving that signal, the servers should stop current processing and return to the request awaiting state.
--ps-kill
Similar to --break but tries to identify btred server processes by looking at the output of the system command ps x -o pid,command. This may help if killall -USR1 btred doesn't work.
--kill
Runs killall -9 ntred on the local machine and and killall -9 btred on the server hosts listed with --servers, --serverlist, or in ~/.ntred_serverlist.
--ps-kill
Similar to --kill but tries to identify btred server processes by looking at the output of the system command ps x -o pid,command. This may help if killall -TERM btred doesn't work.
--list-files
List all files currently open on servers.
--list-changed-files
List all files that have been changed by some macro. Note, that a macro has to claim that the file was changed by setting $TredMacro::FileChanged variable to 1, otherwise the btred server would never notice.
--listed-files|-L
Apply request only on listed files (currently only works for queries and --reload-files request).
--count-files
Query number of files open on each server.
--reload-files
Send a command to the btred servers to reload all open files. If --filelist or --listed-files options are given, reload only files occuring in the given lists (all other files remain intact in servers' memory).
--reload-changed-files
Send a command to the btred servers to reload files that have been modified since (re)loaded.
--reload-macros
Send command to the btred servers to reload the initial macro file. If -m (--macro-file) is specified, the servers use the given macro-file instead of the original one (specified when initializing btred servers). Note, that the file (with exactly the same path) must be visible from all server hosts.
--load-files
Send a command to the hub to distribute files given on the command-line or those specified using --filelist of in ~/.ntred_filelist to the servers. Note, that a file distributed to a server is not reloaded by the server if the server already has a file with the same path in memory.
--close-files
Send a command to the btred servers to close all open files.
--save-files
Send a command to the servers to save all open files. The filenames of the saved files may be modified using --add-prefix, --strip-prefix, --strip-suffix, --append-suffix.
--save-changed-files
Same as --save-files except that only files that have been changed by some macro will be saved. Note, that a macro has to claim that the file was changed by setting $TredMacro::FileChanged variable to 1, otherwise the btred server would never notice. See also --list-changed-files.
--knit|K ALL|NONE|name1,name2,...
If a file is saved, save/update also listed types of reffiles the file pulled data from. For the moment, this only makes sense with the PMLBackend which supports so called knitting, i.e. a method to pull certain data from external resources and push it back (with all changes) to the original position in the resource when saving the file. This option allows to list the types of resources (in PML the types are the reference names listed in the PML schema) which should be saved. Default is NONE. This type of resources doesn't include so called secondary files.
--dump-files
Dumps given files or trees to standard output in FS format as they are in memory of the btred servers. Individual dumps are separated with `//FSEND' preceded and followed by two newline characters. To output a single tree, follow the filename with ##n suffix where n is the absolute position of the tree in the file (starting from one). The following example shows how csplit command can be used to save individual dumps into separate files:
  ntred --dump <files> | csplit -z -f out -b '%d.fs' - '/\/\/FSEND/+2' '{*}'

To merge these separate files into one huge FS file, use

  any2any -m hugeout.fs out*.fs
--upload-file
This command is kind of a reverse to dump. It takes a filename from the first command line argument and a FS file from the standard input and sends them to the btred servers. The server possessing the file with the given filename replaces its own in-memory copy of the file with the one provided on the standard input. To update a single tree, follow the filename with ##n suffix where n is the absolute position of the tree in the file (starting from one).
--no-secondary-files
Ignore ``secondary'' files even if loaded. This only affects some commands, such as --list-changed, in which particular case is means that a file, whose secondary file was changed, is not reported as changed unless the (primary) file itself was marked as changed.

SAVING OPTIONS

--strip-prefix|-p regexp
Remove strings matching given regexp from the beginning of filenames before saving.
--add-prefix|-r prefix
Prepend output filenames with the given prefix when saving.
--strip-suffix|-s regexp
Strip strings matching given regexp from the end of filenames when saving.
--add-suffix|-a suffix
Append given suffix to the filenames when saving.
--output-format|-f [fs|csts|trxml|tei|storable]
Format to use for files being saved.
--no-secondary-files
Don't save ``secondary'' files (not even if changed). Normally, secondary files (if loaded) are saved along with their primary files (the exactly same file-name prefix/suffix processing and format apply to both the primary and secondary files).

GENERAL OPTIONS

--glob|-g
Apply Perl glob function on the filename patterns given on the command-line. This expands possible wild-card patterns on each of the filename command-line argument as the standard Unix shell /bin/csh would do. This can not only help in a situation where the shell used doesn't support wildcard expansion, but can also be used to reduce the number of the command-line arguments passed to the process in cases where the argument list would after the shell-expansion exceed a system limit. Note, that currently expansion is performed on the client regardless of the type of request. This may change in the future versions.
--usage|-u
Print a brief help message on usage and exits.
--help|-h
Prints the help page and exits.
--man
Displays the help as manual page.
--quiet|-q
Suppress all NTRED-CLIENT/NTRED-HUB messages on error output.
--really-quiet|-Q
Redirect all std error output to /dev/null.
--ssh command
This option may be used to specify the ssh/rsh command to use to connect to remote servers. Default value is `ssh -o ConnectTimeout=10'.
--local
Run btred servers locally. It ignores all non-local entries in the .ntred_serverlist (i.e. entries not matching local machine's hostname). There still may be more BTrEd instances, provided they use different ports. In this case, SSH is not used.
--btred command
This option may be used to specify the command to use to start a btred server on a remote host. The command must accept any btred parameters.

SECURITY ISSUES

USE AT YOUR OWN RISK. IF SECURITY IS A CRITICAL ISSUE OR IF IN DOUBT, DON'T USE IT AT ALL.

Why is security an issue here? Because btred servers execute almost arbitrary Perl code provided by the client. In the --unrestricted mode such code may contain arbitrary commands such as system() or open(). It is therefore desirable that the servers are not open for all parties.

The following precautions have been taken to lower the potential security risks:

1) Both btred servers and hub require an authorization based on verification of a MD5 signature of a random data block (generated by the server in case of the hub-to-btred-server communication and by the hub in case of the client-to-hub communication) xor-ed with an authorization key known to both parties. Although the communication is unencrypted, the client must with each request send a MD5 checksum of the request XORed with the secret authorization key. Only requests whose signature is verified by the server, are responded to.

2) There may be only one connection from a hub to a server. As soon as it is closed, the server terminates.

3) If the servers are started by the hub itself (using --init) the authorization key is created by the hub and is passed to the btred server via a ssh encrypted pipe.

4) For the client's disposal, the authorization key is stored in user's home directory as ~/.ntred_session_key with permissions set to 600 (only user can read or write). This theoretically (depending on the general security of the system) limits the access to the hub (and thus to the servers) to the user running the hub only. It may, though, be obviously abused from the local root account to execute arbitrary perl code on all btred server hosts. This might especially be undesirable if the hub runs on a machine whose administrator would normaly have no user access to the machines running btred servers. Another possible security issue might arise if user's home directory is on a remote NFS server, so that NFS intervenes accessing the key file. Since NFS uses an unencrypted protocol, network sniffing techniques may be used to obtain the authorization key and hence run arbitrary code on btred hosts. If such situations are likely to happen (e.g. in a large network) it is advisable to use a different location for the authorization key (see --key-file), e.g. /tmp.

5) It is possible to restrict Perl code evaluated on the servers to a safer compartment, where some critical Perl commands are disabled. In some cases, these restrictions may not be sufficient, in other they may be too strict. Some memory leaks can appear when Safe compartment is used. See --safe-mode above for more discussion.

6) Unless --hub option is used, the hub runs on localhost and as such is not (under normal circumstances) open for connections from the outside world. If you are considering making the hub listen on a non-local interface, note that it is a much better option to configure a secure SSH tunnel.


FILES

~/.ntred_serverlist - default list of servers to use

~/.ntred_filelist - default list of files to load on servers

~/.ntred_session_key - client/hub session key


AUTHORS

Petr Pajas <pajas@matfyz.cz>

Zdenek Zabokrtsky <zabokrtsky@ufal.mff.cuni.cz>

Copyright 2003-2005 Petr Pajas and Zdenek Zabokrtsky, All rights reserved.