[comp.sys.isis] Man page: man3/spool.3

ken@gvax.cs.cornell.edu (Ken Birman) (02/13/90)

.TH SPOOL 3  "1 February 1986" ISIS "ISIS LIBRARY FUNCTIONS"
.SH NAME
spool, spool_replay, spool_and_discard \-- ISIS spooling and long-haul functions.
.SH SYNOPSIS
.B 
#include ``isis.h''
.PP
.B id = spool(sname, entry, fmt, args, SP_KEY, args, ... , 0);
.br
.B char *sname, *fmt;
.br
.B int entry;
.PP
.B id = spool_m(sname, entry, msg, SP_KEY, args, ... , 0);
.br
.B char *sname;
.br
.B message *msg;
.br
.B int entry;
.PP
.B spseqn = spool_getseqn(msg)
.br
.B int spseqn;
.br
.B message *msg;
.PP
.B spool_replay(sname, SP_PAT, args, ..., 0);
.br
.B char *sname;
.br
.PP
.B int spool_in_replay;
.PP
.B spool_discard(sname, discard-pattern, 0);
.PP
.B spool_and_discard(sname, spool-request... 0, discard-pattern, 0);
.PP
.B spool_m_and_discard(sname, spool_m-request... 0, discard-pattern, 0);
.PP
.B prepend_and_discard(sname, spool-request... 0, discard-pattern, 0);
.br
.B char *sname;
.PP
.B spool_set_replay_pointer(sname, spseqn);
.PP
.B spool_set_ckpt_pointer(sname, spseqn);
.PP
.B spool_play_through(sname, SP_OFF/SP_ON);
.PP
.B spool_cancel(id)
.PP
.B spool_inquire(id)
.PP
.B spool_wait(id)
.PP
.B spool_advise(sname, options, 0);
.PP
.B bin/spooler [-v] [-l long_haul_conf] [portno]
.PP
.SH DESCRIPTION
.PP
An ISIS spool is used for \fIasynchronous\fR communication with
a process group that is either known to be down, or where
the group may need to spool input for fault-tolerance reasons.
A typical user of the spool is the ISIS long-haul facility,
which is only active when a line to a given remote destination
is up.  All communication with  this facility is via the spooler,
reducing what would otherwise require a case-by-case interaction
to a single, asynchronous case.
.PP
The interface is somewhat restricted by comparison to the remainder of
ISIS, and is intended to be used in a manner
that explicitly recognizes the long delays that may occur
between when these types of messages are sent and when they are received.
These delays make it impractical to support, for example,  a ``spooled broadcast''
that would spool a request until the destination service becomes available
and then perform the broadcast and return whatever replies are received.
(The user who wishes to implement the equivalent functionality can do so using
a ``call-back'' approach.)
.PP
When using spools for communication with a remote network, the destination
network name is specified using the spooler option SP_NETWORK followed
by the name (a null-terminated string).
For debugging purposes, the network can be specified as ``local'',
in which case the messages sent will be treated exactly as for long-haul 
communication, but
delivered on the same network as that of the sender.
Eventually, our intent is to change ISIS to infer
that remote communication is desired by examination of the group name,
a change that will conceal the SP_NETWORK argument.
.PP
The spooler can be contrasted with the ISIS logging facility, which is
concerned with the recovery of individual processes (associated with
specific nodes in the network) into the state that they held at the time of
a failure.  Spooling is used when the destination is an entire process group,
and when the group may be offline at the time a message is sent.
By communicating through the intermediary of the spool, the sender
need not be concerned with whether or not the destination group is
operational.
The spooler is thus visible directly to the sender of a message.
Logging is used in a manner transparent to the caller, which would 
be coded to deal only with ``operational'' process groups.
.PP
The standard use for a spool in ISIS involves a collection of processes
that send messages to a destination process group via the spooler, without
waiting for replies.  During periods when the destination group is operational,
these messages are spooled and promptly forwarded
in the order that they were spooled.
During periods when the destination group is down, messages are spooled but
not forwarded.
Upon recovery, a process group initiated replay of spooled messages.
When the replay terminates, new arriving messages and any messages that had not
previously been fully executed are delivered in spool order.
.PP
The spooler has no way to know when execution for a given spooled
message is completed, and this raises the issue of how it can distinguish
between \fIreplay\fR of a message that has already been executed and
\fIfirst time delivery\fR of a message that may, in fact, have been
delivered previously but which has not yet been ``executed''.
In practical terms, the spooler provides a variable that
can be tested at runtime, \fIspool_in_replay\fR; it will be 
true while replay is occuring.
The method used to distinguish these types of messages in the spool
itself, however, is to
associate a \fIspool pointer\fR with each spool.
The pointer is under control of
the application program, which must
call spool_set_replay_pointer to assign it a value.
The value supplied in a call to the 
.I spool_set_replay_pointer()
routine should be a spooler sequence number, obtained by calling
.I spool_getseqn().
It is illegal to set the spool pointer back; it can only be advanced.
.PP
Many applications will store checkpoints in spools, but may wish
to retain information that was in the spool prior to making
the checkpoint, as a sort of audit trail.
Accordingly, spools also have a \fIcheckpoint pointer\fR
associated with them.
When asked to replay the contents of the spool, the spooler actually
examines only those messages between the
checkpoint pointer and the spool pointer, inclusive.
In contrast to the spool pointer, which must always be
set explicitly, the spooler will automatically
set the checkpoint pointer when the spool_and_discard
or prepend_and_discard operation is invoked, leaving it
pointing to the checkpoint message.
The checkpoint pointer can be explicitly modified
by calling spool_set_ckpt_pointer giving the sequence
number as an argument.
.PP
A spool thus has the following structure:
.in +1in
.nf
(start; small seqn numbers)
old messages
checkpoint
portion that can be replayed
new messages
(end; large seqn numbers)
.fi
.in -1in
The checkpoint pointer has as its value the sequence number of the current
checkpoint.  The replay pointer can have any value; it points to the last
message in the portion of the spool that has already been processed.
The ``old'' messages will only be replayed if the
checkpoint pointer is moved first.
The replayed portion of the spool consists of the checkpoint itself
plus a series of
``incremental updates'' to it.
Any spooled messages subsequent to the spool pointer
may be new -- or may be
ones that were seen before, but where there was a crash
before the spool was moved forward.
.PP
The spooler interface is as follows:
.PP
.I spool
puts a message in the \fIspool\fR for a named process group.  Normally, this
group would be one that is believed to not be operational.
The 
.I spool_m
varients allow the message to be spooled to be precomputed and are analogous to
calling bcast_l and specifying the `m' option.
.PP
On recovery, a group triggers spool replay either by invoking
.I spool_replay
or by specifying the 
.I PG_REPLAYSPOOL
argument to pg_join.
Notice that spool replay is not automatic in ISIS; it must
be activated explicitly.
During replay, the flag \fIspool_in_replay\fR will be non-zero.
Only messages with spooler sequence numbers greater than or
equal to the checkpoint pointer and smaller than or equal to the current
spool pointer will be replayed.
Moreover, replay allows messages to be replayed selectively, using a replay pattern.
For example, say that an application spools all types of messages, but that
only some messages are needed to recover after a failure.
A replay pattern can be specified that will supress replay of the ``irrelevant''
messages.  On the other hand, their presense in the spool may be useful in other
ways, for example to exactly recreate a scenerio that has been causing a process to crash. 
After replay has finished, any additional spooled messages in the spool
or any new messages that are received by the spooler are ``played through''
immediately upon
reception, and this continues
so long as the process group remains operational.
Play through can be disabled by calling 
.I spool_play_through(),
but is activated by default.
Unlike messages being replayed, play-through
messages are NOT subject to any sort of pattern-matching process.
.PP
When 
.I spool_play_through()
is used to disable play-through, the procedure must
be called \fIbefore\fP calling
.I spool_replay()
(or \fIpg_join\fR).
Otherwise, some play-through may occur during the interval after the replay
completes and before your program is informed of it.
Play-through messages are not delivered until after 
.I isis_start_done()
is
called in cases where replay is initiated during startup.
.PP
Programs must explicitly discard the contents of a spool.
This is done using 
.I
spool_discard.
.PP
Finally, the procedure 
.I
spool_and_discard
atomically discards some of the messages in a spool and appends a new
message (normally a checkpoint) to the end of the spool.
(A varient form, prepend_and_discard, is also available but is rarely used;
this places the checkpoint at the start of the spool).
The checkpoint pointer is simulataneously updated to
point to this new checkpoint.
.PP
The following additional spooler functions are not yet implemented.
.I spool_cancel(id)
provides a way to cancel a pending request.
.I spool_wait(id)
blocks until a specified request has been replayed.
.I spool_inquire(id)
returns 0 if the request is still spooled and 1 if it has been replayed.
.I spool_advise(sname, options, 0)
provides an interface with which the caller can create spools
having special characteristics (non-standard resilience, size limits, etc).
Currently, all spools have the same degree of resiliency to failures and
no size limit is enforced.
.SH INSTALLATION HINTS
.PP
The spooler program itself is normally run as part of the isis.rc startup
sequence.
It is usual to run the spooler only on machines with local disks, and to
run 3 to 5 copies of the spooler for an entire LAN.
The spooler program has three optional arguments, which can be given in
any order.
(In future releases of the spooler, it may make sense to run
more than 5 copies of the spooler if possible.)
.B -v
places the program in \fIverbose\fR mode; it prints a description of essentially
every operation it performs on the console.
This is intended for debugging only.
.B -l long_haul_conf
is used to specify a long-haul configuration file, as discussed below.
.B portno
is a port-number for connecting to ISIS, and need only be specified if you wish to
override the default value found in /etc/services or /yp/etc/services.
.SH DESCRIPTIONS
.I spool
puts a message in the \fIspool\fR for a named process group and delivers it
promptly (``plays it through'') if the process group is operational.
The 
.I sname
argument is the name under which the group will run when it restarts.
The 
.I entry 
argument tells what entry point this message should be delivered to upon replay.
The
.I fmt
is a format from which the message should be create; the arguments are
as for \fImsg_put\fR.
.PP
A zero-terminated series of optional keywords describing this message follow. 
Each keyword in the series consists of a name \-- we define a basic set,
but you can extend it \-- and perhaps arguments associated with that name.
There are currently three sorts of keywords: numeric ones, which have an
integer value, timer keywords, which take a long integer argument of the
sort returned in the \fIseconds\fR (tv_sec) field of the timeval structure by
gettimeofday(2), and SP_KEYWORDS
which takes a null-terminated list of strings as its argument.
.PP
The type of broadcast used for actually transmitting to the group will
normally be \fIcbcast\fR.  This is certain to work correctly if all messages to the group
are sent via the spooler.  However, if a group receives some of its messages directly,
you may need to specify the broadcast
type.  This is done by including the key SP_FBCAST, SP_CBCAST, SP_ABCAST or SP_GBCAST,
with no argument.
.PP
The spooler currently does not predefine any numeric message keys.
Instead, the user is permitted to define up to 9 such keys.
This should be done using \fIdefine\fR and specifying values in the range 1-9
inclusive.
A numeric key should be immediately followed by its value in the call to spool.
.PP
There is currently only one timer key that the user would explicitly specify
in a call to spool: SP_EXPIRES.  The argument to SP_EXPIRES is an absolute time
at which this message ``expires''.
The argument should be computed by calling gettimeofday(&now) and then 
computing now.tv_sec+delay, where delay is a delay in seconds
between the time of the call and the time when the message expires.
An expired message will never be delivered to a client, but neither will it
actually be deleted from the spool  
until the next time that spool_discard call is called.
.PP
A spooled message can also have a list of ascii strings associated with it.
Such a list, null-terminated, should follow the keyword SP_KEYWORDS.
.PP
The following illustrates a very complex call to the spool routine as it might
be done from C; the corresponding interface is also supported from FORTRAN and LISP.
.PP
.nf
#define SP_ID            1
#define SP_EPOCH      2
.PP
       ....
       sid = spool(``dbserver'', ADD_RECORD, ``%s,%d'', ``Richard Nixon'', 68,
         SP_ID, db_idno++,
         SP_EPOCH, current_epoch,
         SP_EXPIRES, now.tv_sec+60*60*12, 
         SP_KEYWORDS, ``add'', 0,
         0);
.fi
.PP
The above example uses an ``id'' number and
an ``epoch'' number, but the reader should be aware  that
these have no special meaning to the spooler.
On the other hand,
the spooler \fIdoes\fR assign all
spooled messages a sequence number on a per-spool basis, which is incremented
for each received message.
The spooler 
delivers messages sequentially in order of increasing sequence number,
except during replay when messages from the start of the spool up to and
including the current spool pointer are subject to a pattern and will not be
replayed unless the pattern matches.
.PP
The spool request returns the spooler sequence number that was assigned to the
message.
Given a message that was sent via the spooler, its
sequence number can be obtained directly by calling
.I spool_getseqn(mp).
This function returns 0 when applied to a message that was never spooled.
The special keyword SP_SPSEQN can be used to specify limits on
the sequence number as part of a replay or discard pattern.
For any given spool, sequence numbers are strictly increasing.
.PP
.I spool_replay
triggers replay of a spool.
Replay can be selective; for example, one can replay just the
messages from a particular sender or just the messages with spooler sequence
numbers larger than a specified value.
A pattern is specified very much as the set of keys for a message,
but where a key typically specifies a value, a replay pattern
typically specifies a rule that the value must satisfy for the message
to be replayed.  If several replay constraints (patterns) are given,
all must be satisfied for a given message to be replayed.
.PP
In the case of a numeric key, a low and high bound are given (either can be
SP_INFINITY, however).
Only messages that included the designated key and have a value greater
than or equal to the low bound and less than or equal to the high bound.
For example, spool_replay(sname, SP_ID, 55, SP_INFINITY, 0); replays
all messages in the spool \fIsname\fR with the user-defined numeric
key SP_ID in the message and having a value of 55 or greater, inclusive.
.PP
As noted above, the 
spooler's internal sequence number can be treated as a numeric pattern using the predefined
keyword SP_SPSEQN.  Note, however, that
replay will only be applied to messages betwen the checkpoint pointer
of the spool and the
current spool pointer.
.PP
The time at which a message was spooled can be used as part of a pattern.
SP_ATIME places bounds on this time in absolute time units.
SP_RTIME places bounds on this time relative to the time at which
spool_replay was called.
.PP
The process that sent or spooled a message can also be part of the pattern.
SP_SENDER takes a single address which is the address of the sender whose
messages are to be replayed.
SP_SPOOLER works the same way, but takes the address of the process that invoked
spool.  Note that unless spool_m is being used, the sender is by definition the
same process as the spooler.  In the case of spool_m, however, the
message could be one that was received from some other source.
.PP
If string keywords were specified, the pattern SP_KEYWORDS can be used to
enforce a 1-1 exact match.  The number of strings and their values must
match for the message to be replayed.
.PP
To replay all the messages in a spool, one would call spool_replay(0).
.PP
After a spool_replay is done, the spooler plays through any
messages that are received and that match the ``current'' replay
pattern, with the single exception of any message received from a
spool_and_discard request (in this case, the spooled message normally is a checkpoint,
and hence playing it through would cause confusion).
It will also spool these messages upon reception.
This play-through behavior continues 
so long as the destination process group remains accessible, or
until spool_play_through is called to inhibit further playthrough.
.PP
.I spool_discard
is called just like 
.I spool_replay.
It deletes any spooled messages between the start of
the spool and the current spool replay pointer
that \fImatch\fR the specified pattern,
retaining in the spool any messages that \fIdo not\fR match the pattern.
If an empty pattern is specified, all messages will ``match''
and be discarded.
.PP
\fINOTE:\fR if the spool pointer has not been
set and a call to
.I spool_discard
is issued,
the discard
operation will retain \fIall\fR messages that were
in the spool.
This is because the discard pattern is applied only to
messages in that portion of the spool up through (and including)
the message with sequence number equal to the spool replay pointer value.
.PP
.I spool_and_discard
combines a call to
.I spool
and a call to
.I spool_discard
into one atomic operation.
In the arguments associated with the message to be spooled one may specify
SP_PREPEND, in which case the new message will be stored at the front of the spool.
Otherwise, the new message is appended to the end of the spool, which normally the appropriate
place to store a checkpoint.
The checkpoint pointer is changed to point to the new message.
.PP
For example, say that one wishes to make a checkpoint, and that the
application has just received spooled message
for which spooler_getseqn(mp) returned 66.
A good way to do this would be to issue the sequence of calls: 
.ti 1i
spool_set_replay_pointer(spname, 66);
.ti 1i
spool_m_and_discard(spname, chpt-msg, 0);
.br
The first of these tells the spooler that messages
up through 66 have already been ``consumed''.
The second call 
atomically modifies the spool by appending the checkpoint message to it and
deleting all messages up the this point -- \fIall\fR because no pattern was
specified, and an empty pattern matches all messages.
Any other messages in the spool with sequence numbers greater than
66 are retained.
.SH LONG-HAUL COMMUNICATION
.PP
ISIS spools are also used for communication with remote networks.
A network is a set of ISIS sites isolated from the rest of the ISIS word.
The only way for remote networks to communicate is through the 
long-haul communication package. Each network has a name.
A network's name looks like a group's name and has the same size limit as
the group's one.
.PP
The static description of networks is in a file.
This file contains the default tcp port number (first information
available in the file), and
a set of entries, each describing one specific network.
Each entry is composed of
the described network's name, and a null-terminating
list of its hosts  descriptors.
Each host is specified either by its internet host name (in which case
this name is prefixed by ``N:''), or by its internet address (in which case
this address is prefixed by ``A:'').
A host's descriptor contains also the tcp port number. A host's name (or address)
is separated from the reserved port number by the slash (`/') character.
If the port number is zero, the long-haul package uses the
default value.
.PP
The following illustrates a long-haul network configuration file:
.nf
    2200
    norway N:thor.cs.cornell.edu/0  N:hymir.cs.cornell.edu/0 0
    sweden N:sif.cs.cornell.edu/1800 N:sigyn.cs.cornell.edu/1800 0
    usa N:utgard.cs.cornell.edu/0 A:128.64.27.3/0 0
.fi
This defines three ISIS networks, one in Sweden, one in norway, and one in
the United States.
Each network has a set of designated ``contact points'' -- normally, a few of the
file servers, because one normally runs copies of the spooler only on file
servers (anything else wouldn't make sense).
For example, regardless of how many machines reside in the Norway LAN, only thor and
hymir are declared as running the spooler and will be used for connections
from Sweden and the USA.
The port number used for connections will be 2200 except when communicating with
Sweden.
One host in the United States is declared using its
internet host address.
Each list of contact machines is zero terminated.
.SH DESCRIPTION
.PP
The long-haul package
establishes connections between the local and remote networks.
For each remote network described in the networks configuration file,
one of the running hosts is designated as the manager of the connection
with this partner.
Each designated manager tries to connect to one of the remote
network's host. The designated manager tries successively different hosts
until one accepts the connection request.
.PP
Each long-haul process may be in charge of more than one long-haul connections.
The long-haul package ensures automatic reconnection in the case
of failure of an existing connection.
It also preserves the state of a failed connection and makes it available
to the new manager.
This allows the 
.I at most once delivery semantic
in presence of connections failures and hosts crashes.
.PP
To enable long-haul communication, you must first inform the spooler
for this by calling the spooler command with the option
.I -l yourNetFileName.
For example, if the file ``/usr/spool/isis/long_haul.rc'' contains the configuration file
shown above, your isis.rc file might contain a line:
.ti 1i
.B bin/spooler <isis-spooler> -l /usr/spool/isis/long_haul.rc
.PP
You  trigger the long-haul communication by specifying the remote network name using
the SP_NETWORK option when you issue a call to 
.I spool
or
.I spool_m
procedures.
For example:
.ti 1i
.I spool_m(``rule-checker'', QUICKCHECK, msg-to-spool, SP_NETWORK, ``sweden'', 0);
.ti 1i
.I spool_m(``rule-checker'', QUICKCHECK, msg-to-spool, SP_NETWORK, ``local'', 0);
.SH CONSTANTS
.PP
The number of networks you can define is limited to MAXNETS,
and number of sites in a network is limited to MAXSITES.  Both of
these constants are defined in util/long_haul.c.
.SH WARNINGS
.PP
The long-haul package uses tcp ports to realize inter-LAN communication.
One consequence of this is that, when a long-haul process fails, restating this
process before a certain delay (lets say about 45 secondes), makes this process
unreachable by the rest of the system.
.PP
This version does yet implement recovery from a total network failure (if all
spoolers on one network fail, long-haul communication state and messages spooled
are lost). This lack will no longer be present in the next version.
.SH BUGS
.SH "SEE ALSO"
isis_logentry(3), gettimeofday(2),
ISIS(3)