ken@gvax.cs.cornell.edu (Ken Birman) (07/27/89)
Below is a "man" page for a new "spooler" facility that we are adding to ISIS. The facility doubles as a long-haul interface for communication between LAN's on which ISIS is running independently. We would be very interested in suggestions/comments/questions on this. Direct your remarks to me or to Messac Makpangou: mak@cs.cornell.edu, or post them to comp.sys.isis if you think they might be generally interesting. Thanks... .TH SPOOL 3 "1 February 1986" ISIS "ISIS LIBRARY FUNCTIONS" .SH NAME spool, spool_replay, spool_and_discard \-- ISIS spooling and long-haul functions. .SH SYNOPSIS .B #include "isis.h" .PP .B id = spool(sname, entry, fmt, args, SP_KEY, args, ... , 0); .br .B char *sname, *fmt; .br .B int entry; .PP .B id = spool_m(sname, entry, msg, SP_KEY, args, ... , 0); .br .B char *sname; .br .B message *msg; .br .B int entry; .PP .B spseqn = spool_getseqn(msg) .br .B int spseqn; .br .B message *msg; .PP spool_replay(sname, SP_PAT, args, ..., 0); .PP int spool_in_replay; .br .B char *sname; .PP spool_and_discard(sname, \fIspool-request\fR... 0, \fIdiscard-pattern\fR, 0); .PP spool_m_and_discard(sname, \fIspool_m-request\fR... 0, \fIdiscard-pattern\fR, 0); .br .B char *sname; .PP spool_set_replay_pointer(sname, spseqn); .PP spool_play_through(sname, SP_OFF/SP_ON); .PP spool_cancel(id) .PP spool_inquire(id) .PP spool_wait(id) .PP spool_advise(sname, options, 0); .NP .SH DESCRIPTION .NP An ISIS spool is used for \fIasynchronous\fR communication with a process group that is either known to be down, or where the group may need to spool input for fault-tolerance reasons. The interface is somewhat restricted by comparison to the remainder of ISIS, and is intended to be used in a stylized manner that explicitly recognizes the long delays that typically will occur between when these types of messages are sent and when they are received. These delays make it impractical to support, for example, a ``spooled broadcast'' that would spool a request until the destination service becomes available and then perform the broadcast and return whatever replies are received. (The user who wishes to implement the equivalent functionality can do so using a ``call-back'' approach.) .NP ISIS spools are also used for communication with remote networks. In this mode, the network name (see below) is specified using the spooler option SP_NETWORK. The network name is taken from the network ``names'' file. .NP The spooler can be contrasted with the ISIS logging facility, which is concerned with the recovery of individual processes (associated with specific nodes in the network) into the state that they held at the time of a failure. Spooling is used when the destination is an entire process group, and when the group may be offline at the time a message is sent. By communicating through the intermediary of the spool, the sender need not be concerned with whether or not the destination group is operational. The spooler is thus visible directly to the sender of a message. Logging is used in a manner transparent to the caller, which would be coded to deal only with ``operational'' process groups. .NP The standard use for a spool in ISIS involves a collection of processes that send messages to a destination process group via the spooler, without waiting for replies. During periods when the destination group is operational, these messages are spooled and promptly forwarded, in the order that they were spooled. During periods when the destination group is down, messages are spooled but not forwarded. Upson recovery, a process group initiated replay of spooled messages. When the replay terminates, new arriving messages and any messages that had not previously been fully executed are delivered in spool order. .NP The spooler has no way to know when execution for a given spooled message is completed, and this raises the issue of how it can distinguish between \fIreply\fR of a message that has already been executed and \fIfirst time delivery\fR of a message that may, in fact, have been delivered previously but which has not yet been ``executed''. This is done by associating a \fIspool pointer\fR with each spool, which is controlled by the application in a call to spool_set_replay_pointer. The value supplied in a call to the spool_set_replay_pointer routine should be a spooler sequence number, obtained by calling spool_get_spseqn(). It is illegal to set the spool pointer back; it can only be advanced. .NP The spooler interface is as follows: .NP .I spool puts a message in the \fIspool\fR for a named process group. Normally, this group would be one that is believed to not be operational. The .I spool_m interface is used when the message to be spooled has been precomputed and is analogous to calling bcast_l and specifying the `m' option. .NP On recovery, a group triggers spool replay either by invoking .I spool_replay or by specifying the .B PG_REPLAYSPOOL argument to pg_join. Notice that spool replay is not automatic in ISIS; it must always be activated explicitly. During replay, the flag \fIspool_in_replay\fR will be non-zero. Only messages with spooler sequence numbers smaller than or equal to the current spooler replay pointer will be replayed. Moreover, replay allows messages to be replayed selectively, using a replay pattern. For example, say that an application spools all types of messages, but that only some messages are needed to recover after a failure. A replay pattern can be specified that will inhibit replay of the ``irrelevant'' messages. On the other hand, their presense in the spool may be useful in other ways, for example to exactly recreate a scenario that has been causing a process to crash. After replay has finished, any additional spooled messages in the spool or any new messages that are received by the spooler are ``played through'' immediately upon reception, and this continues so long as the process group remains operational. Play through can be disabled by calling spool_play_through(), but is activated by default. Unlike messages being replayed, play-through messages are NOT subject to any sort of pattern-matching process. .NP When spool_play_through() is used to disable play-through, the procedure must be called \fIbefore\fP calling spool_replay() (or pg_join). Otherwise, some play-through may occur during the interval after the replay completes and before your program is informed of it. Play-through messages are not delivered until after isis_start_done() is called in cases where replay is initiated during startup. .NP Programs must explicitly discard the contents of a spool. This is done using .I spool_discard. .NP Finally, the procedure .I spool_and_discard atomically discards some of the messages in a spool and prepends a new message (e.g. a checkpoint) to the end of the spool. (The caller can specify that the new message should be appended at the tail of the of the spool if desired, but this is not the default). .NP The following additional spooler functions are not yet implemented. .I spool_cancel(id) provides a way to cancel a pending request. .I spool_wait(id) blocks until a specified request has been replayed. .I spool_inquire(id) returns 0 if the request is still spooled and 1 if it has been replayed. .I spool_advise(sname, options, 0) provides an interface with which the caller can create spools having special characteristics (non-standard resilience, size limits, etc). Currently, all spools have the same degree of resiliency to failures and no size limit is enforced. .NP .SH DESCRIPTIONS .I spool puts a message in the \fIspool\fR for a named process group and delivers it promptly (``plays it through'') if the process group is operational. The .I sname argument is the name under which the group will run when it restarts. The .I entry argument tells what entry point this message should be delivered to upon replay. The .I fmt is a format from which the message should be create; the arguments are as for \fBmsg_put\fR. .NP A zero-terminated series of optional keywords describing this message follow. Each keyword in the series consists of a name \-- we define a basic set, but you can extend it \-- and perhaps arguments associated with that name. There are currently three sorts of keywords: numeric ones, which have an integer value, timer keywords, which take a long integer argument of the sort returned in the \fIseconds\fR (tv_sec) field of the timeval structure by gettimeofday(2), and SP_KEYWORDS which takes a null-terminated list of strings as its argument. .NP The type of broadcast used for actually transmitting to the group will normally be \fIcbcast\fR. This is certain to work correctly if all messages to the group are sent via the spooler. However, if a group receives some of its messages directly, you may need to specify the broadcast type. This is done by including the key SP_FBCAST, SP_CBCAST, SP_ABCAST or SP_GBCAST, with no argument. .NP The spooler currently does not predefine any numeric message keys. Instead, the user is permitted to define up to 9 such keys. This should be done using \fIdefine\fR and specifying values in the range 1-9 inclusive. A numeric key should be immediately followed by its value in the call to spool. .NP There is currently only one timer key that the user would explicitly specify in a call to spool: SP_EXPIRES. The argument to SP_EXPIRES is an absolute time at which this message ``expires''. The argument should be computed by calling gettimeofday(&now) and then computing now.tv_sec+delay, where delay is a delay in seconds between the time of the call and the time when the message expires. An expired message will never be delivered to a client, but neither will it actually be deleted from the spool until the next time that spool_discard call is called. .NP A spooled message can also have a list of ascii strings associated with it. Such a list, null-terminated, should follow the keyword SP_KEYWORDS. .NP The following illustrates a very complex call to the spool routine as it might be done from C; the corresponding interface is also supported from FORTRAN and LISP. .NP .nf #define SP_SEQN 1 #define SP_EPOCH 2 .NP .... id = spool("dbserver", ADD_RECORD, "%s,%d", "Richard Nixon", 68, SP_SEQN, db_seqn++, SP_EPOCH, current_epoch, SP_EXPIRES, now.tv_sec+60*60*12, SP_KEYWORDS, "add", 0, 0); .fi .NP .I spool returns a spooled-message-id that can be used in subsequent queries concerning this message or to cancel this message. .NP The above example uses a ``sequence'' number and an ``epoch'' number, but the reader should be aware that these have no special meaning to the spooler. On the other hand, the spooler \fIdoes\fR assign all spooled messages a sequence number on a per-spool basis, which is incremented for each received message. The spooler delivers messages sequentially in order of increasing sequence number, except during replay when messages from the start of the spool up to and including the current spool pointer are subject to a pattern and will not be replayed unless the pattern matches. The spooler sequence number for a message can be obtained by calling .I spool_getseqn(mp). This function returns 0 when applied to a message that was never spooled. .NP The destination group is considered to be on the local network of the caller unless the keyword SP_NETWORK is specified. This keyword takes a single argument, which should be a network name defined in the ``networks'' configuration file for your installation. In this case, delivery will be on the indicated network. The network name ``local'' can be used to obtain a loop-back effect if desired for debugging. .NP .I spool_replay triggers replay of a spool. Replay can be selective; for example, one can replay just the messages from a particular sender or just the messages with spooler sequence numbers larger than a specified value. A pattern is specified very much as the set of keys for a message, but where a key typically specifies a value, a replay pattern typically specifies a rule that the value must satisfy for the message to be replayed. If several replay constraints (patterns) are given, all must be satisfied for a given message to be replayed. .NP In the case of a numeric key, a low and high bound are given (either can be SP_INFINITY, however). Only messages that included the designated key and have a value greater than or equal to the low bound and less than or equal to the high bound. For example, spool_replay(sname, SP_SEQN, 55, SP_INFINITY, 0); replays all messages in the spool \fIsname\fR with the user-defined numeric key SP_SEQN in the message and having a value of 55 or greater, inclusive. .NP The spooler's internal sequence number can be treated as a numeric pattern using the predefined keyword SP_SPSEQN. Note, however, that replay will only be applied to messages between the start of the spool and the current spool pointer. .NP The time at which a message was spooled can be used as part of a pattern. SP_ATIME places bounds on this time in absolute time units. SP_RTIME places bounds on this time relative to the time at which spool_replay was called. .NP The process that sent or spooled a message can also be part of the pattern. SP_SENDER takes a single address which is the address of the sender whose messages are to be replayed. SP_SPOOLER works the same way, but takes the address of the process that invoked spool. Note that unless spool_m is being used, the sender is by definition the same process as the spooler. In the case of spool_m, however, the message could be one that was received from some other source. .NP If string keywords were specified, the pattern SP_KEYWORDS can be used to enforce a 1-1 exact match. The number of strings and their values must match for the message to be replayed. .NP To replay all the messages in a spool, one would call spool_replay(0). .NP After a spool_replay is done, the spooler plays through any messages that are received and that match the ``current'' replay pattern, with the single exception of any message received from a spool_and_discard request (in this case, the spooled message normally is a checkpoint, and hence playing it through would cause confusion). It will also spool these messages upon reception. This play-through behavior continues so long as the destination process group remains accessible, or until spool_play_through is called to inhibit further playthrough. .NP .I spool_discard is called just like .I spool_replay. It deletes those spooled messages that match the specified pattern, retaining in the spool any messages that DO NOT match the pattern. It is important to specify the spooler sequence number up to which messages should be discarded, as it would be an error to discard messages that have not yet been played through. Although such messages would still be played through, the effect would be to delete them from the spool ``prematurely'' -- before the application has actually received and executed them. .NP .I spool_and_discard combines a call to .I spool and a call to .I spool_discard into one atomic operation. In the arguments associated with the message to be spooled one may specify SP_APPEND, in which case the new message will be stored at the end of the spool. Otherwise, the new message is prepended to the spool, which the the appropriate place to store a checkpoint. .NP For example, say that a checkpoint is made after receiving a spooled message for which spooler_getseqn(mp) returns 66. A good way to do this would be to call .CE spool_m_and_discard(spname, chpt-msg, SP_SPSEQN, SP_INFINITY, 66, 0); This modifies the spool by prepending the checkpoint message to it and deleting the messages that the checkpoint ``covers'', while retaining all others. .NP .SH SET UP .PP To set up the spooler, you should add a line to the isis.rc files on the machines where you wish to have spool files created. This line should run ../bin/spooler under the name <isis-spooler>. The spooler takes a single, optional argument; if specified, this should be a ``networks'' configuration file in the format described in the next section. Spools are currently replicated over the full set of spoolers, but in future versions of the program will be replicated only to the extent needed for fault-tolerance and availability purposes. .NP .SH LONG-HAUL COMMUNICATION .NP A \fInetwork\fR is a set of sites within which ISIS. When an applications spans multiple networks, the recommended way to do inter-net communication is through the long-haul spooler facility. Each network is defined by a name, assigned in a configuration file, and by a set of sites on which its spoolers run, and which can be contacted to establish an inter-network link. Network names look like a group names in all respects. Each spooler site is defined by an internet host name/address, plus a tcp port number used to accept connection requests from remote networks. .NP The \fInetwork configuration file\fR is used to communicate this information to the spooler program. It is specified by by supplying the spooler command with the option .I -l network-config-file. .NP A network configuation file is formatted as follows. The first line of the file contains a default tcp port number for contacting spooler sites; this must not be the same as any port number already in use by your applications, or already defined in your ISIS sites file. Each subsequent line of the file describes one specific network. Such a line is composed of the described network's name, and a null-terminating list of hosts descriptors. Each host is specified either by its internet host name (in which case this name is prefixed by ``N:''), or by its internet address (in which case this address is prefixed by ``A:''). A host's descriptor contains also the tcp port number. A host's name (or address) is separated from the reserved port number by the slash (`/') character. If the port number is zero, the long-haul package uses the default value. .NP .SH DESCRIPTION The long-haul package establishes connections between the local and remote networks. For each remote network described in the networks configuration file, one of the running hosts is designated as the manager of the connection with this partner. Each designated manager tries to connect to one of the remote network's host. The designated manager tries successively different hosts until one accepts the connection request. .NP Each long-haul process may be in charge of more than one long-haul connections. The long-haul package ensures automatic reconnection in the case of failure of an existing connection. It also preserves the state of a failed connection and makes it available to the new manager. This allows the .I at most once delivery semantic in presence of connections failures and hosts crashes. .NP To trigger long-haul communication, you should specify the remote network name using the SP_NETWORK option in a call to the .I spool or .I spool_m procedures. The messages will be transmitted the next time the spooler makes contact with the specified network. .NP .SH CONSTANTS .NP The maximum networks you can define is MAXNETS, and the maximun sites in a network is MAXSITES. .SH BUGS .NP The current version of the spooler loses spooled data if the spooler itself experiences a total failure. (On recovery, existing spool files are deleted). This will be corrected in the near future. .SH "SEE ALSO" isis_logentry(3), gettimeofday(2), ISIS(3)
ken@gvax.cs.cornell.edu (Ken Birman) (07/27/89)
>> From jlevy@arisia.xerox.com Thu Jul 27 11:51:01 1989 >> .... >> Have I understood correctly that this in effect makes it possible >> for two independent networks of ISIS-connected sites to >> communicate? This means that in effect the limitation on the >> number of sites is removed? Yes, the new spooler/long-haul facility is intended to help you link "clusters" of ISIS sites over long-haul links. The idea is that if you are running ISIS at Cornell and ISIS at Xerox, you might want to build applications that span the two LAN's. ISIS doesn't normally allow this, but the spool mechanism offers a way to cobble something together explicity. Basically, you spool messages for the remote LAN (currently you need to use the SP_NETWORK option, but eventually we will move to a new group naming convension like /cardiology/ccu/bed11/alarm@columbia.new-york where "new-york" would be taken as the network name). ISIS needs to know how to talk to "new-york", and you tell it by means of the spooler's interlan configuration file - which lists some machine in new-york and ports that they use to do this type of communication. Whenever it gets the change, ISIS makes a connection and forwards the spool using a scheme that gives at most once delivery semantics. As long as the line is up, overhead is low -- your message goes to the spooler, out the line, into the spooler remotely, and is re-broadcast on arrival. Because the scheme is completely asynchronous, you would have to layer any sort of reply mechanism on top of this as part of your application. There were technical obstacles to doing something reasonable with replies as part of our mechanism -- basically, failure cases that made it very hard to recover lost replies. We recognize that the initial facility is pretty primitive. Messac Makpangou is working on this and would be interested in feedback or suggestions. Contact him as mak@cs.cornell.edu. He is working on extending the long-haul code, specifically, while I am responsible for the spooling/replay mechanisms. Ken Birman
wunder@hp-ses.SDE.HP.COM (Walter Underwood) (07/28/89)
Well, for starters, don't use .NP (or .LP) for paragraphs. Those macros don't exist in System V man macros. Use .PP. Also, only some of the lines in the SYNOPSIS section are bold, and many don't have the args declared. Looks kinda funny. wunder