[comp.sys.isis] Seismic monitoring using ISIS

ken@gvax.cs.cornell.edu (Ken Birman) (12/15/89)

Here at SAIC in San Diego, we are using ISIS to build a worldwide
monitoring system that locates and identifies seismic events. The
system is intended to determine if particular events could be the
results of nuclear weapons testing.  Data come in from a variety of
sensors and several seismic arrays on a continuous basis.  One of our
design goals is to provide for continuous operation of the system
without requiring a large full-time staff.  We were attracted to ISIS
because of its strong support for fault tolerance and redundant
computation.  It has also proved flexible enough to allow us to bring
together pre-existing pieces of code in a natural fashion.

However, a couple of minor nits:

1) Executable size

Currently, the size of an executable after linking in ISIS is quite
large.. ~300K.  I suspect the libraries might include some
unnecessary code.

2) Program exit if ISIS is not running

Some of our programs can run stand alone.  It would be nice if they 
could attempt to connect to ISIS and get an error code back when it
is not running so that they could take appropriate action.  Another
possibility would be some sort of "autostart" feature that would 
start up ISIS when a connection was requested.

Thanks for the support.


+-----------------------------------------------------------------------------+
|   Jerry Jackson                       UUCP:  seismo!esosun!jackson          |
|   Geophysics Division, MS/12          ARPA:  jackson@esosun.css.gov         |
|   SAIC                                SOUND: (619)458-4924                  |
|   10210 Campus Point Drive                                                  |
|   San Diego, CA  92121                                                      |
+-----------------------------------------------------------------------------+

ken@gvax.cs.cornell.edu (Ken Birman) (12/15/89)

In article <35194@cornell.UUCP> jackson@gymer.css.gov (Jerry Jackson) writes:
>... a couple of minor nits:
>1) Executable size

As mentioned earlier, Jerry and I ended up sitting down and working out
the causes for this; the result is that I rebuilt the ISIS V2.0 alpha
release here at Cornell in a way that corrects this problem.

>2) Program exit if ISIS is not running
>
>Some of our programs can run stand alone.  It would be nice if they 
>could attempt to connect to ISIS and get an error code back when it
>is not running so that they could take appropriate action.  Another
>possibility would be some sort of "autostart" feature that would 
>start up ISIS when a connection was requested.

This is a good idea.  I am going to extend the isis_init interface as
follows:  I'll add a new call
	isis_init_l(client_port, flags)
where the flags are initially:
	ISIS_PANIC		/* Panic if connect fails; else returns -1 */
	ISIS_AUTOSTART		/* Auto restart ISIS if not already running */
The current isis_init becomes a synonym for isis_init_l(port, ISIS_PANIC)

The autorestart scheme is that if ISIS is not able to connect, it will try
running "/bin/csh" on the file called "/usr/bin/startisis"; if this fails
or if a second attempt to connect fails after a delay of 90 seconds, the
system will panic/return -1 depending on whether ISIS_PANIC is specified.

Thanks for the suggestion...

By the way, I am adding a few other extensions along these lines:

1) ISIS_MONITOR_ENTER(count,cond) ISIS_MONITOR_EXIT(count,cond)
   For lightweight tasks that want a monitor-style of critical section
   Arguments are an integer counter and a condition variable
2) THREAD_LEAVE_ISIS() and THREAD_REENTER_ISIS()
   Lets a lightweight thread run concurrently with the ISIS system and
   later re-enter it.  Useful for systems with real parallelism
3) cc_terminate_l(dest, <message format and args>)
   Currently, cc_terminate sends a message just to the cohorts of a ccord-cohort
   computation and "reply" in such a computation sends a message to the sender
   of the original request and to the cohorts.  cc_terminate_l sends a message
   to <dest> and to the cohorts, atomically.  This is useful if the coordinator
   is supposed to send some message exactly once, and it isn't a reply.
4) group addresses now are preserved even when you leave the group
5) In the BYPASS stuff, you can now get at the protocol at several levels,
   giving you:
	a raw interface to a multicast transport, with little reliability
        an atomic "fbcast" interface (fifo from sender)
	a cbcast interface
	an abcast interface (not yet implemented)
   the scheme is such that you pay an incremental cost as you move up the
   hierarchy.  The raw interface is fastest, fbcast is about 2ms slower 
   (constant overhead w.r.t. the raw protocol regardless of # dests), abcast
   slowest of all.  Figures will be out on this shortly...
6) More flexible group addressing (multicast to members/members+clients/clients)
7) New message formats and types, faster and smaller mlib
8) a fast pg_join for use in special cases (added for Deceit file system)
9) A way to monitor a group for total failure even if you aren't a member
10) a version of pg_lookup that caches results and uses (9) to run very fast
11) automatic switch to MACH IPC for local communication in MACH settings

I think this covers the whole thing.  We will be alpha testing this version
of ISIS early in January, Beta testing by late January, and hope for a release
around March 1.

Ken