[comp.simulation] SIMULATION DIGEST V2 N7

simulation@uflorida.cis.ufl.edu (Moderator: Paul Fishwick) (05/24/88)

Volume: 2, Issue: 7, Mon May 23 23:21:20 EDT 1988

+----------------+
| TODAY'S TOPICS |
+----------------+

(1) Using Chaos for Stochastic Variation
(2) Chaotic Behavior in the Cerberus multiprocessor simulator
(3) Distributed Simulation Methodology & Debugging
(4) Problem = Multiple Mailings...

Moderator: Paul Fishwick, Univ. of Florida
Send topical mail to: simulation@uflorida.cis.ufl.edu


------------------------------

Date: Sat, 21 May 88 16:58 EST
From: Ezra Zubrow <APYEZRA@ubvmsc.cc.buffalo.edu>
Subject: RE: SIMULATION DIGEST V2 N6
To: fishwick@fish.cis.ufl.EDU

I have done some small chaotic simulations just to try to add the
variation rather than stochastic variation into behavioral phenomena. But
the man who has really been at it is Prof. Paul Kazarinoff who can be 
reached at the math department of the Suny at Buffalo--He has a whole
range of chaotic  simulations which he has been running at both the 
Cornell and ILLINOIS SUPER COMPUTERS.
ezra 
zubrow

[Does anyone know Prof. Kazarinoff's net address? -paf]

------------------------------

Date: Sat, 21 May 88 15:19:59 PDT
From: gatech!ut-sally!uunet!umd5!ames!lll-lcc!lll-crg!brooks@bikini.cis.ufl.edu (Eugene D. Brooks III)
To: uflorida!simulation
Subject: Re: SIMULATION DIGEST V2 N6
Newsgroups: comp.simulation
Organization: Lawrence Livermore National Laboratory

>a simulation written in, say, GPSS or SLAM that has exhibited forms
>of chaotic behavior? In particular, I'm wondering if certain
>discrete event models that contain nonlinear methods might contain
>strange attractors -- perhaps we should take out the old time series
>data collected from old experiments and plot them in the phase plane!

I am not sure if its a strange attractor, but the behaviour of the
memory subsystem of the Cerberus multiprocessor simulator, an indirect
network with no global mechanisms to induce synchronization and only
a trivial priority scheme to resolve internal conflicts, seems to
have a collective motion rise up out of noise.  If the processors
start with a random unconstrained address and start a vector request
of a common stride (with caviats for say a stride of 2) the network
falls into lock step after a "settling time" with a strange periodic
behaviour of the memory subsystem.  If a disturbance is made in the
network after it falls into lockstep it bubbles out the conflict and falls
into a different periodic motion.  It would be very interesting to understand
this theoretically as all choices for network topology seem to have the
"attractive behaviour" but some choices settle in faster than others.
One could mix and match topologies and use simulations to score them for
common addressing conditions, but an analysis that could shed light on the
matter would be very helpful.

Anyone interested in this item should send me mail as brooks@maddog.llnl.gov
or ...!maddog!brooks and I will happily send you copies of papers dealing
with the topic.



------------------------------

To: simulation@ufl.edu
Path: calgary!lomow
From: gatech!rutgers!uunet!ubc-cs!calgary!lomow@bikini.cis.ufl.edu (Greg Lomow)
Newsgroups: comp.simulation
Subject: Re: SIMULATION DIGEST VOL. 2, NUM. 5
Summary: Debugging Distributed Simulations
Date: 22 May 88 16:06:32 GMT
Organization: U. of Calgary, Calgary, Ab.


In response to an article in
  Volume: 2, Issue: 5, Mon May 16 12:16:29 EDT 1988

I have responded to the points in a slightly different order
than they were presented in the original posting.

> In any case, the need here is for "distributed debugging tools".

Agreed. We need some way to trace and get consistent snapshots
of an executing distributed simulation as well as being able to
print the values of process variables and messages. However,
there is no reason that our debugging tools cannot hide the
unpleasant details (such as rollback) of the underlying
synchronisation mechanism (see the rest of my posting).

> I agree with Jerry that the # of threads you could possibly
> follow is max 10.  In fact, for me, it is max 1!

I disagree. When you WRITE a sequential, event list based,
process-oriented simulation (using, say, Simula'a class
Simulation) you are in fact writing a simulation with multiple
threads of control. Just because it all runs in a single Unix
process does not change the fact that the abstraction provided
by the simulation language is one of multiple interacting
processes. These are fairly easy to WRITE because we do not have
to consider what every process is doing at every instance in
time; instead we code each process by just considering what it
is doing and what the processes it interacts with are doing (we
also take great comfort in knowing that the whole thing is going
to run sequentially).

> a.  Jerry talks of the difficulty of debugging multiple threads
> of LPs. I think this is a problem with the TimeWarp (TW) scheme.
> It should not be so much a problem with the BCM scheme.

I am not sure why everyone considers WRITING distributed
simulations so difficult. In the case of both sequential and
distributed simulation the programmer abstraction of multiple
interacting processes is the same. The biggest difference is
that in distributed simulations (using either the Chandy-Misra
(CM) or the Time Warp (TW) approach), processes are only allowed
to communicate via message passing (i.e., the processes are
loosely-coupled), whereas in sequential simulations we often
allow processes to communicate using global shared memory.

When talking about distributed simulation we need to
differentiate between the programming abstraction presented to
the user and the underlying synchronisation mechanism.

Most (all) papers on distributed simulation focus on the
distributed synchronisation mechanism (solving this problem is
hard). This leaves the reader with the impression that the
programmer has to take the underlying synchronisation mechanism
(CM or TW) into account when writing a simulation.

In fact, when using either CM or TW a process has access to the
following four primitives (none of which are tied to the
underling synchronisation mechanism)

  self()
    - return this process's unique identifier

  time()
    - return this process's current simulation time

  send(rcvr, delay, event)
    - schedule "event" for the process identified by "rcvr" 
      and have it occur "delay" time units in the future; the
      timestamp of the event will be time()+delay

  receive()
    - return the next event; this is defined as the earliest
      event sent to this process that has not already been 
      processed

Each process receives and processes its events in increasing
timestamp order.

The fact that events are mapped to messages which are sent
between processes and processors is irrelevant to the
programmer. Also the fact the the underlying synchronisation
mechanism is: 1) sending NULL messages to avoid deadlock (CM),
2) detecting and breaking deadlocks (CM), or 3) allowing
processes to execute optimistically and rolling them back when
necessary is transparent to the programmer. It is possible to
write a simulation using these primitives and run it, without
modification, using CM or TW or a sequential simulation package.

> But my heart goes out to someone trying to follow 10 threads of
> logic WITH the possibility of rollback!

This is a non-issue. Someone using TW does not see processes
being rolled back and does not have to take rollback into
consideration when writing the simulation program.

Now Back To Debugging

Going back to my first point, if we can provide programming
abstractions that hide the underlying synchronisation mechanism,
we should be able to provide debugging abstractions that hide
the underlying synchronisation mechanism.

One of the things necessary for debugging distributed
simulations is "determinism". You would like to guarantee that
if you run the same simulation twice (same inputs and same
random number generators, etc.) you will get identical results
and every process will receive its events in the same order in
both executions. Determinism should hold even if the two
executions are run on a different number of processors and the
mapping of processes to processors is different between runs.
Determinism should also be independent of relative processor
speeds and communication delays.

If you don't have determinism it is very difficult (almost
impossible) to re-create some bugs and this can be very
frustrating.
-- 
Greg Lomow
	lomow@cpsc.calgary.cdn
      or
	....![ubc-vision,ihnp4]!alberta!calgary!lomow



------------------------------

Date: Mon, 23 May 88 23:14:50 EDT
From: Paul Fishwick <fishwick@fish.cis.ufl.edu>
To: simulation@ufl.edu
Subject: multiple mailings


Sorry about multiple mailings to those of you on the subscriber list (non-
usenetters)! The problem is due to an intermediate node on the internet
"timing out" which creates locked files and, apparently, all kinds of
headaches for 'sendmail'. The problem should be rectified soon!

-pf

 +------------------------------------------------------------------------+
 | Paul A. Fishwick.......... INTERNET: fishwick@uflorida.cis.ufl.edu     |
 | Dept. of Computer Science. UUCP: {gatech|ihnp4}!codas!uflorida!fishwick|
 | Univ. of Florida.......... PHONE: (904)-335-8036                       |
 | Bldg. CSE, Room 301....... FACS is available                           |
 | Gainesville, FL 32611.....                                             |
 +------------------------------------------------------------------------+



+--------------------------+
| END OF SIMULATION DIGEST |
+--------------------------+