[comp.sys.transputer] Scheduling

rwd@robots.oxford.ac.uk (Ron Daniel) (10/06/89)

I have a question concerning scheduling between PRI PARs and PARs.  We
have a real time control task on a network of transputers being
performed by the PRI PAR which must send packets of data to other
processes.  We have two ways of performing this:


PRI PAR
  WHILE TRUE
    SEQ
      ... external LOW LATENCY comms + little processing, always ready to run
      IF
        run.this.interval
          SEQ
            chan[0] ! data1
            chan[1] ! data2
            chan[2] ! data3
        TRUE
          SKIP
  PAR i=0 FOR 3
    WHILE TRUE
      SEQ
        chan[i]?local.data[i]
        ... A long process of [i]

the other is

PRI PAR
  WHILE TRUE
    SEQ
      ...  as above
      chan ! data1
      chan ! data2
      chan ! data3
  WHILE TRUE
    SEQ i=0 FOR 3
      SEQ
        chan ? local.data[i]
        ... A long process of [i]

At first site it would appear that the first version should allow the
PRI PAR to run with less latency in its comms as the second version
has to wait for communications to synchronise between each SEQ in
turn.  Does the first method to behave in a `sensible' manner?

According to the Inmos Compiler Writer's guide, a low level process
will only deschedule under certain circumstances.  Consider the following set
of events in version 1:

PRI PAR outputs data1 on chan[0], deschedules, and PAR [0] picks up local.data[0].
PRI PAR reschedules and tries to  output on CHAN[1], but for further correct
operation in a `sensible' manner requires PAR [1] to schedule next to pick up
local.data[1], ie the behaviour of the PRI PAR with respect to latency now
depends on the order of scheduling of the replicated PARs.  Is it good
fortune that the next process to schedule is always PAR i+1, ie the sheduler
chooses the process waiting on input rather the one that is free to proceed?
If the process which is ready to perform a complex SEQ was to be scheduled, the
PRI PAR would remain descheduled as its comms is blocked.

The point is, can the scheduling process make `intelligent' choices for the
next process to run so that it may complete its comms in the minimum time?
Is there some fancy proof that the PRI PAR always 'gets its channel'?
I can imagine a number of ways of organising the low priority PARs so that
it becomes necessary to search for the PAR with the other end of the channel
holding up the PRI PAR.  Has anyone produced a set of guidelines on programming
practice to ensure `sensible' behaviour in the above example?

Thanks for any helpful hints etc.

Ron Daniel

geraint@prg.oxford.ac.uk (Geraint Jones) (10/08/89)

This is a reply to Ron Daniel's question about communications between high and
low priority processes in a PRI PAR on a transputer.  The simple answer is
that: no, there is no clever scheduling. The complicated answer takes a bit
longer.

You can think of low-priority-ness as a real tar-baby: a high-priority process
that communicates on a channel the other end of which is held by a low-priority
process has to put up with the vagaries of the low-priority process scheduler.
If a high-priority process communicates on a channel with a process that is not
already waiting, it is descheduled.  If there is then no (other) ready high-
priority process, a low-priority process will be scheduled and time-sliced and
so on until one of them completes the synchronisation and restarts the high-
priority process again.

The examples in the question were, roughly

        PRI PAR                         PRI PAR
          WHILE ...                       WHILE ...
            SEQ                             SEQ
              P                               P
              SEQ i = 0 FOR n                 SEQ i = 0 FOR n
                c[i] ! e(i)                     c[i] ! e(i)
          PAR i = 0 FOR n                 WHILE ...
            WHILE ...                       SEQ i = 0 FOR n
              SEQ                             SEQ
                c[i] ? x[i]                     c[i] ? x[i]
                Q(i)                            Q(i)

and in both cases there can be periods when there is no high priority process
executing.

In the parallel case, the first time around the loops, it can in the worst case
take a time long enough to set up all of the inputs before any one of the
outputs can happen; but it can only take as long as that for all of the outputs
to complete.  Thereafter, of course, each of the Q(i) has to have completed
before the next communications can happen, so it takes n times the time it
takes to run Q(i) before the next cycle of the high priority loop completes.

In the sequential case, it is guaranteed that the high-priority process will
not be running while any of the Q(i), except the last, is running; so it will
take a time about (n-1) times as long as it takes to run Q(i) to complete the
n outputs to c[i].  The second is therefore significantly worse for keeping the
high-priority process `tied-up'.

There is a trick for keeping a high-priority process running even if it needs
to communicate with a low-priority process, and that is to insert a
`sacrificial' buffer.  The high-priority process communicates with the buffer
which also runs at high priority; the buffer communicates with the low-priority
process.  If you have pitch that needs to be touched without your becoming
defiled, you get someone else to do it for you.  In

        PRI PAR
          PAR
            WHILE ...           -- high priority
              SEQ
                ...
                c ! e
                ...
            WHILE ....          -- buffer, also high priority
              SEQ
                c ? b
                d ! b
          WHILE ...             -- low priority
            SEQ
              ...
              d ? x
              ...

the output c!e cannot be responsible for descheduling all of the high-priority
processes: the second process can get stuck at the d!b, but whichever of the
processes first reaches the communication on c, the first process is able to
continue after the c!e and a delay at most as long as it takes the second
process to complete an input and start an output.

Of course, if the first process gets around its loop before the buffer has
been emptied, it has to wait for the next c!e, but you can put in a several-
place buffer to smooth out any short bursts of high-priority activity.

If, however, the high-priority process loops faster on average than the low-
priority process, no finite buffer will prevent the high-priority process from
being held up.  If you want to guarantee latency you need to throw away some
of the communications: if you have to service interrupts that may come too
often, you have to be prepared to throw some of them away.  There is an example
of an `almost-buffer' which is always ready to accept input (after at most a
bounded delay) but which throws away input for which it has no room in
Chapter 9 of `Programming in occam2', Jones and Goldsmith, Prentice-Hall 1988.
(Adverts get everywhere these days.)  It is something like this:

  PROC leaky.buffer(CHAN OF BYTE input,
                    CHAN OF SIGNAL request,
                    CHAN OF BYTE reply      )
    -- buffers up to type.ahead number of bytes at a time
    INT reader, writer, count :
    SEQ
      reader, writer, count := 0, 0, type.ahead
      [type.ahead]BYTE datum :
      WHILE TRUE
        ALT
          BYTE lost :
          count = 0 & input ? lost
            SKIP  -- could signal the error here
          count > 0 & input ? datum[writer]
            count, writer := count - 1, (writer + 1) \ type.ahead
          count < type.ahead & request ? CASE signal
            SEQ
              reply ! datum[reader]
              count, reader := count + 1, (reader + 1) \ type.ahead
  :

and it can be used to guarantee always to be ready for communication on `input'
if it is used with another idiomatic construction, called a `request buffer'
in Jones and Goldsmith:

  PROC request.buffer(CHAN OF SIGNAL request,
                      CHAN OF BYTE reply,
                      CHAN OF BYTE output     )
    WHILE TRUE
      BYTE datum :
      SEQ
        request ! signal
        reply ? datum
        output ! datum
  :

They are used like this:

        PRI PAR
          PAR
            WHILE ...   -- this is the `real' high priority process
              SEQ
                ...
                c ! e
                ...
            leaky.buffer(c, request, reply)
            request.buffer(request, reply, d)

          WHILE ...     -- this is the low priority process
            SEQ
              ...
              d ? x
              ...

with both of the buffering processes also running at high priority.  Now only
the request.buffer can ever get stuck talking to a low-priority process, so the
conversation between the real high-priority process and the leaky.buffer is
carried out entirely at high priority.
                                                                            gj

iang@uowcsa.cs.uow.oz (Ian Gorton) (10/11/89)

In addition to Geraint Jones's description of scheduling and sacrificial
buffers, a very good and readable description is given by Peter Welsh
(reference below). It describes how to use occam for real-time
programming.

Welsh, Peter  'Managing Hard Real-Time Constraints on Transputers',
Proceedings 7th Occam User Group Conference, Grenoble, Sept 1987

-Ian