[comp.lang.ada] Ada and OS tasks

jcallen@Encore.COM (Jerry Callen) (11/22/90)

In article <1990Nov20.205819.24040@sctc.com> stachour@sctc.com (Paul Stachour) writes:

>[ Interesting article on whether or not Ada runtimes should map Ada tasks
   to underlying OS tasks/thread/processes/etc. ]

I've worked on Ada runtimes that did it both ways. Generally, you have more
flexibility if Ada tasks use the underlying OS tasks, but you pay a price in
overhead; how much depends upon the system.

Paul missed a few issues; here are some more:

- The underlying OS may not provide much, if any, control over task scheduling,
  which renders Ada priorities meaningless.

- The OS may not provide much in the way of locking primitives; building your
  own (with, say test and set and spin locks) may prove problematic. For
  instance, suppose a task holding a lock is blocked by the OS and another
  task tries to get the lock? You can burn a lot of CPU in spin locks.

  This is important because a multi-threaded Ada RTS is going to make
  heavy use of locking.

- How do Text_IO and the other predefined Ada I/O packages behave in the
  presense of tasks? Making them "do the right thing" may entail a lot
  of overhead that most programs don't need. (My vote: control via the
  forms string.)

So how have various Ada implementations done it? Here are a few I am familiar
with; I'd love to see folks post more. I have a lot of IBM/370 experience, which
is reflected in the following list...

- Intermetrics MVS Ada: one Ada task per MVS task. Task create times are
  unpleasantly long, but rendezvous isn't bad; as Paul pointed out, all
  I/O in MVS is inherently asynchronous, so the OS provides reasonably
  efficient "wait/post" primitives. Since MVS runs on multiprocessors, it
  is possible to really have multiple Ada tasks executing simultaneously.
  Shared memory isn't a problem since MVS tasks share an address space.

- Intermetrics CICS Ada: one Ada task per CICS task. I don't have any idea
  how well it performs.

- Telesoft MVS Ada: mapping controlled by pragma; I don't have details. 
  I would expect the performance to be similar to the Intermetrics RTS (at
  least regarding OS task create/rendezvous times).

- current Encore Ada (uses Verdix technology): two schemes available. The
  "sequential" Ada is the usual "one Unix process for the whole program"
  approach, with the Ada RTS handling dispatching. The "parallel" Ada
  allows a variable number of Unix processes to be allocated to a program,
  which the RTS multiplexes among the Ada tasks; it's sort of a hybrid of
  the two usual approaches. Memory is shared via Unix shared memory
  facilities. I/O is funnelled through a single I/O process.

- forthcoming Encore Ada (for Encore 88K based systems): one "very lightweight
  process" (thread, really) per Ada task. All threads share an address space.
  I'm currently working on this implementation; needless to say, it runs like
  a bat outta hell. :-)

-- Jerry Callen
   jcallen@encore.com

jloup@nocturne.chorus.fr (Jean-Loup Gailly) (11/23/90)

In article <13325@encore.Encore.COM>, jcallen@Encore.COM (Jerry Callen) writes:

> I've worked on Ada runtimes that did it both ways. Generally, you have more
> flexibility if Ada tasks use the underlying OS tasks, but you pay a price in
> overhead; how much depends upon the system.

Quite correct. I have also worked (within Alsys) on Ada runtimes that
did it both ways. When Ada tasks are mapped to OS threads, the
rendezvous time is usually dominated by the time spent in the OS, even
when the Ada runtime does its best to minimize the number of system
calls.

> - The OS may not provide much in the way of locking primitives; building your
>   own (with, say test and set and spin locks) may prove problematic. For
>   instance, suppose a task holding a lock is blocked by the OS and another
>   task tries to get the lock? You can burn a lot of CPU in spin locks.
> 
>   This is important because a multi-threaded Ada RTS is going to make
>   heavy use of locking.

Yes, spin locks should not be used, but there are other alternatives. You
can build a very fast locking primitive by making an (expensive)
blocking system call only in the case of contention, that is, when a
test-and-set fails. Variants of this scheme are used in the Alsys Ada
runtimes which map tasks onto OS threads, and in the implementation of
mutexes in the Chorus operating system.

> So how have various Ada implementations done it? Here are a few I am
> familiar with; I'd love to see folks post more.

- Alsys MVS Ada: mapping controlled by pragma and binder options. A variable
  number of MVS tasks can be allocated, each running a variable number of
  Ada tasks. So both extremes are possible (all Ada tasks mapped to one
  MVS task, or one Ada task per MVS task).

- Alsys Ada on LynxOS: the current implementation supports only the one to one
  mapping between Ada tasks and LynxOS threads.

- Alsys Ada on VRTX: also one to one mapping.

- Alsys Unix compilers: the usual "one Unix process for the whole program".
  Predefined IO "does the right thing", that is, one task blocked on IO does
  not block other tasks. However non predefined IO (such as sockets) is
  blocking.

				Jean-loup Gailly
E-mail: jloup@chorus.fr		Chorus systemes, 6 avenue Gustave Eiffel
Fax: +33 (1) 30 57 00 66	78182, St-Quentin-en-Yvelines-Cedex
Tel: +33 (1) 30 64 82 79	France