[comp.unix.xenix] Test SCO Xenix IPC reliability

jfh@rpp386.UUCP (The Beach Bum) (08/22/88)

Someone was screaming about how unrealiable IPC (such as shared memory)
on SCO Xenix was.  I whipped this program up originally back during the
great volatile debate and only discovered it again tonigh while cleaning
out my home directory.  When run it prints out

TICK ...
... TOCK

forever as each process gets a chance to execute.  The code is short
enough that you should be able to understand what is going on.  If
you can run this without any trouble then your shared memory is working
just fine.  Otherwise, you have troubles ...

- John.
---------------------- clip out and save as volatile.c ----------------
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <signal.h>

int	zero = 0;
int	*loc = &zero;

int	key = ('v' << 8) | 'o';

catch (sig)
int	sig;
{
	signal (sig, catch);
}

parent ()
{
	while (1) {
		while (*loc)
			;

		write (1, "TICK ....\n", 10);
		*loc = 1;
		kill (loc[2], SIGUSR1);
		pause ();
	}
}

child ()
{
	while (1) {
		while (! *loc)
			;

		write (1, ".... TOCK\n", 10);
		*loc = 0;
		kill (loc[1], SIGUSR1);
		pause ();
	}
}

main ()
{
	int	id;

	if ((id = shmget (key, 3 * sizeof (int), IPC_CREAT|0666)) == -1) {
		perror ("shmget");
		exit (1);
	}
	if ((loc = (int *) shmat (id, (char *) 0, 0)) == (int *) 0) {
		perror ("shmat");
		exit (1);
	}
	loc[0] = 0;
	switch (fork ()) {
		default:
			loc[1] = getpid ();
			signal (SIGUSR1, catch);
			parent ();
		case 0:
			loc[2] = getpid ();
			signal (SIGUSR1, catch);
			child ();
		case -1:
			perror ("fork");
			exit (1);
	}
	exit (1);
}
-- 
John F. Haugh II                 +--------- Cute Chocolate Quote ---------
HASA, "S" Division               | "USENET should not be confused with
UUCP:   killer!rpp386!jfh        |  something that matters, like CHOCOLATE"
DOMAIN: jfh@rpp386.uucp          |         -- apologizes to Dennis O'Connor

richard@neabbs.UUCP (RICHARD RONTELTAP) (08/23/88)

[ Tested the ticktock.c program ]
 
Firstly: 286 Xenix'ers should compile the test program of J.F. Haugh
(II?, come on!) to large model with the -Ml switch.
 
Welllll, I ran the test program on XENIX /386 2.2.1 and 2.2.3 with the
same results.
 
When the program is started the first time only one TICK/TOCK is
printed. When it is started the second time. TICK/TOCK is infinitely
printed.
 
I think what happens is:
When the shared memory is created, and the parent process has printed
TICK, the context is switched to the child process right after the
'signal' command and just before the 'pause' command. When the child
now signals the parent, the signal is caught and the parent goes the
the next command: pause(), and waits for ever!
 
The second time scheduling is different because the shared memory
doesn't have to be created.
 
 
All this is rather far fetched, but the only explenation I can think
of. At least no panic's or core dumps.
 
Can anyone else post experiences?
Maybe Mr Chapman from SCO Kernel development can comment on this?
 
Richard
(...!mcvax!neabbs!richard)

jfh@rpp386.UUCP (The Beach Bum) (08/25/88)

In article <22012@neabbs.UUCP> richard@neabbs.UUCP (RICHARD RONTELTAP) writes:
>[ Tested the ticktock.c program ]
> 
>Firstly: 286 Xenix'ers should compile the test program of J.F. Haugh
>(II?, come on!) to large model with the -Ml switch.

[ yes, my uncle was john f. haugh.  he wasn't married when i was born
  so it was assumed he would remain childless.  my legal name is jfh2. ]

>When the program is started the first time only one TICK/TOCK is
>printed. When it is started the second time. TICK/TOCK is infinitely
>printed.

[ ... ]

>The second time scheduling is different because the shared memory
>doesn't have to be created.

this program should work regardless of scheduling.  on the first entry
into child() the busy loop will be executed because loc[0] was set to
zero prior to the fork.  the signal handler was set prior to entry to
child (but should have been set before the fork() - stupid me).

if parent() executes the kill() call before the child() executes the
signal() call, then you should have seen TICK ... with a hang forever.
the fix is to move the signal() call to before the fork().  if ... TOCK
is printed then signal() has been called.

>All this is rather far fetched, but the only explenation I can think
>of. At least no panic's or core dumps.

no, it is very plausible if only TICK ... was printed.  this is why
concurrent programming is such a joy and volatile variables have to
be treated specially.  because this ain't easy sh*t.
-- 
John F. Haugh II (jfh@rpp386.UUCP)                           HASA, "S" Division

    "If the code and the comments disagree, then both are probably wrong."
                -- Norm Schryer

jbayer@ispi.UUCP (id for use with uunet/usenet) (08/25/88)

In article <22012@neabbs.UUCP>, richard@neabbs.UUCP (RICHARD RONTELTAP) writes:
> [ Tested the ticktock.c program ]
>  
> Welllll, I ran the test program on XENIX /386 2.2.1 and 2.2.3 with the
> same results.
>  
> When the program is started the first time only one TICK/TOCK is
> printed. When it is started the second time. TICK/TOCK is infinitely
> printed.
>  
> I think what happens is:
> When the shared memory is created, and the parent process has printed
> TICK, the context is switched to the child process right after the
> 'signal' command and just before the 'pause' command. When the child
> now signals the parent, the signal is caught and the parent goes the
> the next command: pause(), and waits for ever!
>  
> The second time scheduling is different because the shared memory
> doesn't have to be created.
>  
>  

I think Richard is right.  I added two sleep(1) to the program, one in
the child() and one in the parent().  With these additions the program
starts up and prints TICK/TOCK even when creating the shared memory
segment for the first time.  I enclosed the new program below:

- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/shm.h>
#include <signal.h>

int	zero = 0;
int	*loc = &zero;

int	key = ('v' << 8) | 'o';

catch (sig)
int	sig;
{
	signal (sig, catch);
}

parent ()
{
	while (1) {
		while (*loc)
			;

		write (1, "TICK ....\n", 10);
		*loc = 1;
		sleep(1);			/* added by JB 8/25/88 */
		kill (loc[2], SIGUSR1);
		pause ();
	}
}

child ()
{
	while (1) {
		while (! *loc)
			;

		write (1, ".... TOCK\n", 10);
		*loc = 0;
		sleep(1);			/* added by JB 8/25/88 */
		kill (loc[1], SIGUSR1);
		pause ();
	}
}

main ()
{
	int	id;

	if ((id = shmget (key, 3 * sizeof (int), IPC_CREAT|0666)) == -1) {
		perror ("shmget");
		exit (1);
	}
	if ((loc = (int *) shmat (id, (char *) 0, 0)) == (int *) 0) {
		perror ("shmat");
		exit (1);
	}
	loc[0] = 0;
	switch (fork ()) {
		default:
			loc[1] = getpid ();
			signal (SIGUSR1, catch);
			parent ();
		case 0:
			loc[2] = getpid ();
			signal (SIGUSR1, catch);
			child ();
		case -1:
			perror ("fork");
			exit (1);
	}
	exit (1);
}
- - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

It does work fine now on 386 and 286 Xenix.


Jonathan Bayer

root@telmail.UUCP (Super user) (08/26/88)

In article <5786@rpp386.UUCP> jfh@rpp386.UUCP (The Beach Bum) writes:
>this program should work regardless of scheduling.  on the first entry
>into child() the busy loop will be executed because loc[0] was set to
>zero prior to the fork.  the signal handler was set prior to entry to
>child (but should have been set before the fork() - stupid me).
>
>if parent() executes the kill() call before the child() executes the
>signal() call, then you should have seen TICK ... with a hang forever.
>the fix is to move the signal() call to before the fork().  if ... TOCK
>is printed then signal() has been called.

That's not what I said in my article. Just to be sure, I've tried to move
the signal() before the fork, but got exactly the same results.

I'll try to explain again with a little code:

When I start the program the first time, I get 1 TICK/TOCK. The second time
I get infinite TICK/TOCK's. The result of the first time is caused by
unfortunate scheduling, I think, and here's why (first the fragment):

>parent ()
>{
>	while (1) {
>		while (*loc)
>			;
>
>		write (1, "TICK ....\n", 10);
>		*loc = 1;
>		kill (loc[2], SIGUSR1);
>		pause ();
>	}
>}
>
>child ()
>{
>	while (1) {
>		while (! *loc)
>			;
>
>		write (1, ".... TOCK\n", 10);
>		*loc = 0;
>		kill (loc[1], SIGUSR1);
>		pause ();
>	}
>}

Because loc[0] was initialised to 0, the child process waits if it happens
to get to the 'while' loop first. The parent process passes the loop, prints
TICK, changes *loc to 1 and signals the child process. AT THIS INSTANT, i.e.
BEFORE the parent reaches pause(), the scheduler transfers control to the
child process. (btw is this possible?)

The child process prints TOCK, sets *loc to 0, signals the parent, and pauses.
The parent catches the signal, and continues with the next instruction:
THE PAUSE() INSTRUCTION, and waits for a signal from the child forever.

Get it?

I don't know if the signals were a relevent part of the testing procedure,
but I've 'rewritten' the program without them, and it works just fine. Of
course it doesn't run as fast because of massive waiting in the 'while' loops,
waiting for the scheduler to transfer control to the child or vice versa.

Here is the new program:
-------------------------------------------------------------------
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/shm.h>

int 	*loc;
int	key = ('v' << 8) | 'o';

parent ()
{
	while (1) {
		while (*loc)
			;
		write (1, "TICK ....\n", 10);
		*loc = 1;
	}
}

child ()
{
	while (1) {
		while (! *loc)
			;
		write (1, ".... TOCK\n", 10);
		*loc = 0;
	}
}

main ()
{
	int	id;

	if ((id = shmget (key, sizeof (int), IPC_CREAT|0666)) == -1) {
		perror ("shmget");
		exit (1);
	}
	if ((loc = (int *) shmat (id, (char *) 0, 0)) == (int *) 0) {
		perror ("shmat");
		exit (1);
	}

	*loc = 0;
	switch (fork ()) {
		case -1: perror ("fork"); exit (1);
		case  0: child ();
		default: parent ();
	}
	exit (1);
}
----------------------------------------------------------
Richard
(...!mcvax!neabbs!richard)

john@jetson.UPMA.MD.US (John Owens) (08/26/88)

In article <5786@rpp386.UUCP>, jfh@rpp386.UUCP (The Beach Bum) writes:
> this program should work regardless of scheduling.

> if ... TOCK
> is printed then signal() has been called.

Not the first time.  If ... TOCK is printed once (after TICK ... is
printed), then parent() set loc[0] and child()'s while loop ended.
Yes, parent and child both have called signal(), but the signal
apparently doesn't go through.

I think that parent() executes
	kill (loc[2], SIGUSR1);
before the child process executes
	loc[2] = getpid();
and the child process never receives a signal.

-- 
John Owens		john@jetson.UPMA.MD.US
SMART HOUSE L.P.	uunet!jetson!john		(old uucp)
+1 301 249 6000		john%jetson.uucp@uunet.uu.net	(old internet)

jfh@rpp386.UUCP (The Beach Bum) (08/26/88)

In article <128@jetson.UPMA.MD.US> john@jetson.UPMA.MD.US (John Owens) writes:
>I think that parent() executes
>	kill (loc[2], SIGUSR1);
>before the child process executes
>	loc[2] = getpid();
>and the child process never receives a signal.

the original version busy waited and didn't use signals.  the version i
posted used signals to increase the number of interations per second,
but wasn't tested very well ...

john has found Yet Another Bug(TM) in the code, which is still further
proof as to how difficult concurrent programming can get.  without some
form of p/v operations, that program is very difficult to write.

the new version uses message queues and screams like a banshee.  that
should be final proof as to how bullet proof the message queues are
under xenix.
-- 
John F. Haugh II (jfh@rpp386.UUCP)                           HASA, "S" Division

    "If the code and the comments disagree, then both are probably wrong."
                -- Norm Schryer

jfh@rpp386.UUCP (The Beach Bum) (08/27/88)

In article <5867@rpp386.UUCP> jfh@rpp386.UUCP (The Beach Bum) writes:
>the new version uses message queues and screams like a banshee.  that
>should be final proof as to how bullet proof the message queues are
>under xenix.

and here it is.  i actually developed this on pigs, a 68020 vme bus
machine.  the code compiled first time out on rpp386.  portable, no?

just a brief overview - the parent and child swap "TICK ...." and
".... TOCK" message back and forth using a message queue.  two 
different type messages are used.  type 1 is from the parent and
is expected by the child.  type 2 is from the child and is expected
by the parent.  this insures the two processes remain synchronized.

for a really good work out, run this on the console.  if you want
to prove there are NO bugs in the message passing code (despite
what certain SCO bashers will say) run this in the background with
a real high nice for a few days.  a bug fixed version of the shared
memory tester could also be run to further bebunk the sco nay-sayers.
what the heck, run them both in the background with a nice of say,
plus 20, for a couple of days.  that should find any kinks.
------------------------ cut and save as msgque.c ----------------------
#include <sys/types.h>
#include <sys/ipc.h>
#include <sys/msg.h>
#include <signal.h>

key_t	msgkey = ('m' << 8) | 's';
int	msgqid;
struct	mymsgbuf {
	int	mytype;
	char	mytext[11];
};
struct	mymsgbuf pmsg = { 1, "TICK ....\n" };
struct	mymsgbuf cmsg = { 2, ".... TOCK\n" };

int	childpid;

parent ()
{
	struct	mymsgbuf buf;

	while (1) {
		memset (&buf, sizeof buf, 0);

		if (msgrcv (msgqid, &buf, sizeof buf.mytext, 2L, 0) < 0)
			perror ("parent: msgrcv");

		write (1, buf.mytext, sizeof buf.mytext);

		if (msgsnd (msgqid, &pmsg, sizeof pmsg.mytext, 0) < 0)
			perror ("parent: msgsnd");
	}
}

child ()
{
	struct	mymsgbuf buf;

	while (1) {
		memset (&buf, sizeof buf, 0);

		if (msgrcv (msgqid, &buf, sizeof buf.mytext, 1L, 0) < 0)
			perror ("child: msgrcv");

		write (1, buf.mytext, sizeof buf.mytext);

		if (msgsnd (msgqid, &cmsg, sizeof cmsg.mytext, 0) < 0)
			perror ("child: msgsnd");
	}
}

main ()
{
	if ((msgqid = msgget (msgkey, IPC_CREAT|0666)) == -1) {
		perror ("msgget");
		exit (1);
	}
	switch (childpid = fork ()) {
		default:
			/* prime the pump ... */
			if (msgsnd (msgqid, &pmsg, sizeof pmsg.mytext, 0)) {
				perror ("msgsnd");
				kill (childpid, 9);
				exit (1);
			}
			parent ();
		case 0:
			child ();
		case -1:
			perror ("fork");
			exit (1);
	}
	exit (1);
}
-- 
John F. Haugh II (jfh@rpp386.UUCP)                           HASA, "S" Division

    "If the code and the comments disagree, then both are probably wrong."
                -- Norm Schryer

lab@sdgsunsdgsun.com (Larry Baird) (08/27/88)

in article <166@ispi.UUCP>, jbayer@ispi.UUCP (id for use with uunet/usenet) says:
> 
> I think Richard is right.  I added two sleep(1) to the program, one in
> the child() and one in the parent().  With these additions the program
> starts up and prints TICK/TOCK even when creating the shared memory
> segment for the first time.  I enclosed the new program below:

An better fix is to move the setting of loc[0] 
	(*loc = 1 and *loc = 0 )
to after there respective kills.  
The first kill from parent to child will be ignored, but the 
kill from child to parent will sink up the whole process.


-- 
Larry A. Baird 				Software Design Group, Inc.  
Manager, Software Development             800 Trafalgar Ct. Suite 340
UUCP:ucf-cs!sdgsun!lab                    Maitland, FL 32751
CIS: 72355,171                            (407) 660-0006

woods@gpu.utcs.toronto.edu (Greg Woods) (08/27/88)

In article <5872@rpp386.UUCP> jfh@rpp386.UUCP (The Beach Bum) writes:
> In article <5867@rpp386.UUCP> jfh@rpp386.UUCP (The Beach Bum) writes:
> >the new version uses message queues and screams like a banshee.  that
> >should be final proof as to how bullet proof the message queues are
> >under xenix.
> 
> and here it is.  i actually developed this on pigs, a 68020 vme bus
> machine.  the code compiled first time out on rpp386.  portable, no?

I'll ignore that remark...

> for a really good work out, run this on the console.  if you want
> to prove there are NO bugs in the message passing code (despite
     ?????
> what certain SCO bashers will say) run this in the background with
Like ME for instance????
> a real high nice for a few days.  a bug fixed version of the shared
> memory tester could also be run to further bebunk the sco nay-sayers.
> what the heck, run them both in the background with a nice of say,
> plus 20, for a couple of days.  that should find any kinks.

How about running it for a couple of weeks, with no nice factor, along
with a shm and a sem tester, in multiple incarnations.  Meanwhile, do
a WHOLE lot of disk and tty I/O.  In other words, push it to the limit.
Make the machine so slow as to be un-usable for anything else.

Come on guys.  Even the support people at SCO came up with a better test
programme, and still had no luck finding any bugs.  It works, but if you
work it too hard, it'll drop.  Now I know better:  don't try to do
something with the wrong tools.

I have no doubt Xenix is a nice little implementation of Unix for those
who can't justify non-PC hardware (all too many in these days of < $1000
clones), and who can't decide if they like SysIII, SysV, V7, or BSD.  A
nice little hack that gives you a little of each, but the best of none.
Mind you, I would rather have it than MS-DOS or OS/2.  [ and you'll note
I don't put a smiley after this sentence ]  I should also say that the
SCO support people do try, and care about the quality of their product.
It's just that they had a lot to do to make up for a poor start, and
they are working on the most unforgiving hardware in common use.
-- 
						Greg Woods.

UUCP: utgpu!woods, utgpu!{ontmoh, ontmoh!ixpierre}!woods
VOICE: (416) 242-7572 [h]		LOCATION: Toronto, Ontario, Canada

haugj@pigs.UUCP (Joe Bob Willie) (08/28/88)

In article <114@telmail.UUCP> root@telemail.UUCP (Richard Ronteltap) writes:
>Because loc[0] was initialised to 0, the child process waits if it happens
>to get to the 'while' loop first. The parent process passes the loop, prints
>TICK, changes *loc to 1 and signals the child process. AT THIS INSTANT, i.e.
>BEFORE the parent reaches pause(), the scheduler transfers control to the
>child process. (btw is this possible?)

what appears to have been happening is that if the parent ran all the
way to the kill() call before the child called signal(), the child dies
from the signal and the parent waits in pause() for the dead child to
kill the parent().  this can only happen with the parent because of setting
*loc = 0.  if the CHILD beat the PARENT through the loop after being
kill()'d to kill() the parent BEFORE the parent executes the pause(),
the parent waits in pause() forever for a signal which has already been
delivered, along with the child who is waiting for the parent.

it is possible for the scheduler to pick any runnable process at just
about any time (well, only certain times, but it appears suitably random
to the process) to run, and may suspend any running process to do so.
the only restriction on putting processes to sleep is that a process
running in system space can't been involuntarily put to sleep.  it must
call sleep() itself.
-- 
=-=-=-=-=-=-=-The Beach Bum at The Big "D" Home for Wayward Hackers-=-=-=-=-=-=
               Very Long Address: John.F.Haugh@rpp386.dallas.tx.us
                         Very Short Address: jfh@rpp386
                           "ANSI C: Just say no" -- Me

haugj@pigs.UUCP (Joe Bob Willie) (08/28/88)

In article <105@sdgsunsdgsun.com> lab@sdgsunsdgsun.com (Larry Baird) writes:
>in article <166@ispi.UUCP>, jbayer@ispi.UUCP (id for use with uunet/usenet) says:
>An better fix is to move the setting of loc[0] 
>	(*loc = 1 and *loc = 0 )
>to after there respective kills.  
>The first kill from parent to child will be ignored, but the 
>kill from child to parent will sink up the whole process.

the original code didn't use signals.  unix signals can result in race
conditions since there is no atomic method to send a signal and wait for
the receipt of a signal with one system call.  so long as control returns
to the user between the kill() and the pause(), a race exists.

in this case, should the scheduler chose to execute the child immediately
after the parent (works either way, by the way) sets *loc = 1, the child
can go all the way around its loop and kill the parent before it gets a
chance to enter pause().
-- 
=-=-=-=-=-=-=-The Beach Bum at The Big "D" Home for Wayward Hackers-=-=-=-=-=-=
               Very Long Address: John.F.Haugh@rpp386.dallas.tx.us
                         Very Short Address: jfh@rpp386
                           "ANSI C: Just say no" -- Me

fr@icdi10.uucp (Fred Rump from home) (08/30/88)

Yes, Greg. That's exactly the point. And you said it.
Give us a reason to use xyz machine with abc software and we'll do it. In
the meantime Xenix on fast 386's runs just fine for the rest of us.
-- 
{allegra killer gatech!uflorida decvax!ucf-cs}!ki4pv!cdis-1!cdin-1!icdi10!fr    
26 Warren St.             or ...{bellcore,rutgers,cbmvax}!bpa!cdin-1!icdi10!fr
Beverly, NJ 08010       or...!bikini.cis.ufl.edu!ki4pv!cdis-1!cdin-1!icdi10!fr
609-386-6846          "Freude... Alle Menschen werden Brueder..."  -  Schiller

chip@vector.UUCP (Chip Rosenthal) (09/01/88)

A couple of comments and questions about the IPC test program -- the
shared memory version, not the message queue one.

>int	zero = 0;
>int	*loc = &zero;

Why is loc being set here?  One of the first actions in main() is:

>	if ((loc = (int *) shmat (id, (char *) 0, 0)) == (int *) 0) {

I don't understand the purpose of "zero".  Can anybody help out?

Second, wouldn't it be more realistic to drop the pause() and just do
a polling loop?  I would change:

>		while (*loc)
>			;

to something like:

>		while (*loc)
>			sleep(1);

In a multi-processing package, it is reasonable to fix the IPC service
id number to a known value.  (Grrrr...I've heard the performance arguments.
I *still* wish IPC mapped to a filesystem name rather than using a stupid,
magic ID number.)  But, is it realistic for the service requestor to know
the PID of the service server?  Furthermore, this would get rid of the
bugs which have been pointed out.  All of which are with signals and not
SysV IPC.  And we all know how reliable signals are :-(
-- 
Chip Rosenthal     chip@vector.UUCP | I've been a wizard since my childhood.
Dallas Semiconductor   214-450-0486 | And I've earned some respect for my art.

jfh@rpp386.Dallas.TX.US (The Beach Bum) (09/02/88)

In article <530@vector.UUCP> chip@vector.UUCP (Chip Rosenthal) writes:
>A couple of comments and questions about the IPC test program -- the
>shared memory version, not the message queue one.
>
>Why is loc being set here?  One of the first actions in main() is:

the code was old and moldy from other uses and i didn't clean it
up.

>I don't understand the purpose of "zero".  Can anybody help out?

originally it was there for paranoia.

>Second, wouldn't it be more realistic to drop the pause() and just do
>a polling loop?  I would change:

yes, the original did do polling, but the tick ... ... tock's came
at one second intervals as each processes quantum expired.  [ this
is only true on an idle system where no pre-emption is occuring. 
more or less ;-) ]  putting in the signals sped things up, so i
posted it.  i didn't ever expect it to fall under close scrutiny.

>In a multi-processing package, it is reasonable to fix the IPC service
>id number to a known value.  (Grrrr...I've heard the performance arguments.
>I *still* wish IPC mapped to a filesystem name rather than using a stupid,
>magic ID number.)  But, is it realistic for the service requestor to know
>the PID of the service server?

i suppose it would depend on the implementation.  if you are using
semaphores, then i doubt it.  for shared memory, why not?
-- 
John F. Haugh II (jfh@rpp386.Dallas.TX.US)                   HASA, "S" Division

    "If the code and the comments disagree, then both are probably wrong."
                -- Norm Schryer

chip@vector.UUCP (Chip Rosenthal) (09/04/88)

In article <6141@rpp386.Dallas.TX.US> jfh@rpp386.Dallas.TX.US (The Beach Bum) writes:
>In article <530@vector.UUCP> chip@vector.UUCP (Chip Rosenthal) writes:
>>is it realistic for the service requestor to know
>>the PID of the service server?
>if you are using semaphores, then i doubt it.  for shared memory, why not?

I guess you are right.  Probably do something like have the service requestor
attach to the segment, read out the PID of the service provider, leave the
request, and then signal the provider that a request is awaiting.

You wouldn't be beating on it as hard as the test case did, so the chance
of signal races is reduced.  The only limitation I see is that the requestor
needs to be the same UID as the provider to send the signal.
-- 
Chip Rosenthal     chip@vector.UUCP | I've been a wizard since my childhood.
Dallas Semiconductor   214-450-0486 | And I've earned some respect for my art.