[net.unix-wizards] SHMOP

root%bostonu.csnet@CSNET-RELAY.ARPA (BostonU SysMgr) (10/23/85)

Ok, nobody leave the room...someone call the UNIX police...

From SHMOP(2):

char *shmat(...)
...

Upon successful completion, the return value is as follows:

	Shmat returns the data segment start address of the
	attached shared memory segment
...
Otherwise, a value of -1 is returned...
---
Yup, sure does, about the only one like it I know of, test has to go
something like:

	if(((int) (foo = (footype) shmat(args))) == -1) error...

anyone know any good reason why it (peculiarly) doesn't return NULL on
failure? Even that use of an (int) cast isn't quite right, long may be
closer, maybe, or some union tomfoolery. I would understand if this
were old but it isn't. I doubt very much NULL is a reasonable return
value on success. (note: SYSVR2/3B2 if it makes any difference.)

	-Barry Shein, Boston University

gwyn@BRL.ARPA (VLD/VMB) (10/23/85)

	char *sbrk( int incr );
also returns -1 (presumably this really means (char *)-1)
on failure.  This is a holdover from the good old days of
PDP-11 UNIX; it really ought to be NULL also.

(There is some hilarious code in a few SVR2 utilities
that checks the return of malloc() for -1.  Fixed at BRL.)

rjnoe@riccb.UUCP (Roger J. Noe) (10/24/85)

> char *shmat(...)
> ...
> Upon successful completion, the return value is as follows:
> ...
> Otherwise, a value of -1 is returned...
> ---
> Yup, sure does, about the only one like it I know of, test has to go
> something like:
> 
> 	if(((int) (foo = (footype) shmat(args))) == -1) error...
> 
> anyone know any good reason why it (peculiarly) doesn't return NULL on
> failure? Even that use of an (int) cast isn't quite right, long may be
> closer, maybe, or some union tomfoolery. I would understand if this
> were old but it isn't. I doubt very much NULL is a reasonable return
> value on success. (note: SYSVR2/3B2 if it makes any difference.)
> 
> 	-Barry Shein, Boston University

I prefer to test:

	char *cp;
	if((cp = shmat(...)) == (char *)(-1))
		perror("UNIX System V IPC system calls suck");

But what worries me is that the (char *)(-1) might be a legitimate value for
a pointer to a character.  I think only NULL is required by the language to
never point to anything.  The bit pattern for (char *)(-1) could conceivably
be indistinguishable from a successful return value.  Why is this not NULL?
Because little minds stuck to their foolish consistency that all system calls
should return -1 upon failure.  What they forgot was that all system calls
should return integers!  shmat() in particular should have been allowed
another argument, type (char **) in which one could pass &cp.

The interprocess communication primitives included in System V (message
queues, semaphores, and shared memory) are all very messed up.  They were
clearly not thought out too well in either their design or implementation.
With any luck, they'll go away soon and be replaced by something that works.
--
Roger Noe			ihnp4!riccb!rjnoe

bilbo.niket@LOCUS.UCLA.EDU (Niket K. Patwardhan) (10/24/85)

I thought that every SYSTEM CALL returned -1 on error. For example, read
returns -1 if you give it a fid that hasn't been opened, is not readable etc.
EOF is NOT considered an error, and is signalled by a 0 return. Thus SHMOP
returning -1 on an error would be consistent with all other system calls.

gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (10/24/85)

> I thought that every SYSTEM CALL returned -1 on error. For example, read
> returns -1 if you give it a fid that hasn't been opened, is not readable etc.
> EOF is NOT considered an error, and is signalled by a 0 return. Thus SHMOP
> returning -1 on an error would be consistent with all other system calls.

So tell me, what does that mean for a (char *)-valued function?

root%bostonu.csnet@CSNET-RELAY.ARPA (BostonU SysMgr) (10/25/85)

>From: "Niket K. Patwardhan" <bilbo.niket@locus.ucla.edu>
>
>I thought that every SYSTEM CALL returned -1 on error. For example, read
>returns -1 if you give it a fid that hasn't been opened, is not readable etc.
>EOF is NOT considered an error, and is signalled by a 0 return. Thus SHMOP
>returning -1 on an error would be consistent with all other system calls.

The real problem here is that very few system calls return pointers.
There is a broader C convention that says that on failure routines
which would return pointers return NULL because of the problems of
representing -1 as a pointer in all cases (and, conversely, the promise
that NULL is unique among pointers [please, let's not rehash that here.])

The de facto convention has two rules and goes something like:

	Routines which would return an integer return -1 on failure
	Routines which would return a pointer return NULL on failure

Yes, this is shakey ground, as Doug Gwyn noted to me, about the only
other candidate in UPM(II) (sbrk()) returns -1 also, tho it is very
old. Also, as far as I know this convention has never been codified
as a UNIX system call standard, perhaps only a C convention. Even the
above leaves something to be desired (consider atoi()).

So, you are not wrong, you may even be right. You may be stating the
way it works while I am stating the way it seems it should have worked.

The upshot is: Don't take it too seriously, you'll lose your mind.

	-Barry Shein, Boston University

bradbury@oracle.UUCP (Robert Bradbury) (10/26/85)

From Roger Noe:

> But what worries me is that the (char *)(-1) might be a legitimate value for
> a pointer to a character.  I think only NULL is required by the language to
> never point to anything.  The bit pattern for (char *)(-1) could conceivably
> be indistinguishable from a successful return value.  Why is this not NULL?
> Because little minds stuck to their foolish consistency that all system calls
> should return -1 upon failure.  What they forgot was that all system calls
> should return integers!  shmat() in particular should have been allowed
> another argument, type (char **) in which one could pass &cp.

From the XJ311 C Standard, C.2.2.3:
	"The integer constant 0 is converted to a pointer of the
	appropriate type that is guranteed not to point to an OBJECT.
	Such a pointer, called a null pointer, must appear to be equal to the
	integer constant 0.".
From D.1.1:
	"NULL ... [can be used] as an argument to represent the null pointer".

It is highly questionable whether the result of shmat() can be considered
to point to an "OBJECT", so it may be perfectly allowable from a C standard
point of view to return NULL.  I can imagine machines (I&D space PDP11s)
where you could detach your data space and attach a shared memory segment
at location zero, so NULL (or 0) is a reasonable result from shmat().

At the same time the result of shmat() must be on a SHMLBA boundary.
Out of the 8 or so machines we have used this on, the smallest boundary
I've seen is a 512 byte boundary, so although (char *)-1 may be a pointer
to a character, it is not a legal result from shmat() and may thus be used
to indicate failure of the system call.

> The interprocess communication primitives included in System V (message
> queues, semaphores, and shared memory) are all very messed up.  They were
> clearly not thought out too well in either their design or implementation.
> With any luck, they'll go away soon and be replaced by something that works.
> 

Are they really that messed up?  What criteria do you use to judge that?

We have used all of the calls (shared memory, messages and semaphores)
in implementations of Oracle running on a number of machines.  As articles
at the most recent USENIX, UNIX Review (8/85) and the Bell System Technical
Journal (11/82) indicate, they are critical to the implementation of high
performance DBMS packages [and good multi-user real-time games :-)].

If you were to take a look at shared memory implentations on other operating
systems (VM, MVS, VMS, AOS, etc), you might find that the UNIX facilities are
not all that bad.

My complaints with these calls would be:
 a) The documentation for semaphores is unintelligable, but if you can
    figure them out they work much better than using a signal/kill
    mechanism.  I think the basic problem is that the call is overloaded.
 b) The manufacturers do not test the calls.  We have found bugs (historically)
    in releases from AT&T (3B20,3B2), Amdahl (UTS), and Pyramid.
 c) If you could tell when a process associated with a message queue died
    (via a signal), you could use messages to replace pipes (they are more
    efficient).  Using an alarm() with a msgget() adds too much overhead.
 d) You never know where you can attach shared memory segments if you want
    to put them at fixed addresses.  You have to use test programs
    to determine the valid addresses because they aren't documented.
    The difference in location and direction of growth of shared memory
    segments on two supposadly identical machines (the 3B2 and 3B5)
    is a joke!  [However, this is a problem with AT&T's hardware and software
    people not talking to each other -- not a problem with shared memory.]

As far as going away, I sure hope not -- they are in the System 5 Interface
Definition.  And they really do work (despite Roger's comments).
Now, if the Berkeley people would only wake up and put them into 4.3 or 4.4
life would be just great :-).
-- 
Robert Bradbury
Oracle Corporation
(206) 364-1442                            {ihnp4!muuxl,hplabs}!oracle!bradbury

hokey@plus5.UUCP (Hokey) (10/28/85)

Robert, The SysV semaphore system implements a *stack* instead of a *queue*.

Should I assume you believe that behavior to be a *feature*?  How can it
possibly be useful in a multiuser environment?

Granted, this is a problem in the implementation of semaphores, but it has
not been fixed, and semaphores *still* provide a mutex method which is too
low-level and expensive to be useful for implementing shared database locks
in non-record oriented databases.

-- 
Hokey           ..ihnp4!plus5!hokey
		  314-725-9492

rml@hpfcla.UUCP (10/29/85)

Shmat(2) and sbrk(2) aren't the only two calls which return pointers;
look at signal(2).  This is one case where NULL wouldn't do as an
error return, since there are two out-of-band values for successful
returns, and SIG_DFL is generally #define'd as 0.  There has
been some effort toward easing type-correct code here, as BSD has
added

	#define	BADSIG		(int (*)())-1

and the SVID mentions under FUTURE DIRECTIONS that AT&T will add a
similar macro (with a different name, SIG_ERR).  Things are, at least,
greatly improved from V6, when SIG_DFL and SIG_IGN didn't exist and
the hard-coded constants 0 and 1 were used.

From the way most UN*X implementations (including the early ones) have
been written, *all* system calls return -1 on failure.  Since they all
pass through a common trap routine, they all pass back success/failure
indication and return values via a common mechanism.  The C library
stubs all check this common success/failure indication and, on error,
brach to common code which sets a return value of -1.  Of course it
would be quite easy to write the stub for a particular system call to
set a return value of NULL, but this hasn't been done.  It would also
have been possible to have shmat return its pointer through another
parameter.  Realistically, there's enough code which checks sbrk and
signal return values for -1 that a machine on which -1 was a valid
return for either would have importability problems.  I agree that at
least the error return from shmat could be documented as (char *)-1.

			Bob Lenk
			{ihnp4, hplabs}!hpfcla!rml

henry@utzoo.UUCP (Henry Spencer) (10/31/85)

> The interprocess communication primitives included in System V (message
> queues, semaphores, and shared memory) are all very messed up.  They were
> clearly not thought out too well in either their design or implementation.

It's no particular secret that the SysV IPC swill was devised by taking the
union of several different sets of IPC mechanisms devised by different groups
within the Bell System.  This is why there's virtually no software in the
distributed Unixes that employs them:  they were invented for Bell internal
applications work, too specialized to be distributed widely.  Since each
group invented its own mechanism, software using them was very specific
to individual groups and didn't transport well enough to gain general
acceptance.  Remember that System V started out as the attempt to pull all
of the Bell-internal Unix variants together.  Naturally, each of those groups
leaned heavily on the System V people to include *their* favorite IPC scheme,
unchanged, so they wouldn't have to fix all their software.  Given that
the System V group was short on both the technical creativity to do it right
and the political clout to make the result stick, it was inevitable that they
would implement the union of the schemes rather than the intersection.

> With any luck, they'll go away soon and be replaced by something that works.

Don't get your hopes too high, given that they got into the SysV Interface
Definition (despite attempts to keep them out).
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

thomas@utah-gr.UUCP (Spencer W. Thomas) (11/03/85)

In article <132000020@hpfcls.UUCP> rml@hpfcla.UUCP writes:
>From the way most UN*X implementations (including the early ones) have
>been written, *all* system calls return -1 on failure.  

Hoo boy!  Gotcha on this one.  Both the PDP-11(V6/V7) and
VAX(32V-4.3bsd) versions of Unix set the CARRY BIT to indicate error.
The actual error code is returned in r0.  Returning -1 for error is a C
LANGUAGE convention (since it's hard to test the carry bit in C).
Granted, the routine "cerror" always returns -1 on an error.  This is a
bitch when the system call can legitimately return -1.

-- 
=Spencer   ({ihnp4,decvax}!utah-cs!thomas, thomas@utah-cs.ARPA)
	"When wrath runs rampage in your heart you must hold still
	 that rambunctions tongue!" - Sappho