[comp.std.c] Bug in users command

jfh@rpp386.cactus.org (John F Haugh II) (01/21/91)

[ I'm redirecting this to the ANSI-C group because it seems that Dan has
  some misunderstanding about what the current standards say about passing
  arrays and their addresses. ]

In article <24748:Jan2016:53:4291@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>> How does a pointer to an unknown number of characters differ from a
>> pointer to a known number of characters when passed as an argument?
>
>It doesn't. It does, however, differ from a pointer to an *array* of
>characters.

How?  When an array of any size, known or unknown, is passed as an
argument to a function, the value which is passed is the address of
the first element in the array.  The address of an array, that is,
the address of each member of "char names[MAXUSERS][UT_NAMESIZE]"
where each member is an "array[UT_NAMESIZE] of type char", is the
address of the first member of that array.

>scmp() does not get passed a char [UT_NAMESIZE]. It gets passed a
>pointer to a char [UT_NAMESIZE]. You claim there is an equivalence
>between pointer to array of char and pointer to char? You claim it's
>spelled out in K&R? I don't believe you.

It gets passed the address of the first character in the
"array[UT_NAMESIZE] of type char".  K&R is explicit on this point.
A "pointer to a char [UT_NAMESIZE]" is a pointer to a char with
the value of the 0th element in the array.  This is why "&x[0]"
is identical to an unadorned "x" for all types of "x" and all sizes
of the array of "x"'s.

>Of course, most implementations use the same type for all pointers
>internally. But what would happen on a machine where pointers to words
>are stored differently (say as the number of bytes divided by 4) while
>pointers to characters are stored as byte indices? Then scmp() will
>treat its arguments as byte indices, when in fact they could be a factor
>of 4 off.

It really doesn't matter in this case.  The "names" array is merely
an "array [MAXUSERS][UT_NAMESIZE] of type char".  Pointer arithmetic
in this case is very well defined.  The value of the pointer should
be the address of the 0th element of the array, which is an "array
[UT_NAMESIZE] of char".  The value of that 0th element is again
the address of the 0th element of the array, or a (char *) which
points to the 0th character.  The address of the 1st, 2nd, or nth
element of the "names" array can then be computed by adding "n *
UT_NAMESIZE" to the address of the 0th element.

The only mistake which I see is that qsort() is called with a (char *)
parameter, not (void *), but I believe that there is an explicit
requirement that all (char *) pointers be identical to all (void *)
pointers.  Of course, a function which has (long *) as the parameter
type =would= be called incorrectly by qsort() if it declared its
parameters as (long *) and not (void *), but then by the rule above,
(void *) and (char *) are indistinguishable.  There are machines
with bizarre pointer types, and I'm certain such a piece of code
would be incorrect, but this particular code is just fine.
-- 
John F. Haugh II                             UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 832-8832                           Domain: jfh@rpp386.cactus.org
"While you are here, your wives and girlfriends are dating handsome American
 movie and TV stars. Stars like Tom Selleck, Bruce Willis, and Bart Simpson."

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (01/23/91)

For comp.std.c readers: This argument started when I said that the BSD
users.c appears to be incorrect. It passes a two-dimensional character
array to qsort(), but the comparison function was expecting just a
pointer to characters. John says that's correct.

In article <18969@rpp386.cactus.org> jfh@rpp386.cactus.org (John F Haugh II) writes:
> [ I'm redirecting this to the ANSI-C group because it seems that Dan has
>   some misunderstanding about what the current standards say about passing
>   arrays and their addresses. ]

Really? Then why can't you point out my mistake?

Suppose you have the following:

  foo(s) char *s; { ... }
  char c[300][500];
  foo(&(c[17]));

Is this correct?

I see it this way. c is an array of array of char. c[17] is an array of
char. &(c[17]) is a pointer to an array of char. foo does not expect a
pointer to an array of char; it expects a pointer to char.

We all know that the value of an array is the value of a pointer to the
first element of an array. But that doesn't apply here.

Is there a conversion that says ``a pointer to c[17] is a pointer to
c[17][0]''? I don't see any justification for that in K&R. I certainly
wouldn't use such a coding style, and I hope Saber-C and other program
checkers complain about it.

Now John is right that this code will work on most real machines---pcc
will even throw away the & in &(c[17]) and complain about it. But
suppose you have a machine where pointers to characters are stored as
byte addresses, while all larger pointers are aligned on word boundaries
and then divided by 4 internally. Then &(c[17]) will be a factor of 4
different from &(c[17][0]), and the code will fail miserably.

> In article <24748:Jan2016:53:4291@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
> >> How does a pointer to an unknown number of characters differ from a
> >> pointer to a known number of characters when passed as an argument?
> >It doesn't. It does, however, differ from a pointer to an *array* of
> >characters.
> How?  When an array of any size, known or unknown, is passed as an
> argument to a function,

There is no array being passed. A *pointer* to the array is being
passed.

> The address of an array, that is,
> the address of each member of "char names[MAXUSERS][UT_NAMESIZE]"
> where each member is an "array[UT_NAMESIZE] of type char", is the
> address of the first member of that array.

We're talking about C pointers, not their most common implementation.

> >scmp() does not get passed a char [UT_NAMESIZE]. It gets passed a
> >pointer to a char [UT_NAMESIZE]. You claim there is an equivalence
> >between pointer to array of char and pointer to char? You claim it's
> >spelled out in K&R? I don't believe you.
> It gets passed the address of the first character in the
> "array[UT_NAMESIZE] of type char".  K&R is explicit on this point.

Where?

> A "pointer to a char [UT_NAMESIZE]" is a pointer to a char with
> the value of the 0th element in the array.  This is why "&x[0]"
> is identical to an unadorned "x" for all types of "x" and all sizes
> of the array of "x"'s.

No. Your second sentence is correct, but it is not logically connected
to the previous sentence.

---Dan

jfh@rpp386.cactus.org (John F Haugh II) (01/24/91)

In article <6182:Jan2222:06:3991@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>For comp.std.c readers: This argument started when I said that the BSD
>users.c appears to be incorrect. It passes a two-dimensional character
>array to qsort(), but the comparison function was expecting just a
>pointer to characters. John says that's correct.

Repeat after me ... the type which qsort() expects is a (void *) (or
in the older circles, (char *)).  The comparision function should also
be declared to accept two (void *)'s (or in the older circles, (char *)'s).
This is all in the documentation.

In the "Notes" section for my compiler -

	"The pointer to the base of the table should be of type
	 pointer to element, and cast to pointer to character."

This is the same vein as every other compiler in existence, and I
believe (because my ANSI is not here right now) that ANSI C declares
qsort() to have the following prototype (modulo a few "const"'s here
and there) -

void qsort (void *, unsigned, unsigned, int (*) (void *, void *));

That should make it pretty clear what type qsort() expects.

>There is no array being passed. A *pointer* to the array is being
>passed.

According to the standard, a pointer to a character or a pointer to
a void is being passed.  There is no mechanism in the standard to
create a pointer to an arbitrary object, or to find out the type of
an object which has been passed to you.

You are more than free to do

int scmp (char (*a)[UT_NAMESIZE], char (*b)[UT_NAMESIZE])
{ ... }

but that most certainly is incorrect, according to the standard,
since that function is not (int (*) (void *, void *)).  The best
you could do is

int scmp (void *_a, void *_b)
{
	char (*a)[UT_NAMESIZE] = _a;
	char (*b)[UT_NAMESIZE] = _b;

	...
}

if that gives you some sense of moral superiority, but you are
stuck doing

	strncmp ((char *) a, (char *) b, sizeof *a);

by your logic in that case.

How would qsort() create the parameters?  You can't do

	(*compar) ((typeof base) (base + n)), ...)

there simply is no mechanism to cast a pointer dynamically to some
other type.  How is qsort ever going to create the ((*c)[UT_NAMESIZE])
pointer?  There is no type information regarding what "base" was on
the other side of the function call.  qsort doesn't know, doesn't care
and can't find out.

Finally, by the rule which says that (void *) and (char *) have the
same representation, the BSD use of (char *) in place of (void *)
is perfectly acceptable.

Now are you happy?
-- 
John F. Haugh II                             UUCP: ...!cs.utexas.edu!rpp386!jfh
Ma Bell: (512) 832-8832                           Domain: jfh@rpp386.cactus.org
"13 of 17 valedictorians in Boston High Schools last spring were immigrants
 or children of immigrants"   -- US News and World Report, May 15, 1990

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (01/24/91)

In article <18981@rpp386.cactus.org> jfh@rpp386.cactus.org (John F Haugh II) writes:
> Repeat after me ... the type which qsort() expects is a (void *) (or
> in the older circles, (char *)).  The comparision function should also
> be declared to accept two (void *)'s (or in the older circles, (char *)'s).
> This is all in the documentation.

Yeah, but we're not talking about either side of the qsort() interface.
It's academic that when you're sorting blobs, you have to cast your blob
pointer to (void *) or (char *) before calling qsort(), and then cast
the (void *) or (char *) back to blob * inside the comparison function.

The question here is whether a pointer to an array of 10 blobs is the
same as a pointer to blob. I don't think so.

> int scmp (void *_a, void *_b)
> {
> 	char (*a)[UT_NAMESIZE] = _a;
> 	char (*b)[UT_NAMESIZE] = _b;

Yes! That's exactly my point. I wouldn't bother defining the variables
for this cast, but scmp() has to do something like your example. It's
wrong if you don't cast back to pointer-to-array-of-char.

> if that gives you some sense of moral superiority, but you are
> stuck doing
> 	strncmp ((char *) a, (char *) b, sizeof *a);
> by your logic in that case.

No! That strncmp() call is wrong, wrong, wrong.

The correct call is strncmp(&((*a)[0]),&((*b)[0]),sizeof *a)---or,
equivalently, strncmp(*a,*b,sizeof *a).

Chris or Doug or Karl or somebody out there, could you check me on this?

> How is qsort ever going to create the ((*c)[UT_NAMESIZE])
> pointer?

It doesn't. The only thing you know is that casting to a void * and back
*to the original type* will preserve your results. There may be no
portable way to write qsort(), but that's not my problem. (The most
obvious implementation, of course, is to use char *'s, and add bytes
manually. But I don't know if the standard guarantees that this will
work.)

> Now are you happy?

No, because your strncmp() call is wrong.

Crusading to turn BSD code into legal C...

---Dan

pmk@craycos.com (Peter Klausler) (01/24/91)

In article <12360:Jan2320:15:5291@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>The question here is whether a pointer to an array of 10 blobs is the
>same as a pointer to blob. I don't think so.

They're different pointer types.

>> int scmp (void *_a, void *_b)
>> {
>> 	char (*a)[UT_NAMESIZE] = _a;
>> 	char (*b)[UT_NAMESIZE] = _b;
>
>Yes! That's exactly my point. I wouldn't bother defining the variables
>for this cast, but scmp() has to do something like your example. It's
>wrong if you don't cast back to pointer-to-array-of-char.
>
>> if that gives you some sense of moral superiority, but you are
>> stuck doing
>> 	strncmp ((char *) a, (char *) b, sizeof *a);
>> by your logic in that case.
>
>No! That strncmp() call is wrong, wrong, wrong.

It's fine.

>The correct call is strncmp(&((*a)[0]),&((*b)[0]),sizeof *a)---or,
>equivalently, strncmp(*a,*b,sizeof *a).

Given the declaration
	char (*a)[N];

the expressions

	(char *) a
	&((*a)[0])
	*a

are all valid in ANS X3.159-1989, all of type "char *", and all yield pointers
to the same char object.

The key is that arrays rarely remain lvalues as such; they are converted to
pointer expressions when not an argument to "sizeof" or unary "&" or a character
array initializer string constant.

Without this automatic conversion, "x[y]" could not be defined as identical
to "*(x+y)". Please consult section 3.2.2.1 of the standard for the exact
specification of the automatic array lvalue conversion, or any basic ANSI C
text for a more elementary discussion of pointers and arrays in C.

-Peter Klausler, compiler group, Cray Computer Corp.

diamond@jit345.swstokyo.dec.com (Norman Diamond) (01/24/91)

In article <18981@rpp386.cactus.org> jfh@rpp386.cactus.org (John F Haugh II) writes:
>In article <6182:Jan2222:06:3991@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>>For comp.std.c readers: This argument started when I said that the BSD
>>users.c appears to be incorrect. It passes a two-dimensional character
>>array to qsort(), but the comparison function was expecting just a
>>pointer to characters. John says that's correct.

This is a tough one, and I don't dare to express an opinion yet.
But it's interesting to notice that certain well-known less careful
language lawyers also haven't answered this one yet.

>According to the standard, a pointer to a character or a pointer to
>a void is being passed.

Obviously.  And the question is, are
  char *a;
and
  char (*b)[35];
required to have the same representation, or perhaps even be treated
as compatible.

>There is no mechanism in the standard to
>create a pointer to an arbitrary object,

This is false, though.  This is why the preceding question exists.
Given
  char c[35];
b and &c have the same type.  (Most, if not all, pre-ANSI compilers
did not do this, but these are irrelevant.)

>or to find out the type of
>an object which has been passed to you.

This is true too, but irrelevant.
--
Norman Diamond       diamond@tkov50.enet.dec.com
If this were the company's opinion, I wouldn't be allowed to post it.

diamond@jit345.swstokyo.dec.com (Norman Diamond) (01/24/91)

In article <12360:Jan2320:15:5291@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:

>The correct call is strncmp(&((*a)[0]),&((*b)[0]),sizeof *a)---or,
>equivalently, strncmp(*a,*b,sizeof *a).
>Chris or Doug or Karl or somebody out there, could you check me on this?

Mr. Bernstein appears to be right.  When *a is an array, and it is not the
operand of sizeof or unary &, so it has to be converted to a pointer to the
first element of *a.  This is true in both of Mr. Bernstein's suggested calls.
--
Norman Diamond       diamond@tkov50.enet.dec.com
If this were the company's opinion, I wouldn't be allowed to post it.

jef@well.sf.ca.us (Jef Poskanzer) (01/24/91)

In the referenced message, brnstnd@kramden.acf.nyu.edu (Dan Bernstein) wrote:
}(The most
}obvious implementation, of course, is to use char *'s, and add bytes
}manually.

Yeah, that's what I did when I wrote a users clone last year:

    #define MAXNAMES 1000
    static char users[MAXNAMES][UT_NAMESIZE+1];
    (void) strncpy( users[nusers], u.ut_name, UT_NAMESIZE );
    users[nusers][UT_NAMESIZE] = '\0';

And yes, this will fail if more than 1000 users are logged in at
the same time.  Imagine how concerned I am.
---
Jef

  Jef Poskanzer  jef@well.sf.ca.us  {apple, ucbvax, hplabs}!well!jef
      If ignorance is bliss, why aren't there more happy people?

diamond@jit345.swstokyo.dec.com (Norman Diamond) (01/24/91)

In article <1991Jan24.004044.13362@craycos.com> pmk@craycos.com (Peter Klausler) writes:

>Given the declaration
>	char (*a)[N];
>the expressions
>	(char *) a
>	&((*a)[0])
>	*a
>are all valid in ANS X3.159-1989, all of type "char *", and all yield pointers
>to the same char object.

No.  The last two yield pointers to the same char object, the first char in *a.

The first one coerces a pointer-to-array to a pointer-to-char value.
The standard guarantees that you can take a pointer-to-something, cast it to
a pointer-to-char, cast it back to a pointer-to-something, and the result is
equivalent to the original pointer.  It does not guarantee that the value
with type pointer-to-char can actually be used for anything.

--
Norman Diamond       diamond@tkov50.enet.dec.com
If this were the company's opinion, I wouldn't be allowed to post it.

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (01/24/91)

In article <1991Jan24.004044.13362@craycos.com> pmk@craycos.com (Peter Klausler) writes:
> Given the declaration
> 	char (*a)[N];
> the expressions
> 	(char *) a
> 	&((*a)[0])
> 	*a
> are all valid in ANS X3.159-1989, all of type "char *", and all yield pointers
> to the same char object.

The second and third are obviously correct, and I don't see why anyone
would code the first way when the third is so much simpler.

> The key is that arrays rarely remain lvalues as such; they are converted to
> pointer expressions when not an argument to "sizeof" or unary "&"

This has nothing to do with the first example.

> Without this automatic conversion, "x[y]" could not be defined as identical
> to "*(x+y)". Please consult section 3.2.2.1 of the standard for the exact
> specification of the automatic array lvalue conversion,

No, this has nothing to do with the first example. If you just look at
what's in 3.2.2, you would conclude that (char *) a is not valid.

However, as Doug Gwyn just told me: ``You were probably concentrating on
the spec in 3.2.2.3 and overlooked the one in 3.3.4 Semantics, which in
essence states that pointers can be cast freely to other pointer types,
provided that alignment constraints are properly observed.'' (He's
right, of course.)

So it is correct to call strncmp() with arguments cast to (char *),
provided that characters are aligned the same way as character arrays
(which I think is true).

---Dan

scjones@thor.UUCP (Larry Jones) (01/25/91)

In article <1991Jan24.060113.22461@tkou02.enet.dec.com>, diamond@jit345.swstokyo.dec.com (Norman Diamond) writes:
> [ Given "char (*a)[N];", are "(char *)a", "&(*a)[0]", and "*a" equivalent? ]
> 
> No.  The last two yield pointers to the same char object, the first char in *a.
> 
> The first one coerces a pointer-to-array to a pointer-to-char value.
> The standard guarantees that you can take a pointer-to-something, cast it to
> a pointer-to-char, cast it back to a pointer-to-something, and the result is
> equivalent to the original pointer.  It does not guarantee that the value
> with type pointer-to-char can actually be used for anything.

To rephrase, "Is a pointer to an array, when cast to the array element
type, equivalent to a pointer to the first element of the array?"

I believe that the standard imposes sufficient constraints on the
implementation that the answer to that question must be "yes".
----
Larry Jones, SDRC, 2000 Eastman Dr., Milford, OH  45150-2789  513-576-2070
Domain: scjones@thor.UUCP  Path: uunet!sdrc!thor!scjones
Nobody knows how to pamper like a Mom. -- Calvin