[net.lang.c] C subscripts

ryan@mouton.UUCP (10/26/84)

I hope I am not bringing up and old issue
but this discussion about nested comments brings
a similiar niggle to mind; one that I am certain
should raise an equally fruitful controversy.

I am talking about C's arcane subscripting syntax.

Who else out there thinks that
	some_array_name[more][long][names]
is actually a good way to do things?

Just as we need nested comments, because after all other "great languages"
do, I propose that C support "normal" subscripting.

Mathematicians have been around longer than C; Fortran too;
so, in the name of compatability, which is what the ANSI standard
is all about anyway, the compiler SHOULD accept 
	some_array_name[more,long,names]

Of course the older, hopefully obsolescent, method will be supported,
but it should be discouraged.

I would also propose one small, new, preprocessor built-in,
	SUBTYPE  which would be defined OLD or NEW
as in 
	#if SUBTYPE == OLD
		fprintf(stder, "buy a new compiler");
so programmers know what's going on.
Not defining SUBTYPE would be considered in an optional subsection
to the standard.  These details I leave to those who know best.

Now, BEFROE the flames start I KNOW the comma operator exists.
This is not overloading because subscripts are expressions, not statements.
So don't start quoting K&R pg 192 to me.

If you're still confused think about it this way:
	a[x,y,z] saves 20% more disk space than a[x][y][z]
and is much easier to type too!

respectfully,
Tom Ryan

p.s.  Does anyone know of a good preprocessor macro to convert
	[a,b,c,...n] into [a][b][c][...][n] for n>1 ?

henry@utzoo.UUCP (Henry Spencer) (10/28/84)

> Who else out there thinks that
> 	some_array_name[more][long][names]
> is actually a good way to do things?

An equally-appropriate question is "who out there thinks this is a
sufficiently severe problem to be worth fixing?".

> Just as we need nested comments, because after all other "great languages"
> do, I propose that C support "normal" subscripting.

I suggest that C needs neither.

> Mathematicians have been around longer than C...

So they have, but that doesn't mean we should change C so that the
multiplication operator is implicit, like it is in math.  "They do it
that way in math" is not really a very relevant argument; we do lots of
things differently in programming languages.  Whether this is a good
thing is an interesting question, which I don't suggest debating here,
but we have ample precedent for being different.

> ... in the name of compatability, which is what the ANSI standard
> is all about anyway, the compiler SHOULD ...

Compatibility *with* *what*?  Surely not with older C implementations,
which is the major compabitility concern of the ANSI standard; I'm not
aware of any C implementation that has done this.

> Now, BEFROE the flames start I KNOW the comma operator exists.
> This is not overloading because subscripts are expressions, not statements.
> So don't start quoting K&R pg 192 to me.

"expressions, not statements"?  Surely you are confused; a C "assignment
statement" is nothing but an expression with a semicolon after it.  The
distinction you're after, I think, is the one made in things like function
parameter lists:  commas are not comma operators unless within parentheses
or other bracketing.  The expression "x[2,3]" has a perfectly legitimate
(albeit peculiar and unlikely) meaning in C right now, and you are changing
it.  It would not surprise me if somebody, somewhere, had found a use for
the current behavior; to judge by some of the furor over cleaning up the
preprocessor, there is no feature of C so slimy that someone won't find
a real use for it.  Remember that "not breaking existing correct programs"
is the specific "compatibility" objective of the ANSI C committee; you
are proposing a violation of it, for reasons that seem thin.

> p.s.  Does anyone know of a good preprocessor macro to convert
> 	[a,b,c,...n] into [a][b][c][...][n] for n>1 ?

Sounds like you could do it with sed, if things weren't too complicated.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

toml@oliveb.UUCP (Dave Long) (10/28/84)

@ Who else out there thinks that
@        some_array_name[more][long][names]
@ is actually a good way to do things?

    I do!  I prefer to think of multi-dimensional arrays as arrays of arrays,
which the current syntax implies, rather than as different entities.  It
should be easy enough to write a program to convert between notations, though.
-- 
     -- Dave Long --
   {fortune,idi,ios,hplabs,tymix}!oliveb!toml
{allegra,ihnp4,msoft,tty3b,uvacs}!oliveb!toml

alan@drivax.UUCP (Alan Fargusson) (10/29/84)

> @ Who else out there thinks that
> @        some_array_name[more][long][names]
> @ is actually a good way to do things?
> 
>     I do!  I prefer to think of multi-dimensional arrays as arrays of arrays,
> which the current syntax implies, rather than as different entities.  It
> should be easy enough to write a program to convert between notations, though.

I would like both, however I see problems with array[sub,sub]. Specificly
passing arrays a parameters would have to allow some way for the procedure
to know the size of the various dementions to compute the index of elements
of the array correctly. This is difficult in a losely typed language like C.
I think that pointers would have a similar problem.
-- 
---------------------
Alan Fargusson.

{ ihnp4, sftig, amdahl, ucscc, ucbvax!unisoft }!drivax!alan

gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (10/30/84)

> Who else out there thinks that
> 	some_array_name[more][long][names]
> is actually a good way to do things?

I do.  There is a deeper meaning to C arrays than to Fortran arrays.

> Just as we need nested comments, because after all other "great languages"
> do, I propose that C support "normal" subscripting.

We DON'T need nested comments.  I brought up the issue solely because I
was getting bored and wanted a new topic for discussion.

> Mathematicians have been around longer than C; Fortran too;
> so, in the name of compatability, which is what the ANSI standard
> is all about anyway, the compiler SHOULD accept 
> 	some_array_name[more,long,names]

Most mathematicians seldom write subscripts that way!
C is not FORTRAN.  Why should it worry about "compatibility" of appearance?

> I would also propose one small, new, preprocessor built-in,
> 	SUBTYPE  which would be defined OLD or NEW
> as in 
> 	#if SUBTYPE == OLD
> 		fprintf(stder, "buy a new compiler");
> so programmers know what's going on.

This is exactly the sort of thing we DON'T want in the new standard.

> Now, BEFROE the flames start I KNOW the comma operator exists.
> This is not overloading because subscripts are expressions, not statements.
> So don't start quoting K&R pg 192 to me.

You're right that it's not overloading.  What it is, is AMBIGUOUS.

	type	data[M][N];	/* or as you wish, [M,N] (also ambiguous) */
	data[i];		/* unambiguously a pointer to (type) */
	data[i][j];		/* unambiguously a (type) */
	data[i,j];		/* under existing rules, a pointer to (type);
				   under your proposal, AMBIGUOUS */

Arrays are definitely second-class citizens in C; pointers are full citizens.
I hope the ANSI C committee cleans this up, but leaving [][] syntax intact
seems essential.

bmt@we53.UUCP ( B. M. Thomas ) (10/31/84)

If we have

	char array[7][3]

It would be a shame if you couldn't say

	*array[3] = "stuff";
Put that into array(7,3) and by the time you have made your grammar smart
enough to do that, I'll be busy making sure that I have a backup copy of
cc so that I don't have to use yours!

andrew@hwcs.UUCP (Andrew Stewart) (11/01/84)

>I am talking about C's arcane subscripting syntax.
>Who else out there thinks that
>	some_array_name[more][long][names]
>is actually a good way to do things?

I'm sorry, but I do.

>Just as we need nested comments, because after all other "great languages"
>do, I propose that C support "normal" subscripting.
>
>Mathematicians have been around longer than C; Fortran too;
>so, in the name of compatability, which is what the ANSI standard
>is all about anyway, the compiler SHOULD accept 
>	some_array_name[more,long,names]

Ah! We follow the hallowed traditions of F*rtr*n... I see.
I have a minor objection, though -
if you consider an array to be a vector of vectors (of vectors, etc, etc)
then the C method is reasonable. The "normal" method treats the array as
a homogeneous mass, rather than a hierarchical structure.

>Of course the older, hopefully obsolescent, method will be supported,
>but it should be discouraged.

I would prefer to see the C method adopted by other languages.

>Now, BEFROE the flames start I KNOW the comma operator exists.
>This is not overloading because subscripts are expressions, not statements.
>So don't start quoting K&R pg 192 to me.

Ahem - the expression
	array_name[more,long,names]

will evaluate the expression 'more', then the expression 'long' and then use the
expression 'names' as the subscript value.
It *is* a sequence of comma-operator expressions.
Not a readable facility, but (like so many gruesomes in C) when you need it,
you *need* it.

-- 
----------------------------------
"Not a bug, a feature! It's documented, dash it!"

Andrew Stewart, Dept. of Computer Science, Heriot-Watt University,
		Edinburgh,
		Scotland.

		..!ukc!edcaad!hwcs!andrew

jerry@oliveb.UUCP (Jerry Aguirre) (11/02/84)

As I recall FORTRAN (gerr!) also varied the left most subscript
fastest.  That is, an array was stored in memory in the order:
		a[1,1], a[2,1], a[3,1], a[1,2], a[2,2] ...

Is anyone suggesting that we change the order of array storage?  Maybe
the notation a[1,2] would have the leftmost order and the notation
a[2][1] would have the rightmost?

And of course FORTRAN arrays always begin with a[1] instead of a[0].
I know a lot of people who get more confused by this than by the use of
a[0][0].  Pascal allows you to specify the starting value of the array.
Is anyone suggesting that we start arrays with other than 0?

Regarding enumerated types as array subscripts.  It would give a
cleaner appearance if the language allowed a typedef as a array size.
So that:
	typedef enum {red, blue, green} color;
	int value[color];

Would declare an array that could be subscripted with:
	value[red], value[blue], value[green]
    or:
	color x;
	value[x];

A warning or error message for an uncast subscript not of the correct
type would be ok.  But would this conflict with the proposed "enumerated
types as int" change?  Is the use of an typedef as an array size going
to confuse the syntax?

On PORTABILITY:  The comment that you should disallow a feature because
that feature can be used to write unportable code is unC like as well
as impractical.  I consider the ability to write unportable code in C
and advantage!  Remember there are several classes of code which are
implicitly not portable.  The bottom level of device drivers is an
obvious example.  No way is the code that twiddles the I/O bits going to
be portable because on any other machine the bits would be different.
Removing that unportable feature might make it impossible
to write device drivers in C.  That would put us back in the dark ages
where drivers are written in assembler or the systems group uses one
language and the applications another.  This is the main failing of
Pascal.  One can't write compilers, editors, linkers, or operating
systems with standard Pascal because it is so over protective.

Remember: languages don't write bad code, programmers do!

				Jerry Aguirre @ Olivetti ATC
{hplabs|fortune|idi|ihnp4|ios|tolerant|allegra|tymix}!oliveb!jerry

gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (11/02/84)

> ... however I see problems with array[sub,sub]. Specificly
> passing arrays a parameters would have to allow some way for the procedure
> to know the size of the various dementions to compute the index of elements
> of the array correctly. This is difficult in a losely typed language like C.
> I think that pointers would have a similar problem.

There is already a problem with using variable-sized arrays as
parameters to C functions.  This could be solved in the obvious
way (`a la Fortran) without breaking any existing code, but
there has been a remarkable lack of enthusiasm for the idea when
it has been suggested to members of the ANSI C committee.  As
things now stand, array parameters must have CONSTANT dimensions
specified for all but the first subscript.  To have them
determined dynamically at run-time (without being also passed in
as parameters), a mechanism known as "templates" or "dope vectors"
is required.  This is too much complexity for C's intended purpose.

david@ukma.UUCP (David Herron) (11/07/84)

> Summary: The logic of C's subscripting syntax
> 
> @ Who else out there thinks that
> @        some_array_name[more][long][names]
> @ is actually a good way to do things?
> 
>     I do!  I prefer to think of multi-dimensional arrays as arrays of arrays,
> which the current syntax implies, rather than as different entities.  It
> should be easy enough to write a program to convert between notations, though.
> -- 
>      -- Dave Long --
>    {fortune,idi,ios,hplabs,tymix}!oliveb!toml
> {allegra,ihnp4,msoft,tty3b,uvacs}!oliveb!toml
> 
> 

I agree too!  It is probably easier to learn as it reflects what 
is actually happening.  (The other notation looks like a function call).
-----------------------------------------
David Herron
Phone:	(606) 257-4244 (work, phone will usually be answered as "Vax Lab").
	(606) 254-7820

        Arpa-Net-----\
		      \   (or cbosgd!hasmed!qusavx!ukma!david)
	unmvax----\    \
	research   \____\____ anlams!ukma!david
	boulder    /      /
	ucbvax----/      /
                        /
	decvax!ucbvax--/

For arpa-net, anlams has the name ANL-MCS.  I have been having trouble
getting mail from arpa-net through anlams so maybe try a different route
or the user name "s".

mpackard@uok.UUCP (11/24/84)

[] <- Bug box

I have been using C for a short while but enjoy it very much.
The thought never occured to me that others would hate it so
much.  I think it is way too late to change the semantics.
If you have trouble with the language semantics I would think
that you would move on and try something else.

adm@cbneb.UUCP (12/03/84)

>Now, BEFROE the flames start I KNOW the comma operator exists.
>This is not overloading because subscripts are expressions, not statements.
>So don't start quoting K&R pg 192 to me.

You're wrong; it IS overloading BECAUSE subscripts are expressions.
Thus, "array[t=3, t+2]" is equivalent to "array[5]".  Your proposed use of
the comma would therefore be ambiguous.