[net.lang.c] c programming style - READ THIS

guy@sun.uucp (Guy Harris) (07/18/85)

> They came up with a language called DASL (*), based very strongly on C
> which actually perverted the meaning of arithmetic in such a way that 
> "argv = argv + n" ACTUALLY MEANT "argv = argv + n*sizeof(argv *)" !!

Well, what do you mean by "actually meant"?  The C operation of adding an
integer to a pointer does ***!!NOT!!*** mean "take the integer value which
the same bit pattern as the pointer, add the other integer value to it, and
stuff the bit pattern for the resulting value back into the pointer."  It
means (as has been pointed out several times to the readers of this
newsgroup who somehow have decided otherwise) that if the pointer is
considered to be pointing to the Nth element of an array of items of the
type given by dereferencing the pointer, it is to point to the N+Mth item,
where M is the integer value added to the pointer.  If your phrase "actually
meant" means "performs the specified operation on the underlying bit
patterns", "argv = argv + n" *does* "actually mean" (in that sense) "argv +
n*sizeof(*argv)".

However, one would hope that the C community has stopped thinking in terms
of the underlying bit patterns by now, and actually has a reasonable
abstract model of what C programs do.  Unfortunately, I'm afraid that's not
the case.  It may be that post-C programming languages will have to avoid
defining the result of adding an integer to a pointer and various other C
features that seem to cause a fair bit of confusion to some (such as array
names standing for a constant pointer to the first element of the array).

Now the READ THIS part:

	1) The only operation in C that adds an integer to the integer
	   with the same bit pattern as a given pointer and produces a
	   pointer value with the same bit pattern as the result is

	   (pointer_type) ((int)pointer + integer_value)

	2) "++a" and "a += 1" are completely equivalent expressions.
	   "a++" is almost equivalent; it yields the value before 1 is
	   added to "a", rather than the value after "1" is added.

and, while I'm at it,

	3) A pointer to "foo" and an array of "foo"s are not equivalent.
	   The declarations

		foo *p;

	   and

		foo p[666];

	   are not in any way, shape, or form equivalent - with one exception
	   (an unfortunate one, considering that every six months or so
	   somebody in net.lang.c asks why their program doesn't work when
	   they use the first declaration in one module, the second
	   declaration in another, and link the two modules together).  The
	   second style of declaration is interpreted as being the first
	   style when an argument to a procedure is declared.

Please read, digest, and remember.  Thank you.

	RTFM,
	Guy Harris

gwyn@brl-tgr.ARPA (Doug Gwyn <gwyn>) (07/19/85)

> 	1) The only operation in C that adds an integer to the integer
> 	   with the same bit pattern as a given pointer and produces a
> 	   pointer value with the same bit pattern as the result is
> 
> 	   (pointer_type) ((int)pointer + integer_value)

Well, almost.

I'm sure that Guy knows the problem -- I wonder if any of the
people recently showing by their postings on the subject of C
style that they aren't familiar enough with C to pay attention
to can spot the problem.  Please don't bother to post your
solution; no prizes are being offered.

throopw@rtp47.UUCP (Wayne Throop) (07/20/85)

In an otherwise OK article, Guy Harris says:

>        1) The only operation in C that adds an integer to the integer
>           with the same bit pattern as a given pointer and produces a
>           pointer value with the same bit pattern as the result is
>
>           (pointer_type) ((int)pointer + integer_value)
>
>        Guy Harris

This turns out not to be the case.  The expression "(int)p" where p is a
pointer does not in any way, shape, or form guarantee to yeild an
integer with the "same bits" as the pointer p.

The "only way" to treat the bits of a pointer as an integer, do
arithmetic on this integer, and yeild a pointer with the resulting
integer expression's bits is this:

    union { some_type *p; int i; } u;
           (u.p=pointer, (u.i += integer_value), u.p);

Note that even this code is non-portable, since it assumes the types
"int" and "some_type *" are the same size in bits.  (There IS no
portable way to do it.)
-- 
Wayne Throop at Data General, RTP, NC
<the-known-world>!mcnc!rti-sel!rtp47!throopw

guy@sun.uucp (Guy Harris) (07/22/85)

> In an otherwise OK article, Guy Harris says:
> 
> >        1) The only operation in C that adds an integer to the integer
> >           with the same bit pattern as a given pointer and produces a
> >           pointer value with the same bit pattern as the result is
> >
> >           (pointer_type) ((int)pointer + integer_value)
> >
> >        Guy Harris
> 
> This turns out not to be the case.  The expression "(int)p" where p is a
> pointer does not in any way, shape, or form guarantee to yeild an
> integer with the "same bits" as the pointer p.

Yes, I knew that (having worked for N years on a machine where "int" was 16
bits and pointers were 32 bits), but adding in code to make it work
correctly and explain it to people who don't understand that adding an
integer to a pointer makes the pointer advance by that integer number of the
storage objects pointed to by the pointer, not that integer number of
machine storage units, would probably have totally confused them.

	Guy Harris

doug@escher.UUCP (Douglas J Freyburger) (07/24/85)

> However, one would hope that the C community has stopped thinking in terms
> of the underlying bit patterns by now, and actually has a reasonable
> abstract model of what C programs do.  Unfortunately, I'm afraid that's not
> the case.  It may be that post-C programming languages will have to avoid
> defining the result of adding an integer to a pointer and various other C
> features that seem to cause a fair bit of confusion to some (such as array
> names standing for a constant pointer to the first element of the array).

[ Minor flame ]

I use C largely because it DOES allow (encourage?) me to
think in terms of the underlying bit patterns involved.  I
pay very close attention to just what is happening to
pointers when they are changed; that is the price I pay for
fast running programs.

When I am not concerned with performance, I often use a
different language that lets me play with the algorthym
without paying close attention to the actual machine
language generated, like PASCAL.  (Actually, after four
years full-time in C, I now just use a looser form of C
with simpler idioms, but it did take me thousands of lines
of C to get to that point.)  For just type-it-in-and-go
type programs where the engineer cost dominates, the
languages that are optimized for fast prototyping are
"better" anyways.

When I AM concerned with performance, I have to put in the
effort to track every pointer on my own, and all the other
neat stuff C lets me do without PASCAL or ForTran style
run-time checking.

Every language in its own place.  I place C in the high
performance but high programmer price category.

dave@lsuc.UUCP (David Sherman) (07/24/85)

In article <2439@sun.uucp> guy@sun.uucp (Guy Harris) writes:
||	2) "++a" and "a += 1" are completely equivalent expressions.

Well, almost. On at least some systems, "a++" won't work if a is
float, while "a += 1" will add 1.0 to a.

Dave Sherman
The Law Society of Upper Canada
Toronto
-- 
{  ihnp4!utzoo  pesnta  utcs  hcr  decvax!utcsri  }  !lsuc!dave

guy@sun.uucp (Guy Harris) (07/26/85)

> I use C largely because it DOES allow (encourage?) me to
> think in terms of the underlying bit patterns involved.  I
> pay very close attention to just what is happening to
> pointers when they are changed; that is the price I pay for
> fast running programs.

But is this a case of "thinking of the underlying bit patterns", or is it a
case of thinking of the underlying operations on an abstract machine?  I.e.,
do you think of a pointer as a natural number indexing a large array which
is your process' address space (which may not work on a segmented machine!),
or do you think of it as something that points to an object?  You can still
think in the latter terms without forgetting the important part of how
pointers are implemented - i.e., I doubt the ability to cast pointers to
some integral type and back again is needed, in general, for efficiency

> When I AM concerned with performance, I have to put in the
> effort to track every pointer on my own, and all the other
> neat stuff C lets me do without PASCAL or ForTran style
> run-time checking.

What do you mean by "track every pointer on (your) own"?  True, you can't
use pointers in FORTRAN without run-time checking, but then you can't use
pointers in FORTRAN *with* run-time checking, since it doesn't *have*
pointers.  FORTRAN doesn't do much "run-time checking" because it doesn't
have much checking to do - no pointers, so no null-pointer checking; no
subrange or enumerated types, so no checking assignments to such types.  I
believe there are many PASCAL compilers that will allow you to turn the
checking off - of course, if you do so, you should have good reason to be
sure the errors that the checking would detect should occur, considering
that those errors can bomb your program regardless of whether it was written
in PASCAL or C.  (For instance, if you get a pointer from a routine which
could return a null pointer, *always* check it before using it unless you
*know* that the particular call to that routine which returns the pointer
won't return a null pointer.  And make *sure* you really "know" it.  Just
because you created a file "/tmp/foo" and made it readable doesn't mean
'fopen("/tmp/foo", "r")' is not going to return NULL - you could get an I/O
error, or somebody else could have unlinked the file while you weren't
looking, or...)


You don't have to think of pointers as integers indexing a large array which
is your address space (and think of null pointers as being the integer 0) in
order to write efficient C code.  (And if you do think that way, what you
may end up with is C code that will break when somebody else tries to run it
on their machine - which will force *them* to fix the problem.)

	Guy Harris

keesan@bbncc5.UUCP (Morris M. Keesan) (07/29/85)

In article <734@lsuc.UUCP> dave@lsuc.UUCP (David Sherman) writes:
>In article <2439@sun.uucp> guy@sun.uucp (Guy Harris) writes:
>||	2) "++a" and "a += 1" are completely equivalent expressions.
>
>Well, almost. On at least some systems, "a++" won't work if a is
>float, while "a += 1" will add 1.0 to a.

Those systems are buggy.  The C compiler for the BBN C/70 used to not support
++ and -- on floats, because the compiler maintainer said they were
"nonsensical".  When I inherited the compiler, I added code to the first pass
which would convert "++a" to "a+=1" for floating a.  From the C Reference
Manual, section 7.2 (p. 187, K&R):  "The expression ++x is equivalent to x+=1.
See the discussions of addition and assignment operators for information on
conversions."  The "usual arithmetic conversions" in this case cause "a += 1"
to be equivalent to "a += 1.0".
-- 
Morris M. Keesan
keesan@bbn-unix.ARPA
{decvax,ihnp4,etc.}!bbncca!keesan

kimcm@diku.UUCP (Kim Christian Madsen) (08/04/85)

In article <734@lsuc.UUCP> dave@lsuc.UUCP (David Sherman) writes:
>In article <2439@sun.uucp> guy@sun.uucp (Guy Harris) writes:
>||	2) "++a" and "a += 1" are completely equivalent expressions.
>
>Well, almost. On at least some systems, "a++" won't work if a is
>float, while "a += 1" will add 1.0 to a.
>
>Dave Sherman
>The Law Society of Upper Canada
>Toronto
>-- 
>{  ihnp4!utzoo  pesnta  utcs  hcr  decvax!utcsri  }  !lsuc!dave

Well, you should know that `++' and `--' *ONLY* works on variables of type
scalar, where a there is a defined successor or predecessor (You can compare
these operators as the Pascal functions pred() and succ() ), and since a
variable of type float has no defined successor/predecessor `++' and `--'
won't work on it!

					Regards
					Kim Chr. Madsen
				a.k.a.	kimcm@diku.uucp

guy@sun.uucp (Guy Harris) (08/06/85)

> Well, you should know that `++' and `--' *ONLY* works on variables of type
> scalar, where a there is a defined successor or predecessor

No, I didn't know that.  Then again, neither did Kernighan nor Ritchie.  See
the sixth paragraph under the heading "7.2 Unary operators" in the C
Reference Manual (p. 187 of *The C Programming Language* - I'm sick of
typing in quotes from the CRM for people who won't go read their own copy;
you've got the reference, go read it) wherein it states quite clearly that
"++" and "--" work on any type which can have 1 added to it (type "scalar"?
There's no "scalar" in the index to *The C Programming Language.*  Sure
you're not thinking of Pascal?).

> (You can compare these operators as the Pascal functions pred() and succ()),

No, you can't.  My J&W says that "succ" and "pred" apply to all scalar types
except "real".  My K&R says that "++" applies to any type which you can add
1 to, which *includes* "float" and "double" and does *not* include "enum".
(Besides, "++" has a side effect, while "succ" is a function, but we won't
discuss that.)  The fact that "a + 1" in C and "succ(a)" in Pascal happen
yield the same answer if "a" is an integral type in C and a "integer" or
"char" in Pascal shouldn't tempt you into stretching the similarity past its
breaking point.

	Guy Harris

jmc@inset.UUCP (John Collins) (08/10/85)

In article <247@bbncc5.UUCP> keesan@bbncc5.UUCP (Morris M. Keesan) writes:
>....................  From the C Reference
>Manual, section 7.2 (p. 187, K&R):  "The expression ++x is equivalent to x+=1.
>See the discussions of addition and assignment operators for information on
>conversions."  The "usual arithmetic conversions" in this case cause "a += 1"
>to be equivalent to "a += 1.0".

Interestingly enough the PDP11 C compiler could not do ++ and -- on floating
point numbers.... (Up to System III anyway).
-- 
John M Collins		....mcvax!ist!inset!jmc
Phone:	+44 727 57267
Snail Mail: 47 Cedarwood Drive, St Albans, Herts, AL4 0DN, England.

peter@baylor.UUCP (Peter da Silva) (08/12/85)

> > Well, you should know that `++' and `--' *ONLY* works on variables of type
> > scalar, where a there is a defined successor or predecessor
> 
> No, I didn't know that.  Then again, neither did Kernighan nor Ritchie.  See

Ritchie, did... since the Ritchie 'C' compiler complains about (float)++,
claiming it's not a scalar (or words to that effect). At least it did as
of Nov '80... I haven't tried since then.
-- 
	Peter da Silva (the mad Australian)
		UUCP: ...!shell!neuro1!{hyd-ptd,baylor,datafac}!peter
		MCI: PDASILVA; CIS: 70216,1076

guy@sun.uucp (Guy Harris) (08/16/85)

> > > Well, you should know that `++' and `--' *ONLY* works on variables of
> > > type scalar, where a there is a defined successor or predecessor
> > 
> > No, I didn't know that.  Then again, neither did Kernighan nor Ritchie.
> 
> Ritchie, did... since the Ritchie 'C' compiler complains about (float)++,
> claiming it's not a scalar (or words to that effect).

Well, then, he must have changed his mind at some point, considering that
K&R says "++<something>" is the same as "<something> += 1".  As you'd have
known if you'd read the article you were replying to, the article was
referring to K&R not to a particular compiler written by R; the former may
not be a great standard but it's a lot closer to a standard than the
behavior of one particular compiler.

	Guy Harris

peter@baylor.UUCP (Peter da Silva) (08/19/85)

> Well, then, he must have changed his mind at some point, considering that
> K&R says "++<something>" is the same as "<something> += 1".  As you'd have
> known if you'd read the article you were replying to, the article was
> referring to K&R not to a particular compiler written by R; the former may
> not be a great standard but it's a lot closer to a standard than the
> behavior of one particular compiler.
> 
> 	Guy Harris

Since the behaviour of Ritchie compiler was used to resolve certain
ambiguities in K&R, it has to be considered part of the definition.
It's unfortunate that they never tried running Appendix A through YACC,
but them's the breaks. I guess the fact that YACC probably didn't exist
at the time of writing might have something to do with it.

Dennis? Can you confirm or deny this?
-- 
	Peter (Made in Australia) da Silva
		UUCP: ...!shell!neuro1!{hyd-ptd,baylor,datafac}!peter
		MCI: PDASILVA; CIS: 70216,1076

mer@prism.UUCP (08/28/85)

Since 'p+n', where p is a pointer and n is an integer, is equivalent to
adding n*sizeof(whatever p points to), the safe and portable way of adding
an integer to a pointer treated as an integer is 
	 (char *)p + n
since (I think) character are always a byte wide.  If that's not always the
case, I apologize; on the other hand, it's probably safe than converting
a pointer to an int or a long;  I had trouble porting something from a VAX
to a Pyramid because of a cast of this sort.

eppstein@columbia.UUCP (David Eppstein) (08/31/85)

In article <5400011@prism.UUCP> mer@prism.UUCP writes:
> 
> Since 'p+n', where p is a pointer and n is an integer, is equivalent to
> adding n*sizeof(whatever p points to), the safe and portable way of adding
> an integer to a pointer treated as an integer is 
> 	 (char *)p + n
> since (I think) character are always a byte wide.  If that's not always the
> case, I apologize; on the other hand, it's probably safe than converting
> a pointer to an int or a long;  I had trouble porting something from a VAX
> to a Pyramid because of a cast of this sort.

Not the case.  For instance on the DEC-20, adding integers to (int *)
looks like adding integers to integers but adding them to (char *)
does something totally different (divides by number of bytes per word,
adds word quotient as an integer, adds number of bytes remainder
multiplied by byte size shifted over by 30 bits; bytes per word and
bits per byte have to be calculated from the pointer itself because
we have both 7-bit bytes packed 5 to a word and 9-bit bytes packed
4 to a word (others too but those are the ones used in C); this is not
as bad as it sounds because there is a machine instruction to do it all).

On all implementations that I know about, coercing the pointer to (long) and
adding your integer will work.  However this is still somewhat nonportable
and you are better off doing whatever you are trying to do some other,
clean, portable way not involving adding integers as integers to pointers.

meissner@rtp47.UUCP (Michael Meissner) (09/01/85)

In article <5400011@prism.UUCP> mer@prism.UUCP writes:
>
>Since 'p+n', where p is a pointer and n is an integer, is equivalent to
>adding n*sizeof(whatever p points to), the safe and portable way of adding
>an integer to a pointer treated as an integer is 
>	 (char *)p + n
>since (I think) character are always a byte wide.  If that's not always the
>case, I apologize; on the other hand, it's probably safe than converting
>a pointer to an int or a long;  I had trouble porting something from a VAX
>to a Pyramid because of a cast of this sort.
>

This will break down on machines which are bit or word oriented, rather than
byte oriented.  The real question is why do you need to do this in the first
place?

(for word oriented machines, char pointers are typically bigger than word
pointers, and truncatation occurs when converting from un-aligned character
pointers back to word pointers;  the addition typically would create an
un-aligned character pointer)

(for bit oriented machines, adding n to a pointer would actually add 8*n)

	Michael Meissner
	Data General