[comp.lang.c] A not so nice macro

maart@cs.vu.nl (Maarten Litmaath) (06/23/89)

mcdaniel@uicsrd.csrd.uiuc.edu (Tim McDaniel) writes:
\...
\(But, presumably, Maarten meant
\    "one is a Pascal dweeb => one tends to complain about C subscripts"
\rather than
\    "one tends to complain about C subscripts => one is a Pascal dweeb").

Right, but if you read between the lines... :-)
(Just some fuel for the flames, OK?)

\One application for non-0-based-subscripting is, in fact,
\> in the MINIX kernel's `proc' table user processes have positive
\> indices, while kernel tasks have negative. 

It's MINIX' very `proc_addr()' macro which `inspired' me to write the article.

\...
\Not an array "reference", whatever that means, but an expression
                           ^^^^^^^^^^^^^^^^^^^
			   Huh? I didn't invent the term, you know!

\...
\It might be disconcerting, as you note, to see
\        zork(5) = 20;
\in a C program.

VERY disconcerting...

\...  If you ever think that Chris Torek has
\made a mistake, it is prudent to think again.  The odds say that you
\made the mistake.

I know, I know. Hey! I just wanted to see if you all were paying attention!
>;->

\>	bar	_foo[HIGH - LOW + 1];                   /* #1 */
\>	#define foo         (_foo - LOW)                /* #2 */
\[with the example]
\>	foo[-4] == (_foo - -5)[-4] == *((_foo + 5) - 4) == /* #3 */
\>           *(_foo + 1) == _foo[1]
\...
\Objection 2 (minor): with this #define, constructs like
\        a = func foo;
\would be syntactically legal, and
\        foo = 10;
\would give a confusing error message.  I might prefer
\        #define foo     &_foo[-LOW]
\which is (more or less) equivalent.

Almost right: you fell into `your own' trap! `foo[-4]' would evaluate to
`&_foo[-LOW][-4]'. You need an extra pair of parentheses.

\Objection 3: the result of the computation is undefined in pANS C.
\>	foo[-4] == (_foo - -5)[-4] == *((_foo + 5) - 4)
\OK so far, in a syntactic sense at least.
\>           == *(_foo + 1)
\Wrong, at least in pANS C.  pANS C is not permitted to rearrange
\expressions so as to ignore parentheses.

Completely right! My brain needed a reboot when I wrote the article. :-(

\Under what conditions is Maarten's scheme guaranteed to work?  (From
\here on, assume there's no integer overflow.)

[Nice calculations leading to:]

\        HIGH >= LOW
\        LOW <= 0
\        HIGH >= -1
\...
\The first #define, attributed to Chris Torek,

I doubt again Chris was involved at all; perhaps he's saved the article I was
referring to.

\...
\>	#define		foo_addr(n)	(&(foo)[  ((n)-(LOW))  ])

\There's a general rule of thumb: in the right-hand side of a macro
\definition, parenthesize everything, even where you think parentheses
\are unnecessary.  Here's yet one more example of where the rule of
\thumb is useful.

Right.

\% Actually, there's the "as if" rule, which permits an implementation
\to do anything it wants as long as the result derived is correct for
\all cases in which pANS C defines an answer.  For example, if "a+b"
\would overflow, the value of "(a+b)+c" is not defined under pANS C,
\and an implementation may do whatever it likes.  It might produce, in
\this case, "a+(b+c)", which might happen to be the numerically correct
\answer.  Or it might send 110 volts at 100 amps AC through your chair.

That's a `Quality of Implementation' issue! :-)

\# Actually, I'm implicitly assuming another identity, that
\        &*x == x
\for any address x.  I don't know about this identity under pANS C,
\either.

Save the normal restrictions (`x' is a valid pointer, etc.) I don't know
the pANS wording. How about segmented architectures with `near' and `far'
pointers?
Anyway, thanks a lot for destroying my nice macro! >:-(
Really, thanks for getting things straight.
-- 
"I HATE arbitrary limits, especially when |Maarten Litmaath @ VU Amsterdam:
   they're small."  (Stephen Savitzky)    |maart@cs.vu.nl, mcvax!botter!maart

mcdaniel@uicsrd.csrd.uiuc.edu (Tim McDaniel) (06/24/89)

Karl Heuer, "The Walking Lint", replied via e-mail to my reply to
Maarten.  He gave me permission to post his mail here.  My comments
follow.

----------------------------------------------------------------------

From: karl@ima.UUCP (Karl Heuer)
Subject: Re: A nice macro
Newsgroups: comp.lang.c
In-Reply-To: <1330@garcon.cso.uiuc.edu>
References: <2784@solo8.cs.vu.nl>
Organization: Interactive Systems, Boston

In article <1330@garcon.cso.uiuc.edu> mcdaniel@uicsrd.csrd.uiuc.edu
wrote:
>It might be disconcerting, as you [Maarten] note, to see
>        zork(5) = 20;
>in a C program.

Actually, I think "lmacros" are a pretty neat hack; they often are
cleaner than their "standard" counterparts.  (One that I've had
occasion to use is "push(stk)=val", with symmetry to "val=pop(stk)".)
They do take some getting used to.

>Objection 1: in pANS C, many identifiers starting with "_" are
>reserved for the implementation.  Unfortunately, I don't have the
>rules handy, so I can't tell the circumstances under which this
>declaration would be legal.  If "_foo" is extern, I'm pretty sure it
>is illegal.  "real_foo" would be a better choice.

Give the pANS rules for macro expansion, you could use "foo" for both.
This solves the namespace problem, but creates a debugging nightmare.

>>	#define		foo_addr(n)	&foo[(n) - LOW]
>
>might have a similar problem.  "a[b]" was defined by K&R to be
>identical to "*(a+b)", so

Well, if you want to do it as a lexical replacement, it's "*(a+(b))"
(otherwise a[1<<1] would yield *(a+1<<1) which is illegal).  In fact
this is how it's stated in the pANS.

>        &foo[(n) - LOW] <==> (foo + (n) - LOW)
>I don't know whether pANS C specifies the same identity,# or what it
>says about evaluating expressions without parentheses.  If compilers
>are allowed to rearrange, a compiler might instead compute
>         (foo - LOW + (n))

Without the parens, "a+b+c" must be equivalent to "(a+b)+c" because of
the associativity rules.  But in view of the correct identity, we
actually have "(foo + ((n)-LOW))", which is guaranteed to do the right
thing.  So there's no problem here.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

----------------------------------------------------------------------

So it appears that I neglected my own rule of thumb!  (The one that
says "If you think Chris Torek was wrong, think again".)  It seems
that foo_addr is OK as it stands.

BTW, I personally don't mind lmacros.  However, that goes into a style
issue: how much can you alter the syntactic appearance of C code
before it becomes unreadable?  The original source for the Bourne
shell is a notorious example of Obfuscated C.  Whether or not lmacros
are unbearable is a matter of personal style.  Questions of
programming style are very good generators of flame wars.
--
"Let me control a planet's oxygen supply, and I don't care who makes
the laws." - GREAT CTHUHLU'S STARRY WISDOM BAND (via Roger Leroux)
 __
   \         Tim, the Bizarre and Oddly-Dressed Enchanter
    \               mcdaniel@uicsrd.csrd.uiuc.edu
    /\       mcdaniel%uicsrd@{uxc.cso.uiuc.edu,uiuc.csnet}
  _/  \_     {uunet,convex,pur-ee}!uiucuxc!uicsrd!mcdaniel