[comp.lang.c] interesting type problem

henry@utzoo.UUCP (Henry Spencer) (09/20/87)

Consider the following problem.  We are trying to write a table-driven
scanner for a garden-variety programming language.  We decide on a fairly
quick approach:  we use the current character to index into a table,
which contains pointers to the appropriate next table for each character.
When we hit a NULL pointer, we're done.  We will have some cyclic structures
for tokens like identifiers that can be of unlimited length.

Question:  what is the type of one of those pointers?

It can't be done in strictly portable C using "efficient" pointers (where
character pointers and "void *" are "inefficient" because they require
extra-bulky and extra-slow representations on some machines).  The cyclic
structures are what kill it; somewhere along the line, one has to lie about
the types to avoid infinite regress.  And the only portable way to do such
lying is to use "void *".

The irksome aspect of it is that on every machine I can think of, including
several that do have an efficient/inefficient distinction in pointer
representations, the problem could be solved in assembler without using any
inefficient pointers!  "Pointer to pointer to X" is generally an efficient
pointer with a representation independent of X, and this algorithm never
uses any other flavor of pointer.

I'm not advocating a change to C to fix this.  I was just a bit bemused by
the discovery of the situation, and thought others might be interested.
-- 
"There's a lot more to do in space   |  Henry Spencer @ U of Toronto Zoology
than sending people to Mars." --Bova | {allegra,ihnp4,decvax,utai}!utzoo!henry

henry@utzoo.UUCP (Henry Spencer) (09/21/87)

Mark Brader has pointed out that this could be solved by wrapping the
pointer in a struct, although in general this risks incurring overheads
like padding.  Seems a rather artificial way of handling it, though.
-- 
"There's a lot more to do in space   |  Henry Spencer @ U of Toronto Zoology
than sending people to Mars." --Bova | {allegra,ihnp4,decvax,utai}!utzoo!henry

jss@hector.UUCP (Jerry Schwarz) (09/22/87)

In article <8626@utzoo.UUCP> henry@utzoo.UUCP (Henry Spencer) writes:
>We decide on a fairly
>quick approach:  we use the current character to index into a table,
>which contains pointers to the appropriate next table for each character.
>When we hit a NULL pointer, we're done.  We will have some cyclic structures
>for tokens like identifiers that can be of unlimited length.
>
>Question:  what is the type of one of those pointers?

Declare 

    struct ptr { 
	    struct ptr *p ;
	    } ;

And let your values be "struct ptr*". 

You won't need any casts (just selections) and on most machines with
multiple formats for pointers, the efficient one will be used for
pointers to structures.

Jerry Schwarz 
Bell Labs, Murray Hill

gwyn@brl-smoke.UUCP (09/24/87)

In article <8626@utzoo.UUCP> henry@utzoo.UUCP (Henry Spencer) writes:
>I'm not advocating a change to C to fix this.  I was just a bit bemused by
>the discovery of the situation, and thought others might be interested.

Whitesmiths had a similar problem declaring the type of the functions
whose addresses were to be passed to their their onexit() function; at
process termination, the onexit function stack "unwound", with each
function responsible for returning a pointer to the next such function.
That's obviously a recursive type...

throopw@xyzzy.UUCP (Wayne A. Throop) (09/28/87)

> henry@utzoo.UUCP (Henry Spencer)
> Consider the following problem. [A type 'T' at least partially composed of
> a pointer to type 'T']  [...]
> It can't be done in strictly portable C using "efficient" pointers

True.  The problem is that C has no way to introduce a type by name
before or during the definition of that type, except with struct, union,
or enum.  Thus, there is no way to specify a recursive type, except inside
structures, unions, or enumerations.  I personally think that "typedef"
ought to do this, so that one could say things like

        typedef foo_t *foo_t;

to declare foo_t as a pointer type, pointing to foo_t.  Of course, this
complicates the parse of such things beyond belief (or it seems to... I
haven't really tried hard to come up with a simple scheme because it
looks so hopeless).  Perhaps a new keyword, such as "forward", which
could introduce a recursive typedef, so that something like:

        typedef forward foo_t;
        typedef foo_t *foo_t;

would work just fine, because foo_t can now be recognized as both a type
and the name being defined.  Though maybe even this doesn't solve it.

Sigh.

--
Inigo hated it there.  Everybody was so dangerous, big, mean and
muscular, and so what if he was the greatest fencer in the world, who'd
know it to look at him?  He looked like a skinny Spanish guy it might be
fun to rob.  You couldn't walk around with a sign saying "Be careful,
this is the greatest fencer since the death of the Wizard of Corsica. Do
not burgle."
                        --- From The Princess Bride by William Goldman
-- 
Wayne Throop      <the-known-world>!mcnc!rti!xyzzy!throopw