henry@utzoo.UUCP (Henry Spencer) (09/20/87)
Consider the following problem. We are trying to write a table-driven scanner for a garden-variety programming language. We decide on a fairly quick approach: we use the current character to index into a table, which contains pointers to the appropriate next table for each character. When we hit a NULL pointer, we're done. We will have some cyclic structures for tokens like identifiers that can be of unlimited length. Question: what is the type of one of those pointers? It can't be done in strictly portable C using "efficient" pointers (where character pointers and "void *" are "inefficient" because they require extra-bulky and extra-slow representations on some machines). The cyclic structures are what kill it; somewhere along the line, one has to lie about the types to avoid infinite regress. And the only portable way to do such lying is to use "void *". The irksome aspect of it is that on every machine I can think of, including several that do have an efficient/inefficient distinction in pointer representations, the problem could be solved in assembler without using any inefficient pointers! "Pointer to pointer to X" is generally an efficient pointer with a representation independent of X, and this algorithm never uses any other flavor of pointer. I'm not advocating a change to C to fix this. I was just a bit bemused by the discovery of the situation, and thought others might be interested. -- "There's a lot more to do in space | Henry Spencer @ U of Toronto Zoology than sending people to Mars." --Bova | {allegra,ihnp4,decvax,utai}!utzoo!henry
henry@utzoo.UUCP (Henry Spencer) (09/21/87)
Mark Brader has pointed out that this could be solved by wrapping the pointer in a struct, although in general this risks incurring overheads like padding. Seems a rather artificial way of handling it, though. -- "There's a lot more to do in space | Henry Spencer @ U of Toronto Zoology than sending people to Mars." --Bova | {allegra,ihnp4,decvax,utai}!utzoo!henry
jss@hector.UUCP (Jerry Schwarz) (09/22/87)
In article <8626@utzoo.UUCP> henry@utzoo.UUCP (Henry Spencer) writes: >We decide on a fairly >quick approach: we use the current character to index into a table, >which contains pointers to the appropriate next table for each character. >When we hit a NULL pointer, we're done. We will have some cyclic structures >for tokens like identifiers that can be of unlimited length. > >Question: what is the type of one of those pointers? Declare struct ptr { struct ptr *p ; } ; And let your values be "struct ptr*". You won't need any casts (just selections) and on most machines with multiple formats for pointers, the efficient one will be used for pointers to structures. Jerry Schwarz Bell Labs, Murray Hill
gwyn@brl-smoke.UUCP (09/24/87)
In article <8626@utzoo.UUCP> henry@utzoo.UUCP (Henry Spencer) writes: >I'm not advocating a change to C to fix this. I was just a bit bemused by >the discovery of the situation, and thought others might be interested. Whitesmiths had a similar problem declaring the type of the functions whose addresses were to be passed to their their onexit() function; at process termination, the onexit function stack "unwound", with each function responsible for returning a pointer to the next such function. That's obviously a recursive type...
throopw@xyzzy.UUCP (Wayne A. Throop) (09/28/87)
> henry@utzoo.UUCP (Henry Spencer) > Consider the following problem. [A type 'T' at least partially composed of > a pointer to type 'T'] [...] > It can't be done in strictly portable C using "efficient" pointers True. The problem is that C has no way to introduce a type by name before or during the definition of that type, except with struct, union, or enum. Thus, there is no way to specify a recursive type, except inside structures, unions, or enumerations. I personally think that "typedef" ought to do this, so that one could say things like typedef foo_t *foo_t; to declare foo_t as a pointer type, pointing to foo_t. Of course, this complicates the parse of such things beyond belief (or it seems to... I haven't really tried hard to come up with a simple scheme because it looks so hopeless). Perhaps a new keyword, such as "forward", which could introduce a recursive typedef, so that something like: typedef forward foo_t; typedef foo_t *foo_t; would work just fine, because foo_t can now be recognized as both a type and the name being defined. Though maybe even this doesn't solve it. Sigh. -- Inigo hated it there. Everybody was so dangerous, big, mean and muscular, and so what if he was the greatest fencer in the world, who'd know it to look at him? He looked like a skinny Spanish guy it might be fun to rob. You couldn't walk around with a sign saying "Be careful, this is the greatest fencer since the death of the Wizard of Corsica. Do not burgle." --- From The Princess Bride by William Goldman -- Wayne Throop <the-known-world>!mcnc!rti!xyzzy!throopw