herndon@umn-cs.UUCP (02/18/86)
I wrote a mild flame in another newsgroup about this, but somebody out there might know another way. The problem is this: I'd like to construct a jump table by putting lots of labels into an array, and then issuing a statement like goto jumptab[i]; Unfortunately, labels are no longer simple integers (and haven't been for most of a decade now). They can no longer be type coerced to integer or pointer-to-integer, nor can other types be coerced to type label. Is there any way around this? Admittedly, for the static case with a dense list, a switch statement usually constructs a jump table, but I'd like to construct one on the fly. If I just want to execute machine code out of an array ("goto array_name;" used to be legal C) I can write a simple assembly language routine of one argument to do it. Is there a legal way to do this in pure C any more? (Yes, one can type coerce the address of the array to pointer-to-function and call it, but is it possible to jump to the array?) Both of these problems are admittedly things one would not often do. They are, however, things one might do in an interpreter. The former operation may be considered to be "machine independent", as there is one reasonable, consistent interpretation. The latter is much more open to question, as the contents of the array are not machine independent. However, if one is to write portable inter- preters, being able to jump to an array without assembly language help would be a plus. (Individual pseudo-machine operations can then be relegated to data tables, leaving the C code machine independent.) Robert Herndon ...!{ihnp4|stolaf}!umn-cs!herndon
hfavr@mtuxo.UUCP (a.reed) (02/19/86)
Robert Herndon writes: > I wrote a mild flame in another newsgroup about this, but > somebody out there might know another way. > The problem is this: I'd like to construct a jump table by > putting lots of labels into an array, and then issuing a > statement like > goto jumptab[i]; Such a table is automatically built by the compiler when you do switch (i) { case 1: goto label_a; case 2: goto label_b; } /* etc. */ This can do everything you need, and gives you the benefit of readable labels. Adam Reed (ihnp4!npois!adam)
bowles@cbosgd.UUCP (Jeff Bowles) (02/19/86)
(Gee, wasn't the article I'm replying to supposed to be here, too?) Well, anyway, the article I'm following up is one where a programmer asks why: int i; ... goto labels[i]; /* "levels" contains pointers to, well, * you get it... */ went away from C. [I guess it used to be valid. Hmmmm.] If you feel the need to do something like this, you probably want to just use a structure: struct bletch { int l_token; /* If you see this, */ void (*l_whither)(); /* then call this routine */ }; Then you can just do a table lookup, calling the appropriate routines for each item. Or, :-), you can use the "computed goto" that C provides: switch (i) { case 0: ... break; case 1: ... } The "table lookup" that C provides will usually do what you need, with minimal effort. If you're translating an assembler program and need to "jump to the location specified by the N'th element of this array", I suggest finding a BCPL compiler. Jeff Bowles Lisle, IL
ark@alice.UucP (Andrew Koenig) (02/19/86)
In "pure C," the only thing you can do with a label is use it as the subject of a goto. The closest you can come is to call an element of an array rather than jumping to it; the array must then be an array of function pointers.
bzs@bu-cs.UUCP (Barry Shein) (02/20/86)
Re: what to do now that goto jumptab[x] is gone? Well, the obvious solution is a switch statement but that does not fulfill all your requirements and you probably rejected that. The way I do such things is to use an array of pointers to functions. In your example of jumping to on-the-fly generated code I suspect that is really what you are saying: On-the-fly generated functions, having an environment around the generated code is certainly not a bad thing and we assume you would like to come back sometime. Generating the prologue and epilogue to make the generated code a function should be more or less trivial as, besides stack offsets for autos, it's a cliche. So, the answer would be to declare a table something like: int (*funtable[MAXFNS])() ; /* did I get that right? array of pointers to functions returning int */ and just malloc the storage for the generated code. Obviously the return value needn't be int. I can't think of any reason off hand why this isn't powerful enough for what you propose. It should be quite portable (code generator aside) and is legal C. -Barry Shein, Boston University
kwh@bentley.UUCP (KW Heuer) (02/23/86)
In articles <184@bu-cs.UUCP> bu-cs!bzs (Barry Shein) writes: >declare a table something like: int (*funtable[MAXFNS])() ; >and just malloc the storage for the generated code. ... It should be >quite portable (code generator aside) and is legal C. Well, some compilers will dislike the attempt to cast a (char *) into a (int (*)()) ; in fact I think some will call it an outright error (not just a warning). But in any case it is _not_ portable to the 3b2, because all programs are pure -- you can't goto/call data space, nor can you read from the instruction stream. Some sort of chastity belt in the hardware, I think.
herndon@umn-cs.UUCP (02/23/86)
[Here, bugs, bugs, bugs! Here, bu Hmmm. Apparently my original posting wasn't too clear. Many responses were sent telling me that I should just use a switch statement, or an array of pointers to functions. Somebody else mailed me a note telling me I should not put code into an array, but should use an assembler. My original note explicitly mentioned the possibilities of using both switch statements and pointers to functions, and I've had to make do with these options. Sigh. Let me restate my problem. Suppose I have an interpreter, which accepts input from a user. Something like a BASIC (Ugh!) or Lisp interpreter/compiler. I wish to convert a statement that the user enters into machine code, and be able to execute that machine code, RIGHT THEN(!). (I certainly don't wish to have to call an assembler and a loader.) This is perhaps a iffy operation, since some machines will not allow the execution of data. Now, it is certainly not too difficult to generate my machine code and stick it into an array somewhere. If I could simply jump to it, I'd be very happy. This I can do by creating an assembly language procedure of one argument which jumps to the address given as the argument. 1) Can I do this without the assembly language help? As a second alternative, I can put my machine code into an array, place the address of that array into a union as an integer, and CALL (not jump to!) the array by pulling the address out of the union as a pointer to a function. This is somewhat ugly, since I don't know what size a code address is, and C will NOT let me type cast an address into a pointer to a function. Therefore this CODE construct is not portable. (As I noted in my original article, I can generate code from machine-dependent DATA tables by using ifdefs and includes, but I'd like machine independent CODE.) Further, many machines (for instance, the VAX) insist on particular prologues and epilogues for procedures which I have no interest in and do not wish to generate code for. 2) Is there a machine independent way to coerce non-pointer- to-function values to pointer-to-function values? As a third alternative, definitely the least desirable from my particular perspective, is to do the whole thing a "proper way". I should generate nice intermediate code, stuff it into an array, and then write a routine to interpret the intermediate code. Presumably then I can use the switch statement everyone recommends to generate the jump-tables to get to the code to interpret my intermediate code. Slow. And I can't add new intermediate-opcodes without recompiling. The fourth alternative (another "proper way") is to generate arrays of pointers to functions for code, where the pointers point to real, live C functions. Then, by stepping through the arrays and calling each function pointed to, I can indirectly interpret my code. (Something like a forth interpreter.) Again, I can't add new intermediate-opcodes without recompiling. Sigh. Oh, well, it was a hack anyhow. It was something that used to be possible and had occasional application, and then was rudely snatched away by "improvements" to C. I think it predates the existence of K&R's book. Robert Herndon
guy@sun.uucp (Guy Harris) (02/23/86)
> >declare a table something like: int (*funtable[MAXFNS])() ; > >and just malloc the storage for the generated code. ... It should be > >quite portable (code generator aside) and is legal C. > > Well, some compilers will dislike the attempt to cast a (char *) into > a (int (*)()) ; in fact I think some will call it an outright error > (not just a warning). But in any case it is _not_ portable to the 3b2, > because all programs are pure -- you can't goto/call data space, nor > can you read from the instruction stream. 3B2, hell, that goes all the way back to separate I&D space on the PDP-11. It is quite unportable, and "lint" will justifiably complain about it, warning of a "questionable conversion of function pointer" (even if you're converting another kind of pointer *into* a function pointer, but that's life during UNIX). Then again, if they're generating code on the fly, it's not going to be very portable anyway, so if you're doing this sort of thing worrying about whether the pointer conversion is portable is kind of silly. (If you really MUST do this sort of thing, you can probably get the OS to help by providing a call to convert a data segment into a code segment.) -- Guy Harris {ihnp4, decvax, seismo, decwrl, ...}!sun!guy guy@sun.arpa (yes, really)
jph@houxf.UUCP (J.HARKINS) (02/26/86)
> In articles <184@bu-cs.UUCP> bu-cs!bzs (Barry Shein) writes: > >declare a table something like: int (*funtable[MAXFNS])() ; > >and just malloc the storage for the generated code. ... It should be > >quite portable (code generator aside) and is legal C. > > Well, some compilers will dislike the attempt to cast a (char *) into > a (int (*)()) ; in fact I think some will call it an outright error Huh??? This line of code DOES NOT cast a char * into an int. It is declaring that funtable is an array of MAXFNS elements, each of which is a pointer to a function that returns a int value. > (not just a warning). But in any case it is _not_ portable to the 3b2, BOLDERDASH!!! I have programs that use pointers to functions, some that run on 3B2/5's. The construct is totally(no flames, please) portable. As a matter of fact I have used this type of construct to allow emulation of UN*X signal processing on a non UN*X operating system that only allowed one routine to be specified for all signals. > because all programs are pure -- you can't goto/call data space, nor > can you read from the instruction stream. Some sort of chastity belt > in the hardware, I think. Whazat?? MOST(not all) programs are pure in this environment, yes. But that has nothing to do with being able to use a pointer to a function. The code that is executed is actually in the shared text region; it is only the pointer to the function that is in the data area. ------- Disclaimer: I hereby disclaim all my debts. ------ Jack Harkins @ AT&T Bell Labs Princeton Information (201) 949-3618 (201) 561-3370 houxf!jph
kwh@bentley.UUCP (KW Heuer) (02/26/86)
[ bu-cs!bzs (Barry Shein) ] >> >declare a table something like: int (*funtable[MAXFNS])() ; >> >and just malloc the storage for the generated code. ... It should be >> >quite portable (code generator aside) and is legal C. [ bentley!kwh (Karl Heuer) ] >> Well, some compilers will dislike the attempt to cast a (char *) into >> a (int (*)()) ; in fact I think some will call it an outright error [ houxf!jph (Jack Harkins) ] >Huh??? This line of code DOES NOT cast a char * into an int.... >I have programs that use pointers to functions, some that run on 3B2/5's. Sorry, you seem to have lost the context. The original poster wanted to malloc space for the CODE ITSELF, not the pointer table; i.e. do something like funtable[0] = (int (*)())malloc(codesize); and this line _does_ cast a (char *) (which is what malloc() returns) into a function pointer. (Actually a more likely sequence is char *s = malloc(codesize); s[0] = CLRW; s[1] = R0; funtable[0] = (int (*)())s; n = (*funtable[0])(); or something like that.) [ bentley!kwh (Karl Heuer) ] >> because all programs are pure -- you can't goto/call data space, nor >> can you read from the instruction stream. Some sort of chastity belt >> in the hardware, I think. [ houxf!jph (Jack Harkins) ] >Whazat?? > >MOST(not all) programs are pure in this environment, yes. But that has >nothing to do with being able to use a pointer to a function. The code >that is executed is actually in the shared text region; it is only the >pointer to the function that is in the data area. When I said "all programs are pure", I meant that on the 3b2 it is _not_ _possible_ to write an impure program (as far as I can determine). The code fragment above can be made to work on a VAX (even without "ld -N"), but on the 3b2 it dies with a bus error. I hope I've cleared this up. To any would-be flamers: the alignment is appropriate; the bus error occurs on the CALL instruction; don't flame me about what I'm "clearly" doing wrong unless you can demonstrate a way to do it right. On a 3b2. I've already checked things pretty carefully, including the source code in the kernel.
chris@umcp-cs.UUCP (Chris Torek) (02/27/86)
In article <1067@houxf.UUCP> jph@houxf.UUCP (Jack Harkins) . . . responds to the following from Barry Shein: >>declare a table something like: int (*funtable[MAXFNS])() ; >>and just malloc the storage for the generated code. ... It should be >>quite portable (code generator aside) and is legal C. > >Well, some compilers will dislike the attempt to cast a (char *) into >a (int (*)()) ; in fact I think some will call it an outright error Jack Harkins says: >This line of code DOES NOT cast a char * into an int. It is declaring >that funtable is an array of MAXFNS elements, each of which is a pointer >to a function that returns a int value. You are both right. It is obvious that the line Jack refers to is: int (*funtable[MAXFNS])(); while the code Barry refers to is: char *malloc(); funtable[n] = (int (*)()) malloc(codesize); (which does not appear in the quoted text, but is implied nonetheless.) So when Barry says: > it is _not_ portable to the 3b2, he is correct: you cannot invoke the allocated function without turning it into `code' first, for the hardware will not execute `data'; and when Jack says: >I have programs that use pointers to functions, some that run on 3B2/5's. >The construct is totally(no flames, please) portable. he is also correct: pointers to functions are portable. It is this specific usage---allocate data, fill with code, call data area as function---that is not. Hoping this will forestall further confusion, -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1415) UUCP: seismo!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@mimsy.umd.edu
aaz@pucc-j (Marc Mengel) (02/27/86)
As far as I know, the following code is legal, and it works on
all the machines I have ever used. It is not neccesarily portable
everywhere, since some machines may not like executing in the
data segment, but then again, if you are putting machine code in
an array and executing it, it isn't portable code in *any* case.
char foo[BIGNUM];
main()
{
int result;
/* code to put machine code into foo[] */
result = (* (int (*)()) foo)();
}
--
Marc Mengel
Uucp: { decvax, icalqa, ihnp4, inuxc, sequent, uiucdcs }!pur-ee!pucc-j!aaz
{ decwrl, hplabs, icase, psuvax1, siemens, ucbvax }!purdue!pucc-j!aaz
USnail: 910 N. 9th street
Lafayette IN 47904
larry@cca.UUCP (Laurence Schmitt) (02/28/86)
> > 2) Is there a machine independent way to coerce non-pointer- > to-function values to pointer-to-function values? > Considering that the program in question would be generating machine code on the fly, an extremely machine *dependent* operation, it seems curious to complain that the jump operation itself cannot be made machine independent! :-) -- Larry Schmitt Computer Corporation of America larry@cca 4 Cambridge Center decvax!cca!larry Cambridge, MA 02142 (617)-492-8860
kwh@bentley.UUCP (KW Heuer) (02/28/86)
This is a re-post; my apologies if you get it twice. [ bu-cs!bzs (Barry Shein) ] >> >declare a table something like: int (*funtable[MAXFNS])() ; >> >and just malloc the storage for the generated code. ... It should be >> >quite portable (code generator aside) and is legal C. [ bentley!kwh (Karl Heuer) ] >> Well, some compilers will dislike the attempt to cast a (char *) into >> a (int (*)()) ; in fact I think some will call it an outright error [ houxf!jph (Jack Harkins) ] >Huh??? This line of code DOES NOT cast a char * into an int.... >I have programs that use pointers to functions, some that run on 3B2/5's. Sorry, you seem to have lost the context. The original poster wanted to malloc space for the CODE ITSELF, not the pointer table; i.e. do something like funtable[0] = (int (*)())malloc(codesize); and this line _does_ cast a (char *) (which is what malloc() returns) into a function pointer. (Actually a more likely sequence is char *s = malloc(codesize); s[0] = CLRW; s[1] = R0; funtable[0] = (int (*)())s; n = (*funtable[0])(); or something like that.) [ bentley!kwh (Karl Heuer) ] >> because all programs are pure -- you can't goto/call data space, nor >> can you read from the instruction stream. Some sort of chastity belt >> in the hardware, I think. [ houxf!jph (Jack Harkins) ] >Whazat?? > >MOST(not all) programs are pure in this environment, yes. But that has >nothing to do with being able to use a pointer to a function. The code >that is executed is actually in the shared text region; it is only the >pointer to the function that is in the data area. When I said "all programs are pure", I meant that on the 3b2 it is _not_ _possible_ to write an impure program (as far as I can determine). The code fragment above can be made to work on a VAX (even without "ld -N"), but on the 3b2 it dies with a bus error. I hope I've cleared this up. To any would-be flamers: the alignment is appropriate; the bus error occurs on the CALL instruction; don't flame me about what I'm "clearly" doing wrong unless you can demonstrate a way to do it right. On a 3b2. I've already checked things pretty carefully, including the source code in the kernel.
nather@utastro.UUCP (Ed Nather) (03/01/86)
In article <1067@houxf.UUCP>, jph@houxf.UUCP (J.HARKINS) writes: > > BOLDERDASH!!! > It is a thrilling thing to see a new and needed word enter the language. I assume it means the dash has been overstruck ... -- Ed Nather Astronomy Dept, U of Texas @ Austin {allegra,ihnp4}!{noao,ut-sally}!utastro!nather nather@astro.UTEXAS.EDU
henry@utzoo.UUCP (Henry Spencer) (03/02/86)
> 2) Is there a machine independent way to coerce non-pointer- > to-function values to pointer-to-function values? No, because on some machines they are very different animals: a pointer to a function is not necessarily a pointer to the bytes comprising its code. Some machines want a rather more elaborate structure, in which a pointer to a function is a module identifier and a function-within-module identifier, and there is extra information somewhere that allows the machine to interpret this. Which leads to... > ...Further, many machines (for instance, the VAX) > insist on particular prologues and epilogues for procedures > which I have no interest in and do not wish to generate > code for. If you want to treat something as a function, you *must* observe the conventions that your machine (and your compiler) want to see. There is no portable way around this. In fact, there's no entirely portable way to do what you want at all, because the basic nature of the conventions (never mind the details of them!) is machine-dependent. For example, machines that use a module+function form of function pointer will need some sort of module dictionary somewhere, which you're going to have to build. Some machines won't let you do what you want at all, in fact, because on them, code is code and data is data and never the twain shall meet. (Examples: a pdp11 running split-space; a segmented machine that makes a distinction between code and data segments.) -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
jph@houxf.UUCP (J.HARKINS) (03/02/86)
>> >> BOLDERDASH!!! >> >It is a thrilling thing to see a new and needed word enter the language. >I assume it means the dash has been overstruck ... > >-- >Ed Nather >Astronomy Dept, U of Texas @ Austin >{allegra,ihnp4}!{noao,ut-sally}!utastro!nather >nather@astro.UTEXAS.EDU Actually, it refers to spur of the moment sporting events held regularly in areas subject to rockslides:-) ------- Disclaimer: I hereby disclaim all my debts. ------ Jack Harkins @ AT&T Bell Labs Princeton Information (201) 949-3618 (201) 561-3370 houxf!jph
jph@houxf.UUCP (J.HARKINS) (03/02/86)
Sorry for the delay, haven't read news in a week ... >and just malloc the storage for the generated code. ... It should be This is the line I missed in the original posting, thus the motivation for my reply to Karl Heuer's posting was off the mark. Sorry about that Karl, and everyone else who has charred the Recieve Data line on my modem for it. >Huh??? This line of code DOES NOT cast a char * into an int.... I should have said (int (*))() instead of int, and was referring to the declaration, not the attempt to assign code to a variable. As I said above, I conviently missed the orignal point, trying to dynamically generate code, which is indeed illegal for seperate I&D. So, in the words of Gilda Radner: "Never mind". Consider this posting my reply to future flames on my original reply. ------ Disclaimer: I hereby disclaim all my debts. ------ Jack Harkins @ AT&T Bell Labs Princeton Information (201) 949-3618 (201) 561-3370 houxf!jph
gwyn@BRL.ARPA (VLD/VMB) (03/03/86)
I can think of two legal ways to implement the goto jumptab[i]; idea in C, assuming you need the flexibility of reassigning the destinations for each i: (1) Use an array of jmpbufs, do something to get them initialized using setjmp (that's the hard part), and longjmp to the correct jmpbuf array member. (2) Use an array of function pointers, initialize them as desired, and call via the appropriate function array member (watch out that you don't keep recursing deeper and deeper; it's probably best to have a common return from the functions). I suspect that if we knew your intended application, better solutions would be possible. Just what do you think you need a jump table for?