throopw@xyzzy.UUCP (Wayne A. Throop) (12/08/87)
> rustcat@russell.STANFORD.EDU (Vallury Prabhakar) > Is there an equivalent in C for the "funcall" utility (in Lisp)? Not in general, but in the example below, there is hope. Something to keep in mind if you already know LISP and are trying to learn C: use pointers. The things that LISP does naturally and easily often correspond to the use of pointers. The problem posed is a case in point: > I would like to have a portion of code, that compares a specific string > with a predefined string, and if true, then call the corresponding function. > { > static char *LIST = {"First", "Second", "Third"}; > int First(), Second(), Third(), > int i; > > for (i = 0; i < 3; i ++) { > if (strcmp (<input-string>, LIST[i]) == 0) { > (...statement that can call the corresponding routine...) > } > } > } Easy enough. You can either have a "parallel" array of pointers to your funcitons, or (more to my taste) you can have a list of name-function "dotted pairs" (well, not really dotted pairs, but a struct with two "cells", much like a cons, but I digress), like so: int invoke_named_function( name ) char *name; { int strcmp(); int First(), Second(), Third(); static struct { char *name; int (*function)(); } table[] = { { "First", First }, { "Second", Second }, { "Third", Third } }; int i; for( i=0; i<(sizeof(table)/sizeof(table[0])); ++i ){ if( strcmp( name, table[i].name ) == 0 ){ return( (*(table[i].function))() ); } } return( -1 ); /* or some other error indication */ } You can simply add as many entries to the table as you like. Note that the (*(table[i].function))() used to invoke the function can in some compilers be replaced by a simple table[i].function(). -- Everyone I know is having a more productive crisis than I am. --- Cathy -- Wayne Throop <the-known-world>!mcnc!rti!xyzzy!throopw
mcdonald@uxe.cso.uiuc.edu (12/14/87)
>> >> Is there an equivalent in C for the "funcall" utility (in Lisp)? >> I wasn't able to find any information about variable functions that can >> be called in any of the C manuals. I'd appreciate any pointers/advice/ >> alternatives on how to go about doing this. >> >> -- Vallury >C will allow you to refer to previously-compiled functions indirectly >(through function pointers), but the feature you're talking about can >only be implemented in an interpretive language like Lisp. Lisp will >interpret commands as it encounters them, and hence a Lisp object can >be used both as code and data. For example, if you define a list in (section omitted) >This will not work in C or in any compiled language such as Pascal, >FORTRAN, COBOL, etc. etc., as these compile their code and keep their >data separate from the code. >Now, you could write, in C, a program that implements an interpreter >that will take in character strings and execute them per some set of >rules. In fact, there are several implementations of Lisp that are >written in C. The unix shells are also examples of programs that are >implemented in C and that are interpretive. >So the point here is that in C, you can't do exactly what you described, and >this is due to the fact that C is compiled, not interpreted. Apparently C does not allow this as a general, required, feature. However, I learned C with the explicit assumption that, since it allowed both data and function pointers, one could take an array, write a program that generated compiled code in that array, cast the address of (the first byte of) the array to a function pointer, and call that function. I have since learned that on some systems (i.e. 80286 Xenix in anything except single-segment model) it won't work, and that ANSI C does not require it to work (A FATAL FLAW). BUT, on all the machines that I regularly use, including the VAX/VMS, PDP-11/Decus-C, IBM-PC/DOS, and IBM-PC/OS2, it does work (although on OS2 it requires a minor help from a system call.) Just because a language is normally compiled, as C is, does not mean that you can't write an interpreter in it (for the language of your choice), or, as I have done, an incremental compiler. It's just that on your particular system, the system implementer has stupidly prevented you from executing the code you generate. Any system in which is is impossible to write something as obvious as an incremental compiler is terminally brain-damaged. (Actually, in most cases it can be done with the aid of assembly language.) Would someone explain to me why a system would prevent you from doing this? You couldn't have a TurboPascal or even a Forth! (I wouldn't object if you had to explicitly declare that a block of memory would be both data and code.) Doug McDonald P.S. Please don't think I've done anything as ambitious as write a complete language compiler. My little affair is a simple incremental compiler for arithmetic expressions only. I first tried an interpreter, but the compiler was faster by about a factor of 75.
kers@otter.HP.COM (Christopher Dollin) (12/15/87)
Duh ... > This is an example of how, in Lisp, the "mylist" object can be used > either as code or as data, and again, this is because Lisp is > interpretive. > Even in the case of Lisp "compilers", the compiled code has an interpreter > built in. > > So the point here is that in C, you can't do exactly what you described, and > this is due to the fact that C is compiled, not interpreted. False. There exists at least one Lisp compiler, namely the Poplog Common Lisp compiler, which has NO interpreter built it, not any, not even a bit. [There are probably lots more, but that's the one I use the most]. The reason you can't do it (without work) in C is because C isn't designed that way. That's all. Compilation has nowt to do with it, although INCREMENTALITY (being able to compile before all the text is available) may have. Regards, Kers | "Why Lisp if you can talk Poperly?"
pardo@uw-june.UUCP (David Keppel) (12/16/87)
[ why don't all machines let us execute code stored in an array ] Some machines make optimizations about where the executable code and data live, and trying to execute code from the data region breaks some of those optimizations. [OK, not many, but a *few*] ;-D on ("And, as a consolation prize, you get Rice-A-Roni") Pardo "You can do anything you want, if you don't mind slow"
rwa@auvax.UUCP (Ross Alexander) (12/17/87)
In article <3826@uw-june.UUCP>, pardo@uw-june.UUCP (David Keppel) writes: > [ why don't all machines let us execute code stored in an array ] Ouch! Are people _still_ trying to do this sort of stuff? In my undergrad days, it was a well-known kluge that Honeywell Fortran (on the 6050) wasn't too fussy about how you referenced external objects. This let one declare an integer array, stuff 'magic constants' == 'machine instructions' into it, and then call it. It did have its uses, but the whole idea makes me cringe in retrospect. It was poor practise then, and inexcusable now. -- Ross Alexander @ Athabasca University, alberta!auvax!rwa
dag@chinet.UUCP (Daniel A. Glasser) (12/17/87)
In article <3826@uw-june.UUCP> pardo@uw-june.UUCP (David Keppel) writes: >[ why don't all machines let us execute code stored in an array ] > >Some machines make optimizations about where the executable code and >data live, and trying to execute code from the data region breaks some >of those optimizations. [OK, not many, but a *few*] > [trailing stuff removed.] To be more specific, many machines have separate instruction and data address spaces, and unless they are mapped together, you cannot directly execute the contents of arrays, since all instruction fetches will be done from the instruction space and the array contents are stored in the data address space. The M68000 is capable of this separation, some PDP-11's support I/D separation, the Intel 8086, Zilog Z8001/3, Intel 8051, and many other "current" machines have this, but on systems where it is supported you usually have the choice not to use it. This particular discussion should move over to comp.arch, since it is not a language issue. -- Daniel A. Glasser ...!ihnp4!chinet!dag ...!ihnp4!mwc!dag ...!ihnp4!mwc!gorgon!dag One of those things that goes "BUMP!!! (ouch!)" in the night.
karl@haddock.ISC.COM (Karl Heuer) (12/17/87)
In article <47000025@uxe.cso.uiuc.edu> mcdonald@uxe.cso.uiuc.edu writes: >[Re generating code at runtime: (*(int (*)())&array[0])()] >I have since learned that on some systems ... it won't work, and that ANSI C >does not require it to work (A FATAL FLAW). BUT, on all the machines that I >regularly use ... it does work So, it's a FATAL FLAW that ANSI C has to work on machines other than the ones you regularly use? Gimme a break. Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
mcdonald@uxe.cso.uiuc.edu (12/18/87)
me: >[Re generating code at runtime: (*(int (*)())&array[0])()] >I have since learned that on some systems ... it won't work, and that ANSI C >does not require it to work (A FATAL FLAW). BUT, on all the machines that I >regularly use ... it does work Karl W. Z. Heuer: So, it's a FATAL FLAW that ANSI C has to work on machines other than the ones you regularly use? Gimme a break. Me again: Yes, I really mean it. I can't see why it cannot be specified to be generally useful. I realize that, somewhere, there might be a machine that CAN'T, realy, truly CAN'T, do what I want, but I have never seen one described to me. There are machines where the architecture or the operating system makes it hard, or not the default, but not impossible. The language specifiers should put it in the language. Then, if a particular machine simply can't do it, their C compiler would be sold with an asterisk * * due to the stupid blunder we made when we decided on this machine's architecture, it is impossible for us allow you to write an incremental compiler. Therefore, we are unable to produce a completely implemented compiler. So sorry. By fatal flaw, I mean that it is fatal to my programs. A substantial fraction of the programs I have written for the IBM-PC are in fact dependent on being incremental compilers. It is also fatal to the claim that C is a general purpose language. It wouldn't matter in special- purpose languages like Fortran or Cobol. Again, I would be perfectly happy if you had to declare specifically that you wanted code and data to be co-accessible. Could you give me an example of a machine where this would be impossible? (perhaps some hard-wired Lisp processor ?) Doug McDonald
gwyn@brl-smoke.ARPA (Doug Gwyn ) (12/20/87)
In article <47000027@uxe.cso.uiuc.edu> mcdonald@uxe.cso.uiuc.edu writes: >I realize that, somewhere, there might be a machine that CAN'T, Separation of Instruction and Data space is enforced by a large number of modern computer systems. It is generally considered to be a Good Thing, since it prevents program errors from being quite as disastrous as they might be were valid code to be overwritten with random data while running. I won't give you a list of such systems, as there doesn't seem to be any point in an enumeration. It is possible to implement a fairly fast interpretive language within this architectural constraint. I have implemented a few of these, and you can find an example in Kernighan & Pike's "The UNIX Programming Environment" (the chapter on the "hoc" language).
chip@ateng.UUCP (Chip Salzenberg) (12/22/87)
In article <47000027@uxe.cso.uiuc.edu> mcdonald@uxe.cso.uiuc.edu writes: >>[Re generating code at runtime: (*(int (*)())&array[0])()] >Yes, I really mean it. I can't see why it cannot be specified to be generally >useful. I realize that, somewhere, there might be a machine that CAN'T, >realy, truly CAN'T, do what I want, but I have never seen one described to >me. The iAPX 286 can't do this in protected mode, which is the only useful mode for multitasking OS's. (Executable segments are not writable.) Look, if you're on a machine where executing data is allowed, you can just cast the data address to a function pointer and call the result: int (*func)(); char array[100]; ... func = (int (*)()) array; (*func)(); And if the C compiler won't even do the cast, use a union. As they say in Rome: Non perspiratum ("no sweat"). -- Chip Salzenberg "chip@ateng.UUCP" or "{codas,uunet}!ateng!chip" A T Engineering My employer's opinions are not mine, but these are. "Gentlemen, your work today has been outstanding, and I intend to recommend you all for promotion -- in whatever fleet we end up serving." - JTK
michael@stb.UUCP (Michael) (12/22/87)
(I came in late to this, so pardon any mis-understandings)
In article <47000025@uxe.cso.uiuc.edu> mcdonald@uxe.cso.uiuc.edu writes:
(a question about refering to data as code)
(A reply saying "Ok in lisp, (where code and data are interchangable) but
not in C (where they are seperate)")
(A reply saying "You can compile code into an array and then execute
from that array")
Unfortunately, this is not the same. In lisp, you can create a data
structure that looks like SOURCE code, and execute it. In C, you have
to write a machine dependent compiler subroutine to compile the code,
and then execute it.
--
: Michael Gersten ihnp4!hermix!ucla-an!remsit!stb!michael
: sdcrdcf!trwrb!scgvaxd!stb!michael
: "Copy Protection? Just say 'Off site backup'. "
karl@haddock.ISC.COM (Karl Heuer) (12/22/87)
In article <47000027@uxe.cso.uiuc.edu> mcdonald@uxe.cso.uiuc.edu writes: >[Re generating code at runtime: (*(int (*)())&array[0])()] >Yes, I really mean it. I can't see why it cannot be specified to be generally >useful. ... The language specifiers should put it in the language. Then, if >a particular machine simply can't do it, their C compiler would be sold with >[a disclaimer that it isn't full ANSI]. Certainly it's a useful option when the architecture and the operating system can support it. It has already been mentioned that this isn't always the case, so I won't harp on that. But even if ANSI were only concerned with such "reasonable" architectures, it is clearly beyond their jurisdiction to try to specify anything about the result of converting a data pointer to a code pointer or vice versa. However, I really don't think you have anything to worry about. Most likely, the C compilers on "reasonable" architectures will continue to support such intersegment casts as a Common Extension. There is some analogy with the casting of pointer to integer. Have you seen what the dpANS says (and doesn't say) about that? Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
aglew@ccvaxa.UUCP (12/24/87)
..> Executing data. There are many architectures where executing data cannot be done easily. But, since the loader (getxfile) has to read in data and execute it at some point, a similar facility should be provided to the user (without having to play tricks with temporary files). In this posting, I discuss why casting an array to a function pointer is NOT the way, I discuss the main architectural impediments, and I suggest an interface for converting data to code that might be portable to a lot of architectures (UNIX would have to be modified). Casts are not the way --------------------- Casts, however, are *NOT* the way. Using a cast to function makes it much too tempting to do little bit twiddles, like changing an ADD to a MULTIPLY, while you are executing the code. Almost *NO* modern architectures permit this sort of thing to be done safely, without some sort of possibly expensive synchronization of the instruction prefetch buffer and memory. This would create hell for an optimizing compiler. Casts are also inappropriate because there are many architectures where I and D are separate. You can't make D into I. You probably can, however, copy between the two. Finally, casts are inappropriate because they do not indicate HOW MUCH data is going to be made into code. Knowing how much is important, because, as mentioned above, systems with separate I and D, where movement is permitted between the two, have to do some sort of synchronization - and the synchronization may be made more efficient if the amount of data is known (page flush instead of entire cache flush). Architectural Impediments to Data->Code --------------------------------------- As discussed above, duty cycle - many architectures cannot execute writes into the instruction stream immediately. Some form of synchronization must be done so that data written can be made into code. Instructions and data may be truly in different spaces. However it may be possible to copy between them. Entry point registry: Advanced architectures may register entry points for security reasons. Examples of Applications That Can Use Data->Code Conversion ----------------------------------------------------------- Incremental compilers: although these are inherently machine dependent, you can isolate much of the dependence in per-machine files. It is obviously desirable to be able to compile without going through headstands to read the compiled code from a file. (A standard routine "Compile converting string to format obviously is useful). Numerical Work: many large numerical packages actually used to compile and load parts of their algorithms for efficiency. Interpretation isn't even in the ballpark, and even running compiled code with ifs is too expensive... Overlay Systems: there are still some systems with small address spaces. Almost-Portable Interface for Data->Code Conversion --------------------------------------------------- Completely separate I/D spaces Data Type Since some systems have truly disjoint I/D spaces, it is necessary to have a data type that is "uninitialized code". Suggested syntax: int f()[SIZE] where SIZE is in the same units as sizeof(). This is not to imply that code is measured in bytes; it is just to facilitate the description of sizes Providing a prototype at declaration time may be appropriate for architectures that do entry point control. Dynamic Allocation funcptr = codealloc(size); This loses in that C doesn't have a "mode" type; but, it'll handle most architectures. Movement Between Data and Code Spaces codecpy( (char *)frombuf, (int ()*)tofunc, SIZE) (i) Is legal only to correctly sized function buffers. Otherwise undefined. (ii) Gives you a locus for doing all the sorts of synchronization that your architecture requires. (iii) Identifies tofunc as an entry point to the machine. And, obviously, you would want a vice versa function. Simple interface The above lets you explicitly manage the code address space at a high level. A simpler interface might be: funcptr = mkexecutable((char*)buf,size) where you basically say that it is not safe to modify buf while funcptr may be executing. In this case, it would be possible for funcptr to be simply a cast, but mkexecutable() might very well manage the I address space, allocate, and copy, returning a pointer to the newly allocated space. I think that an interface like this would be portable to many systems. Andy "Krazy" Glew. Gould CSD-Urbana. 1101 E. University, Urbana, IL 61801 aglew@mycroft.gould.com ihnp4!uiucdcs!ccvaxa!aglew aglew@gswd-vms.arpa My opinions are my own, and are not the opinions of my employer, or any other organisation. I indicate my company only so that the reader may account for any possible bias I may have towards our products.
jimp@cognos.uucp (Jim Patterson) (12/28/87)
In article <47000025@uxe.cso.uiuc.edu> mcdonald@uxe.cso.uiuc.edu writes: >I learned C with the explicit assumption that, since it allowed both >data and function pointers, one could take an array, write a program that >generated compiled code in that array, cast the address of (the first >byte of) the array to a function pointer, and call that function. > I have since learned that on some systems (i.e. 80286 Xenix in anything >except single-segment model) it won't work, and that ANSI C does not >require it to work (A FATAL FLAW). This is hardly a FATAL FLAW, since it's an unusual application that would depend on such an ability. It's also a stance that is consistent with the ANSI C objective of standardizing a version of C that is widely implementable. Executing code in data space is extremely difficult on some architectures, for example the HP/3000 which has completely segregated code and data spaces. -- Jim Patterson Cognos Incorporated UUCP:decvax!utzoo!dciem!nrcaer!cognos!jimp P.O. BOX 9707 PHONE:(613)738-1440 3755 Riverside Drive Ottawa, Ont K1G 3Z4
ljz@fxgrp.UUCP (Lloyd Zusman, Master Byte Software) (01/06/88)
In article <47000025@uxe.cso.uiuc.edu> mcdonald@uxe.cso.uiuc.edu writes: > ... >I learned C with the explicit assumption that, since it allowed both >data and function pointers, one could take an array, write a program that >generated compiled code in that array, cast the address of (the first >byte of) the array to a function pointer, and call that function. > I have since learned that on some systems (i.e. 80286 Xenix in anything >except single-segment model) it won't work, and that ANSI C does not >require it to work (A FATAL FLAW). > ... I agree with some of the others here that this is anything but "A FATAL FLAW". It's hardly a flaw that a language that was designed as a compiled language would not always work properly when some machine- dependent side effects of the language that sometimes allow it to be used in an interpretive manner don't work on all architectures and hence are not required as features of the language. Back in the days of FORTRAN, I wrote some code that would go through memory and find the FORMAT statement strings, and that would alter them at run-time. This gave me execution-time-modifiable FORMAT statements which, although useful, are not a feature of the language. This code I wrote only worked for a particular FORTRAN implementation on a particular operating system on a particular machine. Putting machine code into a block of memory at run-time and executing it is a slick feature, but it's foolish to expect such a thing to be portable ... or to even be possible at all on some architectures, as some people here have already pointed out. Making this a feature of a language one of whose prime features is portability would be silly, in my opinion. Sure, if your operating system and machine architecture allow this to be done, and if you don't care whether your code is portable, then by all means do it if you must. But don't expect someone to design C to make it easy on you. ------------------------------------------------------------------------- Lloyd Zusman Master Byte Software Los Gatos, California Internet: fxgrp!ljz@ames.arpa "We take things well in hand." UUCP: ...!ames!fxgrp!ljz