dmb@wam.UMD.EDU (David M. Baggett) (10/27/89)
Richard Covert writes: >I have a friend who has written some shareware programs >(ARCIT, ARCIT Shell, UNARCIT etc ) and he would like to >make his code self-modifying. IN particular he would like >to save the state of various buttons that he uses. Just a minor point: This isn't technically what "self-modifying code" means. The term usually refers to executable code that changes/creates executable code at run-time. An example might be a bit of code that writes a certain number of move instructions into a buffer, then jumps into the buffer. While self-modifying code can occasionally be useful (perhaps for loop unrolling), it has two major problems: 1) It's incredibly difficult to understand, modify, debug, and maintain. 2) It won't work on machines with instruction caches. Way back when, self-modifying code was thought to be a really powerful technique. I seem to recall there was quite a debate about its merits, out of which came the "Harvard architecture", where code and data are kept in separate places (and never the twain shall meet). Anyway, I think these days that most computer scientists would agree that it is simply "tricky" and confusing, and should be avoided. [Richard's stated options:] >1) Write the data to an external file (*.CFG), but now >you have to maintain a PRG, a RSC and a CFG file. Clumsy. I don't see why this is clumsy. It makes the program much easier to maintain. It would be even nicer if the config file were human-readable and editable. C provides fprintf and fscanf for this sort of thing. If you use these, every C programmer will know what you're doing right away. >2) Write to the RSC file, but my friend wants to incorporate >the RSC into the PRG (using RSCTOC or equivalent). So, there >may not be a RSC to write to. > >3) Write to the PRG file. Best of all since you need it to >run his program!! No offense, but I really think this is a "sleazy hack". What's the point of doing this? Then you have to worry about what happens when you (heaven forbid) modify the source code and recompile, thereby changing the size of the executable. And, as you pointed out, you have to come up with some "clever" scheme to make sure you don't clobber something useful (like executable code) in your binary. Whatever you come up with will likely be non-portable as well. If you don't want people to see the config information, why not put it in a hidden file? >[1st suggestion --- store a unique string in the binary and search for it] Not only is this risky, but it's also going to be slow. Additionally, putting things like static char config[1024] = "MAGIC STRING HERE"; in your source seems like one of the easiest ways to bewilder anybody looking at the code (including you). >[2nd suggestion --- reserve space at the end of the binary and lseek] This isn't risky, but it's still slow. Here you're going to have to plow through the whole binary to get to your config data. Is this an improvement over simply opening a separate file? >I am interested in this whole idea. It's a nifty concept, but I think in practice you'll find that it isn't worth the trouble it will cause you. David Baggett arpa: dmb@tis.com
steve@thelake.UUCP (Steve Yelvington) (10/27/89)
In article <8910270333.AA14741@cscwam.UMD.EDU>, dmb@wam.UMD.EDU (David M. Baggett) writes ... >Richard Covert writes: <bunch of stuff about configuration files and self-modifying code deleted> >>3) Write to the PRG file. Best of all since you need it to >>run his program!! > >No offense, but I really think this is a "sleazy hack". What's the point of >doing this? Then you have to worry about what happens when you >(heaven forbid) modify the source code and recompile, thereby changing >the size of the executable. And, as you pointed out, you have to come >up with some "clever" scheme to make sure you don't clobber something >useful (like executable code) in your binary. Whatever you come up >with will likely be non-portable as well. I use MicroEMACS 2.19, a small, fast text editor. I wanted to save the margin settings and a few other characteristics, but having to load a configuration file (a) slows down program invocation, and (b) provides yet another opportunity for something to go wrong, i.e., the configuration file gets lost. I remembered an old CP/M communications program called MEX that had the ability to "clone" itself -- to write a modified version of the running program back to disk. So I whined at Dale Schumacher, who was handling the MicroEMACS modifications, until I got him to add such a feature. I assume that it indeed is nonportable, but is set off by #ifdef ATARI_ST in the source code. I don't know any details about the technique. Perhaps Dale can be persuaded to describe it. Steve Yelvington, up at the lake in Minnesota ... pwcs.StPaul.GOV!stag!thelake!steve (Usenet) ... {playgrnd,moundst,class68}!thelake!steve (Citadel)
ericco@stew.ssl.berkeley.edu (Eric C. Olson) (10/28/89)
Really Self-Modifying Data Lisp systems typically have a 'dumplisp' function which dump an image of itself to disk. Thus, invoking the dumped lisp, returns you to the exact environment you dumped. This is easier to do in lisp since it treats its source code as data. Although modify the executable file seems bad to me, I think that modifying the resource file is completely reasonable solution (and simple). By parsing the structure of the resource file, your program can quickly determine which parameter (object) needs to be modified. In fact, if your program uses text, then you should put the text in the resource file as well. This is how the resource file is supposed to be used. By putting the text in the resource file, non-English speaking people can replace it with meaningful non-English text. Eric ericco@sag4.ssl.berkeley.edu Eric ericco@ssl.berkeley.edu
7103_2622@uwovax.uwo.ca (Eric Smith) (11/14/89)
In article <89316.201227SML108@PSUVM.BITNET>, SML108@PSUVM.BITNET writes: > Hi, I am writing an assembly language routine which modifies its own code in > a tight loop in order to avoid having to do a decision statement at every > iteration. Unfortunately, whatever code I am inserting is screwing things up > royally, and although I have checked it fairly throughly, I cannot figure out > what is going on. Question: Is there something screwy about executable and > object files that would disallow self modifying. The block that gets modified > > is this: > > lsr.w d3 > bne cont > add.l #8,a0 > move.w #$8000,d3 > cont: nop It would have helped if you had included the code that was doing the modifications. The 68000 does instruction prefetch. If you're modifying code that's really close to the instruction that does the modification, then you can lose (the chip is executing the instruction it prefetched, rather than the updated instruction in memory). You can get around this by sticking some nops in. A better solution is to eliminate the self modification entirely. I *strongly* suggest the latter, because your code will almost certainly break on the TT (which has a 256 byte instruction cache). -- Eric R. Smith email: Dept. of Mathematics ERSMITH@uwovax.uwo.ca University of Western Ontario ERSMITH@uwovax.bitnet London, Ont. Canada N6A 5B7 ph: (519) 661-3638
apratt@atari.UUCP (Allan Pratt) (11/15/89)
7103_2622@uwovax.uwo.ca (Eric Smith) writes: >In article <89316.201227SML108@PSUVM.BITNET>, SML108@PSUVM.BITNET writes: >> [...things about self-modifying code...] >It would have helped if you had included the code that was doing the >modifications. The 68000 does instruction prefetch. Yeah, and (as Eric points out) on a TT it will get you in BIG TROUBLE. Writing to something as data and reading it as code is a BIG NO-NO unless you invalidate the cache in between. In fact, on the 68030, writing something in User mode and reading it in Super mode would confuse the cache, were it not for a side effect of the write-allocate bit. People who do DMA into memory have to worry about that - the BIOS tries to help, but you can still get in trouble. Your DMA driver should execute the following instructions to clobber the cache after a DMA read and before anybody actually looks at the data: movec.l cacr,d0 ; get current cache control register value or.w #$808,d0 ; set both "clear" bits movec.l d0,cacr ; write this new value back The clear bits in the cacr are one-shots, so you don't have to clear them again. The code above is harmless if the cache isn't enabled in the first place, as it doesn't change the enable or other state bits. The TT is going to open a whole new can of worms, people. We've been dealing with it internally, of course, and TOS runs fine, but there are things which you could get away with on the ST which you can't do on the TT. For example, some programs use the high byte of a pointer for something; with a 24-bit address bus, that's harmless. But with a full 32-bit bus, that gets you in trouble. ============================================ Opinions expressed above do not necessarily -- Allan Pratt, Atari Corp. reflect those of Atari Corp. or anyone else. ...ames!atari!apratt