[comp.sys.amiga.tech] Whats wrong with self Modifying Code?

chymes@fribourg.csmil.umich.edu (Charles Hymes) (07/07/90)

Oh great gurus, enlighten this poor soul who hath not programmed in
Holy Assembler sence the loly 6502 was mighty.
Yea, I pray that thou dost not blast my humble screen with mighty
blasts of searing, wrathful flame, but knowest this, that I hath donned
the sacred absphestos suit, and awaitest thou's marvelous revalation.

I'd really like to know.

Charlweed Hymerfan A Major (but humble) Dude

valentin@cbmvax.commodore.com (Valentin Pepelea) (07/07/90)

In article <1990Jul6.201743.24777@csmil.umich.edu>
chymes@fribourg.csmil.umich.edu (Charles Hymes) writes:
>
>Oh great gurus, enlighten this poor soul who hath not programmed in
>Holy Assembler sence the loly 6502 was mighty.

Image you execute some code located at address $2000. While the CPU fetches
the instructions at address $2000, the cache retains those bytes. Next time
the CPU wants to execute the code located at address $2000, the instruction
bytes are already in the cache, so no external memory cycles have to be
performed. Fine.

Now you modify the instruction byte at address $2000. Next time the CPU wants
to execute at address $2000, it thinks it already has the bytes it wants in
the instruction cache, so it executes the wrong piece of code.

"But when I write to location $2000, will I not modify the contents of the
 cache?"

Yes, but you will modify the contents of the data cache, not the instruction
cache. Therefore, after modifying a piece of code, you must clear the
instruction cache.

Similarly, you must always clear the caches after a DMA transfer. (Direct
Memory Access) That's when an external device, such as a disk controller, moves
data into your motherboard ram. An alternate solution to clearing the caches
is to declare regions of memory as non-cacheable.

Valentin
-- 
The Goddess of democracy? "The tyrants     Name:    Valentin Pepelea
may distroy a statue,  but they cannot     Phone:   (215) 431-9327
kill a god."                               UseNet:  cbmvax!valentin@uunet.uu.net
             - Ancient Chinese Proverb     Claimer: I not Commodore spokesman be

ckp@grebyn.com (Checkpoint Technologies) (07/07/90)

In article <1990Jul6.201743.24777@csmil.umich.edu> chymes@fribourg.csmil.umich.edu (Charles Hymes) writes:
>Oh great gurus, enlighten this poor soul who hath not programmed in
>Holy Assembler sence the loly 6502 was mighty.
>Yea, I pray that thou dost not blast my humble screen with mighty
>blasts of searing, wrathful flame, but knowest this, that I hath donned
>the sacred absphestos suit, and awaitest thou's marvelous revalation.
>
>I'd really like to know.
>
>Charlweed Hymerfan A Major (but humble) Dude

OK, OK, you can get up off your face, we won't hurt you... :-)

IMHO, the biggest and best reason not to allow self-modifying code is
that your running program doesn't match your source code.  This
complicates debugging to a large degree; I had to deal with
self-modifying stuff regularly, a few years ago.  Not again.

There are other reasons too.  A good thing to do in multi-tasking
systems is keep one copy of a program in memory, and have several 
tasks run it, right where it sits.  A program which has modified
itself isn't going to work for several tasks at the same time
(probably).  Another reason is that in some cases, on many different
machines and for many different reasons, it doesn't work or is
unreliable.  The 68030 instruction cache is a good example, but most
microprocessors have some kind of instruction prefetch mechanism.  If
you modify something very near to the current PC, it may be too late,
it's already in the pipeline.  Sure, you can measure the pipeline, but
it can *change* from one generation of your CPU to the next, and now
your code doesn't work anymore.
-- 
First comes the logo: C H E C K P O I N T  T E C H N O L O G I E S      / /  
                                                                    \\ / /    
Then, the disclaimer:  All expressed opinions are, indeed, opinions. \  / o
Now for the witty part:    I'm pink, therefore, I'm spam!             \/

cmcmanis@stpeter.Eng.Sun.COM (Chuck McManis) (07/07/90)

There isn't anything "wrong" with it per se as long as the system
supports it. When asked in context to the Amiga there are a couple
of issues :

Issue : How do you get reliable self modifying code execution
	when there is an instruction cache present ? Let's take
	for example the code that goes something like :
		mov	#JumpTable, D0	; Load D0 with address of table
		add	D1, D0		; Calculate offset
		mov	D1, #JumpTarget	; Store in in the Jump instruction
		.dw	4efch		; opcode for a jump immediate
	JumpTarget:
		.ds	4		; 4 byte jump address
	JumpTable:			; Table of addresses. 
		.dl	function1
		.dl	function2
		.dl	function3
	In this example, We allow the parameter passed in D1 to be used
	as an offset into a jumptable. The final jump instruction is 
	self modified, and the routine branches appropriately. Now on
	an instruction cache machine, all of the instructions get 
	fetched into the cache. When the 'mov D1,#JumpTarget' instruction
	happens, while it changes the version in memory, the cache
	is _not_ changed. Poof! The code breaks on 68020 and 68030
	Amigas with the instruction cache turned on. Generally the
	68K family is flexible enough in its addressing modes that 
	you can accomplish what you would have done one way, another
	"legal" way that doesn't involve self modifying code. This
	allows your program to continue to function on high end Amigas.

Issue : What do you do when a version of LoadSeg makes the code hunk
	of your program "execute-only" on the MMU?
	Using the same example above, when an MMU is present it is
	possible (and probable) that at sometime in the future the
	MMU will "protect" your code from getting stomped on by making
	the memory it is running in "execute/read" only. When that is
	the case the "mov D1,#JumpTarget" instruction generates a
	CPU exception. This exception will be treated by LoadSeg as
	either a runaway process or some other valid reason to shut
	down your task. Suddenly you are hosed again.

So the bottom line isn't that it is morally wrong, simply that it 
won't work on some legitimate Amiga systems and is thus "illegal"
in terms of having full Amiga compatibility. If one chooses to use
self modifying code in a commercial or even freeware Amiga program
they should be sure to spell out clearly on either the package or
in a README file that the code will not work reliably on a 68020 or
68030 system. 


--
--Chuck McManis						    Sun Microsystems
uucp: {anywhere}!sun!cmcmanis   BIX: <none>   Internet: cmcmanis@Eng.Sun.COM
These opinions are my own and no one elses, but you knew that didn't you.
"I tell you this parrot is bleeding deceased!"

mcmahan@netcom.UUCP (Dave Mc Mahan) (07/07/90)

 In a previous article, chymes@fribourg.csmil.umich.edu (Charles Hymes) writes:
>Oh great gurus, enlighten this poor soul who hath not programmed in
>Holy Assembler sence the loly 6502 was mighty.
>Yea, I pray that thou dost not blast my humble screen with mighty
>blasts of searing, wrathful flame, but knowest this, that I hath donned
>the sacred absphestos suit, and awaitest thou's marvelous revalation.
>
>I'd really like to know.
>
For one thing, It's not re-entrant.  That means you can't let more than one
process share the same physical piece of code without major overhead.  For
another, it's not even close to easy in a language like 'C' (or any other,
than assembler).  With a 68000, there really isn't a good reason to do such,
since there are other ways to run just as fast without it.  Finally, it's a
major pain in the ass to debug and document for the future.  I have seen people
dork the top value on the stack while in a subroutine and return to a different
place from where they came from, but that too is compiler specific and is
non-portable as all heck.  Just say 'no' to self-modifying code.

>Charlweed Hymerfan A Major (but humble) Dude

    -dave

dylan@cs.washington.edu (Dylan McNamee) (07/08/90)

In article <11749@netcom.UUCP> mcmahan@netcom.UUCP (Dave Mc Mahan) writes:
>
> In a previous article, chymes@fribourg.csmil.umich.edu (Charles Hymes) writes:
>>Oh great gurus, enlighten this poor soul who hath not programmed in
>>Holy Assembler sence the loly 6502 was mighty.
>>Yea, I pray that thou dost not blast my humble screen with mighty
>>blasts of searing, wrathful flame, but knowest this, that I hath donned
>>the sacred absphestos suit, and awaitest thou's marvelous revalation.
>>
>>I'd really like to know.
>>
>For one thing, It's not re-entrant.  That means you can't let more than one


Well, this isn't necessarily so...for example, if the modified code leaves
the program in an executable state that doesn't perform the self modification.
Then it's reentrant.  

The other posters mentioned difficulty of debugging such code, and I agree.
I was using the 'mon' monitor/disassembler on my copy of SimCity, to 
see what I could see...and they use self modifying code all over.  It plays
havoc with mon.  If you set a breakpoint in the self modified portion, the
progam is in an invalid state when the breakpoint is reached.  If you put
the breakpoint in a 'stable' piece of code, all's cool.  Finally, if you let
the whole program run to exit, then select rerun on mon, it runs again
without accessing the disk--reentrant(!) 

Pretty neat stuff to watch.  (But kids--don't try this at home!)

>>Charlweed Hymerfan A Major (but humble) Dude
>
>    -dave
dylan

ked01@ccc.amdahl.com (Kim DeVaughn) (07/08/90)

In article <13104@cbmvax.commodore.com>, valentin@cbmvax.commodore.com (Valentin Pepelea) writes:
>
> An alternate solution to clearing the caches
> is to declare regions of memory as non-cacheable.

Does the 3000 provide for non-cacheable memory, and if so, does 2.0 support
it?  How does the user control/specify this?  Come to think of it, just how
does the user specify xlation tables, etc in general?

On a related note, anyone know if Lattice/Manx/DICE support "volatile"
correctly?

/kim



-- 
UUCP:  kim@uts.amdahl.com   -OR-   ked01@juts.ccc.amdahl.com
  or:  {sun,decwrl,hplabs,pyramid,uunet,oliveb,ames}!amdahl!kim
DDD:   408-746-8462
USPS:  Amdahl Corp.  M/S 249,  1250 E. Arques Av,  Sunnyvale, CA 94086
BIX:   kdevaughn     GEnie:   K.DEVAUGHN     CIS:   76535,25

gpsteffl@sunee.waterloo.edu (Glenn Patrick Steffler) (07/09/90)

I have been waiting for people to show all of the great ways to avoid
self modifying code.  But the examples have been contrived and not the
least instructive.

Lets take a real world example:

A spread sheet program which must perform several thousand iterations
of a formula while recalculating.  The formula can be "compiled" to
the stack, and run such that the execution time is considerably less
than if the formula had been interpreted each time.

A video device driver or some such that uses raster ops to write values
to the video display.  (Given the Amiga has a blitter with this ability
I do not ask for confirmation of the relevance of this arguement)  The
driver can "compile" a raster operation fill algorithm into some small
code segment and run it.  This is indeed self modifying code, but is 
almost essential for speed, because the user hates to wait for screen
refresh.

Anyway, that was just some fodder for you guys.  I submit this for
rationalization or destruction.

In article <138523@sun.Eng.Sun.COM> cmcmanis@stpeter.Eng.Sun.COM (Chuck McManis) writes:
>There isn't anything "wrong" with it per se as long as the system
>supports it. When asked in context to the Amiga there are a couple
>of issues :
>
>Issue : How do you get reliable self modifying code execution
>	when there is an instruction cache present ? Let's take
[...]
>they should be sure to spell out clearly on either the package or
>in a README file that the code will not work reliably on a 68020 or
>68030 system. 

>--Chuck McManis						    Sun Microsystems
>uucp: {anywhere}!sun!cmcmanis   BIX: <none>   Internet: cmcmanis@Eng.Sun.COM
>These opinions are my own and no one elses, but you knew that didn't you.
>"I tell you this parrot is bleeding deceased!"

I agree this is a difficult thing to do, but aliasing the address of 
the self modifying code with several other addresses would make that
point moot.

Later

-- 
Co-Op Scum - U of Loo '91             "Bo doesn't know software" - George Brett

"Just got paid today, got myself a pocket full-o change" - ZZ top
                       Glenn Patrick Steffler

mwm@raven.pa.dec.com (Mike (Real Amigas have keyboard garages) Meyer) (07/10/90)

In article <1990Jul9.163607.18336@sunee.waterloo.edu> gpsteffl@sunee.waterloo.edu (Glenn Patrick Steffler) writes:

   Lets take a real world example:

   A spread sheet program which must perform several thousand iterations
   of a formula while recalculating.  The formula can be "compiled" to
   the stack, and run such that the execution time is considerably less
   than if the formula had been interpreted each time.

   A video device driver or some such that uses raster ops to write values
   to the video display.  (Given the Amiga has a blitter with this ability
   I do not ask for confirmation of the relevance of this arguement)  The
   driver can "compile" a raster operation fill algorithm into some small
   code segment and run it.  This is indeed self modifying code, but is 
   almost essential for speed, because the user hates to wait for screen
   refresh.

Both examples are from a class that's often confused with self
modifying code, but aren't really self modifying unless badly written.
I call that class (until somebody suggests something better) self
generating code.

All programs of this class can be done without modifying code -
remember, you're creating it, not modifying it! For the first example
(several thousand iterations? trivial) I'd compile to interpreted
stack-machine code to run. For large applications (popi, for instance)
or those where speed is more critical (your video driver), compile to
an array and tell the OS you're launching a task loaded into that
chunk of memory, and use that. The OS should take care of making sure
the code in that memory actually gets run, and not whatever you put
there last time around.

Of course, for your video applications, you wouldn't want to use a
soft solution anyway. Either a blitter, or a hand-coded library for
all the common operations, or a graphics accelerator that does all the
real work. In any of those cases, you never need to generate or modify
code; that's already been done. For development, you'd want the
support code that finally runs the code you're developing to be as
straightforward as possible, and raw speed won't be that critical.

	<mike
--
My feet are set for dancing,				Mike Meyer
Won't you turn your music on.				mwm@relay.pa.dec.com
My heart is like a loaded gun,				decwrl!mwm
Won't you let the water run.

dillon@overload.UUCP (Matthew Dillon) (07/10/90)

>On a related note, anyone know if Lattice/Manx/DICE support "volatile"
>correctly?
>
>/kim

    I've been cleaning up DICE in that respect, but I cannot guarentee
    it yet.  It still sometimes stores a temporary result into the
    destination of an assign and then doing the final operation on said
    destination to produce the final result in the destination.

--

			    *!* note new domain name!

    Matthew Dillon	    dillon@Overload.Berkeley.CA.US
    891 Regal Rd.	    uunet.uu.net!overload!dillon
    Berkeley, Ca. 94708
    USA

valentin@cbmvax.commodore.com (Valentin Pepelea) (07/10/90)

In article <a8x802lO01Rq01@JUTS.ccc.amdahl.com> ked01@ccc.amdahl.com
(Kim DeVaughn) writes:
>
> Does the 3000 provide for non-cacheable memory, and if so, does 2.0 support
> it?  How does the user control/specify this?  Come to think of it, just how
> does the user specify xlation tables, etc in general?

The A3000, as well as the 2500/30 and 2500/20 provide for non-cacheable memory,
both in hardware using the *CIIN (cache inhibit) pin and in software using the
CI bit in translation tables. The *CIIN is automatically asserted for CHIP and
I/O memory areas.

There is no OS provided way to control/specify this. But enterprizing hackers
may write their own MMU code, and hope that future versions of the OS will not
break them.  :-)

Valentin
-- 
The Goddess of democracy? "The tyrants     Name:    Valentin Pepelea
may distroy a statue,  but they cannot     Phone:   (215) 431-9327
kill a god."                               UseNet:  cbmvax!valentin@uunet.uu.net
             - Ancient Chinese Proverb     Claimer: I not Commodore spokesman be

Chuck.Phillips@FtCollins.NCR.COM (Chuck.Phillips) (07/10/90)

>>>>> On 9 Jul 90 12:45:59 GMT, mwm@raven.pa.dec.com (Mike (Real Amigas have keyboard garages) Meyer) said:
Mike> All programs of this class can be done without modifying code -
Mike> remember, you're creating it, not modifying it! For the first example
Mike> (several thousand iterations? trivial) I'd compile to interpreted
Mike> stack-machine code to run. For large applications (popi, for instance)
Mike> or those where speed is more critical (your video driver), compile to
Mike> an array and tell the OS you're launching a task loaded into that
Mike> chunk of memory, and use that. The OS should take care of making sure
Mike> the code in that memory actually gets run, and not whatever you put
Mike> there last time around.

This brings up an interesting point about future versions of AmigaOS and
the MMU.  In future versions of AmigaOS, will we need to mark this area of
memory as executable to avoid MMU violations?  Do we write the code into
the stack or heap and then call something like LoadSeg() on it?

Expiring minds want to know!
--
Chuck Phillips  MS440
NCR Microelectronics 			Chuck.Phillips%FtCollins.NCR.com
Ft. Collins, CO.  80525   		uunet!ncrlnk!ncr-mpd!bach!chuckp

gpsteffl@sunee.waterloo.edu (Glenn Patrick Steffler) (07/11/90)

In article <MWM.90Jul9134559@raven.pa.dec.com> mwm@raven.pa.dec.com (Mike (Real Amigas have keyboard garages) Meyer) writes:
>In article <1990Jul9.163607.18336@sunee.waterloo.edu> gpsteffl@sunee.waterloo.edu (Glenn Patrick Steffler) writes:
>   A spread sheet program which must perform several thousand iterations
>   of a formula while recalculating.  The formula can be "compiled" to
>   the stack, and run such that the execution time is considerably less
>   than if the formula had been interpreted each time.
>
>Both examples are from a class that's often confused with self
>modifying code, but aren't really self modifying unless badly written.
>I call that class (until somebody suggests something better) self
>generating code.

Ok.  My goof.  The purpose of the article was to present a scenario
which almost "requires" code to be generated, and run from a non-code
segment, or some such.  I wanted someone to approach the article on the
basis of incompatability with hardware, like the instruction/data cache
separation, or the MMU non-cache feature.

>All programs of this class can be done without modifying code -
>remember, you're creating it, not modifying it! For the first example

Agreed, and I had stated this in my article.

>(several thousand iterations? trivial) I'd compile to interpreted
>stack-machine code to run. For large applications (popi, for instance)
>or those where speed is more critical (your video driver), compile to
>an array and tell the OS you're launching a task loaded into that
>chunk of memory, and use that. The OS should take care of making sure
>the code in that memory actually gets run, and not whatever you put
>there last time around.

Fine, thats what I would do if it were not for the problems that may happen
on machines which cache lots of memory.  If I recompile into the array 
and run code from it withouta a cache flush, the possibility of major
problems is quite large.

>Of course, for your video applications, you wouldn't want to use a
>soft solution anyway. Either a blitter, or a hand-coded library for
>all the common operations, or a graphics accelerator that does all the
>real work. In any of those cases, you never need to generate or modify
>code; that's already been done. For development, you'd want the
>support code that finally runs the code you're developing to be as
>straightforward as possible, and raw speed won't be that critical.

RAW speed is generally critical on slower archetectures, and in areas
of serious competition like spreadsheet recalc times etc.  Video drivers
tend to give the user more a feeling of how fast a program is rather than
its computational efficiency.

>	<mike
>--
>My feet are set for dancing,				Mike Meyer
>Won't you turn your music on.				mwm@relay.pa.dec.com
>My heart is like a loaded gun,				decwrl!mwm
>Won't you let the water run.


-- 
Co-Op Scum - U of Loo '91             "Bo doesn't know software" - George Brett

"If I could only flag her down" -- ZZtop Afterburner
                       Glenn Patrick Steffler

mwm@raven.pa.dec.com (Mike (Real Amigas have keyboard garages) Meyer) (07/11/90)

In article <1990Jul10.201202.3378@sunee.waterloo.edu> gpsteffl@sunee.waterloo.edu (Glenn Patrick Steffler) writes:

   >stack-machine code to run. For large applications (popi, for instance)
   >or those where speed is more critical (your video driver), compile to
   >an array and tell the OS you're launching a task loaded into that
   >chunk of memory, and use that. The OS should take care of making sure
   >the code in that memory actually gets run, and not whatever you put
   >there last time around.

   Fine, thats what I would do if it were not for the problems that may happen
   on machines which cache lots of memory.  If I recompile into the array 
   and run code from it withouta a cache flush, the possibility of major
   problems is quite large.

The last two sentences in the paragraph cover that. You don't just
branch into the code, you run it as a seperate task (or thread or
whatever your favorite term is). If the OS fails to correctly deal
with caches in this case, then it's got a _serious_ bug. After all you
haven't done anything different from what happens when a shell starts
an application.

The OS may do a cache flush. Then again, it may not - it has more
information about what's going on in the system than you do.

	<mike
--
And then up spoke his own dear wife,			Mike Meyer
Never heard to speak so free.				mwm@relay.pa.dec.com
"I'd rather a kiss from dead Matty's lips,		decwrl!mwm
Than you or your finery."

jesup@cbmvax.commodore.com (Randell Jesup) (07/12/90)

In article <MWM.90Jul11114553@raven.pa.dec.com> mwm@raven.pa.dec.com (Mike (Real Amigas have keyboard garages) Meyer) writes:
>In article <1990Jul10.201202.3378@sunee.waterloo.edu> gpsteffl@sunee.waterloo.edu (Glenn Patrick Steffler) writes:
>   Fine, thats what I would do if it were not for the problems that may happen
>   on machines which cache lots of memory.  If I recompile into the array 
>   and run code from it withouta a cache flush, the possibility of major
>   problems is quite large.
>
>The last two sentences in the paragraph cover that. You don't just
>branch into the code, you run it as a seperate task (or thread or
>whatever your favorite term is). If the OS fails to correctly deal
>with caches in this case, then it's got a _serious_ bug. After all you
>haven't done anything different from what happens when a shell starts
>an application.
>
>The OS may do a cache flush. Then again, it may not - it has more
>information about what's going on in the system than you do.

	The OS does a cache flush when it needs to - when it does relocation
in LoadSeg(), it flushes the caches afterwards before returning.  There's
no reason for the Shell to do cache flushes - they're only needed after 
creating or modifying executable code.

-- 
Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
{uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.cbm.commodore.com  BIX: rjesup  
Common phrase heard at Amiga Devcon '89: "It's in there!"