[comp.sys.amiga.programmer] Starting another copy of your own code

rbabel@babylon.rmt.sub.org (Ralph Babel) (03/09/91)

In article <18cb4d63.ARN0b56@swinjm.UUCP>,
forgeas@swinjm.UUCP (Jean-Michel Forgeas) writes:

> I don't see where you find self-modifying code in the
> example above. All I see is data (while there is no
> process running on it). Effectively after CreateProc() on
                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
> it this data becomes code, but there is no modification of
  ^^^^^^^^^^^^^^^^^^^^^^^^^
> the code after CreateProc().

This is not true! It's always the function that brings the
code into memory (e.g. LoadSeg()) that is supposed to keep
the code cache up to date, not the function to execute this
code (e.g. CreateProc()). The same code might be executed
several times (even concurrently), so clearing the cache
every time would be wasteful.

These are the only legal ways of bringing code into memory:

- ROM-code
- DAC_WORDWIDE
- DAC_BYTEWIDE
- DAC_NIBBLEWIDE
- MakeFunctions()
- MakeLibrary()
- SetFunction()
- LoadSeg()

Putting an entry into the Captures/KickTags and rebooting
the system _might_ also be considered safe. 2.0 provides a
few additional functions (e.g. InternalLoadSeg()) plus some
cache control calls.

In article <1991Mar9.170859.4810@Sandelman.OCUnix.On.Ca>,
mcr@Sandelman.OCUnix.On.Ca (Michael Richardson) writes:

> Please suggest an alternative to doing this.

Simply use a function entry that is longword-aligned, e.g.
an assembly language stub after a "CNOP 0,4" or the first
(C-) function in an object module, and convert it to a
seglist:

void (*pfv)(void);
BPTR sl;

sl = (BPTR)(((ULONG)pfv >> 2) - 1);

> So far code like this [PhonySegList] (Hmm. I don't have a
> SegSize at the beginning though) has been working on
> A3000s for some time now.

256 bytes of cache isn't a lot. It even works under 1.3.

> Is there now a 2.0 function for doing the same thing?

2.0 dos.library allows one to specify a regular function
address instead of a (longword-aligned) seglist.

> /* From Leoproc.zoo. By Leo Schwab. */
>
> CopyMem(&template, fakelist, sizeof(struct PhonySegList));
> fakelist->psl_EntryPoint = SlaveStart;

Sorry, self-modifying code. Even if Leo did it. :-)

Ralph

mcr@Sandelman.OCUnix.On.Ca (Michael Richardson) (03/10/91)

In article <06417.AA06417@babylon.rmt.sub.org> cbmvax.commodore.com!cbmehq!babylon!rbabel (Ralph Babel) writes:
>In article <19578@cbmvax.commodore.com>,
>ken@cbmvax.commodore.com (Ken Farinsky - CATS) writes:
>
>> LoadSeg() needs a BPTR to a seg list, which can be faked
>> like:
>>
>>         /* From Mike Sinz AmigaMail example */
>>         struct CodeHdr
>>                 {
>>                 ULONG SegSize;   /* sizeof(struct CodeHdr) */
>>                 ULONG NextSeg;   /* Must be NULL */
>>                 UWORD JumpInstr; /* set to 0x4EF9 (a jump instruction) */
>>                 APTR Function;   /* a pointer to the function */
>>                 }
>
>I'd call this self-modifying code.

  Please suggest an alternative to doing this.
  So far code like this (Hmm. I don't have a SegSize at the beginning
though) has been working on A3000s for some time now. 
  Is there now a 2.0 function for doing the same thing?

/* From Leoproc.zoo. By Leo Shwab. */
struct PhonySegList {
        BPTR    psl_NextSeg;            /*  BPTR to next element in list  */
        UWORD   psl_JMP;                /*  A 68000 JMP abs.l instruction  */
        void    (*psl_EntryPoint)();    /*  The address of the function  */
};

struct PhonySegList template = {
        NULL,                           /*  No next element.              */
        0x4EF9,                         /*  JMP abs.l                     */
        NULL                            /*  Argument for JMP instruction  */
};

...

  /*
   * Allocate a PhonySegList structure.
   */
  if((fakelist = AllocMem(sizeof(struct PhonySegList), NULL))==NULL ||
     (startup  = AllocMem(sizeof(struct SlaveStart),MEMF_PUBLIC))==NULL) {
    return(88);
  }

  /*
   * Copy the template into the allocated memory, and set the entry
   * point to the sub-process.
   */
  CopyMem(&template, fakelist, sizeof(struct PhonySegList));
  fakelist->psl_EntryPoint = SlaveStart;




-- 
   :!mcr!:            |  The postmaster never | - Pay attention only
   Michael Richardson |    resolves twice.    | to _MY_ opinions. -  
 HOME: mcr@sandelman.ocunix.on.ca +   Small Ottawa nodes contact me
 Bell: (613) 237-5629             +    about joining ocunix.on.ca!

mcr@Sandelman.OCUnix.On.Ca (Michael Richardson) (03/11/91)

In article <06493.AA06493@babylon.rmt.sub.org> cbmvax.commodore.com!cbmehq!babylon!rbabel (Ralph Babel) writes:
>In article <1991Mar9.170859.4810@Sandelman.OCUnix.On.Ca>,
>mcr@Sandelman.OCUnix.On.Ca (Michael Richardson) writes:
>
>> Please suggest an alternative to doing this.
>
>Simply use a function entry that is longword-aligned, e.g.
>an assembly language stub after a "CNOP 0,4" or the first
>(C-) function in an object module, and convert it to a
>seglist:
>
>void (*pfv)(void);
>BPTR sl;
>
>sl = (BPTR)(((ULONG)pfv >> 2) - 1);

  Thank you. This had not occured to me, and I'm a little hesistant
about it. We shall try this.

>> CopyMem(&template, fakelist, sizeof(struct PhonySegList));
>> fakelist->psl_EntryPoint = SlaveStart;
>
>Sorry, self-modifying code. Even if Leo did it. :-)

  No dispute about it being not exactly kosher. I had assumed that the
first long word (which was NULL, the next pointer) was used in some
way. Probably we'll wind up assembling a short assembly program to
'jmp' to the right place rather than depend on the function being
first. 
  Actually, we are probably going to pull the two processes into
seperate binaries. It will allow a lot more customization. (And a lot
more user confusion :-)

-- 
   :!mcr!:            |  The postmaster never | - Pay attention only
   Michael Richardson |    resolves twice.    | to _MY_ opinions. -  
 HOME: mcr@sandelman.ocunix.on.ca +   Small Ottawa nodes contact me
 Bell: (613) 237-5629             +    about joining ocunix.on.ca!

mcr@Sandelman.OCUnix.On.Ca (Michael Richardson) (03/11/91)

In article <06493.AA06493@babylon.rmt.sub.org> cbmvax.commodore.com!cbmehq!babylon!rbabel (Ralph Babel) writes:
>In article <1991Mar9.170859.4810@Sandelman.OCUnix.On.Ca>,
>mcr@Sandelman.OCUnix.On.Ca (Michael Richardson) writes:
>
>> Please suggest an alternative to doing this.
>
>Simply use a function entry that is longword-aligned, e.g.
>an assembly language stub after a "CNOP 0,4" or the first
>(C-) function in an object module, and convert it to a
>seglist:
>
>void (*pfv)(void);
>BPTR sl;
>
>sl = (BPTR)(((ULONG)pfv >> 2) - 1);

  Thank you. This had not occured to me, and I'm a little hesistant
about it. We shall try this.

>> CopyMem(&template, fakelist, sizeof(struct PhonySegList));
>> fakelist->psl_EntryPoint = SlaveStart;
>
>Sorry, self-modifying code. Even if Leo did it. :-)

  No dispute about it being not exactly kosher. I had assumed that the
first long word (which was NULL, the next pointer) was used in some
way. Probably we'll wind up assembling a short assembly program to
'jmp' to the right place rather than depend on the function being
first. 
  Actually, we are probably going to pull the two processes into
seperate binaries. It will allow a lot more customization. (And a lot
more user confusion :-)

-- 
   :!mcr!:            |  The postmaster never | - Pay attention only
   Michael Richardson |    resolves twice.    | to _MY_ opinions. -  
 HOME: mcr@sandelman.ocunix.on.ca +   Small Ottawa nodes contact me
 Bell: (613) 237-5629             +    about joining ocunix.on.ca!
-- 
   :!mcr!:            |  The postmaster never | So much mail, 
   Michael Richardson |    resolves twice.    |  so little time.
HOME: mcr@sandelman.ocunix.on.ca 	Bell: (613) 237-5629
    Small Ottawa nodes contact me about joining ocunix.on.ca!

cg@ami-cg.UUCP (Chris Gray) (03/12/91)

In article <06493.AA06493@babylon.rmt.sub.org> rbabel@babylon.rmt.sub.org
(Ralph Babel) writes:
[edited down, but you'll all remember it!]

>This is not true! It's always the function that brings the
>code into memory (e.g. LoadSeg()) that is supposed to keep
>the code cache up to date, not the function to execute this
>code (e.g. CreateProc()). The same code might be executed
>several times (even concurrently), so clearing the cache
>every time would be wasteful.
>
>These are the only legal ways of bringing code into memory:
>
>- ROM-code
>- DAC_WORDWIDE
>- DAC_BYTEWIDE
>- DAC_NIBBLEWIDE
>- MakeFunctions()
>- MakeLibrary()
>- SetFunction()
>- LoadSeg()
>
>Putting an entry into the Captures/KickTags and rebooting
>the system _might_ also be considered safe. 2.0 provides a
>few additional functions (e.g. InternalLoadSeg()) plus some
>cache control calls.
>
>Sorry, self-modifying code. Even if Leo did it. :-)

Ralph is saying that the technique of building a SegList structure at run-time
and then using it with CreateProc is self-modifying code. I won't argue the
definition (I don't think it is, I think it is dynamically created code).
The technique does not break on systems with caches, whereas the usual forms
of self-modifying code will. If Commodore wants to explicitly say that the
technique is invalid (Ralph suggested an alternative), that is their right,
but they should be clear on why. It is NOT because it breaks on CPUs with
caches, but because they don't want it done. There may be good reasons for
not doing it, like future considerations of MMUs and execute-only status.

I waited for someone else to jump on this, but nobody did, so I had too.
I've likely just crossed in the postings with half-a-dozen other replies.

--
Chris Gray   alberta!ami-cg!cg	 or   cg%ami-cg@scapa.cs.UAlberta.CA

rbabel@babylon.rmt.sub.org (Ralph Babel) (03/12/91)

In article <cg.7124@ami-cg.UUCP>, cg@ami-cg.UUCP (Chris
Gray) writes:

> Ralph is saying that the technique of building a SegList
> structure at run-time and then using it with CreateProc is
> self-modifying code. I won't argue the definition (I don't
> think it is, I think it is dynamically created code).

"dynamically-created code"? Chris, I guess you've been doing
too much compiler development lately! :-)

> The technique does not break on systems with caches, [...]

I think it does. Here's the scenario:

1. Process P1 executes its code in memory chunk M;
2. Parts of chunk M will be loaded into the code cache;
3. Process P1 terminates;
4. The memory area occupied by P1's code segment (including
   chunk M) is released to the free-memory pool;
5. Process P2 allocates a fake seglist; let's assume this
   seglist happens to be located in chunk M;
6. The code to be executed is copied to M, but this copying
   does _not_ update the _code_ cache!!!
7. Since CreateProc() doesn't clear the code cache, it will
   read the _old_ code from process P1 when entering the
   fake seglist.

> It is NOT because it breaks on CPUs with caches,

It does break on CPUs with _separate_ code and data caches.

Ralph

rbabel@babylon.rmt.sub.org (Ralph Babel) (03/13/91)

In article <752@tnc.UUCP>, m0154@tnc.UUCP (GUY GARNETT)
writes:

> In the example code from AmigaMail, the 'code' is a single
> instruction which is built into a data structure. When it
> is branched to, the code cache will miss (that location
> has never been executed before;

How do you know? The same memory region might have been part
of a code hunk - and FreeMem() doesn't clear the code cache.

> The method is also always safe if you invalidate the cache
> right after you put a value in CodeHdr.Function;

With the 2.0 cache support functions it's certainly a lot
safer (and easier) than before, but just wait and see how
many programs will fail nevertheless with the 68040's
copyback cache enabled.

Ralph

m0154@tnc.UUCP (GUY GARNETT) (03/13/91)

I don't think that the examples described previously are
self-modifying code (at least, not from the point of view of a 680x0
family processor).  The reason: Self modifying code (the kind which
causes programs to break) is where some values got into the code
cache, and then were changed (by whatever agency method; its not
important).  When the program goes back to re-execute the instruction
(its probably inside some kind of loop), if the cache hits (probably)
then we execute the old (before change) instruction (on a miss,
everything works fine).  In the example code from AmigaMail, the
'code' is a single instruction which is built into a data structure. 
When it is branched to, the code cache will miss (that location has
never been executed before; yes, the value is in the DATA chache, but
the two are separate entities).

Therefore, the method should be safe, as long as each separate task to
be launched has its own CodeHdr.  If you try to re-use a CodeHdr (by
stuffing a new value in CodeHdr.Function) then it *IS* self-modifying
code, and may fail.  The method is also always safe if you invalidate
the cache right after you put a value in CodeHdr.Function; this will
work no matter how many times you try to re-use the structure (of
course, trying to re-use CodeHdr will probably have other side-effects
as well).

Wildstar

cg@ami-cg.UUCP (Chris Gray) (03/13/91)

In article <06545.AA06545@babylon.rmt.sub.org> rbabel@babylon.rmt.sub.org (Ralp
>In article <cg.7124@ami-cg.UUCP>, cg@ami-cg.UUCP (Chris
>Gray) writes:
>
>> Ralph is saying that the technique of building a SegList
>> structure at run-time and then using it with CreateProc is
>> self-modifying code. I won't argue the definition (I don't
>> think it is, I think it is dynamically created code).
>
>"dynamically-created code"? Chris, I guess you've been doing
>too much compiler development lately! :-)

Naw - working full-time on a MUD for the Amiga - all interpreted.

>> The technique does not break on systems with caches, [...]
>
>I think it does. Here's the scenario:
>
>1. Process P1 executes its code in memory chunk M;
>2. Parts of chunk M will be loaded into the code cache;
>3. Process P1 terminates;
>4. The memory area occupied by P1's code segment (including
>   chunk M) is released to the free-memory pool;
>5. Process P2 allocates a fake seglist; let's assume this
>   seglist happens to be located in chunk M;
>6. The code to be executed is copied to M, but this copying
>   does _not_ update the _code_ cache!!!
>7. Since CreateProc() doesn't clear the code cache, it will
>   read the _old_ code from process P1 when entering the
>   fake seglist.
>
>> It is NOT because it breaks on CPUs with caches,
>
>It does break on CPUs with _separate_ code and data caches.

Ok, you win - I wuz wrong. Sigh. In reality I would guess that the I-cache
is flushed by CreateProc and/or LoadSeg, so it wouldn't matter, but I agree
we are talking principles here.

--
Chris Gray   alberta!ami-cg!cg	 or   cg%ami-cg@scapa.cs.UAlberta.CA

rbabel@babylon.rmt.sub.org (Ralph Babel) (03/14/91)

In article <19852@cbmvax.commodore.com>,
jesup@cbmvax.commodore.com (Randell Jesup) writes:

> If you build code yourself, a quick call to CacheClearE
> with CACRF_ClearI will solve your day.

With the 68040's copyback cache enabled, it won't. You have
to push the data cache to memory as well.

Ralph

jesup@cbmvax.commodore.com (Randell Jesup) (03/14/91)

45@babylon.r <cg.7176@ami-cg.UUCP>
Sender: 
Reply-To: jesup@cbmvax.commodore.com (Randell Jesup)
Followup-To: 
Distribution: 
Organization: Commodore, West Chester, PA
Keywords: 

In article <cg.7176@ami-cg.UUCP> cg@ami-cg.UUCP (Chris Gray) writes:
>In article <06545.AA06545@babylon.rmt.sub.org> rbabel@babylon.rmt.sub.org (Ralp
>>"dynamically-created code"? Chris, I guess you've been doing
>>too much compiler development lately! :-)
>
>Naw - working full-time on a MUD for the Amiga - all interpreted.

	Hmmmm.

>>It does break on CPUs with _separate_ code and data caches.
>
>Ok, you win - I wuz wrong. Sigh. In reality I would guess that the I-cache
>is flushed by CreateProc and/or LoadSeg, so it wouldn't matter, but I agree
>we are talking principles here.

	In 2.0, LoadSeg flushes the ICache after relocation, and CreateNewProc
flushes if it has to build any code you you (NP_Entry).  If you build code
yourself, a quick call to CacheClearE with CACRF_ClearI will solve your day.
It even takes a start location and length.

-- 
Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
{uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.commodore.com  BIX: rjesup  
The compiler runs
Like a swift-flowing river
I wait in silence.  (From "The Zen of Programming")  ;-)

peter@sugar.hackercorp.com (Peter da Silva) (03/14/91)

In article <cg.7124@ami-cg.UUCP> cg@ami-cg.UUCP (Chris Gray) writes:
> The technique does not break on systems with caches, whereas the usual forms
> of self-modifying code will.

Sure. What if your newly-allocated seglist just happens to be sitting in a
chunk of memory that an unloaded segment of code just vacated?
-- 
Peter da Silva.   `-_-'
<peter@sugar.hackercorp.com>.

dillon@overload.Berkeley.CA.US (Matthew Dillon) (03/17/91)

In article <1991Mar14.203146.20908@Sandelman.OCUnix.On.Ca> mcr@Sandelman.OCUnix.On.Ca (Michael Richardson) writes:
>In article <06493.AA06493@babylon.rmt.sub.org> cbmvax.commodore.com!cbmehq!babylon!rbabel (Ralph Babel) writes:
>>In article <1991Mar9.170859.4810@Sandelman.OCUnix.On.Ca>,
>>mcr@Sandelman.OCUnix.On.Ca (Michael Richardson) writes:
>>...
>  No dispute about it being not exactly kosher. I had assumed that the
>...

    I generally write a separate .A (assembly) file which BEGINS with
    the dummy segment... you get longword alignment AUTOMATICALLY because
    ALL OBJECT MODULES ARE LONGWORD ALIGNED!

    There is no need for CNOP or any other possibly screwy opcode (many
    assemblers will insert 0's instead of NOP's to do the alignment)

					    -Matt

--

    Matthew Dillon	    dillon@Overload.Berkeley.CA.US
    891 Regal Rd.	    uunet.uu.net!overload!dillon
    Berkeley, Ca. 94708
    USA