[comp.sys.mac.misc] CODE 0

jimchow@nexus.bison.mb.ca (Jim Chow) (03/15/91)

I've been exploring around with ResEdit on some applications. I'm 
particularly interested in the CODE 0 part of applications. Can anyone 
tell me what is in it and whether it can be reconstructed if it has 
been corrupted or damaged? I know this seems like a general question 
but I would like to know what is typically located in the first few 
bytes of it. ie does it have a count of all the resources? is it a map 
for which resources to load first?
Any info on this is appreciated.

woody@nntp-server.caltech.edu (William Edward Woody) (03/18/91)

In article <waBuy2w163w@nexus.bison.mb.ca> jimchow@nexus.bison.mb.ca (Jim Chow) writes:
>I've been exploring around with ResEdit on some applications. I'm 
>particularly interested in the CODE 0 part of applications. Can anyone 
>tell me what is in it and whether it can be reconstructed if it has 
>been corrupted or damaged? I know this seems like a general question 
>but I would like to know what is typically located in the first few 
>bytes of it. ie does it have a count of all the resources? is it a map 
>for which resources to load first?
>Any info on this is appreciated.

This perhaps belongs strictly on comp.sys.mac.programmer, I suppose, but I
will answer it here.  Further comments or replys should probably be redirected
to comp.sys.mac.programmer exclusively.

A Macintosh Application is made up of multiple code resources, identified as
CODE resource 1 through N (typically).  To determine how the multiple code
segments should talk to eachother (that is, where the entry points in the
CODE resources are) depends on a fixed code resource which is capable of
storing if a CODE segment is loaded, where that segment is (if it is loaded),
and what the CODE resource is if it is not loaded.  (Remember that the location
of a CODE resource is not fixed in any sense of the word; CODE resources, like
all others, may be located anywhere in memory.)

This 'jump table' is stored as CODE resource 0.

According to Inside Macintosh Vol. #2, Chapter 2: Segment Loader Routines,
Page 60, "The Jump Table", (sounds a lot like quoting from the Bible, doesn't
it?) the CODE resource 0 stores the length of the global static space (in
bytes), the length of the jump table itself, and the contents of the jump
table.

The format of the CODE resource 0 header (which contains the size of these
various structures) is:

	OFFSET	LENGTH	CONTENTS
	0	4	"About A5 size"; size in bytes from location pointed
			to by register A5 to upper end of application static
			global space.
	4	4	"Below A5 size"; size of bytes of application globals
			plus Quickdraw globals
	8	4	Length of jump table in bytes
	12	4	Offset to jump table from location pointed to by A5

Note:  All global variables are refered to as a relative offset to register
A5.  This register is initialized to a space in the Application's heap and A5
is pointed to that space before the application is loaded.

Note:  The typical offset to the jump table from the location pointed to by A5
is told to be 32 bytes.  I don't know why this is the case (why 32???), but
from that the "Above A5 size" is 32 + the length of the jump table.  When a
application jumps from one CODE segment to another, it does it typically by
doing a 'JMP NN(A5)' instruction (that is, jump to the address given by adding
NN to A5).

What follows this (at offset 16 to the beginning of CODE resource 0) is the
jump table itself.  The entries in this jump table are 8 bytes long, and in
the CODE resource have the following format:

	e + 0	2	Offset of this CODE segment entry point from the
			beginning of the CODE segment this jump table entry
			describes
	e + 2	4	A machine language instruction which pushes the CODE
			segment number onto the stack.  (This is how the CODE
			segment is encoded in the jump table entry, and is
			usually a push immediate operand onto stack.)
	e + 6	2	The _LOADSEG trap instruction.

What the Macintosh does with this is to load the entire jump table into memory
at the relative location off of A5 that was provided by CODE segment header,
and (when the application is first started) executes the first entry in the
jump table.

When you execute a CODE segment by jumping through the jump table (by jumpping
at byte offset 2 from the front of the jump table entry), the instructions
push the code segment onto the stack, and executes the _LOADSEG trap.  This
trap loads the code segment on the stack, patches the jump table to show that
the CODE segment is now in memory, and then jumps to the entry originally
specified in that jump table entry.

When a CODE segment is loaded, the entries in the jump table change to the
following format:

	e + 0	2	CODE Segment number
	e + 2	6	A JUMP to absolute address (which jumps to the routine
			described by this jump table entry).

When a CODE segment is unloaded, the entries in the jump table are put back to
the format originally stored in the CODE resource 0.

				---

The problem with trying to reconstruct this is that the values in CODE 0
require some supernatural knowledge of the application itself.  You would need
to know how many bytes of global space the application used, and (if you
were trying to patch the rest of the CODE 0 resource) you would need to know
the entry point offsets for all the routines in the rest of the application.

I suppose you could devine this by scanning the entire application for
references to relative addressing off of register A5, but in some cases
(where an array is stored in the global space), simply looking for MOVE -NN(A5)
instructions just won't work, as the application may do a LEA -NN(A5)
(moving the address of the global in the global static space somewhere else),
and then use that address to compute array entry locations.

This would require an awful lot of skill and a fair amount of luck to make
work.

						-- Bill

P.S.  One of the methods I have recommended for an application to disable itself
(for example, a beta application with a time-bomb inside itself) is to
delete CODE resource 0 or to overwrite the header of CODE resource 0, and
then immediately do an _ExitToShell (to quit the application).  This works
real well, so long as the _ExitToShell is executed immediately following the
application's internal surgery, and may even work if the application simply
continues without exiting first, as the contents of the CODE resource are
already in memory.

-- 
	William Edward Woody		   | Disclamer:
USNAIL	P.O.Box 50986; Pasadena, CA 91115  |
EMAIL	woody@tybalt.caltech.edu	   | The useful stuff in this message
ICBM	34 08' 44''N x 118 08' 41''W	   | was only line noise. 

Jim.Spencer@p510.f22.n282.z1.edgar.mn.org (Jim Spencer) (03/18/91)

Jim Chow writes in a message to All

JC> I've been exploring around with ResEdit on some applications. 
JC> I'm particularly interested in the CODE 0 part of applications. 
JC> Can anyone tell me what is in it and whether it can be reconstructed 
JC> if it has been corrupted or damaged? I know this seems like a 
JC> general question but I would like to know what is typically located 
JC> in the first few bytes of it. ie does it have a count of all 
JC> the resources? is it a map for which resources to load first? 
JC> Any info on this is appreciated. 

CODE ID=0 is the applications jump table.  It consists of series of entries for each routine that can be called from code in another segment.  On the disk, the entries contain the code to load the segment in which the routine resides plus the offset of the routine from the beginning of the segment.  When an application is launched, CODE 0 is brought into memory and a jump is done to the first entry which should contain the code necessary to load and jump to the entry point of the application.  When a segme






nt is loaded, the entries in the memory resident version of the jump table are changed to be jump to subroutine instructions.  When the segment is unloaded, the jump table entries are changed back to the load segment instructions.

I go into this detail in order to explain that there really isn't any way to rebuild a damaged CODE 0 resource.  You would have to know exactly what routine each entry pointed to and know that routine's segment and offset.  Even the author of the program doesn't know this: its handled by the Linker during development.