[comp.sys.cbm] Power-C Linker Format

rickc@pogo.GPID.TEK.COM (Rick Clements) (11/04/88)

When porting and writing code, I often run into a problem.  That problem is
Power-C doesn't handle initializing very many types of structures and arrays.
The solution, I would like to try, is program that would handle initialized
structures.  I could then link to the structures.  The only problem is I
don't know the linker format.

I hear there is a public domain assembler that will link with Power-C, so
I hope someone has the format.  None of the local libraries carry Transactor,
if the format was cover there could someone post a summary?
--
-- 
Rick Clements (RickC@pogo.GPID.TEK.COM)

prindle@NADC.ARPA (Frank Prindle) (11/07/88)

This is a repost of C-Power linker object format in answer to a recent request
for same:

Here is, as I recall it, the format of a C Power (AKA Power C) relocatable
object (i.e. a ".o" or ".obj") file; others, please correct me if this is
wrong, though I checked it out with the C Power assembler (ASSM) source code
and the C Power reverse assembler (RA) source code:

There are 5 distinct parts to the object file; each part begins with a 2 byte
count in standard 6502 low-byte/hi-byte format.  The 5 parts directly follow
each other in the following order:

	1. relocatable object code
	2. relocation entries
	3. external definition entries
	4. external reference entries
	5. uninitialized data block entries

I will describe each part in detail:

1. relocatable object code:
	The first 2 bytes are a byte count of the object code to follow.  What
	follows is simply the generated object code for the corresponding source
	code file.  For those instructions and .byte or .word pseudo ops which
	reference relocatable addresses within a function (typically for local
	jumps), the value in the operand field is the offset relative to the
	first word of object code.  For those instructions and .byte or .word
	pseudo ops which reference externally defined addresses, the value of
	the operand field is irrelevant and typically filled in by the compiler
	with bytes which duplicate the byte which immediately preceeds them.
	This part ends when the number of bytes of object code specified in the
	count have been encountered.

2. relocation entries:
	The first 2 bytes are a count of the number of relocation entries to
	follow.  Each relocation entry is exactly 2 bytes long and consists of
	an offset relative to the first byte of object code.  This offset 
	actually points to the byte before a 2-byte address which is to have
	added to it the absolute address of the first word of object code; that
	is, for 3-byte instructions, this offset points to the op-code preceed-
	ing an address to be relocated; for 2-byte addresses without an op-code
	(e.g. .word pseudo ops), the offset points to the byte before the
	address to be relocated.  1-byte addresses to be relocated (e.g. >addr
	or <addr) are not handled by this relocation mechanism, but rather as
	pseudo extdefs/extrefs (see below).  This part ends when the number of
	2-byte relocation entries specified in the count have been encountered.

3. external definition entries

	The first 2 bytes are a count of the number of external definition
	entries to follow.  Each extdef entry is a variable number of bytes
	long.  First appears the externally defined name, terminated with a
	zero byte.  Next is a 1-byte flag; if this flag is 0, the externally
	defined symbol has an absolute value; if this flag is a 1, the external-
	ly defined symbol has a relocatable value relative to the first byte
	of object code.  Finally, the last 2 bytes of each entry are the
	absolute value of the (absolute) symbol, or the offset of the (reloc-
	atable) symbol.  Whenever the compiler must reference only the low or
	high byte address of a local piece of static data (e.g. a string
	literal), that datum is given a "pseudo" external definition; that is,
	the compiler makes up a name for it consisting of several randomly
	generated special characters and additional identifier characters,
	then treats it as if it were an external definition.  This is done so
	that it may be referenced by an external reference entry to follow.
	This part ends when the number of multi-byte extdef entries
	specified in the count have been encountered.

4. external reference entries:
	The first 2 bytes are a count of the number of external reference
	entries to follow.  Each extref entry is a variable number of bytes
	long.  First appears the externally referenced name, terminated with
	a zero byte.  Next is a 2-byte word in low/hi format; the low 2 bits
	of this word indicate if this external reference is to a full 2-byte
	address (flag=0), a single byte to contain the high byte of the address
	(flag=1), or a single byte to contain the low byte of the address (flag=
	2).  The upper 14 bits of this word are an offset into the external
	object.  Finally, the last 2 bytes of each entry are the offset of the
	external referencing instruction (points 1 byte before the external
	address reference itself) relative to the first byte of object code.
	In the case of references to "pseudo" external definitions, the
	reference will be resolved by the matching external definition, and
	the flag will always be either 1 or 2.  This part ends when the number
	of multi-byte extref entries specified in the count have been
	encountered.

5. uninitialized data block entries:
	The first 2 bytes are a count of the number of data block entries
	to follow.  Each data block entry is a variable number of bytes
	long.  First appears the data block name, terminated with a zero
	byte.  Lastly is a 2-byte size, representing the number of data
	bytes to be reserved by the linker (and zeroed by the run-time
	initialization code) for that named data block.  These entries are
	used to represent uninitialized static or external data to prevent
	large object modules filled with nothing but zeros.  Since the
	data block names are effectively externally defined, dummy "pseudo"
	extdef names are again created when local static data is to be
	allocated as an uninitialized data block.  The purpose of these
	randomly generated dummy names is to clue the linker that these are
	not real external definitions, and to prevent external name conflict
	with identically named local data in other object modules.  This part
	ends when the number of multi-byte data block entries specified in the
	count have been encountered.  At this point the object file is at
	end-of-file.

I hope the above is a reasonably complete and useful description of C Power
object code.  Refer also to the Transactor (March 1988) article "The link
between C and assembly".  Also note that ASSM faithfully adheres to the above
format so that ASSM generated object files may directly be linked with those
generated by C Power; however, ASSM uses a different algorithm for the
generation of pseudo extdef names, attempting to make those names more
readable without sacrificing their uniqueness.

Sincerely,
Frank Prindle
Prindle@NADC.arpa

mat@emcard.UUCP (Mat Waites) (11/07/88)

In response to a request for C-power object file format, here is a
posting from Frank Prindle from a while back:

-------------------------

From gatech!prindle Wed May 11 14:54:31 EDT 1988
Status: RO

Here is, as I recall it, the format of a C Power (AKA Power C) relocatable
object (i.e. a ".o" or ".obj") file; others, please correct me if this is
wrong, though I checked it out with the C Power assembler (ASSM) source code
and the C Power reverse assembler (RA) source code:

There are 5 distinct parts to the object file; each part begins with a 2 byte
count in standard 6502 low-byte/hi-byte format.  The 5 parts directly follow
each other in the following order:

	1. relocatable object code
	2. relocation entries
	3. external definition entries
	4. external reference entries
	5. uninitialized data block entries

I will describe each part in detail:

1. relocatable object code:
	The first 2 bytes are a byte count of the object code to follow.  What
	follows is simply the generated object code for the corresponding source
	code file.  For those instructions and .byte or .word pseudo ops which
	reference relocatable addresses within a function (typically for local
	jumps), the value in the operand field is the offset relative to the
	first word of object code.  For those instructions and .byte or .word
	pseudo ops which reference externally defined addresses, the value of
	the operand field is irrelevant and typically filled in by the compiler
	with bytes which duplicate the byte which immediately preceeds them.
	This part ends when the number of bytes of object code specified in the
	count have been encountered.

2. relocation entries:
	The first 2 bytes are a count of the number of relocation entries to
	follow.  Each relocation entry is exactly 2 bytes long and consists of
	an offset relative to the first byte of object code.  This offset 
	actually points to the byte before a 2-byte address which is to have
	added to it the absolute address of the first word of object code; that
	is, for 3-byte instructions, this offset points to the op-code preceed-
	ing an address to be relocated; for 2-byte addresses without an op-code
	(e.g. .word pseudo ops), the offset points to the byte before the
	address to be relocated.  1-byte addresses to be relocated (e.g. >addr
	or <addr) are not handled by this relocation mechanism, but rather as
	pseudo extdefs/extrefs (see below).  This part ends when the number of
	2-byte relocation entries specified in the count have been encountered.

3. external definition entries

	The first 2 bytes are a count of the number of external definition
	entries to follow.  Each extdef entry is a variable number of bytes
	long.  First appears the externally defined name, terminated with a
	zero byte.  Next is a 1-byte flag; if this flag is 0, the externally
	defined symbol has an absolute value; if this flag is a 1, the external-
	ly defined symbol has a relocatable value relative to the first byte
	of object code.  Finally, the last 2 bytes of each entry are the
	absolute value of the (absolute) symbol, or the offset of the (reloc-
	atable) symbol.  Whenever the compiler must reference only the low or
	high byte address of a local piece of static data (e.g. a string
	literal), that datum is given a "pseudo" external definition; that is,
	the compiler makes up a name for it consisting of several randomly
	generated special characters and additional identifier characters,
	then treats it as if it were an external definition.  This is done so
	that it may be referenced by an external reference entry to follow.
	This part ends when the number of multi-byte extdef entries
	specified in the count have been encountered.

4. external reference entries:
	The first 2 bytes are a count of the number of external reference
	entries to follow.  Each extref entry is a variable number of bytes
	long.  First appears the externally referenced name, terminated with
	a zero byte.  Next is a 2-byte word in low/hi format; the low 2 bits
	of this word indicate if this external reference is to a full 2-byte
	address (flag=0), a single byte to contain the high byte of the address
	(flag=1), or a single byte to contain the low byte of the address (flag=
	2).  The upper 14 bits of this word are an offset into the external
	object.  Finally, the last 2 bytes of each entry are the offset of the
	external referencing instruction (points 1 byte before the external
	address reference itself) relative to the first byte of object code.
	In the case of references to "pseudo" external definitions, the
	reference will be resolved by the matching external definition, and
	the flag will always be either 1 or 2.  This part ends when the number
	of multi-byte extref entries specified in the count have been
	encountered.

5. uninitialized data block entries:
	The first 2 bytes are a count of the number of data block entries
	to follow.  Each data block entry is a variable number of bytes
	long.  First appears the data block name, terminated with a zero
	byte.  Lastly is a 2-byte size, representing the number of data
	bytes to be reserved by the linker (and zeroed by the run-time
	initialization code) for that named data block.  These entries are
	used to represent uninitialized static or external data to prevent
	large object modules filled with nothing but zeros.  Since the
	data block names are effectively externally defined, dummy "pseudo"
	extdef names are again created when local static data is to be
	allocated as an uninitialized data block.  The purpose of these
	randomly generated dummy names is to clue the linker that these are
	not real external definitions, and to prevent external name conflict
	with identically named local data in other object modules.  This part
	ends when the number of multi-byte data block entries specified in the
	count have been encountered.  At this point the object file is at
	end-of-file.

I hope the above is a reasonably complete and useful description of C Power
object code.  Refer also to the Transactor (March 1988) article "The link
between C and assembly", which is enlightening, though incomplete.

Also note that ASSM faithfully adheres to the above format so that ASSM
generated object files may directly be linked with those
generated by C Power; however, ASSM uses a different algorithm for the
generation of pseudo extdef names, attempting to make those names more
readable without sacrificing their uniqueness.

Sincerely,
Frank Prindle
Prindle@NADC.arpa

----------------------

Good Luck,
Mat

-- 
W Mat Waites
gatech!emcard!mat
8-5 EDT phone: (404) 727-7197