[comp.sys.ibm.pc] Converting .EXE to .COM

bright@Data-IO.COM (Walter Bright) (01/17/89)

In article <16473@iuvax.cs.indiana.edu> bobmon@iuvax.cs.indiana.edu (RAMontante) writes:
>Help, somebody.  Occasionally I compile a program in in TurboC's Tiny
>model, for subsequent conversion from a .exe to a .com file.  Then I use
>'exe2bin' or a similar program to do the conversion.
>The thing is, sometimes this conversion fails with a message sort of
>like "Program has relocatable segments, can't convert."  Nowhere in any
>of the documentation I have, can I find a statement of what this means,
>or what exe2bin/exe2com are looking for.  Or what they're actually
>doing, for that matter.

The trick to understanding .COM files is knowing how they're loaded and
run by DOS:

	.EXE file:
	1. All memory is allocated to the program.
	2. There is a header at the beginning of the file which
	   contains a bunch of stuff.
	3. The PSP is built for it.
	4. The load image portion of the EXE is copied into memory
	   after the PSP.
	5. The relocation table is read from the EXE file, and relocation
	   is performed. The table consists of a sequence of 4 byte
	   paragraph/offset pairs which are offsets into the EXE load
	   image. To each word the offset points to is added the segment
	   value of the PSP.
	6. DS and ES are set to the PSP segment.
	7. SS, SP, CS and IP are set to values specified by the header,
	   and the program is jumped to.
	.COM file:
	1. All memory is allocated to the program.
	2. The PSP (Program Segment Prefix) is built for it.
	3. The COM file is copied into memory after the PSP.
	   The COM file must be < (64k - 100h) in length.
	4. The COM file is read in to memory following the PSP.
	5. SP is set to the end of the program segment, and a word
	   of 0's is placed at the top of the stack.
	6. ES,DS,SS,CS are all set to the segment of the PSP, IP is
	   set to 100h, and execution starts.

Interesting points:
	1. There is no relocation table in .COM files, so if EXE2BIN
	   tries to convert a .EXE file with a relocation table in
	   it, it fails. Also, in order to EXE2BIN successfully, the IP
	   specification in the header must be 100h.
	2. Relocation table entries are generated whenever a program
	   explicitly references a segment value, as in a far function
	   call, or in a statement like:
		MOV AX,DGROUP	;load segment of DGROUP
	3. Contrary to popular belief, .COM files are not *required* to
	   have SS=DS=CS. They're only required to *start out* that way.
	   Zortech C exploits this fact to enable .COM programs it
	   generates to have DS != CS. Thus, the only restriction
	   is that ((sizeof code) + (sizeof static data) < 64k). The
	   near heap can expand to (64k - (sizeof static data)),
	   regardless of the size of the code portion. Some other
	   compilers restrict the near heap size for COM programs
	   to (64k - (sizeof COM file)).
	4. There are .COM files under OS/2. Closer examination, however,
	   reveals that they're just .EXE files with the extension
	   renamed! ARRRRGGGHHH! Belated point: why didn't Microsoft
	   invent a new extension (like, say, .PRG) for OS/2 only
	   executables? It would make life much easier, since you
	   could do a dir and see which programs were OS/2 only.
	   You could then have OS/2 and MS-DOS versions of the same
	   executable existing in the same directory. Also, by
	   definition, OS/2 programs couldn't be run under MS-DOS 'cuz
	   the loader wouldn't recognize the .PRG extension!

For further reference see the MS-DOS Tech Ref. It's also interesting
if you can obtain source code for a program loader.

** Turn up the volume, and rip off the knob! **

bobmon@iuvax.cs.indiana.edu (RAMontante) (01/17/89)

bright@dataio.Data-IO.COM (Walter Bright), in <1827@dataio.Data-IO.COM>:
	[ much good information, including... ]
-
-[.EXE files]
-	6. DS and ES are set to the PSP segment.
-
-[.COM files]
-	6. ES,DS,SS,CS are all set to the segment of the PSP, IP is
-	   set to 100h, and execution starts.
-
-For further reference see the MS-DOS Tech Ref. It's also interesting
-if you can obtain source code for a program loader.

Are the points 6, above, guaranteed, then?  Does this Tech. Ref. talk
about stuff like that?  Where can those of us who don't talk to IBM find
such things?
--
"Intel architectures build character."  "Segments are for worms."  "Feh."

bright@Data-IO.COM (Walter Bright) (01/21/89)

In article <16532@iuvax.cs.indiana.edu> bobmon@iuvax.cs.indiana.edu (RAMontante) writes:
<bright@dataio.Data-IO.COM (Walter Bright), in <1827@dataio.Data-IO.COM<:
<-For further reference see the MS-DOS Tech Ref. It's also interesting
<-if you can obtain source code for a program loader.
<Does this Tech. Ref. talk
<about stuff like that?  Where can those of us who don't talk to IBM find
<such things?

I never talk to IBM. My source is "Disk Operating System, Technical
Reference", sold by IBM. I bought it along with PC-DOS. I've seen it
documented elsewhere, also. Just go to your local bookstore, which is
usually jammed with books on MS-DOS, and flip through them till you find
it. I've found Tower Books to have a particularly wide selection of such
books, also university book stores.

The info is not a secret, and isn't that hard to come by.