[net.micro.pc] .COM files

ali@bradley.UUCP (08/26/84)
#N:bradley:900004:000:3796
bradley!ali    Aug 12 23:16:00 1984

[]

    I'm not sure how familiar you are with the structure of COM files but
basicaly a COM file is a memory image of an executable code with or without
data fields. The loader portion of the DOS performs the following before
giving control to the module:
  [] a Program Segment Prefix is built and the appropriate parts of that
     prefix is initialized(such as FCB1, FCB2, the INT 20H instruction at
     offset 0, this is covered in detail starting on page E-3 in the DOS
     manual)
  [] All the general purpose registers are set to zero, all the segment
     registers are set to the segment address of the PSP(Program Segment
     Prefix)
  [] The Instruction pointer(IP register) is set to 100H.
  [] The COM file is read in at location 100H(this is a straight load
     without any modification, editing or relocation)
  [] a word of zero is pushed on the stack(since SP is zero from step 2,
     the zero word is actually now at the very end of the segment. The 
     zero word serves as a return address to location zero which contains
     the instruction INT 20H)
  [] finally  control is transfered through a long jump to PSP:100H
     (this is how CS and IP are set)

    It is very common to make the first three bytes in a COM file a jump
to the program start address(this is done by the programmer and not by the
loader or any other part of the system, usually to make it possible to
put data items before they are refernced in the code, this way the assembler
can generate the correct code knowing the size of the data items)
    Creating COM files is restricted by the following:

  1) The program may not contain any DATA segments ( you can change the DS
     later in your code to point to anything you want, but the header of
     EXE file produced by the linker must not contain a DATA segment)
  2) there must not be a stack segment (this is trivial since a stack will
     be setup by the loader, and there is really no need for the program
     to have a seperate stack unless the program is very large, in which
     case you can relocate the stack during the initialization part of your
     program)
  3) The entry point must be at 100H. If you did not specify an entry point
     using
             END     entry-point
     statement in your program, in which case the linker assumes entry point
     at CS:0 and sets the entry point address in the EXE file header to that,
     so exe2bin will perform a simple conversion to produce a COM file. If on
     the other hand you did specify an entry point then you better use the
             ORG     100H
     statement to move the program up, since exe2bin will delete the fisrt
     100H locations while converting this EXE file to a COM one and it
     better not be part of your program.

    In any case try something close to the following format:

    CODE     SEGMENT
             ASSUME  CS:CODE
             ORG     100H

    BEGIN    PROC    NEAR
             JMP     INIT
    BEGIN    ENDP

    MSG      DB      'Hello world.',0DH,0AH,24H

    INIT     PROC    NEAR
             MOV     DX,OFFSET MSG
             MOV     AH,9
             INT     21H
             RET
    INIT     ENDP

    CODE     ENDS
             END     BEGIN

   ofcourse you can ommit the PROC ENDP stuff and use labels followed by
colons. Also you can name the segment so it will be combined with other
segments of the same name in a separately assembeled program but I suggest
you make it relocatable on paragraph boundary.

    CODE    SEGMENT PARA PUBLIC 'name'

   I hope this will help you write COM programs, and I'm sorry for length
of this response, reading it back I noticed that I have included a lot of
information that is already available in the DOS manual.

good luck.

 Ali Ezzet    {ihnp4,uiucdcs}!bradley!ali