anton@postgres (Jeff Anton) (01/07/88)
Would some kind sole tell me or point me at a definition of the format of .TOS, .TTS, and .GEM files? I'm working on a porting the gnu C compiler to generate atari objects. I'm not attempting to have the atari run gnu cc. Please send responces to me by mail. Also, a pointer to a way to avoid ever starting gem when booting would be nice. Jeff Anton
dag@chinet.UUCP (Daniel A. Glasser) (01/09/88)
In article <81@pasteur.Berkeley.Edu> anton@postgres.berkely.edu writes: >Would some kind sole tell me or point me at a definition of the >format of .TOS, .TTS, and .GEM files? I'm working on a porting the gnu >C compiler to generate atari objects. I'm not attempting to have the >atari run gnu cc. Please send responces to me by mail. Also, a pointer >to a way to avoid ever starting gem when booting would be nice. > Jeff Anton Well, I did reply to Jeff by mail, though I'm unsure of the path that was used, however, I believe that this will be of interest to the Atari ST community on the net. Therefore, here is the information from the mail that I sent to Jeff. I'd suggest filing it and reading it at your leasure. [.......................... Cut Here .....................] Now for the information at hand -- GEMDOS executable file format: The GEMDOS executable file consists of a header, text segment, data segment, symbol segment and relocation segment. Conventions used: USHORT = 16 bits unsigned ULONG = 32 bits unsigned LONG = 32 bits signed GEMDOS header: struct gemhdr { USHORT gh_magic; /* always == 0x601A */ ULONG gh_tsize; /* size of text segment */ ULONG gh_dsize; /* size of data segment */ ULONG gh_bsize; /* size of bss segment */ ULONG gh_ssize; /* size of symbol segment */ ULONG gh_reserved[2]; /* 2 longs, "must" be 0 */ USHORT gh_reserve; /* Reserved, " " */ }; The text and data segments must have even length. The magic number resolves into a "BRA .+1C" instruction. The TEXT segment: This section of the file starts immediatly following the header in the file (offset 0x1C). All relocatable references must be long and word-aligned. The GEMDOS program loader adds the address of the beginning of the TEXT segment to all relocation references. [see relocation format, below] After the program is loaded, execution begins at the first word in this segment. The DATA segment: This section of the file starts immediatly following the TEXT segment. (offset 0x1C+text_seg_size). Relcatables are TEXT segment based, and relocated in the same manner and with the same restrictions as the TEXT segment ones. The BSS segment: This is not stored in the file. It is allocated at runtime by GEMDOS (actually, by the runtime startup.) The SYMBOL segment: This contains symbol table information for DRI linker style executables. For Mark Williams C style executables pre 3.0, this segment is empty. As of 3.0, this segment holds debug and symbol table information in Mark Williams format. Write or call Mark Williams for more information on this. The DRI symbol table format is: struct drisym { char ds_name[8]; /* null padded ident. */ USHORT ds_type; /* symbol type flags. */ LONG ds_value; /* signed 32 bit value. */ }; Type flags for the ds_type field: DEFINED 0x8000 defined symbol EQUATED 0x4000 equated symbol GLOBAL 0x2000 global symbol EQUATED_REG 0x1000 equated register EXTERNAL_REF 0x0800 external reference DATA_RELOCATE 0x0400 data based relocatable TEXT_RELOCATE 0x0200 text based relocatable BSS_RELOCATE 0x0100 bss based relcatable RELOCATION (fixup) segment: The relocation segment begins with a longword that specifies the offset into the text segment of the first relocatable reference. This is followed by a stream of unsigned bytes that specify the next fixup. A zero byte flags the end of the relocation table, a value of 1 means add 254 to location counter and fetch the next byte, any other even value is added to the location counter and the longword at that address is relocated. All other odd valued bytes are reserved for future use. If the initial longword is 0, there is no relocation information. Loading and relocating: When GEMDOS loads a program file through Pexec() it goes through the following steps: The largest segment of free memory is allocated to the process and a prototype basepage is built at the beginning of it. The program file header is read and the basepage is filled in. The text and data segments are then read into memory. The symbol segment is skipped and the first longword of relocation information is read into the fixup location pointer. If this value is 0, the system proceeds with the action specified by the Pexec() mode. Otherwise, the address of the text segment is added to the fixup location pointer and the longword at that address has the address of the text segment base added to it. The loader then reads the relocation stream until it gets a 0 byte, if a byte is 1, 254 is added to the fixup location pointer and the next byte is read from the relocation stream; any other even non-zero byte is added to the fixup location pointer and the longword at that location has the base address of the text segment added to it. Once a zero byte is encountered in the relocation stream, Pexec() proceeds to either set up for execution (loading appropriate registers with appropriate values and then jumping to the first location in the text segment) or returns the basepage address to the caller. Notes on fixup generation: o All relocatable values must be word aligned longwords. o Relcation references to text segement locations are zero-based. Data is loaded directly after the text segment, and relocation references to data segment locations are based off of the text segment base, thus a relocatable reference to the first data location would be the text segment size. BSS is "loaded" immediately following the data segment, thus BSS references are the offset into the bss segment added to the combined size of the data and text segments. This makes one-pass linking difficult. Other comments: o All ROM versions of GEMDOS released so far have a bug which prevents loading of programs with relocation segments > 32K bytes. Atari has confirmed this bug and says it will be fixed in the next version of GEMDOS. The date this was written is 8-January-1988. o You will need references to GEMDOS to understand the basepage and other GEMDOS issues. I recommend the Mark Williams C manual for much of this information. o It is up to the runtime startup code, which must begin in the first byte of the text segment to set up the program stack and free unneeded memory back to the system. Look at just about any vendors runtime startup module source for how this works. o Some documents have the header wrong, listing 3 reserved longwords at the end. There are only 2, plus a short. Disclaimer and plea: I work for Mark Williams Company. I am responsible for much of the Mark Williams C compiler package for the Atari ST, and have made considerable contributions to the documentation for that package. Despite this, the material contained in this message is not a Mark Williams Company product. All opinions in this message are my own and have not been cleared by my employer. Therefore, the information contained in this message is presented without warranty. Any errors in content, grammer or spelling are my own. Please don't call Mark Williams Company about this information, send me mail to one of the addresses in my signature or write me at 6030 N. Kenmore Ave., Apt. 512, Chicago, IL, 60660. Final words: I hope the above answers the questions... -- Daniel A. Glasser ...!ihnp4!chinet!dag ...!ihnp4!mwc!dag ...!ihnp4!mwc!gorgon!dag One of those things that goes "BUMP!!! (ouch!)" in the night.
michael@garfield.UUCP (Mike Rendell) (01/12/88)
In article <81@pasteur.Berkeley.Edu> anton@postgres (Jeff Anton) writes: >Would some kind sole tell me or point me at a definition of the >format of .TOS, .TTS, and .GEM files? I'm working on a porting the gnu >C compiler to generate atari objects. I'm not attempting to have the >atari run gnu cc. Please send responces to me by mail. Also, a pointer >to a way to avoid ever starting gem when booting would be nice. > Jeff Anton I tried to reply via mail but someone claims not to know about your site. Maybe this header will be of some help to you: ] Delivery-date: Mon, 11 Jan 1988 15:44:43 UTC-0330 ] Originator: ucbvax.Berkeley.EDU!MAILER-DAEMON@uunet.uucp ] Send-date: Mon, 11 Jan 1988 14:12:42 UTC-0330 ] From: <ucbvax.Berkeley.EDU!MAILER-DAEMON@uunet.uucp> ] To: <garfield!michael.uucp> ] Subject: Returned mail: Service unavailable ] ] ----- Transcript of session follows ----- ] >>> RCPT To:<postgres!anton@pasteur.berkeley.edu> ] <<< 554 <postgres!anton@pasteur.berkeley.edu>... UUCP host name postgres not re ] cognized at this site ] 554 <pasteur!postgres!anton>... Service unavailable Anyway, the reason I am repling is that I have already done what you are intending to do. Some minor changes needed to be made to gcc and gas - these included stuff to tell gcc that ints were 16 bits (there are problems with passing arguments for bios calls otherwise, also much faster) and some changes to get gas to dump .o files that are usable on the sun (so I could adb them there...). Other stuff that was needed was a program to convert unix a.out executables to gem format, long multiplication/ division/modulo routines for gcc, and of course libc (which is still under construction). If you want some/all of the stuff I have done just send a note (I hacked 4.3bsd (vax) ld/strip/size/nm/ranlib so I can't just send them to you - maybe diffs if you have a source licience?) The only thing that is really missing is floating point routines - any sugestions as to PD versions (in C or assembler) for these would be helpful. Mike Rendell Department of Computer Science michael@garfield.uucp Memorial University of Newfoundland uunet!garfield!michael St. John's, Nfld., Canada (709) 737-4550 A1C 5S7
apratt@atari.UUCP (Allan Pratt) (01/12/88)
First, thanks to Daniel Glasser for his posting. There are one or two things I want to clarify, though... in article <2082@chinet.UUCP>, dag@chinet.UUCP (Daniel A. Glasser) says: > ... All relocatable references must be long and word-aligned. Clarification: all relocatable references must be longwords and must be word-aligned. (Another reading of the above sentence is, "They must be longword-aligned and word-aligned.") > The BSS segment: > > This is not stored in the file. It is allocated at runtime > by GEMDOS (actually, by the runtime startup.) The BSS segment is allocated by GEMDOS. If you ask for 32K of BSS, your program will get 32K of BSS (as you can see by checking your basepage). What gets set up by the runtime startup is the HEAP, which is the space between the end of your declared BSS and your initial stack pointer. It is the size of the HEAP that you set when you assemble GEMSTART (for instance). > The SYMBOL segment: > > Type flags for the ds_type field: > > DEFINED 0x8000 defined symbol > EQUATED 0x4000 equated symbol > GLOBAL 0x2000 global symbol > EQUATED_REG 0x1000 equated register > EXTERNAL_REF 0x0800 external reference > DATA_RELOCATE 0x0400 data based relocatable > TEXT_RELOCATE 0x0200 text based relocatable > BSS_RELOCATE 0x0100 bss based relcatable There are some more types than this: 0x0080 means "FILE" and is used by the linker (well, by ALN, at least, and possibly LO68) to show where a file starts. (The symbol name is the file name, and the symbol value is the address of the start of the text segment of that file (even if it doesn't have anything in the text segment)). ALN also uses the next bit, 0x0040, to mean "ARCHIVE" -- this is an ALN-specific extension, and is only used in conjunction with FILE. The start of an archive is marked with a symbol of type ARCHIVE FILE, where the symbol name is the archive name. The end of the archive is marked with a symbol of type ARCHIVE FILE with NO name (all nulls). Thanks again to Dan Glasser for this posting. ============================================ Opinions expressed above do not necessarily -- Allan Pratt, Atari Corp. reflect those of Atari Corp. or anyone else. ...ames!atari!apratt