[comp.lang.c++] COFFing

lewie@EE.ECN.PURDUE.EDU (Jeff Lewis) (05/06/88)

    Steve Simmons sez:
    
    One of the "problems" with gcc/g++ and gas on some systems is the lack
    of support for coff.  I'm starting a look at hacking gas to output coff
    format files.  Anyone out there who has started this already or has
    advice/warnings, please drop me a note.
		
    Note that gdb/gdb+ would undoubtably gain new uses should anyone want
    to start a parallel project.

I already have a working gas that produces COFF format object files
(68000 version).  It's shortcoming is that it doesn't yet accept the
debugging directives.  I'm currently working on that.

There are several options that come to mind for doing this.

1. I could have gcc output COFF-style directives and be nice and
compatible with the system's assembler (using SGS assembler format
all the way).  This way all you need to get running is the compiler.

2. I could bypass the system's assembler, output motorola format
(or whatever it is that gas accepts), and hack gcc and gas to
output/accept COFF-style directives (with .'s in front of them!).
Need to get both the compiler and the assmebler going.

3. I could leave gcc be and derive the COFF symbol entries from
the dbx symbol entries (gack! what a pain!).

4. What the heck, keep it BSD all the way to the loader.  You have
incompatible .o's, but at least you didn't need to hack on gas.
This would also simplify gcc wrt only knowing one assembler style.
Loader, however, would have to understand both BSD and native
formats, unless you wanted to translate your system libraries.

5. Go BSD all the way to the a.out.  Loader would need to stick something
on the front of an a.out to make the OS grok it.  You'd also need to
translate your system libraries into BSD format.  Gas could be pretty
much the same for any given processor and ld could be pretty much
the same across all systems.  RMS suggested this.

6. We could change gcc to generate a more generic symbol info output
from which you could easily generate either dbx or COFF output.
Assembler would have to know about different object formats,
but it would be in well-contained modules due to the more generic
nature of the input.  You'll need gcc & gas, but can use the native loader.

7. We could change gcc to instead of generating assembler ascii
output, simply transform the rtl into a machine specific rtl
that contained opcodes and addressing modes for the target machine,
retaining all symbol, file and line info.  This could be dumped
raw into a 'gnu object file' and later fed into an assembler/linker
for generation of relocatable or executable object files.
This approach avoids the assembler encoding/decoding step,
and makes it easy to generate whatever object file format you
want in the end.

I'm currently working on something along the lines of #6, but would
like to hear what everyone thinks about this.  I kinda lean toward
some variation on #7, though (as if you couldn't guess).

--Jeff Lewis (lewie@ee.ecn.purdue.edu, pur-ee!lewie)

P.S.  I anyone's interested in my current version of gas, let me know.
The differences are fairly well confined to write.c with several
niggling changes in various places.

lewie@EE.ECN.PURDUE.EDU (Jeff Lewis) (05/06/88)

Re:  making gas write COFF format .o's

[rms] I think this is not the right way to support GCC on sysV.  It would be
[rms] better to make sysV look more like a GNU system.
 
[rms] This involves two steps:

[rms] * Write a translator which will convert a library object module from
[rms] COFF to BSD format.  This isn't terribly hard because there is no
[rms] debuggin information.

It's not terribly hard but it unecessarily adds to system maintenance.

[rms] * Modify GNU ld to add a minimal amount of stuff to the front of a
[rms] BSD format file to make it executable by the kernel.  The symbol table
[rms] will remain in BSD format.  This should be very little change.

You'll find the COFF format header to be very similar to what you describe.
It contains a file header followed by an a.out header (similar, but not
exactly like the BSD one).  You might as well be writing the COFF header.

The only complaint I have with sticking to BSD format is that COFF is
a maturation of the standard a.out format.  For example, it's designed to
hold debugging info in a structured manner - BSD hacks it in by encoding
types into a symbol name.

[rms] * Modify GDB to read such files.  This should be very little change.

Fortunately, someone's already written a COFF reader for gdb so there's
no change needed.

[rms] Now you can replace all the software for compilation, etc., with the
[rms] GNU software.

I understand perfectly your desire to stick with what you already have,
but a) it's not that difficult to support COFF, and b) you'll probably
end up improving on the loader interface later - current BSD format won't
support you into the future - why not do it now?

As per my previous letter, I should elaborate on my sketchy thoughts:

Re: Method #6 (have gcc output generic symtab info)

    RMS didn't like this 'cause it would add complexity to gcc/gas
and would require *another* format for debugging info.
    I intended this for a system where you get the compiler *and* the
assembler - that way you only need to support one output format.
In reality, you're gonna want to support BSD.  Gcc currently doesn't
support COFF anyway, so you piggyback COFF onto this format.  This
generic format *would be* the gdb format, so you merely have to extend
this to cover all your bases.  All in all you end up with only two
debug formats to support.
    It wouldn't add complexity to gcc as per above.  It would add
complexity to gas, I can't argure with that.  I think it would
be worth it - and I am willing to write the code.  As a plug, I write
fairly maintainable code.

Re: Method #7 (bag the traditional compiler/assembler/loader cycle)

    I was very unclear on this one.  What I really envisioned was writing
out pseudo object files directly from the compiler.  This wouldn't be
any more difficult than the current technique: instead of formatting
text for output, you'd just stick opcodes and addresses into a compact,
fixed format structure - essentially just what an assembler might
construct as it was scanning the input text.  All I'm suggesting is
ripping out all the steps in the middle.  The compiler dumps this pseudo
object file, and the 'assembler'/linker (there'd be no need to keep
the steps separate) would just suck in the 'object' file and do
all the final relocating and instruction optimizing.  You could have
this linker generate relocatable output if you needed an object
file understandable by a standard loader - or just put it
all together and make executables.

    This is all just skylarkin' for now anyway - but I would like to
hear what other people think about it.

--Jeff Lewis (lewie@ee.ecn.purdue.edu, pur-ee!lewie)

bob@acornrc.UUCP (Bob Weissman) (05/06/88)

Would somebody please tell me what COFF is?

Sorry, I have only BSD experience.

Thanks,
Bob Weissman
Internet:	bob@acornrc.uucp
UUCP:		...!{ ames | decwrl | oliveb | pyramid }!acornrc!bob
Arpanet:	bob%acornrc.uucp@ames.arc.nasa.gov