[comp.lang.modula2] Modula-2 "object" format for def'n module?

josef@nixdorf.de (Moellers) (10/29/90)

Hi,
I am currently busy to write my own Modula-2 compiler.
Although progress is very slow, I would like to know if someone could
give me any hints on the format of compiled "DEFINITION MODULE"s.


--
| Josef Moellers		| c/o Siemens Nixdorf Informatonssysteme AG |
|  USA: mollers.pad@nixdorf.com	| Abt. PXD-S14				    |
| !USA: mollers.pad@nixdorf.de	| Heinz-Nixdorf-Ring			    |
| Phone: (+49) 5251 104662	| D-4790 Paderborn			    |

a665@mindlink.UUCP (Anthon Pang) (10/31/90)

josef@nixdorf.de writes:
> Hi,
> I am currently busy to write my own Modula-2 compiler.
> Although progress is very slow, I would like to know if someone could
> give me any hints on the format of compiled "DEFINITION MODULE"s.

Compiled DEFINITION MODULES?  Oh...you mean .SBM or symbol files...

I don't believe there is a universal standard for symbol files...ie
non-portability between different compilers.  But as a "hint", your symbol
files would contain a key (for revision control), the names of identifiers,
flags or sections for constants, types, and exported variables.  The values of
constants and the types of TYPEs & VARs.  And, don't forget
PROCEDURES...parameters passed, values returned, and their types.  [whew]  Good
luck!

v056ped5@ubvmsd.cc.buffalo.edu (Brian M McNamara) (11/01/90)

In article <josef.657189394@peun11>, josef@nixdorf.de (Moellers) writes...
>Hi,
>I am currently busy to write my own Modula-2 compiler.
>Although progress is very slow, I would like to know if someone could
>give me any hints on the format of compiled "DEFINITION MODULE"s.
>| Josef Moellers		| c/o Siemens Nixdorf Informatonssysteme AG |


I may be wrong, but I always thought of DEFINITION MODULE's as a text
interface to compiled code. If I am wrong would someone drop me a line
on how exactly you would use a compiled DEF MOD? Also, anyone who uses
FST....could you drop a line on how to use FOREIGN DEFINITION MODULES?
I am trying to interface some C graphics routines, but the compiler won't
even except the DEF MOD much yet accept the OBJ code.

Thanx


Brian

lins@Apple.COM (Chuck Lins) (11/01/90)

In article <43710@eerie.acsu.Buffalo.EDU> v056ped5@ubvmsd.cc.buffalo.edu writes:
>In article <josef.657189394@peun11>, josef@nixdorf.de (Moellers) writes...
>>Hi,
>>I am currently busy to write my own Modula-2 compiler.
>>Although progress is very slow, I would like to know if someone could
>>give me any hints on the format of compiled "DEFINITION MODULE"s.
>>| Josef Moellers		| c/o Siemens Nixdorf Informatonssysteme AG |
>
>
>I may be wrong, but I always thought of DEFINITION MODULE's as a text
>interface to compiled code. If I am wrong would someone drop me a line

Symbol files are usually a much more compact format than straight source text.
There is an article in IEE Software (Sept or Nov 87/88) by Jurg Gutknecht
describing the mechanism. You'll have to add the fine details to get it
working, but the article provides enough information to get the main logic
down.


-- 
Chuck Lins               | "Is this the kind of work you'd like to do?"
Apple Computer, Inc.     | -- Front 242
20525 Mariani Avenue     | Internet:  lins@apple.com
Mail Stop 37-BD          | AppleLink: LINS@applelink.apple.com
Cupertino, CA 95014      | "Self-proclaimed Object Oberon Evangelist"
The intersection of Apple's ideas and my ideas yields the empty set.

gkt@iitmax.IIT.EDU (George Thiruvathukal) (11/01/90)

In article <josef.657189394@peun11>, josef@nixdorf.de (Moellers) writes:
> Hi,
> I am currently busy to write my own Modula-2 compiler.
> Although progress is very slow, I would like to know if someone could
> give me any hints on the format of compiled "DEFINITION MODULE"s.

I have never written a compiler for Modula-2 but can perhaps shed some light
on how one might implement DEFINITION modules.  There are two approaches which
have been used by the various Modula-2 compiler implementors:

  1. When an IMPORT is encountered, parse the DEFINITION module for its symbols
     and incorporate the symbols into the symbol table for the current module
     (or unit) of compilation.

  2. When an IMPORT is encountered, look for a symbol file corresponding to the
     DEFINITION module and load the symbols into the symbol table for the
     current module (or unit) of compilation.

Each of these implementation strategies has its merits and shortcomings.  Let
us examine strategy #1.  A parse of the DEFINITION module is equivalent to the
notion of an include file in C.  If a number of modules in a given project 
import objects from a common module, the compiler has to parse the same text
over and over again (as occurs in C in a multiple "module" project).  For large
DEFINITION modules, the compile time for the entire project can be long.  The
advantage of the treatment of DEFINITION modules as include files is simplicity.
From the standpoint of the compiler, the state of the scanner and the parser 
must merely be pushed onto a stack (a simple context switch) and restarted with
the new file (the DEFINITION module).  It is trivial, which is why many 
compiler vendors implement it.  Frequently, the vendors of Modula-2 compilers
argue from the standpoint that processors are getting faster and memory much
cheaper, so why should we bother to follow another avenue?

Strategy #2 obviously remedies the potentially long compilation times for real
world projects.  The first time an import of a module occurs, a symbol file is
sought (with the same prefix as the module).  If it is not found, the DEFINITION
module is parsed for its declarations and translated into a symbol file.  The
symbol file can be virtually any format, but it would make sense for the format
to be an "image" form which could be rapidly read into the symbol table for the
current unit of compilation.  In the event the symbol file is found, the symbol
file is either up-to-date or out-of-date.  If the symbol file is out-of-date, 
the symbol file must be produced according to the previous discussion, as if 
the symbol file was not present; otherwise, the symbol file is read into the
symbol table for the current unit of compilation.  Obviously, the idea of 
symbol files requires more work (and creates more clutter) but can pay big 
dividends for large projects (which is real world stuff) where files included
can be common to many modules and can be very large.  It is not so trivial to
implement, and it certainly involves an implementation of strategy #1.  One
can circumvent the partial implementation of strategy #1 by leaving it to Joe
User to "compile" every DEFINITION module into a symbol file.  But why should
the user have to do "compiler" work?

Well, I hope I have adequately defined the working theory for the management
of DEFINITION modules in Modula-2 compilers with which I have worked.  If I
were doing a compiler for Modula-2, I would opt (as a maverick) for strategy
#2.  Although our computing resources are apparently infinite these days, I
am under the impression that one should always endeavor to design software
with these guidelines:
   1. it should work efficiently
   2. it should use the resources well and avoid waste!
   3. it should not be user-hostile (it should do the dirty work)
   4. it should be modular

Thank you for reading.  I am elated to notice some topics of substance (and
not Satan) emerging in our treasured newsgroup.

George Thiruvathukal
gkt@iitmax.iit.edu

eepjm@cc.nu.oz.au (11/02/90)

In article <josef.657189394@peun11>, josef@nixdorf.de (Moellers) writes...

>I am currently busy to write my own Modula-2 compiler.
>Although progress is very slow, I would like to know if someone could
>give me any hints on the format of compiled "DEFINITION MODULE"s.
 
In article <43710@eerie.acsu.Buffalo.EDU>, v056ped5@ubvmsd.cc.buffalo.edu
(Brian M McNamara) replies: 

> I may be wrong, but I always thought of DEFINITION MODULE's as a text
> interface to compiled code. If I am wrong would someone drop me a line
> on how exactly you would use a compiled DEF MOD? Also, anyone who uses

I haven't decoded the format used by any of the compilers available at
present, so can't give a PRECISE answer to the first question, but I think
I can clear away some of the confusion.  Here goes ...

How is a definition module used (by the compiler)?  Obviously, when the
compiler hits an IMPORT declaration.  As v056ped appears to have assumed,
one could *in principle* read the definition module at that point, but it
would be unreasonably inefficient (especially if you import from a lot of
modules, which often happens in Modula-2 programming. Now if only somebody
could invent a super-module or module of modules ... but that's another
story).  So in practice we do it this way:

1. User compiles the definition module, and from this the compiler creates
   a "symbol file".  The compiler which I use most of the time (FTL) calls
   this the .LMS file, many others call it the .SYM file.  In any case, it
   is quite distinct from the "object file" which is produced when
   compiling an implementation module.
2. Whenever the compiler hits an IMPORT, it reads in the appropriate
   symbol file and adds the information in it to its internal symbol
   tables.  It's better to read the symbol file rather than the original
   definition module file at this point because:
    (a) The definition module file is full of lots of redundant stuff
        like comments, and it would slow the compiler down to have to
        process all of that again.  The symbol file is the same
        information in a more compact form.
    (b) The compiler designer can, if he/she wishes, design the symbol
        file format so that it is an exact image of the symbol table which
        the compiler is going to want to set up in memory as a result of
        processing the IMPORT declaration.

Point 2(b) is sort of an answer to the original question.  Think of it in
terms of a top-down design: what information is the compiler going to want
to access when it encounters an imported name in, for example, a procedure
call?  That tells you something about how to design the symbol table(s) in
memory.  Once you have that design right, that tells you what you need as
information in the symbol file, and how to format it.  Consequence: actually,
you don't even need to design the symbol file format until you have already
written a lot of your compiler and have reached the point of thinking about
type checking and code generation and things like that.  In my opinion,
this needs-driven approach to design is far superior to the approach most
people seem to use, which is (a) decide on a format for the symbol file;
(b) much later, while writing various parts of the compiler, try to
find ingenious methods for accessing the information held in a
badly-designed symbol file.

One final point: I personally find it easiest to have a separate symbol
table in the compiler for each module imported via the "IMPORT ModuleName"
construct, but to put symbols imported via "FROM ModuleName IMPORT ... "
into the same symbol table as used for global symbols belonging to the
current module.  (Also, you need special treatment for nested modules.)
That way, it's easier to set up the search path when the compiler is
looking for a name.  But that's just a personal opinion, others might
have different ideas.

Hope this helps.

Peter Moylan                           eepjm@cc.nu.oz.au
Dept. Elec. Eng. & Comp. Sci., Univ. of Newcastle, NSW 2308, Australia.

randy@m2xenix.psg.com (Randy Bush) (11/03/90)

> I would like to know if someone could give me any hints on the format of
> compiled "DEFINITION MODULE"s.

"Compilation of Data Structures: A new Approach to Efficient Modula-2 Symbol
Files",  J. Gutknecht, ETH #64, July 1985 has some ideas worth study.
-- 
..!{uunet,qiclab,intelhf}!m2xenix!randy