[comp.lang.modula2] How to make libraries smaller?

DUG@CZHETH5A.BITNET (05/25/90)

Comment: Mail not returnable through this path, use the list address

Author: AEOLUS::NOTES9B "Wehrli Christoph"
Topic 784.1
Time: 25-MAY-1990 09:59 ZRH

    -< Smart linker >-

    The Logitech Linker V3.0 and later does intelligent linking from
    all imported modules, e.g. only code actually referenced is included
    in the final .EXE file. Linking is just an implementation feature
    and not a question of Modula2 or C.

a665@mindlink.UUCP (Anthon Pang) (05/26/90)

> DUG@CZHETH5A.BITNET writes:
> 
>     -< Smart linker >-
> 
>     The Logitech Linker V3.0 and later does intelligent linking from
>     all imported modules, e.g. only code actually referenced is included
>     in the final .EXE file. Linking is just an implementation feature
>     and not a question of Modula2 or C.

The problem with the Logitech Linker is all the interdependency of the standard
modules on nonstandard ones...the result is that the optimized size isn't much
smaller than the unoptimized size of a portable M2 program.

borchert@MATHEMATIK.UNI-ULM.DE (Andreas Borchert) (05/28/90)

Tom Tkacik writes:

> I am working on a convoluted method of making smaller executables.
> I I could get it to work, I should be able to have Modula-2 executables
> as small a C executables.  I need help.  Can it be done?
>
> What I was thinking of doing is to remove all but a single procedure from
> a source library file (perhaps using the C preprocessor and its
> #ifdef directive) and compile the file.
> Then do this for each procedure in the file.  All of the compiled
> files would be given different names, (the linker does not
> care what the file is called), and collected into a single library.
>
> This seems all well and good, execpt for global variables.
> If a global variable is used by two procedures, it would have to remain in
> the source file twice, and be compiled twice.  The result would be a
> multipy defined variable.  But if I left it out during one of the compiles,
> the compiler would complain.  I can't IMPORT it from another file, because
> then the compiler would give it the wrong name.

> What is needed is some way to simulate C's extern.  I need to declare a
> variable as external, but with a local name.  Is this at all possible?

You cannot simulate C's extern in Modula-2. C externals do not have
a "local name". They are simply common variables, i.e. one or more
declarations of the same variable identifier refer to the same
variable.

> Is there another solution to the problem?  I am fairly new to Modula-2,
> (does my C background show? :-)

There are at least two ways to solve this problem: on the level of
the original Modula-2 source and on the level of the compiler output.

If you want to be free of dependencies to your compiler and to your
environment you are enforced to produce a set of modules out of the
big one (that's what you've tried). But this implies that every
global variable is declared once and only once. Modula-2 doesn't support common
variables in the sense of Fortran or C.
Simply collect all global types and variables into one module and
import all that stuff in each of the other partitions.

If your Modula-2 compiler generates assembly output it should be rather
easy (using awk) to cut it into a set of smaller entities.
Global variables in Modula-2 are part of the bss segment
(they cannot be initialized). Variables in the bss segment are
declared as common (or at least can be declared to be common).
So you simply need to repeat the code for the variables for each procedure
of the module.

Some pitfalls if you are cutting the assembly text:

(1)	Don't separate local procedures from the procedures they belong to.
	It should be easy to distinguish local procedures from
	global procedures: the symbol of the local procedure is not extern.
(2)	Check how floating point constants are generated.
	Not every machine accepts immediate floating point values.
	In this case the constants are generated elsewhere.
(3)	String constants have local symbols and are part of the text segment.
	Simply copy them like global variables.
(4)	Don't separate the initialization part from the initialization flag.

Andreas Borchert

RCAPENER@cc.utah.edu (05/29/90)

In article <9005281240.AA07902@mathematik.uni-ulm.de>, borchert@MATHEMATIK.UNI-ULM.DE (Andreas Borchert) writes:
> Tom Tkacik writes:
> 
	[lots of stuff removed]
> 
> If you want to be free of dependencies to your compiler and to your
> environment you are enforced to produce a set of modules out of the
> big one (that's what you've tried). But this implies that every
> global variable is declared once and only once. Modula-2 doesn't
> support common variables in the sense of Fortran or C.
> Simply collect all global types and variables into one module and
> import all that stuff in each of the other partitions.
>
	[more stuff removed]

I can see that this makes lots of sense for global types (very similar
to C's typedef in an #include file), and will indeed work.  But will
variables be the SAME variables in all the modules that import it, or
will they be DIFFERENT?  It seems to me that it will be the latter,
since the storage space will be allocated seperately each and every
time you compile the object files that import it.  I assume that this
could be a compiler design issue, but is more likely to be a linker
problem, but don't know for sure.  Have you tested it with various
compilers/linkers to see what they all do?  It seems reasonable that
you may be correct, but then again, is this really defined in the
language standard?


Sincerely

rcapener@csulx.weber.edu (csulx was formerly wsccs)

Peter.M..Perchansky@f101.n273.z1.fidonet.org (Peter M. Perchansky) (05/29/90)

Hello:

    TopSpeed Modula-2's linker is smart, and will only include procedures, constants, and variables used within a given module.  Turbo Pascal and Turbo C spout the same features.

    Most linkers, as you are already aware of, do not boast such an ability.



--  
uucp: uunet!m2xenix!puddle!273!101!Peter.M..Perchansky
Internet: Peter.M..Perchansky@f101.n273.z1.fidonet.org

jensting@skinfaxe.diku.dk (Jens Tingleff) (05/30/90)

RCAPENER@cc.utah.edu writes:

[.......]

>								 But will
>variables be the SAME variables in all the modules that import it, or
>will they be DIFFERENT?  It seems to me that it will be the latter,
>since the storage space will be allocated seperately each and every
>time you compile the object files that import it.  

Ehhrmm, why ? 

This is such a silly restriction that I can't beleive it's in the language. 
Since I don't have my Wirth at the ready, all I can say is that
with a few tens of thousands of lines M-2 behind me, I'm merely 99.99 % sure
that variables are only allocated space ONCE (i.e. in the module where
they are defined, be it in connection with the compilation of the
IMPLEMENTATION or the DEFINITION module).

Since this is somp.lang.modula2, perhaps someone with a PIM2 at hand could
put us all at ease here ..... 



	Jens
jensting@diku.dk is
Jens Tingleff MSc EE, Research Assistent at DIKU
	Institute of Computer Science, Copenhagen University
Snail mail: DIKU Universitetsparken 1 DK2100 KBH O

cspw.quagga@p0.f4.n494.z5.fidonet.org (cspw quagga) (05/31/90)

 
 > tkacik@rphroy.UUCP (Tom Tkacik) writes ...
 
 >When linking a Modula-2 program, the entire library is included,
 >even routines that are not needed.  If any of those unused routines call
 >a routine from another library, that entire library is also included.
 >The result is a short program burdened by a lot of unused library code.
 >(I know this is an implementation problem, not a Modula-2 problem.)
 
 >I have the source code for those libraries, and would like to re-create
 >them so that each routine is in its own .o file, and use the archiver
 >to collect them into a single library.
 
TopSpeed's "smart linker" solves this by recognizing that the unit of
linking ought to be the object (i.e. procedure/variable), rather than
the whole module.  (I haven't seen their Modula 2 yet, but the TopSpeed
C does this, and a direct result is that their library sources are in a
couple of big source files, rather than the hundreds of small ones we
have become accustomed to.  But the executable code stays small.)
 
 >What I was thinking of doing is to remove all but a single procedure from
 >a source library file (perhaps using the C preprocessor and its
 >#ifdef directive) and compile the file.
 >Then do this for each procedure in the file.  All of the compiled
 >files would be given different names, (the linker does not
 >care what the file is called), and collected into a single library.
 
 >This seems all well and good, execpt for global variables.
 >If a global variable is used by two procedures, it would have to remain in
 >the source file twice, and be compiled twice.  The result would be a
 >multipy defined variable.  But if I left it out during one of the compiles,
 >the compiler would complain.  I can't IMPORT it from another file, because
 >then the compiler would give it the wrong name.
 
I cannot see any mechanisms in Modula that would allow you to
transparently 'break' one module into N fragments, while still
allowing the user to believe that they are all part of the same module.
 
I think you're trying to solve the problem at the wrong level.  It is
the linker that needs improvement.  The Modula libraries are organized into
LOGICALLY coherent modules.  The fact that the linker tries to treat
these as PHYSICALLY coherent units is a mess.
 
Rather devote your efforts to providing a smart linker.  A quick and
dirty approach may require nothing more than a script file or two.
 
For example, you could link with the usual mechanisms and produce a map
file or a list of all the routines you actually need.  Then use the
standard library maintenance utilities to extract from the real
libraries only the bits you want for that application, and re-build
them into a special purpose lean-'n-mean library.  Then just re-link
with the new improved library.  (I'm not sure all library managers would
be able to extract pieces of a module, but it is worth a try.)
 
Of course, are not worried about the code size during development
(are you?), so you'd only need to do this once when your product
is ready for release.  This way the slowwww linking wouldn't be so bad!
 
Pete

--
EP Wentworth - Dept. of Computer Science - Rhodes University - Grahamstown.
Internet: cspw.quagga@f4.n494.z5.fidonet.org
Uninet: cspw@quagga
uucp: ..uunet!m2xenix!quagga!cspw



--  
uucp: uunet!m2xenix!puddle!5!494!4.0!cspw.quagga
Internet: cspw.quagga@p0.f4.n494.z5.fidonet.org

Jon.Guthrie@p2.f70.n226.z1.fidonet.org (Jon Guthrie) (06/03/90)

 >> Simply collect all global types and variables into one module and
 >> import all that stuff in each of the other partitions.

 > I can see that this makes lots of sense for global types (very similar
 > to C's typedef in an #include file), and will indeed work.  But will
 > variables be the SAME variables in all the modules that import it, or
 > will they be DIFFERENT?  

It depends...if you IMPORT a TYPE (call it Type1) and then declare a variable   
of type Type1 with the name FOO in module BAR1 and declare another variable of   
Type1 named FOO in module BAR2 then the variables will be different.

If, however, you declare a Type1 variable named FOO in the included file and   
IMPORT that into BAR1 and BAR2 then FOO will be the same variable in both   
modules.  Clear?  (Try it, it works!)

...You trust them with your fortunes, let them guard your lives 

--  
uucp: uunet!m2xenix!puddle!226!70.2!Jon.Guthrie
Internet: Jon.Guthrie@p2.f70.n226.z1.fidonet.org

RCAPENER@cc.utah.edu (06/03/90)

In article <6608.2664C206@puddle.fidonet.org>,
> Peter.M..Perchansky@f101.n273.z1.fidonet.org (Peter M. Perchansky) writes:
> Hello:
> 
>     TopSpeed Modula-2's linker is smart, and will only include procedures,
>     constants, and variables used within a given module.  Turbo Pascal and
>     Turbo C spout the same features.
> 
>     Most linkers, as you are already aware of, do not boast such an ability.

Well, I wouldn't know about MOST linkers.  Most FORTRAN linkers
bring it all in whether you need it or not.  If you have seen
one that doesn't display this behavior let me know.

On the other hand, I have yet to see a C linker bring in anything
more than is absolutely needed, but this assumes you specify the
proper options on UNIX systems.  For example, the NeXT requires
the -lsys_s flag to link in the shared libraries, whereas most
SYSV or BSD machines link shared libs by default.  Regardless,
they do NOT link in unused functions or vars!  If you have a
C system in mind that contradicts my experience with the following
C compilers please let me know.  I KNOW the following systems
will NOT bring in unused functions or variables.

Aztec C			Lattice C		MicroSoft C
C86-C			Turbo-C			High-C
pcc (portable cc)	gcc (Gnu cc)		vcc (VAX-ULTRIX CC)
						    (also VMS)

The VAX C compiler will produce huge files if you don't link in
the shared library, otherwise they are about 10-20% larger in
size than what pcc produces on an Ultrix (MicroVAX-II hardware
for both VMS and ULTRIX for a fair comparison).

C' ya later

aubrey@rpp386.cactus.org (Aubrey McIntosh) (06/03/90)

The Logitech 3.4 Linker is a smart linker as well.  It has an option
switch telling it to link quickly, or link small.
It's well on its way toward being an all-round general purpose linker.

-- 
Aubrey McIntosh  	"Find hungry samurai." -- The Old Man        
1502 Devon Circle       comp.os.minix, comp.lang.modula2         
Austin, TX 78723 
1-(512)-452-1540  (v)

jensting@skinfaxe.diku.dk (Jens Tingleff) (06/08/90)

cspw.quagga@p0.f4.n494.z5.fidonet.org (cspw quagga) writes:

[....]
> 
>I cannot see any mechanisms in Modula that would allow you to
>transparently 'break' one module into N fragments, while still
>allowing the user to believe that they are all part of the same module.

Oh yes, all you have to do is to introduce a new ``level" of modules.
So, you original module with the procedures P1, P2, .. , Pn becomes
a module that imports the procedures P1, P2, .. , Pn from the 
*seperately compiled* modules M1, M2, .. , Mn. So, the original
module is reduced to consist of IMPORT statements and initialisation
code for the module global variables. SIMPLE AS THAT!

Remember that unqualified IMPORT is totally transparent, name-wise.

OK, it may not be ``transparant'', but it will work.

Even if some of the procedures depend on each other, the dependecy
will be resolved by the programmer (doing the relevant IMPORT in
the relevant M.. files), and by the linker, the traditional way.

The way to do it (in my humble opinion) is to generate a global
DEFINITION MODULE containg *all* module global vars. Write a
DEFINITION MODULE for all the M.. files (simply by copying the 
procedure declaration out, using an editor). Write a ``standard''
IMPORT section (a header) for each of the M.. files (simply importing all
the vars from the glbal DEFINTION MODULE). Write all the
IMPLEMENTATION modules M.. by 
	-1 copy out the standard header to a file
	-2 copy the procedure declaration and body 
	   into that file.
This shouldn't take too long, using a nice editor (not to mention macros..).
The effort should be smaller than that for writing a smart linker.. ;^)

	Jens

jensting@diku.dk is
Jens Tingleff MSc EE, Research Assistent at DIKU
	Institute of Computer Science, Copenhagen University
Snail mail: DIKU Universitetsparken 1 DK2100 KBH O

Peter.M..Perchansky@f101.n273.z1.fidonet.org (Peter M. Perchansky) (06/13/90)

Hello:

    MicroSoft C does not perform smart linking.  Every procedure and function referenced in a given library module will be brought into the final .EXE (used or not used).



--  
uucp: uunet!m2xenix!puddle!273!101!Peter.M..Perchansky
Internet: Peter.M..Perchansky@f101.n273.z1.fidonet.org