[net.lang.c] Proposal to add modules to C

chris@umcp-cs.UUCP (Chris Torek) (05/15/85)

It seems to me that the main (if not only) reason to avoid using global
variables (as opposed to ``module-level'' variables, i.e., accessible
to foo.c and bar.c but not baz.c) is to avoid name collisions.  If the
variable makes sense as a statically-allocated unit, why would you care
whether baz.c can get at it, unless you want to ensure that no accidental
changes or references are made?  [I will mention functions later.]

Anyway, assuming that I'm not already way off base, this can be done
quite easily in C as it stands right now, by using long names (let's not
start *that* again: I only need long internals for this argument, and
the current ANSI draft has 31 character internal names) and structures.

For example, if files foo1 and foo2 are to share access to a data
structure, you might create a corresponding foo.h file:

	struct foodata {		/* private globals for foo */
		int	fd_int;
		double	fd_fp;
		/* and more */
	};

	extern struct foodata foodata;	/* declared in fooglob.c */

	/* Now we hide the fact that they are really in a struct */
	#define foo_int foodata.fd_int
	#define foo_fp	foodata.fd_fp

Within foo1 and foo2, these variables appear to be globals.  To anyone
who does not include foo.h, there is only one name to worry about (and
it need not make much sense; indeed you can hide the ugly 6-character
externals this way---#define's are internal names, and therefore have
at least 31 significant characters).

This admittedly leaves function names out in the cold....  I'd suggest
using function pointers inside foodata, and initialization routines, but
someone would gripe about run time.  (Personally, I'll just use my long
identifier names.)  Well, if you want more, try Mesa.  It's got more
modules (and more strict type checking) than any Unix hacker would know
what to do with....
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 4251)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@maryland

henry@utzoo.UUCP (Henry Spencer) (05/17/85)

> It seems to me that the main (if not only) reason to avoid using global
> variables (as opposed to ``module-level'' variables, i.e., accessible
> to foo.c and bar.c but not baz.c) is to avoid name collisions.  If the
> variable makes sense as a statically-allocated unit, why would you care
> whether baz.c can get at it, unless you want to ensure that no accidental
> changes or references are made?

Because I want to ensure that no *deliberate* changes or references are
made!!  That variable is a detail of the *implementation* of the module
in question, and outsiders are not entitled to use it because the
implementation is subject to change.  There are some really ugly places
in Unix where programs *know* details of the implementation of stdio,
for example.  Making things like that globally accessible essentially makes
them part of the specification of the module, hence very hard to change
even if they are later seen as serious mistakes.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

chris@umcp-cs.UUCP (Chris Torek) (05/20/85)

> > [me]
> > If the variable makes sense as a statically-allocated unit, why would
> > you care whether baz.c can get at it, unless you want to ensure that no
> > accidental changes or references are made?

> [Henry Spencer]
> Because I want to ensure that no *deliberate* changes or references are
> made!!

Then beat the programmer on the head with a stick! :-)

Seriously, I think there is no way that you can make a language force
people not to do bad things, and that even attempting it is a mistake.
(This is purely a personal opinion.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 4251)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@maryland

jans@mako.UUCP (Jan Steinman) (05/20/85)

The NS32000 Series has hardware support for this concept.  (I'm suprised
Henry didn't mention this in his posting -- he seems to know the NS32000.)
It is quite well explained in the National literature, so I won't go over it
in detail.  The trick to resolving external module addresses at run time is to
store them in a "link table", then use a known index into the table to access
the routine.  The extra fetch adds to the (already considerable) overhead of
procedure/function calls; I know of no NS32000 compilers that make use of this
feature, but then I don't know of too many!  For instance the function
"printf()" gets compiled to the following assembly code in "traditional" use:

	jsr	_printf	;Does not exist at assemble time, must be linked.

The linker then loads in all the printf() code (which probably includes a lot
of dead code if you only wanted to do "printf("%s\n", string);) and resolves
the address in the "jsr" call.  Using National's external addressing, the
compiler-generated instruction

	cxp	1234	;(or some other meaningless-to-humans index)

causes a jsr (sort of) through the 1234th location in the module's link table.
Linking is then a simple matter of contiguous loading with no address
resolution needed.  The many address fetches needed makes this a slow process;
National does it about as efficiently as can be expected.

This might be less than accurate, and I've left out detail, but the concept
of "modules" is supported by NS32000 hardware.  Look in their book for more.
-- 
:::::: Jan Steinman		Box 1000, MS 61-161	(w)503/685-2843 ::::::
:::::: tektronix!tekecs!jans	Wilsonville, OR 97070	(h)503/657-7703 ::::::

grs@ncoast.UUCP (Gregg R. Siegfried) (05/27/85)

In article <5606@utzoo.UUCP> henry@utzoo.UUCP (Henry Spencer) writes:
>
>Because I want to ensure that no *deliberate* changes or references are
>made!!  That variable is a detail of the *implementation* of the module
>in question, and outsiders are not entitled to use it because the
>implementation is subject to change.  There are some really ugly places
>in Unix where programs *know* details of the implementation of stdio,
>for example.  Making things like that globally accessible essentially makes
>them part of the specification of the module, hence very hard to change
>even if they are later seen as serious mistakes.
>-- 

	I think a change like this would corrupt the language, rather than
provide a way to implement hidden types.  I found Modula-2 quite useful
for doing just what you described.  That there don't seem to be any Modula-2
compilers for Unix systems should be a small concern however.  If there
isn't one under construction, there *should* be.

					Gregg Siegfried

..decvax!cwruecmp!ncoast!grs

[When all else fails...use assembler]
"If you want to catch a lot of fish, you need a big net."