[comp.lang.c] portability and standards

meissner@dg_rtp.UUCP (Michael Meissner) (06/21/87)

In article <17198@amdcad.AMD.COM> tim@amdcad.UUCP (Tim Olson) writes:
	/* nested comments removed */
> We tend to write our .h files (for global variables) something like:
> 
> #ifdef	GLOBAL
> #define	EX
> #else
> #define	EX	extern
> 
> EX int	foo, bar;
> 
> Then, in one .c file which includes this .h file, we #define GLOBAL. 
> The external variables will be declared in all files, but defined only
> in the file which #defines GLOBAL.  This still has the problem that
> explicit initialization cannot be done, but it is much more portable.

I use a somewhat similar scheme that allows simple initializations:

#ifndef	GLOBAL
#define	GLOBAL	extern
#define	INIT(x)
#endif

GLOBAL	char	*p INIT("initialization");

in the main program before the includes, I define:

#define	GLOBAL
#define	INIT(x)	= x
-- 
	Michael Meissner, Data General	Uucp: ...mcnc!rti!dg_rtp!meissner

It is 11pm, do you know what your sendmail and uucico are doing?

karl@haddock.UUCP (Karl Heuer) (06/23/87)

In article <2166@dg_rtp.UUCP> meissner@dg_rtp.UUCP (Michael Meissner) writes:
>I use a somewhat similar scheme that allows simple initializations:
>  #ifndef GLOBAL
>  #define GLOBAL extern
>  #define INIT(x)
>  #endif
>in the main program before the includes, I define:
>  #define GLOBAL
>  #define INIT(x) = x

I used to use an equivalent trick, until I discovered that it's okay to have a
declaration *and* a definition in the same file.  Now I put declarations into
the include file (without the above macros), and also include this file in the
main program or globals.c or whatever, where I have the definitions.  If they
don't match, the compiler will say so.

This means I have to update both the header file and the globals.c file if I
change things, but I consider that a small price to pay.  On the flip side, I
can change the value of a global without editing the header; this is useful in
a "make" environment.  (Also, it neatens things up -- I never liked having to
use those macros.)

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

DRIEHUIS%HLERUL5.BITNET@wiscvm.wisc.EDU (06/26/87)

> Since VMS does not, as far as I know, have lint, there doesn't seem to
> be a good test.
No, it doesn't have lint, and I dread that ! Lint is much nicer than the
'portability' option of VMS C (ever heard of a compiler complaining
about its own include files ? VMS C does !).
> Under Primos there is another wrinkle to take into account.  The Primos
> C compiler insists that every file have an actual entry point.  (I.e.,
> you can't have a file only consisting of definitions.)
VMS C has exactly the same bug. It took me quite some time to find this
out, because for some reason the compiler nor the linker complained
about what must have been an unresolved external in some source file.
                                        - Bert
---------------------------------------------------------------------------
Bert Driehuis, LICOR Leiden, <DRIEHUIS@HLERUL5>,
                and VNG The Hague <no-name@no-net>
                (I speak for nei In NewsgDril! Leei

daniels@cae780.TEK.COM (Scott Daniels) (06/27/87)

In article <8051@brl-adm.ARPA> DRIEHUIS%HLERUL5.BITNET@wiscvm.wisc.EDU writes:
>> The Primos C compiler insists that every file have an actual entry point.  
>> (I.e., you can't have a file only consisting of definitions.)
>VMS C has exactly the same bug. It took me quite some time to find this
>out, because for some reason the compiler nor the linker complained
>about what must have been an unresolved external in some source file.

In VMS-C "extern"s are considered implemented as labelled common 
(variable name = label).  Therefore, there is "no such thing" as an
undefined extern (an extern statement will be sufficient to create 
storage).  In addition, their linker will not pull things out of 
libraries based simply on data references.  This combination of 
behaviors might make you think the modules must have code, but it 
is not true.  VMS-C does allow data-only modules.

FROM:   Scott Daniels, Tektronix CAE
	5302 Betsy Ross Drive, Santa Clara, CA  95054
UUCP:   tektronix!teklds!cae780!daniels
	{ihnp4, decvax!decwrl}!amdcad!cae780!daniels 
        {nsc, hplabs, resonex, qubix, leadsv}!cae780!daniels 

woerz@iaoobelix.UUCP (06/29/87)

> /***** iaoobelix:comp.lang.c / brl-adm!DRIEHUIS%HLERUL5.BITNET@w /  5:49 pm  Jun 26, 1987*/
> ...
> > Under Primos there is another wrinkle to take into account.  The Primos
> > C compiler insists that every file have an actual entry point.  (I.e.,
> > you can't have a file only consisting of definitions.)
> VMS C has exactly the same bug. It took me quite some time to find this
> out, because for some reason the compiler nor the linker complained
> about what must have been an unresolved external in some source file.
>                                         - Bert
> ---------------------------------------------------------------------------
> Bert Driehuis, LICOR Leiden, <DRIEHUIS@HLERUL5>,
>                 and VNG The Hague <no-name@no-net>
>                 (I speak for neither of the above)
> /* ---------- */

I don't know what version of VAX C you have, with version 2.2-015 it
worked well to have only definitions of variables in a file. It
compiled without problems.

------------------------------------------------------------------------------

Dieter Woerz
Fraunhofer Institut fuer Arbeitswirtschaft und Organisation
Abt. 453
Holzgartenstrasse 17
D-7000 Stuttgart 1
W-Germany

BITNET: iaoobel.uucp!woerz@unido.bitnet
UUCP:   ...{seismo!unido, pyramid}!iaoobel!woerz

edstrom%UNCAEDU.BITNET@wiscvm.wisc.EDU (06/30/87)

About initialized globals in header files:

I don't understand the need for macros in defining and initializing global
variables. Maybe I am missing the point but what I do for header files with
variables shared by many .c modules is:

headerfile.h

extern double example;
double example = 33.3;

I use this on VAX VMS and it works fine. Is ther some reason why this approach
is not "safe" or "proper"?

gwyn@brl-smoke.ARPA (Doug Gwyn ) (06/30/87)

In article <8113@brl-adm.ARPA> edstrom%UNCAEDU.BITNET@wiscvm.wisc.EDU writes:
>headerfile.h
>
>extern double example;
>double example = 33.3;
>
>I use this on VAX VMS and it works fine.

I don't understand how this could work on VMS; every file that includes
the header file would be attempting to initialize the external storage
allocated for "example".  The fact that they are using the same initializer
must be being taken into account somehow by the VMS linker, but in general
this usage is an error.  There must be no more than one explicit static
initialization of any particular datum.

eric@snark.UUCP (Eric S. Raymond) (07/03/87)

In article <3399@ihlpg.ATT.COM>, bgb@ihlpg.UUCP writes:
>
> [a sensible technique for preventing inconsistencies in globals usage]
>

I submit, though, that this whole discussion rests on a false premise --
that it is somehow a Good Thing to have a separate globals module with no code.
Every time I've seen such a module it has been evidence for poor data
design in the program. Globals are bad style, the data design equivalent
of goto statements.

Object-oriented languages like Simula, Smalltalk and C++ have demonstrated that
the best route to a clean and powerful design is by partitioning that design
into a set of communicating abstract data types -- black boxes with sealed
innards and narrow, well-defined interfaces.

[Yes, this is going to relate back to good C style in a bit. Bear with me...]

The major difference between abstract data type (ADT) decomposition and the
more primitive kind of functional decomposition that Algol-descended languages
like Pascal and C were designed to support is precisely the status of
global data. In an ADT design, data elements associated with a module
are also considered 'inside the box', to be accessed through the same kind
of narrow interfaces as the code elements.

In languages like Smalltalk that were built from the ground up for ADT, every
piece of storage is part of an instance of an ADT. There are no globals.
This seems very odd when you're just getting used to ADT design, but once
you get used to thinking in ADT terms you don't miss them. In fact, in
an ADT design environment you quickly develop a perception that globals
are ugly, because they represent places where the data elements of what
should be ADTs are leaking into each other.

[Returning to earth now...]

C wasn't particularly designed for ADT building, but it's powerful enough
to support it pretty well. In my C code, each .c source file implements an
ADT and has an associated .h file that defines the interface. Each data area
in the program is allocated in the module that defines the ADT it's part of.
If it has to be accessible to other modules, it's declared (exactly once)
in the corresponding .h file. There may be lots of data areas that are
publically visible, but there are no 'globals' that aren't definitely owned
by some ADT somewhere.

I find this approach leads to code that is cleaner, better organized and
much easier to maintain. The mental effort required to identify would-be
'globals' as part of an ADT including their handler code is usually
fairly trivial, nor is there any run-time overhead necessarily involved
in partitioning things this way.

I recommend the elimination of globals, and the more general habit of thinking
in ADTs, to C programmers everywhere. The grief you save may be your own.

tim@amdcad.AMD.COM (Tim Olson) (07/03/87)

In article <107@snark.UUCP> eric@snark.UUCP (Eric S. Raymond) writes:
+-----
| I submit, though, that this whole discussion rests on a false premise --
| that it is somehow a Good Thing to have a separate globals module with no code.
| Every time I've seen such a module it has been evidence for poor data
| design in the program. Globals are bad style, the data design equivalent
| of goto statements.
| 
| I recommend the elimination of globals, and the more general habit of thinking
| in ADTs, to C programmers everywhere. The grief you save may be your own.
+-----
This is all well and good when the program *can* be partitioned into
modules, each with their own data structures and procedures which
operate on them, but there are many cases where this doesn't work.  For
example, we use this technique wherever possible when we wrote the
Am29000 processor simulator, but there are still *many* signals (global
variables) which are visible to all modules, and don't have a logical
owner: the clock, various pipeline hold conditions, global control
registers, etc.  In this case, the most logical place to put them was a
global.h file which was included in every .c file which required them. 
The #ifdef GLOBAL technique was used to prevent multiple copies (.c and
.h) of the global variables.

	-- Tim Olson
	Advanced Micro Devices
	(tim@amdcad.amd.com)
	

jfh@killer.UUCP (07/06/87)

In article <107@snark.UUCP>, eric@snark.UUCP (Eric S. Raymond) writes:
> In article <3399@ihlpg.ATT.COM>, bgb@ihlpg.UUCP writes:
> >
> > [a sensible technique for preventing inconsistencies in globals usage]
> >
> 
> I submit, though, that this whole discussion rests on a false premise --
> that it is somehow a Good Thing to have a separate globals module with no code.
> Every time I've seen such a module it has been evidence for poor data
> design in the program. Globals are bad style, the data design equivalent
> of goto statements.

I have written code that needs goto's to keep from growing into a massive
maze of if-then-else's, unneeded functions, and totally unreadable code.

The goto is nice when you want to give up, or handle a special case - if you
go using it as a way to avoid for, while and do loops you need to have
your head examined for remnants of FORTRAN or BASIC.

But then I follow some-body-or-other's advice and *document* my goto's with
come-from's - i.e. How Did I Get Here? 

> The major difference between abstract data type (ADT) decomposition and the
> more primitive kind of functional decomposition that Algol-descended languages
> like Pascal and C were designed to support is precisely the status of
> global data.

I thought C came from BCPL?  And besides, what the h*ll is he saying here?
I used globals in Pascal whenever they were needed.  Some variables get to
be a real bother to pass all over gods creating, and wrapping them up into
a big structure you pass all over the place is a bother.
 
>  In fact, in
> an ADT design environment you quickly develop a perception that globals
> are ugly, because they represent places where the data elements of what
> should be ADTs are leaking into each other.

All this is good and fine for toy code.  No, I take that back.  Somethings
*do* work as ADT's very well.  But the entire world does not live in ADT
land.  What do you do with a simple status variable like errno that is
modified by all of the system calls?  Add pointers in the file status
blocks to all point to errno?  Ask a VMS programmer wether they prefer
FAB's and RAB's to file descriptors and errno.
 
> [Returning to earth now...]

Please do.

> C wasn't particularly designed for ADT building, but it's powerful enough
> to support it pretty well. In my C code, each .c source file implements an
> ADT and has an associated .h file that defines the interface.
> [ stuff deleted ] There may be lots of data areas that are
> publically visible, but there are no 'globals' that aren't definitely owned
> by some ADT somewhere.

Sounds like nice design, when it applies.  But what if the variable has
to live for quite a while, is used by 15 or 20 routines, and isn't all
that abstract?  For example, a file descriptor returned by a special file
open routine?

> I find this approach leads to code that is cleaner, better organized and
> much easier to maintain. The mental effort required to identify would-be
> 'globals' as part of an ADT including their handler code is usually
> fairly trivial, nor is there any run-time overhead necessarily involved
> in partitioning things this way.

I suppose all of the ADT glue is useful for libraries and packages that are
used many times, and may have many different functions.  Well de[fs]i(n|gn)ed
interfaces do not exclude global variables however.  Nor do global variables
have to produce unreadable code. That seems to be what /* */ are for - to
add some intelligence to the source code.  The most readable code I've
seen has tended to be assembler - for just that reason.  Every assembler
programmer I know has always documented well, even to excess in many
occasions.  The best C code is the same way.  Look at the source for
the chess program that was just posted.  I seem to remember that it was
well documented.  That or rogue, I forget ...

- John.

jimp@cognos.uucp (Jim Patterson) (07/07/87)

In article <8113@brl-adm.ARPA> edstrom%UNCAEDU.BITNET@wiscvm.wisc.EDU writes:
>I don't understand the need for macros in defining and initializing global
>variables. Maybe I am missing the point but what I do for header files with
>variables shared by many .c modules is:
>
>headerfile.h
>
>extern double example;
>double example = 33.3;
>
>I use this on VAX VMS and it works fine. Is ther some reason why this approach
>is not "safe" or "proper"?

A good explanation of the data models supported in various C
implementations can be found in the ANSI C Rationale document which was
distributed with the ANSI C (X3J11) draft documents, in the section
3.1.2.2, Linkage of Identifiers.  The same section in the ANSI C
draft explains the rules that ANSI has actually adapted.

VAX C adapts a style of externs called the Common model. All objects
with external scope are placed in what the linker refers to as PSECTs;
they are equivalent to FORTRAN NAMED COMMON blocks.  This model is
very unrestrictive; it effectively ignores the extern keyword and
allows any number of initializations of a given external object.  The
only time you might see a complaint is when the same external object
is initialized to two distinctly different and non-zero values.

A large number of other C implementations adapt what is refered to as
the REF/DEF (or Reference/Definition) model.  In this model, an object
with external scope may have any number of references identified by
explicit use of the extern keyword (and no initializer), but can have
only one definition (no extern keyword and an optional initializer).
This model is atually termed the Strict REF/DEF model and has been
adapted by the ANSI C committee on the grounds that it will break the
fewest implementations. A Relaxed REF/DEF model is used by some C
implementations as well which allows for multiple definitions but still
distinguish between references and definitions based on the extern
keyword.

Some compilers rely on the presence or absence of initializers to determine
which declaration forms the definition of an object.  This model is
termed the Initialization model.

So, to answer your query, your approach isn't portable and isn't
supported by the ANSI definition. (This isn't to say that DEC won't
continue to support it; I suspect that they will). The Common model is
supported by a number of implementations, but isn't supported by many
others.  ANSI C has adapted REF/DEF because it is compatible with the
majority of existing implementations.  If you want to be maximally
portable, this seems to be the best approach.  Also you should include
initializers on each definition if you need to be able to port to
implementations using the Initialization model.
-- 

Jim Patterson          decvax!utzoo!dciem!nrcaer!cognos!jimp
Cognos Incorporated    

gwyn@brl-smoke.ARPA (Doug Gwyn ) (07/10/87)

In article <1068@aldebaran.UUCP> jimp@cognos.UUCP (Jim Patterson) writes:
>ANSI C has adapted REF/DEF because it is compatible with the
>majority of existing implementations.

More importantly, many linkers place unacceptable restrictions on
COMMON storage (e.g.: 4Kb alignment for each external COMMON name;
no more than 256 distinct COMMON names).  The C implementor often
does not have much if any control over the linker that he must use.
Practically all linkers have usable DEF/REF support.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (07/10/87)

In article <1103@killer.UUCP> jfh@killer.UUCP (John Haugh) writes:
>What do you do with a simple status variable like errno that is
>modified by all of the system calls?

A botch.  Consider what happens in multi-threaded applications,
such a signal handler that has to do system calls.  Of course
there are ways around this misfeature, but they shouldn't have
been necessary.

Don't knock ADT or other structured software design methodologies
until you've learned to use them.  A lot of very smart people have
worked very hard over the past few decades to figure out how to
turn software construction into a solid engineering discipline
rather than a guild craft.  Much real progress has been made in
understanding the fundamental issues and in developing suitable
methodology, although as usual many practitioners remain ignorant
of theoretical developments (especially in the UNIX community,
which has more than its fair share of random hackers).

steele@unc.cs.unc.edu (Oliver Steele) (07/10/87)

John Haugh (jfh@killer.UUCP) writes:
]What do you do with a simple status variable like errno that is
]modified by all of the system calls?

Doug Gwyn (VLD/VMB) <gwyn> (gwyn@brl.arpa) writes:
>A botch.  Consider what happens in multi-threaded applications,
>such a signal handler that has to do system calls.  Of course
>there are ways around this misfeature, but they shouldn't have
>been necessary.

Agreed.  There are other examples that aren't (see below).

>Don't knock ADT or other structured software design methodologies
>until you've learned to use them.  A lot of very smart people have
>worked very hard over the past few decades to figure out how to
>turn software construction into a solid engineering discipline
>rather than a guild craft.  Much real progress has been made in
>understanding the fundamental issues and in developing suitable
>methodology, although as usual many practitioners remain ignorant
>of theoretical developments (especially in the UNIX community,
>which has more than its fair share of random hackers).

(I've intentionally blurred the distinction between ADTs and OOP below.)

Even very object oriented languages such as Smalltalk don't manage to
completely eliminate globals: it is still useful (in the case of
Smalltalk) to have such objects as Transcript, Disk, Sensor, and Display.
In C you can write functions that assign and return what are really
globals and restrict the scope of the actual globals to the file defining
these functions (similar to class variables in Smalltalk), but it's
arguably clearer to use an object than a function to represent entities
such as the standard input stream that people tend to think of as
objects.

Another way to 'fake' globals is to have a set of functions (or macros)
such as putchar() and printf() which take one less parameter than your
base set and hide the global inside themselves.  In the case of printf()
this is convenient; in a drawing module it would require DrawGrayCircle,
DrawBlackCircle, DrawWhiteCircle, etc., just to hide Gray, Black, and
White from the extra-modular code.

This discussion branch started as a question on where to put globals when
you do use them; even if they can all be eliminated, most programmers are
going to learn ADTs long before they elimate all globals.

Most objects conceptually belong with certain modules.  For instance, if I
were working exclusively with one size of matrix, I would place the zero
and identity matrices in the module defining the matrix ADT.

------------------------------------------------------------------------------
Oliver Steele				  ...!{decvax,ihnp4}!mcnc!unc!steele
							steele%unc@mcnc.org

	"They're directly beneath us, Moriarty.  Release the piano!"

wong@llama.rtech.UUCP (J. Wong) (07/10/87)

In article <8113@brl-adm.ARPA> edstrom%UNCAEDU.BITNET@wiscvm.wisc.EDU writes:
>I don't understand the need for macros in defining and initializing global
>variables. Maybe I am missing the point but what I do for header files with
>variables shared by many .c modules is:
>
>headerfile.h
>
>extern double example;
>double example = 33.3;
>
>I use this on VAX VMS and it works fine. Is ther some reason why this approach
>is not "safe" or "proper"?
>
Although this is "safe" and "proper", it is not space efficient.  The DEC C
implementation allocates all global variables separately, each in a separate
PSECT.  Since a PSECT is a page, this results in a lot of wasted space.  In
addition, if you happen to have a declaration that is not used, it still
forces the definition to be linked into the image.

Better to use globalref/globaldef with DEC C (see the manual.)  Of course,
this is not portable ... (unless you use defines of some sort.)
				J. Wong		ucbvax!mtxinu!rtech!wong

****************************************************************
You start a conversation, you can't even finish it.
You're talking alot, but you're not saying anything.
When I have nothing to say, my lips are sealed.
Say something once, why say it again.		- David Byrne

leichter@yale.UUCP (Jerry Leichter) (07/12/87)

In article <1057@rtech.UUCP> wong@llama.UUCP (J. Wong) writes:
>Although this is "safe" and "proper", it is not space efficient.  The DEC C
>implementation allocates all global variables separately, each in a separate
>PSECT.  Since a PSECT is a page, this results in a lot of wasted space.  In
>addition, if you happen to have a declaration that is not used, it still
>forces the definition to be linked into the image.
>

The above comment is nonsense.  PSECT's can be any size; the alignment they
are forced to can be set to anything from a byte to a page.  Data PSECT's
produced for C extern's are given the alignment appropriate for the data
they contain.  Starting with V2.3 of VAX C, you can even specify the alignment
you want (with the VAX C-specific _align storage class modifier).

							-- Jerry