[comp.lang.c] Caution for those considering MSC 5.0

mc35+@andrew.cmu.edu (Mark Chance) (02/05/88)

In recompiling my application under 5.0 I discovered an annoying
feature which effectively prevents me from using 5.0.  My application is
pretty large and I need a lot of stack space.  I am using the large
model and by adding the -Gt option to force data items to their own
segment things are pretty cool.  Now along comes 5.0 and I specify 20000
bytes stack space, the linker says 'Error: stack+data>64K'.  I say how
can that be?  Well the punch line is that 5.0 puts strings in the
CONST space which is in the stack segment !!! :-(.  So cluttering up my
precious stack space is 40K worth of strings that used to be distributed
among the various FAR-DATA segments.  I intend to complain to Microsoft
about this since I have not found any compiler switches to avoid this.
Any other ideas?

Mark Chance                Information Technology Center
mc35+@andrew.cmu.edu       Carnegie-Mellon Univ.

jrl@anuck.UUCP (j.r.lupien) (02/07/88)

Micrsoft has a fix for pass two for this problem. I have tried
it, and it seems to work. Call them up, and ask for:
the C 5.0 / GT FIX disk.
It comes with instructions on how to install.

gof@crash.cts.com (Jerry Fountain) (02/08/88)

In article <UW2DrLy00VsGZ8I0CK@andrew.cmu.edu> mc35+@andrew.cmu.edu (Mark Chance) writes:
>In recompiling my application under 5.0 I discovered an annoying
>feature which effectively prevents me from using 5.0.  My application is
>pretty large and I need a lot of stack space.  I am using the large
>model and by adding the -Gt option to force data items to their own
>segment things are pretty cool.  Now along comes 5.0 and I specify 20000
>bytes stack space, the linker says 'Error: stack+data>64K'.  I say how
>can that be?  Well the punch line is that 5.0 puts strings in the
>CONST space which is in the stack segment !!! :-(.  So cluttering up my
>precious stack space is 40K worth of strings that used to be distributed
>among the various FAR-DATA segments.  I intend to complain to Microsoft
>about this since I have not found any compiler switches to avoid this.
>Any other ideas?
>
>Mark Chance                Information Technology Center
>mc35+@andrew.cmu.edu       Carnegie-Mellon Univ.


I think you should have contacted MS before putting out this minor flame
(or at least have posted it as a question to the net rather than a complaint)

What you need to do is call MS, and tell them you need the 'Gt fix for the
second pass of the compiler.'  This makes 5.0 work like 4.0 (makes CONST
-Gt aware).  They will send you a replacement second pass for the compiler.
Note:  When I spoke with them about the 'bug' they stated they did not
consider it a bug but simply a design change.  In reality it only effects 
those with a large stack requirement or a large number of CONST's


-- 
-----Jerry Fountain-----
UUCP: {hplabs!hp-sdd,sdcsvax,nosc}!crash!pnet01!gof   ARPA: crash!gof@nosc.mil
MAIL: 523 Glen Oaks Dr., Alpine, Calif. 92001         INET: gof@pnet01.CTS.COM

PEPRBV%CFAAMP.BITNET@husc6.harvard.EDU (Bob Babcock) (02/11/88)

mc35+@andrew.cmu.edu (Mark Chance) writes:
>Well the punch line is that [MSC] 5.0 puts strings in the
>CONST space which is in the stack segment ....
>that used to be distributed
>among the various FAR-DATA segments.

I ran into  just the opposite  problem.   Having  just  purchased
Turbo-C  1.5 and  MSC 5.0 as possible  replacements  for Computer
Innovations C86, I found that some global variables which were in
DGROUP under Turbo-C were put into another segment  by MSC.  This
caused my assembly language  subroutines  to quickly go south.  I
would  have  expected  the linker  to warn me that something  was
wrong, but it didn't.   Anyway,  my question  is: can I force MSC
5.0 to put all global variables  into DGROUP when using the large
model?  The manual seems to indicate that only initialized global
data  will  go  here,  but  isn't  all  global   data  implicitly
initialized to zero if not otherwise specified?

john@viper.Lynx.MN.Org (John Stanley) (02/15/88)

In article <11754@brl-adm.ARPA> 
PEPRBV%CFAAMP.BITNET@husc6.harvard.EDU (Bob Babcock) writes:
.......
 >The manual seems to indicate that only initialized global
 >data  will  go  here,  but  isn't  all  global   data  implicitly
 >initialized to zero if not otherwise specified?

  It's true on "many", but not all systems.  Rule of thumb is NEVER 
assume any UN-initialized variable contains zero (or NULL)...  If you
haven't explicitly put something into a variable, assume it's set to 
a random value or the constant most likely to cause your procedure to 
bomb....

--- 
John Stanley (john@viper.UUCP)
Software Consultant - DynaSoft Systems
UUCP: ...{amdahl,ihnp4,rutgers}!meccts!viper!john

ekb@ho7cad.ATT.COM (Eric K. Bustad) (02/17/88)

In article <620@viper.Lynx.MN.Org>, john@viper.Lynx.MN.Org (John Stanley) writes:
> In article <11754@brl-adm.ARPA> PEPRBV%CFAAMP.BITNET@husc6.harvard.EDU (Bob Babcock) writes:
> .......
>  >The manual seems to indicate that only initialized global
>  >data  will  go  here,  but  isn't  all  global   data  implicitly
>  >initialized to zero if not otherwise specified?
> 
>   It's true on "many", but not all systems.  Rule of thumb is NEVER 
> assume any UN-initialized variable contains zero (or NULL)...  If you
> haven't explicitly put something into a variable, assume it's set to 
> a random value or the constant most likely to cause your procedure to 
> bomb....

Actually, K&R says on page 198:  "Static and external variables which are
not initialized are guaranteed to start of as 0".  If a compiler does not
arrange for this to happen, then it is not a C compiler.  As a matter of
style, however, I agree with John that one should explicitly initialize
any variable if your code makes any use of its initial value.

As for how MSC 5.0 handles "uninitialized" globals, I imagine that it is
doing the same thing as all of the C compilers I've ever used.  All of the
static variables which are implicitly initialized to zero are placed in a
separate section called BSS.  Only the size of this section needs to be
stored in the object file, saving mucho disk space.

= Eric Bustad

karl@haddock.ISC.COM (Karl Heuer) (02/19/88)

In article <278@ho7cad.ATT.COM> ekb@ho7cad.ATT.COM (Eric K. Bustad) writes:
>As a matter of style, however, I agree with John that one should explicitly
>initialize any variable if your code makes any use of its initial value.

I sometimes use the opposite approach: if I have an object of static duration
whose initial value is irrelevant, I may emphasize this fact by writing e.g.
"int x = ARB;".  Usually I do this within partially-initialized aggregates:
"static struct foo x = { &y, 4L, '\0', ARB };", which indicates that the
missing member will be filled in by the subsequent code.

(The value of ARB is, of course, truly arbitary; I generally #define it as 0
so that it can be used to initialize either numbers or pointers.)

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

brett@wjvax.UUCP (Brett Galloway) (02/20/88)

In article <2635@haddock.ISC.COM> karl@haddock.ima.isc.com (Karl Heuer) writes:
>In article <278@ho7cad.ATT.COM> ekb@ho7cad.ATT.COM (Eric K. Bustad) writes:
>>As a matter of style, however, I agree with John that one should explicitly
>>initialize any variable if your code makes any use of its initial value.

It has long bothered me that C guarantees that uninitialized data of static
extent gets initialized to 0.  Personally, I *never* rely on that fact, for
two reasons.  One is a matter of style (see above).  The other is a matter
of principle.  This guarantee is useless and can introduce enormous
inefficiencies.  Unless your machine obeys the convenient kludge that binary 0
translates to a 0 object of every type, then in general a copy of the entire
uninitialized data space must be put in the executable and loaded into
memory.  Something like BSS is completely useless then.

In those cases where I have a lot of data that needs to be initialized, it
is usually the case that it is fairly simple to initialize with code, and so
I just keep a static initialized flag to indicate whether or not to
initialize the rest.

I can see where the inefficiency above could raise even more efficiency
problems for a non-hosted system (on PROM).  A C application in PROM must
keep a copy of the static data area in PROM and copy it to RAM prior to
running.   Implicitly initializing the entire static data area can in general
greatly increase the size of PROM needed.  An equivalent assembly language
implementation would of course just use BSS, and would be *much* smaller.

Anyhow, I think that this is a botch in C; it is a shame that it can't be
changed.  Maybe D?

-- 
-------------
Brett D. Galloway
{ac6,calma,cerebus,isi,isieng,pyramid,tymix}!wjvax!brett

chris@trantor.umd.edu (Chris Torek) (02/20/88)

In article <1221@wjvax.UUCP> brett@wjvax.UUCP (Brett Galloway) writes:
>It has long bothered me that C guarantees that uninitialized data of static
>extent gets initialized to 0.  Personally, I *never* rely on that fact, for
>two reasons.  One is a matter of style (see above).  The other is a matter
>of principle.  This guarantee is useless and can introduce enormous
>inefficiencies.  Unless your machine obeys the convenient kludge that binary 0
>translates to a 0 object of every type, then in general a copy of the entire
>uninitialized data space must be put in the executable and loaded into
>memory.  Something like BSS is completely useless then.

I will not argue on the point of style, but the latter is wrong.
C has very few basic types (there are more in the dpANS than in
K&R, but still few enough); hence something like BSS can always be
used, even if the machine architecture has something like tag bits.
The key idea here is that a small amount of code can replace a
large amount of data:  16 (if that happens to be the number) separate
kinds of `bss' can be used to initialise the 16 kinds of zero.
Arrays of structures containing differing kinds of zeros can be set
with runtime startup loops:

>In those cases where I have a lot of data that needs to be initialized, it
>is usually the case that it is fairly simple to initialize with code, and so
>I just keep a static initialized flag to indicate whether or not to
>initialize the rest.

The compiler can do this implicitly.  C++ does this sort of thing
for static constructors, for instance.
-- 
In-Real-Life: Chris Torek, Univ of MD Computer Science, +1 301 454 7163
(hiding out on trantor.umd.edu until mimsy is reassembled in its new home)
Domain: chris@mimsy.umd.edu		Path: not easily reachable

gwyn@brl-smoke.ARPA (Doug Gwyn ) (02/21/88)

In article <1221@wjvax.UUCP> brett@wjvax.UUCP (Brett Galloway) writes:
>Unless your machine obeys the convenient kludge that binary 0
>translates to a 0 object of every type, then in general a copy of the entire
>uninitialized data space must be put in the executable and loaded into
>memory.  Something like BSS is completely useless then.

No, on those architectures where different object types require
different representations for zero, it is possible to initialize
arrays (other types are best done via explicit .data) using a
small amount of run-time support code that interprets a data
descriptor template and plops down the right kinds of zero data
before main() is called.

wes@obie.UUCP (Barnacle Wes) (02/22/88)

In article <11754@brl-adm.ARPA>, PEPRBV%CFAAMP.BITNET@husc6.harvard.EDU (Bob Babcock) writes:
> The manual seems to indicate that only initialized global
> data will go here

This is correct.  This is also as it should be.

> but isn't all global data implicitly
> initialized to zero if not otherwise specified?

No, as a matter of fact, it ISN'T supposed to be zeroed, it's supposed
to be left as it was!  MSC isn't wrong here, your code is wrong
because it was written to work with Turbo C, which is wrong.

The MSC compiler is the same compiler as Microsoft's Xenix compiler;
it generates three types of segments: text, data, and bss.

Text segments contain executable program text, they are executable but
not writable on protected systems.  The filler for the text segments
are, of course, stored in the executable file.

Data segments contain initialized variables, they are non-executable,
readable, and writable.  The values for the data segments are stored
in the executable file, similar to the text segments.

BSS (Block Starting with Symbol) segments are for pre-allocated,
uninitialized data items that are NOT stack based.  In C, this means
all STATIC variables (of which globals are a sub-set) that are not
initialized.  The stack segment is basically just a special bss
segment.

This is the way C compilers are supposed to generate code.  As usual,
Borland made good ol' Turbo C do it wrong, and if you work with TC at
any depth at all, it will screw you every time!  (IMHO) your best bet
is to do what I did: pitch Turbo C in the garbage, either cry or rage
for a FEW minutes over what a waste of money it was, and buy Quick C.
Of course, since you have MSC 5.0, you *alreay have* Quick C.  Lucky
you.
-- 
    /\              -  "Against Stupidity,  -    {backbones}!
   /\/\  .    /\    -  The Gods Themselves  -  utah-cs!utah-gr!
  /    \/ \/\/  \   -   Contend in Vain."   -  uplherc!sp7040!
 / U i n T e c h \  -       Schiller        -     obie!wes

flaps@csri.toronto.edu (Alan J Rosenthal) (02/22/88)

In article <620@viper.Lynx.MN.Org> john@viper.UUCP (John Stanley) writes:
>It's true on "many", but not all systems.  Rule of thumb is NEVER 
>assume any UN-initialized variable contains zero (or NULL)...  If you
>haven't explicitly put something into a variable, assume it's set to 
>a random value or the constant most likely to cause your procedure to 
>bomb....

ahem.  global and static variables are initialized to zero if they're
arithmetic, NULL if they're pointers, recursively through structs and
arrays but not unions (depending on who you ask); it's true that
automatic variables are not initialized.  (Unfortunately, vax bsd does
initialize them, analogous to making *(char *)0 == 0.)

There may be some implementations out there in which globals are not
properly initialized, but they're not full implementations of C, by
definition.

Your rule of thumb is valid for stylistic reasons, but stylistic
reasons only.

ajr
-- 
"noalias considered sailaon"

henry@utzoo.uucp (Henry Spencer) (02/23/88)

>  >...  isn't  all  global   data  implicitly
>  >initialized to zero if not otherwise specified?
> 
>   It's true on "many", but not all systems...

This is like saying that 2+2 == 4 is true on many but not all systems.
2+2 == 4 and implicit initialization of globals to zero are both guaranteed
facts in correct implementations of C.  Your own environment will determine
how interested you are in buying and using incorrect implementations.
Personally I think it's a waste of money and time.

bts@sas.UUCP (Brian T. Schellenberger) (02/26/88)

In article <620@viper.Lynx.MN.Org> john@viper.UUCP (John Stanley) writes:
|In article <11754@brl-adm.ARPA> 
|PEPRBV%CFAAMP.BITNET@husc6.harvard.EDU (Bob Babcock) writes:
|.......
| >isn't  all  global   data  implicitly
| >initialized to zero if not otherwise specified?
|
|  It's true on "many", but not all systems.  

Any systems on which it is not true are  *WRONG* !

(K&R, section 4.9, p. 82)
-- 
                                                         --Brian.
(Brian T. Schellenberger)				 ...!mcnc!rti!sas!bts

DISCLAIMER:  Whereas Brian Schellenberger (hereinafter "the party of the first 

m5@bobkat.UUCP (Mike McNally ) (02/26/88)

Explicit initialization of global variables usually causes the compiler/linker
to place those variables in the initialized data segment rather than the BSS
segment.  This causes object and executable files to be larger.

-- 
Mike McNally, mercifully employed at Digital Lynx ---
    Where Plano Road the Mighty Flood of Forest Lane doth meet,
    And Garland fair, whose perfumed air flows soft about my feet...
uucp: {texsun,killer,infotel}!pollux!bobkat!m5 (214) 238-7474

dc@gcm (Dave Caswell) (02/28/88)

In article <66@obie.UUCP> wes@obie.UUCP (Barnacle Wes) writes:
)
)> but isn't all global data implicitly
)> initialized to zero if not otherwise specified?
)
)No, as a matter of fact, it ISN'T supposed to be zeroed, it's supposed
)to be left as it was!  MSC isn't wrong here, your code is wrong
)because it was written to work with Turbo C, which is wrong.

You are wrong wrong wrong.  

In article <61@obie.UUCP>, wes@obie.UUCP (Barnacle Wes) writes:
) Try Stephen Kochan's book "Programming in C" from Hayden Books.  As a
) matter of fact, Hayden has a "Hayden Books Unix System Library," which
) is edited by Kochan and Pat Wood, all of which I have found useful.
) I've heard there is a book on Unix communications in this series which
) I am still looking for.  I heartily recommend any book in this series

Do these books really say that globals are not initialized to zero?
If so they are worthless.

bob@gen1.UUCP (Robert Kamins) (02/29/88)

In article <417@white.gcm> dc@gcm (Dave Caswell) writes:

>In article <66@obie.UUCP> wes@obie.UUCP (Barnacle Wes) writes:
>)
>)> but isn't all global data implicitly
>)> initialized to zero if not otherwise specified?
>)
>)No, as a matter of fact, it ISN'T supposed to be zeroed, it's supposed
>)to be left as it was!  MSC isn't wrong here, your code is wrong
>)because it was written to work with Turbo C, which is wrong.
>
>You are wrong wrong wrong. [etc] 

     Kernighan & Ritchie, "The C Programming Language", Appendix 
"A", Section 8.6 (p. 198, re: "Initialization") says:

    "Static and external variables which are not [specifically]
     initialized are guaranteed to start off as 0; automatic and 
     register variables which are not initialized are guaranteed 
     to start off as garbage."

     I have inserted "specifically" in the first line since the
context infers it.

     Bob Kamins.