[comp.unix.xenix] Answers to

dyer@spdcc.COM (Steve Dyer) (04/27/87)
What follows are several replies mailed to me clarifying the allocation of
stack and memory segments when building large systems, particularly
when using large model.  Ironically, it turned out that the particular
application I was building (MH 6.5) finally didn't require anything larger
than sep. I&D space, after carefully choosing options (although MH brought
its own set of portability problems, which I'll summarize in another note.)
However, all the comments were much appreciated and went a long way towards
clearing up my confusion.  I hope this will help others.

Steve Dyer
-------------
From:	sco!md
Steve,

	Let me try and fill in some of the gaps in the Xenix-286
	documentation - I think that this will also answer your
	specific questions.

	Large model programs under running under Xenix-286 do NOT
	have a separate segment for the stack. The stack always
	occupies the area at the high end of the program's first
	data segment (and SS == DS). The low end of the first data
	segment is used for initialised data - uninitialised data
	is placed in one or more additional data segments.
	Thus the total size of stack and initialised data cannot
	exceed 64K - exec() refuses to exec a binary if this
	size is > 64K and returns ENOMEM.

	The stack size is contained in the x.out header, and
	can easily be altered by using "fixhdr -F ....."

	If you wish to maximise the available stack space, then
	it follows that you must minimise the amount of initialised
	data. A certain amount of this data is in the C libraries
	and nothing can be done about this, however there are
	two things that you can do which will reduce the amount
	of program data in this segment.
	    a) remove the initialisers from any data that is
	       explicitly initialised to 0. This will cause it
	       to be moved out to one of the "far" bss segments
	    b) move data that MUST be initialised into a
	       separate module, and compile it using the -ND
	       compiler flag - this allows you to specify a
	       different data segment name from the default
	       which is "_DATA" and will cause the data in
	       that module to be allocated in a different
	       segment.
	       eg:	cc -c -ND dataseg1 .... prog.c

	       It is **extremely** important that modules
	       compiled in this way only contain data -
	       if they contain any code then it is likely
	       that the DS register will be reloaded with
	       the segment selector for this "extra"
	       data segment - the previous value will not
	       be saved, and problems will arise when you
	       try and access any data that IS in the
	       first data segemnt.

Michael Davidson,
SCO Languages Group
(sco!md)
--------------
From:     laskin@gryphon.CTS.COM
>
>Is the -F 0xxxx flag which can be given both to "cc" and "ld" used only by
>the loader or does the compiler itself use this too?  My guess would be
>the loader only; otherwise, the standard libraries would have to be special-
>cased.  If this is so, then is it permissible to play around with
>"fixhdr -F 0xxxx x.out" increasing the stacksize as necessary?
>
cc just passes it -F to load.

>What exactly does the 'd' flag do in "cc", as in "cc -Mld ...".
>There is only the vaguest allusion to this in the manual page;
>it says "instructs the compiler not to assume SS=DS."  When would
>this be used?  I would assume that SS<>DS in any large model program.

In large modem, the _DATA segment gets all the static initialized
data (the strings mostly).  The loader then adds the default stack
size (or the stack size set by -F 0xxxxx) to the segment size
and arranges for SP to be initialized to the END of this segment.
The default stack allocation is 2K.  Uninitialized static data
goes into FAR_BSS and gets it's own (and as many as required)
data segment(s).

There is a case where the loader fills up _DATA with, say, 61K of
data which leaves no room for the stack.  The -Md flag causes the
compiler to manage SS and DS on calls leaving a module.  Normally,
SS is assumed to be the same throughout the program.  DS is
changed only when you enter (or leave) a module with a declared (-ND)
data segment name.  You're probably asking for it if you use -Md.

If you use -ND to name a data module, it gets its own segment, however
you can't make any libarary calls from the module.  The reasons are
arcane and have to do with DS pointing at the wrong place for the
library routines.  I've heard a report that explicitly named
data segments >32K fail but I haven't confirmed it.

>I'm trying to bring up MH-6.5, a system which unfortunately assumes
>too much about the machine it's running on and defines gratuitously
>large (in 8086 terms) automatic arrays.  Naturally, this is guaranteed

Use -m mapfile and get a link map.  Look at the size of the _DATA
segment.  What's left in that segment can be used by the stack.
Try -Md (I don't think it will fix it).  Then you get out xstr and
extract the strings from all the modules (hope you have an xstr
that works, SCO's seems to be broken ... I have a binary xstr that
will run under SCO) and do something like I did in this makefile for
ELM:

[ S.D.--I didn't have any problem with SCO's "xstr", although both 4.XBSD
and SCO's "xstr" screw up when hitting compile-time initializations like:

char str[100] = "foobar";

since that gets translated into the meaningless construct

char str[100] = &xstr[NNNN];

]

{ many makefile lines deleted }

../bin/elm: ${OBJS} ${EXTRA} ${HEADERS} ../hdrs/elm.h
	xstr
	${CC} -c -Ml -ND STRING xs.c
	${CC} -F 8000 -o ${BIN}/elm -Ml -m elm.map  ${OBJS} xs.o ${LIBS} ${LIB2}
	chgrp mail ../bin/elm


.c.o:   ${HEADERS}
	xstr -c $*.c
	${CC} -c ${CFLAGS} -NM $* -NT $* ${DEFINE} x.c 
	mv x.o $*.o

----------------------------------------------------------------
The map:
Stack Allocation = 32768 bytes

 Start     Length Name                   Class
 003f:0000 00000H _CSU_TEXT              CODE
 003f:0000 00090H ~CRT0_TEXT             CODE
 003f:0090 01169H ADDR_UTILS             CODE
	[ bunck of module names]
 004f:7458 00000H C_ETEXT                ENDCODE
 --------------------------------------------------------------
 0057:0000 01990H _DATA                  DATA
 0057:1990 00000H XIB                    DATA
 0057:1990 00000H XI                     DATA
 0057:1990 00000H XIE                    DATA
 0057:1990 00000H XCB                    DATA
 0057:1990 00000H XC                     DATA
 0057:1990 00000H XCE                    DATA
 0057:1990 0042aH CONST                  CONST
   -- you get from here to the end of the segment for the stack --
 0057:1dba 00000H STACK                  STACK
 0057:1dba 00000H EDATA                  ENDDATA
 0057:1dba 04942H _BSS                   BSS
 0057:66fc 00022H c_common               BSS
 0057:671e 00000H EEND                   ENDDS
 005f:0000 00000H STRING                 FAR_DATA
 005f:0000 00000H STRING_CONST           FAR_DATA
 005f:0000 00000H STRING_BSS             FAR_DATA
--- note 32K worth of strings end up here instead of in _DATA ----
 0067:0000 08925H XS1_DATA               FAR_DATA
 006f:0000 07adeH FAR_BSS                FAR_BSS

and remember, no arrays greater than 64K or you have to go to huge model
or play with the 'far' and 'huge' keywords. 
Avoid that at all costs :-).

Don't flame me.  I didn't design the chip.  I just program the damn thing.
--
Greg Laskin           
Trailing Edge Technologies, Ltd. (but not very)
INTERNET:     laskin@gryphon.CTS.COM
UUCP:         {akgua, hplabs!hp-sdd, sdcsvax, ihnp4, nosc}!crash!gryphon!laskin
--------------------
-- 
Steve Dyer
dyer@harvard.harvard.edu
dyer@spdcc.COM aka {ihnp4,harvard,linus,ima,bbn,halleys}!spdcc!dyer