[comp.sys.apollo] Huge data segments in SR10

moj@tatti.utu.fi (Matti Jokinen) (11/06/88)

SR10 appears to use about 10 times more space per process than SR9.7 did.
Here is a sample output from `ps axv':

  PID TTY     STAT  TIME PAGEIN SIZE  RSS   LIM TSIZ COMMAND
 8194 ?       S     0:24     18 1888  421  1857   96 nntpd
  102 ?       S <  22:55   6630 2816  227  1860  288 rgyd
 7282 ttyp0   S     0:10    270  640  184  1872  160 ksh
 8343 ttyp0   R     0:00      0  640  170  1854   96 ps
   99 ?       S    17:13   1252 3808   55  1881  128 glbd
   85 ?       R   102:06   1300 1728   53  1872   32 tcpd
   88 ?       S     0:39   4142 1856   44  1872   96 syslogd
  111 ?       S     0:34   2290 1824   43  1878   64 sendmail
   94 ?       S     1:43   2044 1824   40  1860   64 inetd
 7281 ttyp0   S     0:21     45 1856   34  1872   96 telnetd
  106 ?       S     8:04   1365  480   29  1872   32 cron
    1 ?       S <   0:28   2573 1120    0  1860  160 init
    6 ?       S     0:00      0    0    0     0    0 pinger
    5 ?       S    11:32      0    0    0     0    0 unwired_dxm
   10 ?       S     0:13      0    0    0     0    0 netrequest
    7 ?       S    35:27      0    0    0     0    0 netreceive
    9 ?       S     1:43      0    0    0     0    0 wired_dxm
    4 ?       S     0:10      0    0    0     0    0 purifier
    3 ?       S     7:42      0    0    0     0    0 purifier
   97 ?       S     0:02    297 1856    0  1860   64 llbd
    8 ?       S     1:26      0    0    0     0    0 netpaging
  116 ?       S <  21:54  17469 1472    0  1860  224 dm
    2 ?       R   8152:35      0    0    0     0    0 null

The value of the SIZE field matches approximately the amount of disk space
lost when the process is started.  Daemons alone consume more than 15 MB.
User processes are no smaller.  Even without diskless clients and remote
users the peak load may go up to 30 MB if, for instance, cron and inetd
spawn a few background processes while you have two or three active windows
(I have once run out of disk space this way).

What is the motivation of such apparently wasteful use of resources?  Or is
there something wrong in our system?  Since the programs ran with much smaller
data segments in SR9.7, it seems obvious that the bulk of the space is unused.

----------------
Matti Jokinen			Internet: moj@utu.fi
University of Turku		UUCP: ...mcvax!tut!tucos!moj
Finland

dbfunk@ICAEN.UIOWA.EDU (David B. Funk) (11/11/88)

In posting  <130@tatti.utu.fi> Matti Jokinen (moj@utu.fi) Asks:

>SR10 appears to use about 10 times more space per process than SR9.7 did.
>Here is a sample output from `ps axv':
(listing deleted)
>The value of the SIZE field matches approximately the amount of disk space
>lost when the process is started.  Daemons alone consume more than 15 MB.
>User processes are no smaller.  Even without diskless clients and remote
>users the peak load may go up to 30 MB if, for instance, cron and inetd
>spawn a few background processes while you have two or three active windows
>(I have once run out of disk space this way).
>What is the motivation of such apparently wasteful use of resources?  Or is
>there something wrong in our system?  Since the programs ran with much smaller
>data segments in SR9.7, it seems obvious that the bulk of the space is unused.


    In "Domain System Software Release Notes" for SR10.0 there is a discussion
of the changes to the system routines malloc() & rws_$alloc on page 1-18.
The important part is that the requested storage is marked as "used" on
disk at allocation time, not delayed until the pages are used. The net
result is a process' stack frame is completely allocated at process
creation time. Thus each new process you start eats up at least 500K bytes
of disk space, even if it's just a little thing like pwd. Also pre-SR10
inprocess execution saved you the cost of a new stack for simple command
execution.
    In the release notes, page 2-1 recommends:
"All systems should have about four to five megabytes free disk space to
boot and run a small number of server and user processes." (Emphasize the
word "small".) Note that the regy daemon, rgyd (new at SR10), wants to keep
a copy of the whole registry database in memory. The other NCS daemons that
are needed at SR10 add to the memory demand.
    The bottom line is that SR10 uses up lots of resources, gone are the days
that you could run a node with 1 meg of RAM and 34 meg of disk. Nice new things
like long path & variable names, NCS, big directories, etc., cost resources.
At a SR10 transition class, I heard a joking comment: "buy stock in disk drive
companies."
    One thing that might help, you may be able to bind your small programs
with a stack size smaller than the 256K byte default.

Dave Funk

mishkin@apollo.COM (Nathaniel Mishkin) (11/11/88)

In article <8811110459.AA25851@umaxc.weeg.uiowa.edu> dbfunk@ICAEN.UIOWA.EDU (David B. Funk) writes:
>In posting  <130@tatti.utu.fi> Matti Jokinen (moj@utu.fi) Asks:
>>(I have once run out of disk space this way).
>>What is the motivation of such apparently wasteful use of resources?  Or is
>>there something wrong in our system?  Since the programs ran with much smaller
>>data segments in SR9.7, it seems obvious that the bulk of the space is unused.
>    In "Domain System Software Release Notes" for SR10.0 there is a discussion
>of the changes to the system routines malloc() & rws_$alloc on page 1-18.
>The important part is that the requested storage is marked as "used" on
>disk at allocation time, not delayed until the pages are used. The net
>result is a process' stack frame is completely allocated at process
>creation time. Thus each new process you start eats up at least 500K bytes
>of disk space, even if it's just a little thing like pwd. Also pre-SR10
>inprocess execution saved you the cost of a new stack for simple command
>execution.

On vanilla Unix, when you change the break (perhaps via malloc), swap
space disk blocks are reserved to back the new VA space.  If there's
no swap space, the call to set the break, and hence the call to malloc,
fails.

DOMAIN/OS doesn't segregate blocks on our disks between swap space and
file system space.  The idea was that this is a more flexible scheme:
If you want more swap space, delete some files; if you want more space
for files, be prepared for the fact that you can't run programs that
want a lot of heap storage.  (In vanilla Unix changing your mind requires
dumping and reloading your disk.)

The problem with the DOMAIN/OS scheme prior to 10.0 was that changing
the break caused more temp file space to be mapped into your address
space but there was no guarantee that when you actually touched that
part of your address space for the first time (perhaps sometime long
after you changed the break) that there'd be disk space to back it up.
Thus, the malloc would succeed, but your program could then fail randomly
later.  Lots of people complained about this.  So at 10.0 we changed
things so that the disk space is reserved at the time the break is changed,
thus ensuring the same behavior as vanilla Unix.  I think the net effect
is that you require no more disk space than would be required on a vanilla
Unix system.  However, the benefits of being to easily juggle space between
backing storage for heap space and for file system data are retained in
DOMAIN/OS.
-- 
                    -- Nat Mishkin
                       Apollo Computer Inc., Chelmsford, MA
                       mishkin@apollo.com

krowitz@RICHTER.MIT.EDU (David Krowitz) (11/14/88)

Sweet Jesus!!! You stupid SOBS AREN'T PAYING A SINGLE BIT OF
ATTENTION TO ANYTHING WE SAY! All you can come back with is a
whining bit about "vanilla Unix is just as bad ... " 
WE DIDN'T LAY OUT THE PREMIMUM BUCKS IT TAKES TO BUY AN APOLLO
SO THAT WE COULD SUFFER ALONG WITH THE REST OF THE CROWD!! IF
WE WANTED A SYSTEM THAT ACTED LIKE A VANILLA UNIX SYSTEM, WE'D
HAVE BOUGHT SUN WORKSTATIONS! We have dozens of programs which
are designed to handle the maximum case (ie. they allocate big
working arrays), but are used most of the time for much smaller
problems. When we have a large problem to process, we will roll
some files off the disk onto tape temporarily. Now we have to
keep 100MB of disk space free at all times in order to run the
small problems! Explain to me how the hell this is better than
a SUN-3!!! I have 60 (count 'em, sixty) unexperienced Fortran
programmers here, they are engaged in solving complicated
geophysical problems and they have enough work to do without
having to figure out complicated dynamic memory allocation
schemes! YOU ARE MAKING APOLLO WORKSTATIONS LESS USEFULL AND
MORE EXPENSIVE (and don't hand me the lines about your list
prices -- Apollo has constistantly removed the software which
is need to run a group of graphics workstations from the OS
and made it an optional -- and expensive -- add on).

If you had people complaining that the system would let
their job run a bit before it crashed, then why didn't you
add a switch or environment variable which would check the
disk space for them when the program was loaded rather than
removing the ability to run programs which don't necessarily
touch all of their address space every time they are run?
I write complicated print servers for a comercial customer
which require 64MB of buffers in order to handle the worst-case
printing requirements. Most of the time they can get along
just fine on 15MB, and run just fine on a machine with
20 to 30 MB of free disk space. They give an error message
when they try to touch a new page and there isn't enough
disk space. Now you tell me that when this customer wants
to upgrade to SR10, they'll have to keep 64MB of disk space
free just in order to start the print server, which means
that it won't run on a machine with a 155MB disk (take that
disk, format it, add SR10, add the VM for the required
system servers and a few pads and windows, and then see if
you can find the free disk space). What am I supposed to do?
Tell them that every printer they sell requires a $5000
disk upgrade?


 -- David Krowitz

krowitz@richter.mit.edu   (18.83.0.109)
krowitz%richter@eddie.mit.edu
krowitz%richter@athena.mit.edu
krowitz%richter.mit.edu@mitvma.bitnet
(in order of decreasing preference)