[comp.unix.wizards] System V.2 and growing processes

mike@cimcor.mn.org (Michael Grenier) (01/05/89)

I posted this originally to comp.unix.questions but got no reply...
perhaps a wizard is called for :-)

My Microport V/AT box has a very very slow brk/sbrk call. It seems that
when a process wants to increase its brk value, the entire process must
be moved in memory or out to swap space EVEN when there is available
and contigous memory right next to (at higher addresses) the growing process.

I can't figure out why the kernel can't just assign the space to the process
instead of moving the entire process to some new spot in memory which
gets progressively slower as the process gets bigger.

Is this normal behavior for a System V.2 kernel?

    -Mike Grenier
    mike@cimcor.mn.org
    uunet!rosevax!cimcor!mike

debra@alice.UUCP (Paul De Bra) (01/06/89)

In article <627@cimcor.mn.org> mike@cimcor.mn.org (Michael Grenier) writes:
>I posted this originally to comp.unix.questions but got no reply...
>perhaps a wizard is called for :-)
>
>My Microport V/AT box has a very very slow brk/sbrk call. It seems that
>when a process wants to increase its brk value, the entire process must
>be moved in memory or out to swap space EVEN when there is available
>and contigous memory right next to (at higher addresses) the growing process.
>
>I can't figure out why the kernel can't just assign the space to the process
>instead of moving the entire process to some new spot in memory which
>gets progressively slower as the process gets bigger.
>
>Is this normal behavior for a System V.2 kernel?

I have noticed a similar behaviour in an old System III box (68000 based)
when a process tried to grow to about 500k on a system with 1Meg of ram.
Although no other processes were active the big process was swapped out and
in each time it wanted to grow. I do believe it is very implementation
dependent. I often run big processes on SCO Xenix 286 and never noticed
any swapping.

Moving the process around in memory is clearly stupid. So is swapping when
there is enough memory. However, the standard Unix swapping algorithm in case
all memory is used is as follows (yes I looked at the source of a non-paging
system):
1) the process that wants more memory is swapped out.
2) other processes that are sleeping are swapped out to free up more memory.
   (if all processes are active the algorithm will swap out active processes
   after a while too)
3) the growing process is swapped in again en continues to run (until it
   wants more memory which is not available, in which we return to 1).
This algorithm is less than optimal :-( because the kernel could just as
well only do step 2 and swap out idle processes to get more memory. However,
there is a catch since the growing process that is waiting to get more
memory is also idle in a way...

Too find out if what you experience is this swapping problem you should run
the program twice: the first run will swap out all idle processes, and they
will not be swapped in again as long as they remain idle. In the second run
you should not experience swapping anymore. (that was the behaviour on the
old System III box) If you still experience swapping you indeed have a brain-
damaged memory management.

Paul.
-- 
------------------------------------------------------
|debra@research.att.com   | uunet!research!debra     |
------------------------------------------------------

jfh@rpp386.Dallas.TX.US (John F. Haugh II) (01/07/89)

In article <8680@alice.UUCP> debra@alice.UUCP () writes:
>In article <627@cimcor.mn.org> mike@cimcor.mn.org (Michael Grenier) writes:
>>My Microport V/AT box has a very very slow brk/sbrk call. It seems that
>>when a process wants to increase its brk value, the entire process must
>>be moved in memory or out to swap space EVEN when there is available
>>and contigous memory right next to (at higher addresses) the growing process.

The UNIX kernel does not contain a realloc()-like call.  For swapping
kernels, the only routines which are present are malloc() and free().

A process which wishes to grow may not do so by allocating adjacent
memory.  It must be able to allocate a new contiguous region of core
or be swapped.  It would be possible to avoid this behavior if a
realloc existed by testing for available memory at higher or lower
addresses adjacent to the growing process.  This is not done.

>>I can't figure out why the kernel can't just assign the space to the process
>>instead of moving the entire process to some new spot in memory which
>>gets progressively slower as the process gets bigger.

Well, now you know ;-)  Had a realloc been added these problems wouldn't
exist.  I had added a realloc() which tested for adjacency and it resolved
this problem partially.  The conclusion I reached was that processes
should allocate physical memory in large chunks and then manage the
unused space, rather than growing one click at a time.

>I have noticed a similar behaviour in an old System III box (68000 based)
>when a process tried to grow to about 500k on a system with 1Meg of ram.
>Although no other processes were active the big process was swapped out and
>in each time it wanted to grow. I do believe it is very implementation
>dependent. I often run big processes on SCO Xenix 286 and never noticed
>any swapping.

The behavior on the 68K box is due to having a process exceed half the
size of the available memory.  You can't have a free space available
which would be large than the current allocation if you are already
allocating more than half of the free memory.  Only the government is
allowed to use all three halves of anything ;-)

I believe the difference with your Xenix box is because of the difference
in the architecture.  A segment may only be 64K, so you would not swap
until there were no longer any free 64K regions.  Other architecures
require contiguous memory, the 80286 doesn't.

>Moving the process around in memory is clearly stupid. So is swapping when
>there is enough memory. However, the standard Unix swapping algorithm in case
>all memory is used is as follows (yes I looked at the source of a non-paging
>system):

What you didn't mention is that this may be corrected with the addition
of a routine to perform the realloc() function.  The code addition is
extremely small, and with a simplification that the region never grow
down, very little table munging need be performed.
-- 
John F. Haugh II                        +-Quote of the Week:-------------------
VoiceNet: (214) 250-3311   Data: -6272  |"Now with 12 percent less fat than
InterNet: jfh@rpp386.Dallas.TX.US       | last years model ..."
UucpNet : <backbone>!killer!rpp386!jfh  +--------------------------------------

ckl@uwbln.UUCP (Christoph Kuenkel) (01/09/89)

In article <10746@rpp386.Dallas.TX.US>, jfh@rpp386.Dallas.TX.US (John F. Haugh II) writes:
> Well, now you know ;-)  Had a realloc been added these problems wouldn't
> exist.  I had added a realloc() which tested for adjacency and it resolved
> this problem partially.  The conclusion I reached was that processes
> should allocate physical memory in large chunks and then manage the
> unused space, rather than growing one click at a time.
> 
My solution is to estimate the maximum size of dynamic memory a process will
need and code something like:

    ...
    register char *p;

    p = malloc(ESTIMATED_SIZE);			/* 1. */
    (void) malloc(1);				/* 2. */
    free(p);

Step 1 is to grow the process with a single swap out/swap in, step 2 is
to avoid shrinking of the process's break value.

Seems to work quite fine.  Is it much more stupid than I realize?
(Yes, I dont use the maximum size for long-running programs with extensively
varying memory usage)

christoph
-- 
# include <std/disclaimer.h>
Christoph Kuenkel/UniWare GmbH       Kantstr. 152, 1000 Berlin 12, West Germany
ck@tub.BITNET                ckl@uwbln             {unido,tmpmbx,tub}!uwbln!ckl

allbery@ncoast.UUCP (Brandon S. Allbery) (01/12/89)

As quoted from <627@cimcor.mn.org> by mike@cimcor.mn.org (Michael Grenier):
+---------------
| I can't figure out why the kernel can't just assign the space to the process
| instead of moving the entire process to some new spot in memory which
| gets progressively slower as the process gets bigger.
| 
| Is this normal behavior for a System V.2 kernel?
+---------------

It's normal behavior for swapping systems from at least V7 to 5.2.  The
solution is to get a paging system like V.3.

++Brandon
-- 
Brandon S. Allbery, comp.sources.misc moderator and one admin of ncoast PA UN*X
uunet!hal.cwru.edu!ncoast!allbery		    ncoast!allbery@hal.cwru.edu
 ncoast is registering as "ncoast.org" -- watch for the official announcement!
      Send comp.sources.misc submissions to comp-sources-misc@<backbone>.

mike@cimcor.mn.org (Michael Grenier) (01/12/89)

From article <1289@uwbull.uwbln.UUCP>, by ckl@uwbln.UUCP (Christoph Kuenkel):

  John talks about a standard UNIX System V.2 kernel which Microport
appears to use for their V/AT product (to their credit). I had asked
why the brk call always requires the entire process to swap/relocate
in memory or swap to disk.

> In article <10746@rpp386.Dallas.TX.US>, jfh@rpp386.Dallas.TX.US (John F. Haugh II) writes:
>> Well, now you know ;-)  Had a realloc been added these problems wouldn't
>> exist.  I had added a realloc() which tested for adjacency and it resolved
>> this problem partially.  The conclusion I reached was that processes
>> should allocate physical memory in large chunks and then manage the
>> unused space, rather than growing one click at a time.
>> 
+ My solution is to estimate the maximum size of dynamic memory a process will
+ need and code something like:
+ 
+     ...
+     register char *p;
+ 
+     p = malloc(ESTIMATED_SIZE);			/* 1. */
+     (void) malloc(1);				/* 2. */
+     free(p);
+ 
+ Step 1 is to grow the process with a single swap out/swap in, step 2 is
+ to avoid shrinking of the process's break value.
+ 
+ Seems to work quite fine.  Is it much more stupid than I realize?
+ (Yes, I dont use the maximum size for long-running programs with extensively
+ varying memory usage)

Chris,
    Your method would work fine on a non-brain damaged CPU. My problem
however exists on an Intel 80286 processor where one can only malloc
or increase the break value by one segment at a time or less than
64K bytes at a time. Thus if I know the process is going to consume
1 megabyte of RAM, the process has to allocate 1M/64K or at least
16 times. Each time I allocate a block, the process swaps more slowly
which is very fustrating when its obvious that the memory is available.

Moreover, the 80286 processor wouldn't even need that memory to be
contigous to the given process since its virtual memory via selectors
will map the process addresses to various segments throughout physical
RAM. 

John's method of adding a realloc call within the kernel is probably
a good idea on a 80286 if reasonably fast brk/sbrk/mallocs are going to
be achieved for user processes. Hopefully, Microport can look
into this problem.

    -Mike Grenier
	mike@cimcor.mn.org
       uunet!rosevax!cimcor!mike

dorman@sdsioa.UUCP (Leroy Dorman X42406) (01/13/89)

Back in '86 we incrporated 'realloc'-like functionality
into the generic V.2-286 (for multibus 1 Intel 310's) and
improved performance on out 1Mb RAM machines for ld jobs
by up to 40%! (operations like rebuilding kernels, etc.) I guess
uPort assumes all AT's have lots of core . . .

Eric Dorman			siolmd!eric@sdsioa

ka@june.cs.washington.edu (Kenneth Almquist) (01/13/89)

mike@cimcor.mn.org (Michael Grenier) writes:
> My Microport V/AT box has a very very slow brk/sbrk call. It seems that
> when a process wants to increase its brk value, the entire process must
> be moved in memory or out to swap space EVEN when there is available
> and contigous memory right next to (at higher addresses) the growing process.
>
> Is this normal behavior for a System V.2 kernel?

This approach is the one that is used on the PDP-11.  It works reasonably
well on the PDP-11 because processes are limited to 64K of data on that
machine.  In the PDP-11 scheme the data area (which grows up) is stored
below the stack (which grows down), so when either the stack or the data
area is grown, the process must be moved, even if there is contiguous
memory right next to the growing process.  When UNIX was ported to the VAX,
a different scheme was implemented.  Processes are not stored contiguously
in memory.  Instead, the pages of a process are scattered throughout memory.
Thus the problem you describe does not occur on VAX.

You may wonder why the approach used on the VAX isn't used everywhere.
The answer is that on some machines, devices can only do I/O to contiguous
regions of physical memory.  On these machines, there are two reasons for
keeping processes physically contiguous in memory.  First, since if the
pages of a process are scattered randomly throughout memory on such a
machine, swapping a process requires a separate I/O operation for each
page of the process, which is slow.  Second, raw disk and tape device
drivers which perform DMA directly to the user's address space won't work
if the user process is not contiguous.

Apparently the memory management in your kernel is based upon the PDP-11
code rather than the VAX code, probably for the reasons listed in the
preceding paragraph.

If you have kernel source, you could try modifying the layout of the
process so that the data area came last, and then modifying the sbrk
system call to take advantage of this.  This scheme isn't perfect
since UNIX uses a first fit memory allocation strategy (meaning that
the next process to be loaded into memory is likely to be loaded right
above yours in memory), but on a single user system the growing process
is likely to be the only running process so nothing else is likely to
be swapped in.  Another approach is to implement a hybrid scheme which
always reads processes into contiguous areas of memory initially, but
adds noncontiguous memory to the process when the process allocates
more memory.  If a noncontiguous process tries to do raw disk I/O, you
have a problem.  One solution is to swap the process out, forcing it to
be contiguous when it is swapped back in.

If you have a binary only system, about all you can do is complain to
your vendor.
				Kenneth Almquist

debra@alice.UUCP (Paul De Bra) (01/13/89)

In article <6936@june.cs.washington.edu> ka@june.cs.washington.edu (Kenneth Almquist) writes:
}mike@cimcor.mn.org (Michael Grenier) writes:
}> My Microport V/AT box has a very very slow brk/sbrk call. It seems that
}> when a process wants to increase its brk value, the entire process must
}> be moved in memory or out to swap space EVEN when there is available
}> and contigous memory right next to (at higher addresses) the growing process.
}>
}> Is this normal behavior for a System V.2 kernel?
}
}This approach is the one that is used on the PDP-11.  It works reasonably
}well on the PDP-11 because processes are limited to 64K of data on that
}machine.  In the PDP-11 scheme the data area (which grows up) is stored
}below the stack (which grows down), so when either the stack or the data
}area is grown, the process must be moved, even if there is contiguous
}memory right next to the growing process.  When UNIX was ported to the VAX,
}a different scheme was implemented.  Processes are not stored contiguously
}in memory.  Instead, the pages of a process are scattered throughout memory.
}Thus the problem you describe does not occur on VAX.
>...

Aha, this may explain why SCO Xenix 286 does not have the same problem
Microport has: in SCO Xenix the stack doesn't grow at all. When you compile
(actually link) a program you tell the linker how much stack-space to
allocate. So the stack can be below the data, avoiding the problems of
growing processes. Of course having a fixed-size stack sometimes is a
problem by itself... (like try to write alloca:-(...)

Paul.
-- 
------------------------------------------------------
|debra@research.att.com   | uunet!research!debra     |
------------------------------------------------------

jfh@rpp386.Dallas.TX.US (John F. Haugh II) (01/15/89)

In article <6936@june.cs.washington.edu> ka@june.cs.washington.edu (Kenneth Almquist) writes:
>          In the PDP-11 scheme the data area (which grows up) is stored
>below the stack (which grows down), so when either the stack or the data
>area is grown, the process must be moved, even if there is contiguous
>memory right next to the growing process.

This decision was based more on keeping the kernel simple [ apparently,
my name is not Dennis Ritchie ;-) ] than due to hardware.  The MMU
registers on 11's with memory management were capable of separating the
stack and data segments, or in scattering the data pages throughtout
memory.  The kernel COULD have gotten this correct, it just didn't.

>You may wonder why the approach used on the VAX isn't used everywhere.
>The answer is that on some machines, devices can only do I/O to contiguous
>regions of physical memory.

This is correct, but only for the reasons you outlined.  Many process
do not perform raw I/O, so there is little need to insure a process is
accessible to the device controller.  The kernel could have handled the
special case and given the performance advantage to the others.

>If you have kernel source, you could try modifying the layout of the
>process so that the data area came last, and then modifying the sbrk
>system call to take advantage of this.

More is needed than munging malloc or sbreak.  Malloc has no way of
know if you are trying to expand into an adjacent region as sbreak
doesn't even provide this information.  Something like

	if (! (seg = realloc (seg, size)))
		mfree (seg);
		seg = malloc (size);
	}
	return (seg);

was needed.  [ The arguments to malloc are the map and the size, the
old region is never even mentioned ... ]

>This scheme isn't perfect
>since UNIX uses a first fit memory allocation strategy (meaning that
>the next process to be loaded into memory is likely to be loaded right
>above yours in memory), but on a single user system the growing process
>is likely to be the only running process so nothing else is likely to
>be swapped in.

I modified malloc to perform any of first, best and worst allocations.
First was so-so, for the reasons given.  Best was even worse, for more
of the reasons given ;-).  The best allocation scheme I found, in the
presence of a realloc-style modification, was find-worst-fit.  It
reduced swapping to an apparent minimum because there were all these
little holes left about.  Growing processes filled up the holes as
needed, instead of being wedged into a space it was unable to grow
inside of.  [ modulo number of users - in a single user environment
worst-fit smelt a lot like any other fit. ]

>If you have a binary only system, about all you can do is complain to
>your vendor.

True.
-- 
John F. Haugh II                        +-Quote of the Week:-------------------
VoiceNet: (214) 250-3311   Data: -6272  |"UNIX doesn't have bugs,
InterNet: jfh@rpp386.Dallas.TX.US       |         UNIX is a bug."
UucpNet : <backbone>!killer!rpp386!jfh  +--------------------------------------