[comp.unix.internals] whay can't processes shrink as well as grow?

ggm@brolga.cc.uq.oz.au (George Michaelson) (10/04/90)

My understanding is that existing brk/sbrk/malloc in generic *nix
doesn't allow the process to shrink again once mem has been allocated.

Could somebody explain to a non-wizard what stops some method being
used to detect free mem pages or compress the heap such that memory
CAN be freed? I dont mean "for free" ie explain what would have to
be different in HOW malloc/alloc/sbrk/free work to get this behaviour.

I'm expecting to be told that keeping back-pointers to entities using
allocated memory so you could dynamically update their position if
you'd compressed the heap is more pain than it's worth. If thats not
the case, I see some attractive uses of grabbing 5-10 Mb of space, using
it and then releasing it without having to exit.

	-George

-- 
	G.Michaelson
Internet: G.Michaelson@cc.uq.oz.au                     Phone: +61 7 377 4079
  Postal: George Michaelson, Prentice Computer Centre
          The University of Queensland, St Lucia, QLD Australia 4067.

cpcahil@virtech.uucp (Conor P. Cahill) (10/04/90)

In article <1990Oct3.225943.4691@brolga.cc.uq.oz.au> ggm@brolga.cc.uq.oz.au (George Michaelson) writes:
>My understanding is that existing brk/sbrk/malloc in generic *nix
>doesn't allow the process to shrink again once mem has been allocated.

True.

>Could somebody explain to a non-wizard what stops some method being
>used to detect free mem pages or compress the heap such that memory
>CAN be freed? I dont mean "for free" ie explain what would have to
>be different in HOW malloc/alloc/sbrk/free work to get this behaviour.

Nothing stops this from being implemented and it would not require a 
change to the way free() is *allowed* to work.  However, some programs
depend upon an old free() behavior that says something to the effect that
the free'd segment is still available until the next call to one of the
allocation functions.

Besides that "compatibility" reason, you have to consider the usefulness of 
that kind of change for the "average" program.  What I mean here is what 
measurable effect would adding that functionality have to the system as 
a whole?  If it is not measurable (like being able to support x% more
users with the same amount of memory, or a x% increase in overall system
performance) then why do it?

Even though free does not actually return the memory to the system, it
does mark it for re-use, so your programs do not grow indefinately (unless,
of course, you have a bug in your program).

>I'm expecting to be told that keeping back-pointers to entities using
>allocated memory so you could dynamically update their position if
>you'd compressed the heap is more pain than it's worth.

Heap compression is a different animal all together.  You would run into
zillions of problems with programs that got the pointer returned from
malloc and copied it elsewhere, etc.  To allow for this you would have
to require that any access to a malloced area be made through one of your
functions that would then do the translation to a real address (sort of
your own virtual memory manager).  talk about a bad effect on performance.

>If thats not
>the case, I see some attractive uses of grabbing 5-10 Mb of space, using
>it and then releasing it without having to exit.

You could probably do with other mechanisms that do not require the jump 
to 10MB of mem and the fall back to 0, or that use a child program to 
process the 10MB of data and passes it's output back to the parent and
exits (thereby freeing the allocated memory).

-- 
Conor P. Cahill            (703)430-9247        Virtual Technologies, Inc.,
uunet!virtech!cpcahil                           46030 Manekin Plaza, Suite 160
                                                Sterling, VA 22170

slb@neptune.dsc.com (Steve Baur) (10/04/90)

ggm@brolga.cc.uq.oz.au (George Michaelson) writes:

>My understanding is that existing brk/sbrk/malloc in generic *nix
>doesn't allow the process to shrink again once mem has been allocated.

>Could somebody explain to a non-wizard what stops some method being
>used to detect free mem pages or compress the heap such that memory
>CAN be freed?

The problem has to do with the fact that generic *nix is basically a
swapping system, with virtual memory paging (if it exists at all) added
as an after thought.  This is true through System V/R3.

The UNIX system process data space looks like this:

+-------------------------------+
|       Process Stack           |
|             |                 |
|             V                 |
|                               |
+-------------------------------+
|                               |
|       Free Memory             |
|                               |
+-------------------------------+   <----   "Break"
|                               |
|             ^                 |
|             |                 |
|            Heap               |
|                               |
+-------------------------------+
|                               |
|             BSS               |
|                               |
+-------------------------------+
|                               |
|            Data               |
|                               |
+-------------------------------+

The line above the heap (called the "break") is adjusted via the brk(2) and
sbrk(2) system calls and is adjusted upwards when more memory is needed.
It could also be adjusted downwards when allocated memory just below the
break is released, but that does not seem to be performed through System
V/R3.

A more basic problem exists with this scheme though.  Imagine allocating
a 100k chunk of memory, then a 10 byte chunk of data, then deallocating the
100k chunk.  Now the allocated 10 byte chunk of data sits just below the
break, and hence it cannot be adjusted downwards to reflect the release
of the 100k chunk.  No holes are allowed to exist between the top of BSS
and the break.

Anyway, to properly deal with a paging system, something like a getvmpage/
releasevmpage (that treats virtual memory as pages rather than as an
area below an address) is needed.

--
"The speed of light ...  It's not just a good idea, it's the law."
Steve Baur (slb%neptune%dschub@hub.ucsb.edu)

jef@well.sf.ca.us (Jef Poskanzer) (10/05/90)

In the referenced message, cpcahil@virtech.UUCP (Conor P. Cahill) wrote:
}In article <1990Oct3.225943.4691@brolga.cc.uq.oz.au> ggm@brolga.cc.uq.oz.au (George Michaelson) writes:
}>I see some attractive uses of grabbing 5-10 Mb of space, using
}>it and then releasing it without having to exit.
}
}You could probably do with other mechanisms that do not require the jump 
}to 10MB of mem and the fall back to 0, or that use a child program to 
}process the 10MB of data and passes it's output back to the parent and
}exits (thereby freeing the allocated memory).

The child process idea is good.  Another trick that I've used is to have
the program restart itself with an "execvp(*argv, argv);"  This is for
situations when the process's interaction model is "wait for a long time,
do something that needs a lot of memory, go back to waiting".  For example,
a screen saver.
---
Jef

  Jef Poskanzer  jef@well.sf.ca.us  {ucbvax, apple, hplabs}!well!jef
    "Do you think we can use battery operated devices under water?"

chapman (Brian Chapman) (10/10/90)

ggm@brolga.cc.uq.oz.au (George Michaelson) writes:

>My understanding is that existing brk/sbrk/malloc in generic *nix
>doesn't allow the process to shrink again once mem has been allocated.

At the system call level brk() and sbrk() allow you to give the
memory back to the system.  [doesn't it say that on your man page?]

The malloc()/free() stuff implemented on top of that in libc.a has
traditionally never bothered.  Malloc() will brk() for more memory
if it needs it but it free() doesn't brk() the level down if you
give the memory back.  There is no reason for this that I know of
except complexity and execution time trade offs.  (ie. most people
don't care and they don't want free() spending execution time
thinking about it)

>Could somebody explain to a non-wizard what stops some method being
>used to detect free mem pages or compress the heap such that memory
>CAN be freed? I dont mean "for free" ie explain what would have to
>be different in HOW malloc/alloc/sbrk/free work to get this behaviour.

Don't use malloc() and free() just call brk and use the raw address
space you allocate.  This requires an understanding of the process
address space model of your version of Unix.  And is a hell of a lot
less portable than using malloc()/free().  [Well,  OK, I guess it
could be done in a portable way, sort of...]

>I'm expecting to be told that keeping back-pointers to entities using
>allocated memory so you could dynamically update their position if
>you'd compressed the heap is more pain than it's worth. If thats not
>the case, I see some attractive uses of grabbing 5-10 Mb of space, using
>it and then releasing it without having to exit.
>	-George

Yup, malloc() and free() don't work that way, but nothing stops you
from implmenting your own system like you describe above on the system
call brk() sbrk() interface
	-- Chapman
-- 
Brian Chapman		uunet!sco!chapman
Pay no attention to the man behind the curtain!

pcg@cs.aber.ac.uk (Piercarlo Grandi) (10/11/90)

On 3 Oct 90 22:59:43 GMT, ggm@brolga.cc.uq.oz.au (George Michaelson) said:

ggm> My understanding is that existing brk/sbrk/malloc in generic *nix
ggm> doesn't allow the process to shrink again once mem has been allocated.

It depends on what you mean, and the kernel you are working on...

ggm> Could somebody explain to a non-wizard what stops some method being
ggm> used to detect free mem pages or compress the heap such that memory
ggm> CAN be freed? I dont mean "for free" ie explain what would have to
ggm> be different in HOW malloc/alloc/sbrk/free work to get this behaviour.

Here you seem to imply that for you 'free' means 'freed at the malloc
level. Well, almost none of the C heap storage managers will lower the
break if this is possible. I have written (and posted to alt.sources) my
own malloc/free package that does lower the break if one frees a block
of store just under the break. It does not not to compaction of the
arena though.

There are C/C++ automatic storage reclaimers that use compacting garbage
collection; I do not know whether after compaction they lower the break,
if possible. If so, it would not be difficult to modify them to do so.

ggm> I'm expecting to be told that keeping back-pointers to entities using
ggm> allocated memory so you could dynamically update their position if
ggm> you'd compressed the heap is more pain than it's worth.

Read some papers on garbage collection, especially conservative g.c. in
C/C++. The sweeper of a g.c. reclaimer will in one way or another
simulate those backpointers.

ggm> If thats not the case, I see some attractive uses of grabbing 5-10
ggm> Mb of space, using it and then releasing it without having to exit.

Now there is a catch: even if your storage reclaimer detectes that a
block of memory just under the break has become free, and lowers the
break as a result, in many UNIX kernels this regrettably will not
release any pages to the kernel mmeory pool -- in some it will not even
invalidate page table entries beyond the break. This is probably because
so few C/C++ sotorage reclaimers bother to lower the break, so it is
perceived that actually shrinking the data and address space of a
process when the break is lowered is not worth its while.
--
Piercarlo "Peter" Grandi           | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk

ggm@brolga.cc.uq.oz.au (George Michaelson) (10/11/90)

Thankyou for all the followups and individual emails. It looks as if there
is no guaranteed method of achieving what I wanted, although on some systems
there *are* ways of coming close if not all the way. modified malloc sounds
closest in being simple to retrofit, and may offer benefit if slipped 
under existing programs, certainly if large single blocks of space are
malloc'ed and freed. I dont think compaction sounds feasible short of a C++
approach providing it in the infrastructure, unless you are prepared to
do ALL the ptr management yourself. If you have a way to model some
data in a humungous space (eg trying to do data persistance by dumping memory
to disk between process invokations) you're probably doing 1/2 of it already. 

I *did* try R'ing the FM in this area, and must say that on SunOS4.1 at least
it is less than perspex clear on brk/sbrk behaviour, at least for anybody
not well versed in system internals.  Clearly an area fools like me are not 
meant to tread. Seems a shame, given programs which have transient need for
large dynamic data areas but can then become vastly smaller again. What we
get from the opsys is simple to implement, but also seems lacking in 
functionality.

	-george
-- 
	G.Michaelson
Internet: G.Michaelson@cc.uq.oz.au                     Phone: +61 7 377 4079
  Postal: George Michaelson, Prentice Computer Centre
          The University of Queensland, St Lucia, QLD Australia 4067.

chris@mimsy.umd.edu (Chris Torek) (10/11/90)

In article <PCG.90Oct10212600@odin.cs.aber.ac.uk> pcg@cs.aber.ac.uk
(Piercarlo Grandi) writes:
>Now there is a catch: even if your storage reclaimer detectes that a
>block of memory just under the break has become free, and lowers the
>break as a result, in many UNIX kernels this regrettably will not
>release any pages to the kernel mmeory pool ....

(from /sys/kern/vm_drum.c in the 4.{1,2,3,3-tahoe,3-reno} system:)

/*
 * Expand or contract the virtual swap segment mapped
 * by the argument diskmap so as to just allow the given size.
 *
 * FOR NOW CANT RELEASE UNLESS SHRINKING TO ZERO, SINCE PAGEOUTS MAY
 * BE IN PROGRESS... TYPICALLY NEVER SHRINK ANYWAYS, SO DOESNT MATTER MUCH
 */
vsexpand(vssize, dmp, canshrink)
	register segsz_t vssize;
	register struct dmap *dmp;
{
	...
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 405 2750)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

dvv@hq.demos.su (Dmitry V. Volodin) (10/11/90)

In article <PCG.90Oct10212600@odin.cs.aber.ac.uk> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
>Now there is a catch: even if your storage reclaimer detectes that a
>block of memory just under the break has become free, and lowers the
>break as a result, in many UNIX kernels this regrettably will not
>release any pages to the kernel mmeory pool -- in some it will not even
>invalidate page table entries beyond the break. This is probably because

More of that - in some paging evnvironments brk() is a dummy! When
the process addresses memory beyond the brk level the kernel just adds
another one page to porcess's memory pool.

-- 
Dmitry V. Volodin <dvv@hq.demos.su>     |
fax:    +7 095 233 5016                 |      Call me Dima (D-'ee-...)
phone:  +7 095 231 2129                 |