[comp.sys.3b1] swap space

raj@utopia.calvin.edu (Ross A. Jekel;;;772-4870;ATRajT Unix PC) (04/30/91)

Sorry to ask this again.  I know I've seen it on the network
before.  Could someone send me the info about increasing swap
space? (please email me at s83949@ursa.calvin.edu).

Thanks.
Ross A. Jekel
Project:  Post TeX 3.14 and METAFONT 2.7 on OSU.

dt@yenta.alb.nm.us (David B. Thomas) (06/10/91)

How does the 3b1 know where to find swap space, and can it use the "swap
partition" on the second hard disk?

						little david
-- 
Unix is not your mother.

dnichols@ceilidh.beartrack.com (DoN Nichols) (06/11/91)

In article <1991Jun9.170520.4087@yenta.alb.nm.us> dt@yenta.alb.nm.us (David B. Thomas) writes:
>How does the 3b1 know where to find swap space, and can it use the "swap
>partition" on the second hard disk?
>
>						little david

	It looks for the special device file "/dev/swap".  Here, we see that
it is the same major and minor device numbers as /dev/fp001, and probably
should be linked, but isn't on this system.  If you do so, you should use
the permissions and ownership of the "swap" listing.

br--------  1 root    root      0,  1 Dec 30 21:42 /dev/fp001
brw-r-----  1 sys     sys       0,  1 Dec 30 21:50 /dev/swap

	You should be able to simply replace "/dev/swap" with a link to the
selected partition on your other drive, but I know of no way to have it use
both.  It would be nice if it did paging from one drive, and swapping to the
other, so you could increase the space to handle emergency conditions, since
the actual "swap" proceedure is used only under relative emergency
conditions, with paging used for normal conditions.

	You should probably wait a while to see if any reply postings from
the "kernel cognosenti" appear with warnings before trying this, since I
don't have access to kernel source, and am just speaking from my readings
and general understanding.  If I gave wrong advice, I'm sure I'll hear about
it :-)

	Good Luck
		DoN.

-- 
Donald Nichols (DoN.)		| Voice (Days):	(703) 664-1585
D&D Data			| Voice (Eves):	(703) 938-4564
Disclaimer: from here - None	| Email:     <dnichols@ceilidh.beartrack.com>
	--- Black Holes are where God is dividing by zero ---

adams@ucunix.san.uc.edu (J. Adams - SunOS Wizard) (06/13/91)

In article <1991Jun11.030216.6155@ceilidh.beartrack.com> dnichols@ceilidh.beartrack.com (DoN Nichols) writes:
>In article <1991Jun9.170520.4087@yenta.alb.nm.us> dt@yenta.alb.nm.us (David B. Thomas) writes:
>>How does the 3b1 know where to find swap space, and can it use the "swap
>>partition" on the second hard disk?
>>
>>						little david
>
>	It looks for the special device file "/dev/swap".  Here, we see that
>it is the same major and minor device numbers as /dev/fp001, and probably
>should be linked, but isn't on this system.  If you do so, you should use
>the permissions and ownership of the "swap" listing.
>
>br--------  1 root    root      0,  1 Dec 30 21:42 /dev/fp001
>brw-r-----  1 sys     sys       0,  1 Dec 30 21:50 /dev/swap
>
>	You should be able to simply replace "/dev/swap" with a link to the
>selected partition on your other drive, but I know of no way to have it use
>both.  It would be nice if it did paging from one drive, and swapping to the
>other, so you could increase the space to handle emergency conditions, since
>the actual "swap" proceedure is used only under relative emergency
>conditions, with paging used for normal conditions.
>
>	You should probably wait a while to see if any reply postings from
>the "kernel cognosenti" appear with warnings before trying this, since I
>don't have access to kernel source, and am just speaking from my readings
>and general understanding.  If I gave wrong advice, I'm sure I'll hear about
>it :-)

I am strongly inclined to doubt this.  With the exception of
/dev/console (for telinit to work from a remote terminal), all the
kernel hooks are by major/minor device number, not name.  Linking
/dev/swap to some other device is unlikely to have any effect.  The
kernel will continue to page/swap on 0,1.  I suppose you might be able
to hack the kernel object files with adb to change to another
partition/disk, but I can see little benefit in doing this.  In fact,
most UNIX ports don't even have a /dev/swap entry... the swap device(s)
are invisible to the filesystem.

There also seems to be some confusion concerning swapping.  Swapping and
paging are the same process.  Only the amount of data transferred is
different.  The kernel resorts to swapping entire processes as a last
resort when the memory is nearly full and it needs a bigger chunk of
memory than it can get from swapping unused pages.  All of this memory
must still reside within the virtual address space of the system.  Thus,
the notion of swapping to one device and paging to another is
impossible.

I am unsure of one other point:  As I understand it, the total (not the
per-process limit, which is clearly 2.5MB) virtual address space of the
UNIX-PC is 4 megabytes.  If this is in fact the case, increasing the
available swap partition beyond its default maximum of around 4.5 MB
(to allow for alternate blocks, filesystem overhead, etc.) cannot
result in any benefit.  I have never been able to get a straight answer
from anyone at AT&T/Convergent about this.  On most other 32-bit
systems, the virtual address space is 4GB (3GB for 4.x BSD on a VAX). 
Thus, the possible swap space is much larger than the amount of RAM
likely to be present.  My understanding is that Convergent decided to
allow just 4MB of address space for memory because they used a rather
wasteful allocation of available addressing to provide direct
addressability for almost every conceivable peripheral device/expansion
card.  Sort of like MS-DOS where you have only 640K out of 1MB (1024K)
address space addressable as RAM.

In any case, swapped-out processes MUST reside in the virtual address
space of the system, since they are not swapped _in_ as whole processes. 
The System V swapping algorithm is not the same as the old Version 7 one
where the entire process had to fit in the RAM.  In SysVr2.1 the
swapping algorithm is essentially the same as the 4.1 BSD algorithm. 
The process must therefore fit in the VIRTUAL address space since it may
be paged in rather than swapped in as a whole entity.  Swapping in the
sense of V7 UNIX does not occur.  In fact, a new process will not be
allowed to run if it cannot fit in the virtual space available.  It will
be killed in the memory allocation stage.  If you examine the kernal .o
files, you will see that the only swapping program is vmswap.c.  This
program manages/shares the same virtual address space as vmpage.c.

-- 
       Jim Adams              Department of Physiology and Biophysics
adams@ucunix.san.uc.edu     University of Cincinnati College of Medicine      
      "I like the symbol visually, plus it will confuse people." 
                                       ... Jim Morrison

floyd@ims.alaska.edu (Floyd Davidson) (06/13/91)

In article <1991Jun13.065207.10089@ucunix.san.uc.edu> adams@ucunix.san.uc.edu (J. Adams - SunOS Wizard) writes:
>In article <1991Jun11.030216.6155@ceilidh.beartrack.com> dnichols@ceilidh.beartrack.com (DoN Nichols) writes:
>>In article <1991Jun9.170520.4087@yenta.alb.nm.us> dt@yenta.alb.nm.us (David B. Thomas) writes:
>>>How does the 3b1 know where to find swap space, and can it use the "swap
>>>partition" on the second hard disk?
>>>
>>>						little david
>>
>>	It looks for the special device file "/dev/swap".  Here, we see that
[...]
>>	You should be able to simply replace "/dev/swap" with a link to the

>
>I am strongly inclined to doubt this.  With the exception of

You are right.  The /dev/swap is there so you can do i/o to it,
not for the kernel.  

> ...  I suppose you might be able
>to hack the kernel object files with adb to change to another
>partition/disk, but I can see little benefit in doing this.  

This could be done.  With 3.51 the object files are available
and the swap device can be changed via conf.h, but as you say,
why?  (Has anyone tried it just to see if anything unexpected
happened?  Seems like 2-3 years ago someone like Lenny or Gil
said something about this...)

>There also seems to be some confusion concerning swapping.  Swapping and
>paging are the same process.  Only the amount of data transferred is
>different.  The kernel resorts to swapping entire processes as a last
>resort when the memory is nearly full and it needs a bigger chunk of
>memory than it can get from swapping unused pages.  All of this memory
>must still reside within the virtual address space of the system.  Thus,
>the notion of swapping to one device and paging to another is
>impossible.

I'm no kernel expert...  My understanding is that paging and swapping
are two different ways to accomplish virtual memory...

>I am unsure of one other point:  As I understand it, the total (not the
>per-process limit, which is clearly 2.5MB) virtual address space of the
>UNIX-PC is 4 megabytes.  If this is in fact the case, increasing the
>available swap partition beyond its default maximum of around 4.5 MB
>(to allow for alternate blocks, filesystem overhead, etc.) cannot
>result in any benefit.

Lets say you have 4Mb of swap space.  You run two programs that
each take up 1.5Mb of memory.  If you start one more there isn't
enough swap space to run it.  If you had 6Mb of swap space you
could run 3 processes that needed 1.5Mb of memory (and have some
left over...).

In practice the only way I've found to bump that limit is with
gcc, which will take all the memory it can get.  I'm using 8Mb
of swap space.  At this moment 6Mb is free.  I've seen it down
to less than 2Mb. (And at the same time the load averages were
up around 4.00 and response time was long, and it was time to
do something else for a few hours while gcc compiled two 
different versions of itself at the same time...  Not exactly
something you would normally do every day.)

>  I have never been able to get a straight answer
>from anyone at AT&T/Convergent about this.  On most other 32-bit
>systems, the virtual address space is 4GB (3GB for 4.x BSD on a VAX). 

That must be the maximum amount that can be allocated, not the
actual amount useable.  The amount you can use is the size of
the disk space allocated.

>Thus, the possible swap space is much larger than the amount of RAM
>likely to be present.  My understanding is that Convergent decided to
>allow just 4MB of address space for memory because they used a rather
>wasteful allocation of available addressing to provide direct
>addressability for almost every conceivable peripheral device/expansion
>card.  Sort of like MS-DOS where you have only 640K out of 1MB (1024K)
>address space addressable as RAM.

While there is some validity to that comparison of the memory map,
that probably isn't the reason the swap space is so small.  First
off it obviously can be changed, second they didn't see much need
(I assume) on a single user system to have more swap space than
total memory.  And remember they only had one hard disk that was
no bigger than 67Mb.  Under normal circumstances their choice of
4Mb has proven just about exactly right.

I think the gripe we have about the way it got put together is
that limit of 2.3-2.5Mb on per process virtual memory that can't
be changed.  I've got a 68020 system that also has 4Mb of real
memory, but with 8Mb of swap it can allocate 11Mb of memory to one
process (the difference is the kernel etc.).  If I had some horrible
need to do it it could be set up for 200Mb of swap and allocate
that much to one process.  Don't laugh too loud, I saw a post once
where some guy running some kind of modeling program was doing
exactly that on a Sun.

>In any case, swapped-out processes MUST reside in the virtual address
>space of the system, since they are not swapped _in_ as whole processes. 
>The System V swapping algorithm is not the same as the old Version 7 one
>where the entire process had to fit in the RAM.  In SysVr2.1 the
>swapping algorithm is essentially the same as the 4.1 BSD algorithm. 
>The process must therefore fit in the VIRTUAL address space since it may
>be paged in rather than swapped in as a whole entity.  Swapping in the
>sense of V7 UNIX does not occur.

That fits my understanding of what is happening.  That is the
difference between swapping and paging.

>  In fact, a new process will not be
>allowed to run if it cannot fit in the virtual space available.  It will
>be killed in the memory allocation stage.  If you examine the kernal .o
>files, you will see that the only swapping program is vmswap.c.  This
>program manages/shares the same virtual address space as vmpage.c.

I don't know one way or the other on this.  Does it kill the new
process or kill an old process?  I know if existing processes
ask for more memory and swap space is full it will start killing off
processes.  But I don't know what the algorithm is.

Floyd
-- 
Floyd L. Davidson   | Alascom, Inc. pays me, |UA Fairbanks Institute of Marine
floyd@ims.alaska.edu| but not for opinions.  |Science suffers me as a guest.

thad@public.BTR.COM (Thaddeus P. Floryan) (06/13/91)

In article <1991Jun13.065207.10089@ucunix.san.uc.edu> adams@ucunix.san.uc.edu (J. Adams - SunOS Wizard) writes:
>[...] Thus,
>the notion of swapping to one device and paging to another is
>impossible.

Not with DEC's VAX/VMS.  There's both a "swap" and a "page" ``file''.  (Gawd,
I never thought I'd be defending VMS, the penultimate Vomit Making System :-)

>[...]
>I am unsure of one other point:  As I understand it, the total (not the
>per-process limit, which is clearly 2.5MB) virtual address space of the
>UNIX-PC is 4 megabytes.  If this is in fact the case, increasing the
>available swap partition beyond its default maximum of around 4.5 MB
>(to allow for alternate blocks, filesystem overhead, etc.) cannot
>result in any benefit. [...]

Not true.  As I discovered earlier this year while having gcc compile the
"ephem" program in the background and doing some other "online" emacs and
gcc work, increasing the swap partition on my HD from the default multi-user
5MB to 12MB made a BIG difference (those processes simply would NOT run
before ("Out of swap space"); now they do.)

Thad Floryan [ thad@btr.com (OR) {decwrl, mips, fernwood}!btr!thad ]

tkacik@hobbes.cs.gmr.com (Tom Tkacik CS/50) (06/13/91)

In article <1991Jun13.065207.10089@ucunix.san.uc.edu>,
adams@ucunix.san.uc.edu (J. Adams - SunOS Wizard) writes:
|> 
|> I am unsure of one other point:  As I understand it, the total (not the
|> per-process limit, which is clearly 2.5MB) virtual address space of the
|> UNIX-PC is 4 megabytes.  If this is in fact the case, increasing the
|> available swap partition beyond its default maximum of around 4.5 MB
|> (to allow for alternate blocks, filesystem overhead, etc.) cannot
|> result in any benefit.  I have never been able to get a straight answer
|> from anyone at AT&T/Convergent about this.  On most other 32-bit
|> systems, the virtual address space is 4GB (3GB for 4.x BSD on a VAX). 
|> Thus, the possible swap space is much larger than the amount of RAM
|> likely to be present.  My understanding is that Convergent decided to
|> allow just 4MB of address space for memory because they used a rather
|> wasteful allocation of available addressing to provide direct
|> addressability for almost every conceivable peripheral device/expansion
|> card.

There are two virtual memory sizes being thrown around, plus swap space.
What are these, and where do they come from?

4Mb.  This is the total virtual address space of the 3b1.  This IS
per process.  This is limited by the memory management unit of the 3b1.
Only 22 address lines are used by the MMU.  This not only made addressing
peripherals easier, it allowed for a reasonablly sized page table.
Unlike most workstations today that keep the page tables in main memory,
the 3b1 has seperate RAM for that, (making memory management simpler). 
This was the obvious thing to do back in 1984, or whenever
Convergent designed the UnixPC.  Nobody complained then.
Who would need that much virtual memory? :-)

2.5Mb.  I'm not sure why you say the per-process limit is 'clearly' 2.5Mb?
Where this number comes from is not obvious (or clear).
This is the virtual address space available for a user program,
(not process).  The kernel, shared libraries, and perhaps other sundry
stuff, is included in the virtual memory space of every process.  On the
3b1, this amounts to 1.5Mb of virtual memory.
The virtual space available for a user's program is the total virtual space
minus that used for the kernel, etc., or 2.5Mb.  Every process has
4Mb virtual memory.  You just can't get at all of it.

Swap space.  This is simply where the machine puts processes when they do not
fit into main memory.  There is no limit (ok, a very large limit) as to how
much swap space you can have.  Having more swap space will allow you to run
more (or larger) processes.

Swap space and per process virtual memory are independent of each other.

|> In any case, swapped-out processes MUST reside in the virtual address
|> space of the system, since they are not swapped _in_ as whole processes.

This is not true.  Every process must reside in the virtual address space.
You can think of the swap space as multiple virtual address spaces.
One for each running process.  Thus, more swap space, more running processes.
Each virtual address space will be
only as large as is needed by the process.  Because the kernel and
shared libraries do not change, they are not swapped out,
and are not included in the virtual address space of the swapped process.

Part of the memory management is to keep track of those multiple
address spaces in swap space.

Main memory is broken into pages, each page given to a
particular process.  The page tables are used to manage this.  Each time
a new process is scheduled to run, the page tables have to be modified to
reflect that current running process.  If all of the pages needed
are already in main memory, nothing gets swapped,
(though the page tables still need to be modified).
Only if the pages currently needed are on disk, does paging occur.
  
|>                                   In fact, a new process will not be
|> allowed to run if it cannot fit in the virtual space available.  It will
|> be killed in the memory allocation stage.

Do not confuse virtual address space with swap space.  What you are saying
is correct, but I think that you mean to say that a new process will
not be allowed to run if it cannot fit in that swap space available,
which is also correct.
This is what happens when you try to run too many large processes.

--
Tom Tkacik
GM Research Labs
tkacik@hobbes.cs.gmr.com
tkacik@kyzyl.mi.org

tkacik@hobbes.cs.gmr.com (Tom Tkacik CS/50) (06/13/91)

In article <1991Jun13.122803.25362@ims.alaska.edu>, floyd@ims.alaska.edu
(Floyd Davidson) writes:
|> In article <1991Jun13.065207.10089@ucunix.san.uc.edu>
adams@ucunix.san.uc.edu (J. Adams - SunOS Wizard) writes:
|> >In article <1991Jun11.030216.6155@ceilidh.beartrack.com>
dnichols@ceilidh.beartrack.com (DoN Nichols) writes:

|> >  In fact, a new process will not be
|> >allowed to run if it cannot fit in the virtual space available.  It will
|> >be killed in the memory allocation stage.  If you examine the kernal .o
|> >files, you will see that the only swapping program is vmswap.c.  This
|> >program manages/shares the same virtual address space as vmpage.c.
|> 
|> I don't know one way or the other on this.  Does it kill the new
|> process or kill an old process?  I know if existing processes
|> ask for more memory and swap space is full it will start killing off
|> processes.  But I don't know what the algorithm is.

When a process asks for more memory than there is virtual memory,
(via malloc, or sbrk(2)), an error will be returned, and the process will not
get any bigger.  It will not be killed by the kernel either, it will usually
call exit by itself, or fail with a segmentation violation as it mindlessly
tries to use the NULL pointer returned by malloc.
(Of course, a proper program will try to clean up after itself and handle the
lack of memory in a graceful fashion.:-)

I believe that if there is still plenty of virtual memory but
very little swap space left, if a program then calls malloc (or sbrk)
the same error will be returned, and the program will not grow.  I do not think
that a running program will be killed because there is not enough memory to
grow into, it simply will not be allowed to grow anymore.

If a new process tries to start running, and there is no more memory,
the kernel will not let it run.  I do not think that it will start and then
be killed by the kernel.  The kernel should never just start killing
off processes when swap space runs out.  Individual processes will start to
die off as they need more memory and cannot get it.

--
Tom Tkacik
GM Research Labs
tkacik@hobbes.cs.gmr.com
tkacik@kyzyl.mi.org

pschmidt@athena.mit.edu (Peter H. Schmidt) (06/14/91)

In [a bunch of] article[s] [several people reply to Don Nichols' post about
swapping, etc.]:

Just a couple of points I wanted to clarify:

o The 3b1 imposes a *per process* limit of 4M of *virtual memory*.  Thus, every
process can malloc up to 4M for itself, as long as there is enough physical
memory to back it up.

o Without some clever hardware hacks (involving installing a memory board in
a way which fools the 3b1 about what slot it is in), there is a 3.5M
RAM limit, i.e. you can't have more than 3.5M of RAM.

o Physical memory = RAM + sizeof(swap partition) - sizeof(kernel)


Therefore, with a 2M mother board, no memory cards, a 4M default (5000 ~=
4 * 1024 * 1k) swap partition, and a ~300k kernel:

  Physical memory = 2M + 4M -.3M = 5.7M

Thus, all your daemons, your shell and everything else have up to 5.7M to
split between them.  If, like me, you tend to run two GNU-Emacses w/tons
of elisp loaded, increasing the swap partition to 8M, say, can be a big help,
eliminating the annoying "doing vfork: not enough space" message.  This I know
from experience :-)

The absolute maximu physical memory a hacked-to-rediculousness 3b1 could have
would be:
	PM = 4M + 180M -.3M = 183.7M
...with memory board hack, 3.51, WD2010, one of those big 180M disks, and a 0k
file partition (/dev/fp002).  Of course, it wouldn't be a very *useful*
machine configured this way...
 
As to the issue of "swapping off one disk and paging off the other", I can
state that it is perfectly possible (leaving aside the point about the
distinction between paging and swapping - I mean here that your VM backing
stores can perfectly well reside on multiple disk partitions).  For example,
BBN's nX derivative of MACH implements the pager as a process outside of the
kernel, and it pages off of whatever mounted device has the most free space
(it just grabs blocks off the free list).  Thus, paging can often be spread
among several disks or partitions, and it works fine, because the pager has a
nice little table telling it where to find the page that has virtual address
0xDEADBEEF when your program tries to touch it.  (That's a *little*
oversimplified ;-)


So, to review: each process has a hope of grabbing up to 4M, but can only get
as much of that 4M as there is physical memory left free (which may well be
all of the 4M).  OK?

--Peter

--
Peter H. Schmidt	| ...mit-eddie!winter!pschmidt
3 Colonial Village, #10	| winter!pschmidt@mit-eddie.mit.edu
Arlington, MA  02174	| -- Speaking for myself.

andyw@aspen32.cray.com (Andy Warner) (06/14/91)

In article <1991Jun11.030216.6155@ceilidh.beartrack.com>, dnichols@ceilidh.beartrack.com (DoN Nichols) writes:
> In article <1991Jun9.170520.4087@yenta.alb.nm.us> dt@yenta.alb.nm.us (David B. Thomas) writes:
> >How does the 3b1 know where to find swap space, and can it use the "swap
> >partition" on the second hard disk?
> >
> >						little david
> 
> 	It looks for the special device file "/dev/swap".  Here, we see that
> it is the same major and minor device numbers as /dev/fp001, and probably
> should be linked, but isn't on this system.  If you do so, you should use
> the permissions and ownership of the "swap" listing.
> 
> br--------  1 root    root      0,  1 Dec 30 21:42 /dev/fp001
> brw-r-----  1 sys     sys       0,  1 Dec 30 21:50 /dev/swap
> 
> 	You should be able to simply replace "/dev/swap" with a link to the
> selected partition on your other drive, but I know of no way to have it use
> .....

This probably won't work. I'll defer to anyone with access to the
source, but the kernel doesn't use /dev/swap, that's just for things like
ps. Most V.2 kernels use three variables rootdev, swapdev & pipedev to
find their root, swap & pipe device. These are either compiled into the
binary (though they are variables, so you can patch them with adb), or
are picked up from the Volume Header Block of the boot disk. I don't
know if the Convergent VHB hold information about what a slice is to
be used for (I've got a feeling it doesn't). One good thing is that
I'm sure they pick up the size of the swap device from the VHB, so you
don't have to tweak that.

So you can do it, but you'll need to patch swapdev, and then ALSO
change /dev/swap.

Hope this helps,
--
andyw.	(W0/G1XRL)

andyw@aspen.cray.com	Andy Warner, Cray Research, Inc.	(612) 683-5835

adams@ucunix.san.uc.edu (J. Adams - SunOS Wizard) (06/14/91)

In article <1991Jun13.122803.25362@ims.alaska.edu> floyd@ims.alaska.edu
(Floyd Davidson) writes:
>[...]
>In article <1991Jun13.065207.10089@ucunix.san.uc.edu>
>adams@ucunix.san.uc.edu (J. Adams - SunOS Wizard) writes: 

>>I am unsure of one other point:  As I understand it, the total (not the 
>>per-process limit, which is clearly 2.5MB) virtual address space of the 
>>UNIX-PC is 4 megabytes.  If this is in fact the case, increasing the 
>>available swap partition beyond its default maximum of around 4.5 MB 
>>(to allow for alternate blocks, filesystem overhead, etc.) cannot 
>>result in any benefit. 
> 
>Lets say you have 4Mb of swap space.  You run two programs that 
>each take up 1.5Mb of memory.  If you start one more there isn't 
>enough swap space to run it.  If you had 6Mb of swap space you 
>could run 3 processes that needed 1.5Mb of memory (and have some 
>left over...). 
> 
>In practice the only way I've found to bump that limit is with 
>gcc, which will take all the memory it can get.  I'm using 8Mb 
>of swap space.  At this moment 6Mb is free.  I've seen it down 
>to less than 2Mb. (And at the same time the load averages were 
>up around 4.00 and response time was long, and it was time to 
>do something else for a few hours while gcc compiled two  
>different versions of itself at the same time...  Not exactly 
>something you would normally do every day.) 
> 

Based on this experience, the 4MB limit must apply only to physically
addressable RAM and not to virtual memory.  (I assume the physical RAM
limit of 4MB could be increased by hardware modification, but as I
don't have the H/W manual I'll leave that topic be.) Therefore, you
should be able to allocate as much swap space as you can afford by
expanding partition 1 on disk 0, up to a (theoretical :-) maximum of
4GB.  (See below, however...) Unfortunately, the 3B1 kernel has only
one swap device configured, so it won't interleave the swap over two
(or more) drives the way 4.3BSD and later versions of System V can.

In summary:

	Virtual Address Space:  Determined by processor word size and
	MMU design (as in MVS/XA).  32 bits for 68010, so = 2^32 or 4GB.
	Only 3GB on the VAX due to hardware limitations.

	Swap space:  Amount of the above you can actually use at any one
	time, present on one or more reserved disk partitions.  This
	space is mapped into the physical RAM of the machine by the
	kernel.  In "true" System V systems (see below) the available
	space can range from (size of swap) to (size of swap + size of
	physical ram + size of shared text and data) depending on the
	particular state of the system at any given moment.

System V rel. 2.1 and later and 4.1BSD and later adopted different
philosophies regarding the implementation of demand-paged virtual
memory.  The BSD systems, being research-oriented and likely to be used
by programmers with a more intimate knowledge of the inner workings of
the O/S, chose simple, less elegant approaches to memory management with
provisions, such as the vfork() call, for the programmer to tune the
system to the application.  In 4.xBSD, swap space is allocated at the
birth of a process, with enough space being allocated to contain the
entire virtual image of the process in its initial state
(text+data+BSS), excepting of course malloc'd (heap) space obtained
via sbrk() or stack space pushdown expansion, and the user structure and
page tables which are RAM-resident for an active (non-swapped) process.

System V took the approach of being as elegant as possible (e.g., using
copy-on-write and dynamic allocation of all tables except the actual
physical memory map) and hiding the inner workings of the system from
programmer intervention.  In the System V scheme, swap space is
allocated as needed by the page stealer to swap out pages or processes
as needed.  Thus, allocation of swap space is not made until it is
actually used.  Since process sharable text and data may be
demand-paged in from the filesystem, virtual pages may be mapped to one
of three places: physical RAM, swap space, or the executable program
file.

Unfortunately, the UNIX-PC kernel was developed approximately concurrent
with, and relatively independently from, the SVr2.1 kernel.  This means
that the UNIX-PC kernel is a somewhat bastard hybrid of SVr2.0 and
4.1BSD.  In this case, based on examination of the data structures in
the kernal .o files, we have inherited the 'philosophy' of System V (no
vfork() call, etc.) with the memory management algorithm of 4.1BSD
(core-map based memory allocation).

[...]
>
>I think the gripe we have about the way it got put together is
>that limit of 2.3-2.5Mb on per process virtual memory that can't
>be changed.  I've got a 68020 system that also has 4Mb of real
>memory, but with 8Mb of swap it can allocate 11Mb of memory to one
>process (the difference is the kernel etc.).  If I had some horrible
>need to do it it could be set up for 200Mb of swap and allocate
>that much to one process.  Don't laugh too loud, I saw a post once
>where some guy running some kind of modeling program was doing
>exactly that on a Sun.

The DECsystem 5000 this is being posted from has 216MB of swap
configured with 64MB physical RAM.  The Sun 3 I am typing on has 38MB
interleaved over 2 drives with 24MB physical RAM.  With the massive
sizes of some executables (try running a SPICE simulation of a small
microprocessor!) it will not be unreasonable to see systems with the
full 4GB 32-bit address space supported by swap in the near future.

Anyway, the problem with the BSD-derived core-map structure is that it
restricts the size not only of the physical RAM, but also the number
and size of mounted filesystems (including swap space), and the number
of processes and shared text images.  In "true" System V, the
per-process virtual space is fairly easily tunable and is usually
configured based on the size of the pfdata table (physical RAM).  In
our kernel, changing the 2.5MB limit requires a complete rebuild of the
kernel from source to enlarge the coremap structure accordingly.

>>In any case, swapped-out processes MUST reside in the virtual address
>>space of the system, since they are not swapped _in_ as whole processes. 
>>The System V swapping algorithm is not the same as the old Version 7 one
>>where the entire process had to fit in the RAM.  In SysVr2.1 the
>>swapping algorithm is essentially the same as the 4.1 BSD algorithm. 
>>The process must therefore fit in the VIRTUAL address space since it may
>>be paged in rather than swapped in as a whole entity.  Swapping in the
>>sense of V7 UNIX does not occur.
>
>That fits my understanding of what is happening.  That is the
>difference between swapping and paging.

Swapping under SysVr2.1 and later is merely an adaptation of the normal
page-stealer (vhand) daemon in which entire processes are marked
'swapped but ready-to-run' making their entire working set (physical
RAM pages) available for allocation until the memory utilization falls
below the 'high water mark' that triggers swapping.  For each process
so marked, the swapper makes another such process ready to run.  These
processes, however, are NOT 'swapped in' in their entirety but are
allowed to page-fault in just like any normal process.  Both page
stealing and swapping are spawned from process 0.

Under 4.xBSD, page reclamation is initiated by a separate kernel
process, the pagedaemon or process 2.  The swapper is process 0.  The
swapper will become active under several conditions where memory is
critical or a process has slept for over 20 seconds.  Swapped processes
page-fault in as above, but are treated specially by the system.  In
general, swapping under 4.xBSD slows the system abruptly because
the swapper attempts to guess the amount of memory a process being
swapped in will need and will attempt to reserve memory for processes
being swapped in.  In this sense, swapping under 4.xBSD resembles more
closely the V7 process.

>>  In fact, a new process will not be
>>allowed to run if it cannot fit in the virtual space available.  It will
>>be killed in the memory allocation stage.  If you examine the kernal .o
>>files, you will see that the only swapping program is vmswap.c.  This
>>program manages/shares the same virtual address space as vmpage.c.
>
>I don't know one way or the other on this.  Does it kill the new
>process or kill an old process?  I know if existing processes
>ask for more memory and swap space is full it will start killing off
>processes.  But I don't know what the algorithm is.

It is dependent on the system (BSD or SV) and the state of the system. 
In any case, the process requesting swap space at the time is usually
the one that will die.  Under SysV, this could be a process in core
that's being paged out in response to one being demand-paged from disk.
The timing is rather critical.  In the BSD case, it will generally be
the new process since swap space for the entire virtual image is
allocated up front.  On the other hand, if the BSD swapper wakes up, it
attempts to allocate additional swap space for the user structure and
page tables which are normally RAM-resident for any active process.  If
this fails, the swapper will attempt to swap out another process.  If
the memory shortage is critical the system will not allow any processes
other than those currently resident or being swapped in and out to run.

Note that the above is based on published information rather than kernel
source code, so some details may be inconsistent with individual
implementations of the algorithms.

Also, thad@public.BTR.COM (Thaddeus P. Floryan) writes:
#
#In article <1991Jun13.065207.10089@ucunix.san.uc.edu> adams@ucunix.san.uc.edu (J. Adams - SunOS Wizard) writes:
#>[...] Thus,
#>the notion of swapping to one device and paging to another is
#>impossible.
#
#Not with DEC's VAX/VMS.  There's both a "swap" and a "page" ``file''.  (Gawd,
#I never thought I'd be defending VMS, the penultimate Vomit Making System :-)
#
#>[...]
#>I am unsure of one other point:  As I understand it, the total (not the
#>per-process limit, which is clearly 2.5MB) virtual address space of the
#>UNIX-PC is 4 megabytes.  If this is in fact the case, increasing the
#>available swap partition beyond its default maximum of around 4.5 MB
#>(to allow for alternate blocks, filesystem overhead, etc.) cannot
#>result in any benefit. [...]
#
#Not true.  As I discovered earlier this year while having gcc compile the
#"ephem" program in the background and doing some other "online" emacs and
#gcc work, increasing the swap partition on my HD from the default multi-user
#5MB to 12MB made a BIG difference (those processes simply would NOT run
#before ("Out of swap space"); now they do.)

Another confirmation that swap space on the UNIX-PC _IS_ expandable.

In VMS, the paging algorithm is not 'system-wide' but there are multiple
partitions of memory, each of which has an independent paging daemon.  I
am not very familiar with VMS but would hazard a guess that this is the
reason for the separate files.  This approach allows the sysadmin to
assure that certain classes of programs are always allocated at least a
certain amount of system memory.

In any event, it is also possible to envision a system in which there is
a virtual space of some size(say, 4GB -- 2^32 -- 32 bit addressing) that
is managed by a swapping algorithm which has a mush larger, separate,
swap space from which entire processes are 'swapped' into and out of the
virtual space in the manner of V7 UNIX.  Those processes resident in the
virtual space would themselves be paged in and out of main memory.  In
some respects this is a similar concept to background utilization systems
that run processes on idle CPUs in a network.

-- 
       Jim Adams              Department of Physiology and Biophysics
adams@ucunix.san.uc.edu     University of Cincinnati College of Medicine      
      "I like the symbol visually, plus it will confuse people." 
                                       ... Jim Morrison

tkacik@hobbes.cs.gmr.com (Tom Tkacik CS/50) (06/14/91)

In article <1991Jun13.170111.28789@athena.mit.edu>,
pschmidt@athena.mit.edu (Peter H. Schmidt) writes:
|> In [a bunch of] article[s] [several people reply to Don Nichols' post about
|> swapping, etc.]:
|> 
|> Just a couple of points I wanted to clarify:
|> 
|> o The 3b1 imposes a *per process* limit of 4M of *virtual memory*. 
Thus, every
|> process can malloc up to 4M for itself, as long as there is enough physical
|> memory to back it up.

Whoa there!  That's not right.  Just because there is a 4Mb virtual limit
does not mean that you can malloc 4Mb.  The number floating around is 2.5Mb.
That is how much of the 4Mb you have access to.  The rest is the kernal and
shared libraries, which for some reason apparently take up 1.5Mb.

|> o Without some clever hardware hacks (involving installing a memory board in
|> a way which fools the 3b1 about what slot it is in), there is a 3.5M
|> RAM limit, i.e. you can't have more than 3.5M of RAM.

Wrong again.  You can have 4Mb of physical memory in a 3b1.  No need for
hardware hacks either; unless you call installing a 2Mb memory card
a hardware hack.  I speak from experience when I say that my UnixPC has
4Mb of physical memory in it.

|> o Physical memory = RAM + sizeof(swap partition) - sizeof(kernel)

I have never heard of this definition, but it's no worse than any other.

|> Therefore, with a 2M mother board, no memory cards, a 4M default (5000 ~=
|> 4 * 1024 * 1k) swap partition, and a ~300k kernel:
|> 
|>   Physical memory = 2M + 4M -.3M = 5.7M

Be careful with your numbers.  I believe that the kernel takes up
more than .3Mb, you have a bunch of buffers and things in there too.

I have never gotten a reply on a question I have.  If you have 4Mb RAM, and
7Mb swap space, can you run 11Mb worth of programs, or are you limited to
how much swap space you have?  I'm not sure that they add up that way.

|> Thus, all your daemons, your shell and everything else have up to 5.7M to
|> split between them.  If, like me, you tend to run two GNU-Emacses w/tons
|> of elisp loaded, increasing the swap partition to 8M, say, can be a
big help,
|> eliminating the annoying "doing vfork: not enough space" message. 
This I know
|> from experience :-)

I would be very reluctant to try to do the math you just did.
There are other variables you need to take into account, like how many kernel
buffers are there.
Increasing swap space is a good way to allow running two or three
gcc compiles and emacs at the same time (I have a 10Mb swap partition at home);
just don't try to be so exact, it's not that important.

(A quick note.  The 3b1 does not have vfork.  That is a Berkeleyism.)

|> The absolute maximu physical memory a hacked-to-rediculousness 3b1
could have
|> would be:
|> 	PM = 4M + 180M -.3M = 183.7M
|> ...with memory board hack, 3.51, WD2010, one of those big 180M disks,
and a 0k
|> file partition (/dev/fp002).  Of course, it wouldn't be a very *useful*
|> machine configured this way...

You neglect the fact that you could install a second hard disk.

|> As to the issue of "swapping off one disk and paging off the other", I can
|> state that it is perfectly possible (leaving aside the point about the
|> distinction between paging and swapping - I mean here that your VM backing
|> stores can perfectly well reside on multiple disk partitions).  For example,
|> BBN's nX derivative of MACH implements the pager as a process outside of the
|> kernel, and it pages off of whatever mounted device has the most free space
|> (it just grabs blocks off the free list).  Thus, paging can often be spread
|> among several disks or partitions, and it works fine, because the
pager has a
|> nice little table telling it where to find the page that has virtual address
|> 0xDEADBEEF when your program tries to touch it.  (That's a *little*
|> oversimplified ;-)

I think that this discussion was pertaining to the 3b1.  You can only
have a single swap partition.  Other machines can, and do, implement
things differently.

|> So, to review: each process has a hope of grabbing up to 4M, but can
only get
|> as much of that 4M as there is physical memory left free (which may well be
|> all of the 4M).  OK?

You've got the right idea.  Some of the details are questionable.

--
Tom Tkacik
GM Research Labs
tkacik@hobbes.cs.gmr.com
tkacik@kyzyl.mi.org

clewis@ferret.ocunix.on.ca (Chris Lewis) (06/14/91)

In article <1991Jun13.065207.10089@ucunix.san.uc.edu> adams@ucunix.san.uc.edu (J. Adams - SunOS Wizard) writes:
>In article <1991Jun11.030216.6155@ceilidh.beartrack.com> dnichols@ceilidh.beartrack.com (DoN Nichols) writes:
>>In article <1991Jun9.170520.4087@yenta.alb.nm.us> dt@yenta.alb.nm.us (David B. Thomas) writes:
>>	It looks for the special device file "/dev/swap".  Here, we see that
>>it is the same major and minor device numbers as /dev/fp001, and probably
>>should be linked, but isn't on this system.  If you do so, you should use
>>the permissions and ownership of the "swap" listing.

/dev/swap is *usually* just a convenient invariant name for ps and
other user-level tools that rummage around inside other processes's address space
to find the swap area.  Most UNIX kernels have swapdev hardcoded as a major/minor
pair in the binary.  These machines usually need to have the kernel relinked
to change swapdev.  "/dev/swap" is just a convenience so you don't have to
recompile the user-level processes like ps.  There are a few machines that
postpone swap device assignment to somewhere early during the boot phase and use
special system calls to do it.  But I don't think the 3b1 is one of those.
If you simply rename /dev/swap the kernel won't notice, but ps will, and
will rummage thru the "wrong" swap area.

On SVR3 on a 3b2 or 386, on the other hand, you link in the initial swapdev,
and you can append swap devices during multiuser so that you can have
your paging area spread across physical devices.

>There also seems to be some confusion concerning swapping.  Swapping and
>paging are the same process.  Only the amount of data transferred is
>different.

Not quite.  When you have a pure swapping system (eg: V7, early System V's,
V32), the process gets completely swapped out, and the process can't be
restarted until the whole process gets swapped in.  These machines
either cannot support demand load (eg: 68000's), or didn't want to bother...
On dual systems, they frequently swap out, and then page fault back in.

>The kernel resorts to swapping entire processes as a last
>resort when the memory is nearly full and it needs a bigger chunk of
>memory than it can get from swapping unused pages.  All of this memory
>must still reside within the virtual address space of the system.  Thus,
>the notion of swapping to one device and paging to another is
>impossible.

In systems capable of both swapping and paging (eg: BSD 4.x), the algorithms
for deciding which to do is rather system dependent.  One common reason
for swapping on "simpler" machines is when you have to do raw I/O with
buffers that span page boundaries, and you have to do a page shuffle to
get the pages contiguous.

>I am unsure of one other point:  As I understand it, the total (not the
>per-process limit, which is clearly 2.5MB) virtual address space of the
>UNIX-PC is 4 megabytes.  If this is in fact the case, increasing the
>available swap partition beyond its default maximum of around 4.5 MB
>(to allow for alternate blocks, filesystem overhead, etc.) cannot

(alternate blocks and filesystem overhead on a swap partition?  It's
not a file system!)

>In any case, swapped-out processes MUST reside in the virtual address
>space of the system, since they are not swapped _in_ as whole processes. 

This may be true of a specific paging algorithm, but I think you may
be getting a bit confused yourself.  Are you speaking in general terms,
or with a specific paging implementation in mind?  In general, there's no
particular reason that a paged out process be part of the address space of
anything - the kernel only needs to associate a given page fault from a specific
process with the set of page descriptors describing that process from
which to find a disk address.  In many cases, the swap space need only be
constrained by the size of the integer used for disk addresses, and
the virtual space of the kernel is "just" the physical memory.  There
a several examples of systems with virtual memory and paging where the
total of process virtual address spaces considerably exceed the number of
address lines that the processor has (eg: MVS and VM especially with
pre-XA 24 bit physical addresses).

I have this awful feeling that the total virtual address space limit on
a 3b1 is simply because they fixed the size of the page table pool or
some other reason.  The 4mb physical limit is a different story.
-- 
Chris Lewis, Phone: (613) 832-0541, Domain: clewis@ferret.ocunix.on.ca
UUCP: ...!cunews!latour!ecicrl!clewis; Ferret Mailing List:
ferret-request@eci386; Psroff (not Adobe Transcript) enquiries:
psroff-request@eci386 or Canada 416-832-0541.  Psroff 3.0 in c.s.u soon!

floyd@ims.alaska.edu (Floyd Davidson) (06/14/91)

In article <55913@rphroy.UUCP> tkacik@hobbes.cs.gmr.com (Tom Tkacik CS/50) writes:
>In article <1991Jun13.122803.25362@ims.alaska.edu>, floyd@ims.alaska.edu
>(Floyd Davidson) writes:
>|> 
>|> I don't know one way or the other on this.  Does it kill the new
>|> process or kill an old process?  I know if existing processes
>|> ask for more memory and swap space is full it will start killing off
>|> processes.  But I don't know what the algorithm is.
>
>When a process asks for more memory than there is virtual memory,
>(via malloc, or sbrk(2)), an error will be returned, and the process will not
>get any bigger.  It will not be killed by the kernel either, it will usually
>call exit by itself, or fail with a segmentation violation as it mindlessly
>tries to use the NULL pointer returned by malloc.
>(Of course, a proper program will try to clean up after itself and handle the
>lack of memory in a graceful fashion.:-)
>
>I believe that if there is still plenty of virtual memory but
>very little swap space left, if a program then calls malloc (or sbrk)
>the same error will be returned, and the program will not grow.  I do not think
>that a running program will be killed because there is not enough memory to
>grow into, it simply will not be allowed to grow anymore.
>
>If a new process tries to start running, and there is no more memory,
>the kernel will not let it run.  I do not think that it will start and then
>be killed by the kernel.  The kernel should never just start killing
>off processes when swap space runs out.  Individual processes will start to
>die off as they need more memory and cannot get it.


This is pretty easy to check out, so I did.  I took the memory
allocation check routine from hard-params.c in the gcc-1.40
distribution and made a little program that allocates all the
memory it can, tells how much it was, and sleeps for two minutes.

Then I ran it several times in the background.  Interesting.
The first three invocations were always repeatable:

At the start:
  -- Free memory: 2.368 Mb, Free swap 6.412 Mb

Allocate 2.481 Mb
  -- Free memory: 2.296 Mb, Free swap 3.828 Mb

Allocate 2.481 Mb
  -- Free memory: 2.236 Mb, Free swap 1.260 Mb

Allocate 1.021 Mb
  -- Free memory: 2.164 Mb, Free swap 0.228 Mb

At this point things work different every time.  Sometimes
it will allocate 20-40k of memory, sometimes it won't.

When it won't, a prompt that I assume comes from the kernel
but could be from the shell, says "Killed".

Any attempt at running ls results in "Killed".  However,
several smaller utilities like who and ps will run.

One one occasion I did get a message definitely from ksh
that said something like "ksh:  failed to fork()".  I'm not
sure exactly, but that was the essence of it.  The fork
failed get enough memory to start another process even.

Nothing else got killed, and there was nothing running that
would have tried mallocing more memory.  The free memory and
free swap space numbers came from sysinfo (updating every five
seconds).

It does appear that normally programs would in fact handle
a full swap device by whatever means they handle an out of
memory condition, because malloc definitely returns a NULL
when there is no swap space, even if there is sufficient
free memory.  But it also appears that the kernel will
kill a process attempting to start if it can't allocate swap
space for it.

Now as to whether the kernel has specifically a built in
per process limit, or whether it is just a case of running
out of virtual memory (at 4Mb max) due to the program size
and shared libs and the kernel adding up,  here is an easy
way to find out.  If a few people will run the hard-params
binary that comes with the gcc distribution and report 
exactly how much memory it says can be allocated, we can
tell.  That is the last thing hard-params reports, so just
let it run and read the last couple lines.

My kernel is a slightly modified 3.51m version (re-linked
with space for 10 mounted drives).  A standard 3.51m, 3.51a,
or 3.51 might show up slightly different if the limit is
merely how much virtual memory is available.  But it will
be the same if the kernel actually has a limit.  It might
also be the same if the actual difference in size between
various kernels is less than one page in memory too, though.
(Running size on my kernel shows 157,840 if thats worth
anything.)

Floyd
-- 
Floyd L. Davidson   | Alascom, Inc. pays me, |UA Fairbanks Institute of Marine
floyd@ims.alaska.edu| but not for opinions.  |Science suffers me as a guest.

ignatz@wam.umd.edu (Mark J. Sienkiewicz) (06/15/91)

In article <1991Jun11.030216.6155@ceilidh.beartrack.com> dnichols@ceilidh.beartrack.com (DoN Nichols) writes:
>In article <1991Jun9.170520.4087@yenta.alb.nm.us> dt@yenta.alb.nm.us (David B. Thomas) writes:
>>How does the 3b1 know where to find swap space, and can it use the "swap
>>partition" on the second hard disk?
>
>	It looks for the special device file "/dev/swap".  Here, we see that

This is not correct.  It uses partition #1 on hard disk #0.  This just happens
to be named /dev/swap, but the kernel does not look in /dev to find it.  It
just uses the block device with major number 0, minor number 1.

>	You should be able to simply replace "/dev/swap" with a link to the

as described above, this doesn't work.

>other, so you could increase the space to handle emergency conditions, since
>the actual "swap" proceedure is used only under relative emergency
>conditions, with paging used for normal conditions.

The swap area is used both to swap processes and to store DATA pages when
they are paged out.  Read only TEXT pages are never swapped out, but they
are sometimes thrown away and re-read from the a.out file you are executing.

>and general understanding.  If I gave wrong advice, I'm sure I'll hear about
>it :-)

Congrats! I wish more people on the net had this attitude.

There are variables in most unix kernels:
	_swapdev	major/minor # of swap device
	_swplo		first block number on device available for swapping
	_nswap		number of blocks available

Not all unixes have these.  If you are knowledgeable, brave, or foolhardy,
you can use ADB to modify your kernel to change these.  DO NOT DO THIS TO
A RUNNING KERNEL.  cp /unix /unix.hacked; adb /unix.hacked; then boot the
modified kernel.

happy hacking.

dnichols@ceilidh.beartrack.com (DoN Nichols) (06/16/91)

In article <100425.7286@timbuk.cray.com> andyw@aspen32.cray.com (Andy Warner) writes:
>
	[ ... my mistaken assumption deleted ... ]
>
>This probably won't work. I'll defer to anyone with access to the
>source, but the kernel doesn't use /dev/swap, that's just for things like
>ps. Most V.2 kernels use three variables rootdev, swapdev & pipedev to
>find their root, swap & pipe device. These are either compiled into the
>binary (though they are variables, so you can patch them with adb), or
>are picked up from the Volume Header Block of the boot disk. I don't
>know if the Convergent VHB hold information about what a slice is to
>be used for (I've got a feeling it doesn't). One good thing is that
>I'm sure they pick up the size of the swap device from the VHB, so you
>don't have to tweak that.

	Looking at the /usr/include/sys/gdisk.h file to check for this, I
found something that looks interesting.  Does this mean that I can have a
filesystem automatically mount itself (aside from under the control of
/etc/rc)?  If so, when would this happen?  Does it happen only for
removable-media drives like the Syquest?  How about a floppy?  I can see why
I might want to have removable-media drives automount, but is there any
reason for doing things this way with a normal disk partition?  Is there any
good reason for this to exist?  Why am I asking all these questions ... ?

/*      volume home block on disk       */
struct vhbd {
        uint    magic;          /* S4 disk format code */
        int     chksum;         /* adjustment so that the 32 bit sum starting

	[ ... ]

        char    fpulled;        /* dismounted last time? */
***>>>  struct  mntnam mntname[MAXSLICE];       /* names for auto mounting.
                                                        null string means no
                                                        auto mount */
        long    time;           /* time last came on line */

	[ ... ]
};


	Thanks
		DoN.
-- 
Donald Nichols (DoN.)		| Voice (Days):	(703) 664-1585
D&D Data			| Voice (Eves):	(703) 938-4564
Disclaimer: from here - None	| Email:     <dnichols@ceilidh.beartrack.com>
	--- Black Holes are where God is dividing by zero ---

adam@cs.UAlberta.CA (Michel Adam; Gov't of NWT) (06/16/91)

In article <1991Jun14.225216.23361@cfctech.cfc.com> kevin@cfctech.cfc.com (Kevin Darcy) writes:
>In article <1991Jun13.192444.7500@ucunix.san.uc.edu> adams@ucunix.san.uc.edu (J. Adams - SunOS Wizard) writes:
>>In article <1991Jun13.122803.25362@ims.alaska.edu> floyd@ims.alaska.edu
>>(Floyd Davidson) writes:
>>>[...]
>>>In article <1991Jun13.065207.10089@ucunix.san.uc.edu>
>>>adams@ucunix.san.uc.edu (J. Adams - SunOS Wizard) writes: 
>>
>>System V rel. 2.1 and later and 4.1BSD and later adopted different
>>philosophies regarding the implementation of demand-paged virtual
>>memory.  The BSD systems, being research-oriented and likely to be used
>>by programmers with a more intimate knowledge of the inner workings of
>>the O/S, chose simple, less elegant approaches to memory management with
>>provisions, such as the vfork() call, for the programmer to tune the
>>system to the application.  In 4.xBSD, swap space is allocated at the
>>birth of a process, with enough space being allocated to contain the
>>entire virtual image of the process in its initial state
>>(text+data+BSS), excepting of course malloc'd (heap) space obtained

[ ... ]

>
>Although I'm no kernel hacker, I will point out, from a sysadmin's point-of-
>view (with advice from the higher levels of AT&T support/development people)
>that later SysV implementations still retain vestiges of the old virtual 
>memory scheme you describe: on an AT&T 3B2/600 running r3, for instance, the 
>kernel always attempts to map two -entire- sets of -contiguous- pages in the 
>swap area for every process upon startup, whenever sbrk() is called for more 
>memory, and so on. One of these sets is for paging, and the other, for 
>swapping. If, for some reason, there is not enough contiguous space available 
>to allocate one of these sets, the kernel spits out a "getcpages: cannot 
>allocate X contiguous pages" console message (X being some-number-or-other). 
>If there is insufficient contiguous space to map EITHER page, new processes 
>will die on startup, or ungrowable processes will (usually) seg violate, with 
>all of this mayhem accompanied by delightful kernel messages such as "growreg: 
>unable to allocate ...", or "dupreg: unable to allocate..." or "uballoc: 
><something-or-other>" spewing on the console.
>
>The algorithm for mapping this swap space is unknown to me, but it appears to 
>be imperfect - from time to time, one of our 3B2's or another will suddenly 
>experience these symptoms for no particular reason, and plenty of available 
>swap space. Once it starts, even programs with relatively small run-time 
>images (e.g. /bin/cat) will then die on startup, with the kernel complaining 
>that it cannot allocate swap-area space. Obviously, the only cure at that 
>point is to reboot...
>
>Perhaps AT&T has been slow in -implementing- the VM "elegance" of which you 
>speak? I see very little of it on their own products.
>

I seem to remember a discussion on this subject, 6 to 12 months ago, on the
problems with insufficient RAM available on the AT&T 386 machines, (must have
been a SYS V Ver. 3.something). One comment that struck my mind at that time
was that for some reason, there was a recommendation to have as much real RAM
as the swap partition's size ( I may be mistaken on that ...), to avoid some
kind of trashing. There was an explicit reference to the implementation
by Convergent, for the 7300, being somewhat different and managing to avoid
this problem. There where numerous references to the Bach book, but I don't
remember if the difference between the CT version for the 3B1 and the other
versions running on 386 machine was described.

Can someone who archived these articles look up the conclusion? What was it
that CT did differently in their implementation?

(Can someone with access to the kernel source find out?)

>>-- 

>>       Jim Adams              Department of Physiology and Biophysics
>>adams@ucunix.san.uc.edu     University of Cincinnati College of Medicine      
>>      "I like the symbol visually, plus it will confuse people." 
>>                                       ... Jim Morrison
>
>------------------------------------------------------------------------------
>kevin@cfctech.cfc.com 		  | Kevin Darcy, Unix Systems Administrator
>...sharkey!cfctech!kevin 	  | Technical Services (CFC)
>Voice: (313) 759-7140 		  | Chrysler Corporation
>Fax:   (313) 758-8173 		  | 25999 Lawrence Ave, Center Line, MI 48015
>------------------------------------------------------------------------------



Michel Adam

adam@cs.ualberta.ca   (...!alberta!adam)

or

adam@iceman.UUCP   (...!alberta!iceman!adam)

tkacik@kyzyl.mi.org (Tom Tkacik) (06/21/91)

In article <55907@rphroy.UUCP>, I wrote:
> In article <1991Jun13.065207.10089@ucunix.san.uc.edu>,
> adams@ucunix.san.uc.edu (J. Adams - SunOS Wizard) writes:
> |> 
> |> I am unsure of one other point:  As I understand it, the total (not the
> |> per-process limit, which is clearly 2.5MB) virtual address space of the
> |> UNIX-PC is 4 megabytes.  If this is in fact the case, increasing the
> 
> There are two virtual memory sizes being thrown around, plus swap space.
> What are these, and where do they come from?
> 
> 4Mb.  This is the total virtual address space of the 3b1.  This IS
> 
> 2.5Mb.  I'm not sure why you say the per-process limit is 'clearly' 2.5Mb?
> Where this number comes from is not obvious (or clear).
> This is the virtual address space available for a user program,
> (not process).  The kernel, shared libraries, and perhaps other sundry

I have been perusing /usr/include/sys to try to see where that
2.5MB virtual memory limit comes from, and found in /usr/include/sys/param.h
the following defines.

#define VUSER_START	0x80000		/* start address of user process */
#define VUSER_END	0x300000	/* end address of user process */
#define SHLIB_START	VUSER_END	/* start address of shared lib */
#define SHLIB_END	0x380000	/* end address of shared lib */
#define KVMEM_VBASE	SHLIB_END	/* start addr of kernel vm */
#define KVMEM_VLIMIT	0x400000	/* end addr of kernel vm	*/

This defines how virtual memory space is used in each process.
They clearly show that the user process goes from 0x80000 to 0x300000.
This is the 2.5MB.  Shared libraries must fit into the 0.5MB between
0x300000 and 0x380000, while the kernel must fit into the 0.5Mb
between 0x380000 and 0x400000 (the very top of virtual memory).

But that's only 3.5MB of the available 4MB.  I cannot find what
could possibly be in that space from 0x00000 to 0x80000.
The ifiles (in /lib) all instruct ld(1) to start the user
program at 0x80000.

Does anybody have any idea what is in that low part of memory?
Shouldn't we be able to get 3MB of virtual memory for out programs?
What would happen if some guinea pog modified the ifile to start a program
much lower in memory?  How much lower could you safely go?  Any takers?

-- 
Tom Tkacik                |
tkacik@kyzyl.mi.org       |     To rent this space, call 1-800-555-QUIP.
tkacik@hobbes.cs.gmr.com  |

alex@umbc4.umbc.edu (Alex S. Crain) (06/21/91)

In article <384@kyzyl.mi.org> tkacik@kyzyl.mi.org (Tom Tkacik) writes:

	[This is the pertinant magic describing the process environment]

>#define VUSER_START	0x80000		/* start address of user process */
>#define VUSER_END	0x300000	/* end address of user process */
>#define SHLIB_START	VUSER_END	/* start address of shared lib */
>#define SHLIB_END	0x380000	/* end address of shared lib */
>#define KVMEM_VBASE	SHLIB_END	/* start addr of kernel vm */
>#define KVMEM_VLIMIT	0x400000	/* end addr of kernel vm	*/

	[No the questions, reformatted for clarity]

1] This defines how virtual memory space is used in each process.

	Yup.

2] They clearly show that the user process goes from 0x80000 to 0x300000.

	Yup

3] This is the 2.5MB.  Shared libraries must fit into the 0.5MB
between 0x300000 and 0x380000, while the kernel must fit into the
0.5Mb between 0x380000 and 0x400000 (the very top of virtual memory).

	Bzzzzt! wrong. The shared libraries live between 0x300000-0x380000.
The top .5 megs is dynamic memory for the kernel, which lives below 0x80000.

4] But that's only 3.5MB of the available 4MB.  I cannot find what
could possibly be in that space from 0x00000 to 0x80000.  The ifiles
(in /lib) all instruct ld(1) to start the user program at 0x80000.

	Yup. From the kernels point of view, the world looks like this:

KERNEL 0x400000:--------------------------------------------------------------
KERNEL            Kernel dynamic memory pages (for system buffers, etc)
KERNEL 0x380000:--------------------------------------------------------------
USER              Shared library area
USER   0x300000:--------------------------------------------------------------
USER              User stack
USER   0x2fe???:----------------------
USER              User text/data 
USER 
USER 
USER 
USER   0x080000:--------------------------------------------------------------
KERNEL            User structure (8K)
KERNEL            Kernel stack
KERNEL            Kernel data
KERNEL            Kernel text
KERNEL            Interrrupt vector table
KERNEL 0x000000:--------------------------------------------------------------


It looks like that from the perspective of a user program too, except that
the areas labeld KERNEL can't be read by user programs (segfaults). Unix 
works by changing the user pages, but the kernel pages are always there.

5] Shouldn't we be able to get 3MB of virtual memory for out programs?

	You can, but you have to put some of the code in the shared library
(yes, I know that wasn't what you were looking for). This is actually
pretty reasonable, as long as you *add* it to the existing library image
without distrubing any addresses. There is a package to do this in the
archives, shlib.something.Z. The Directions are pretty obtuse, but it
works.

6]What would happen if some guinea pog modified the ifile to start a program
much lower in memory?  How much lower could you safely go?  Any takers?

	It would crash with a segmentation violation and be very boring.

&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&

	A better way to run very large programs would be to use overlays.
This is where a program links several text areas into the same memory space 
and does its own swapping. Its pretty common on architectures with very
small address spaces, like PDP-11s, C-64s, etc. The C-128 had some really
slick hardware support for overlays, where they put the OD in ROM, and you
could toggle a register to make it appear in different places in memory.
The idea was to have the OS load your program, and then make the OS go away
and have your program appear in its place. When you needed to make a system
call, you set up the call, toggled the OS back into memory, and jumped. Cute.

Anywho, the system 5 loader is pretty smart, and can do overlays just fine.
You could then toggle the memory in as shared memory or, if your a little
bolder, you could just diddle the page table.



-- 
#################################		           :alex.
#Disclaimer: Anyone who agrees  #                 Systems Programmer
#with me deserves what they get.#    University of Maryland Baltimore County
#################################	    alex@umbc3.umbc.edu

res@colnet.uucp (Rob Stampfli) (06/22/91)

In article <1991Jun21.151926.15624@umbc3.umbc.edu> alex@umbc4.umbc.edu (Alex S. Crain) writes:
>
>6]What would happen if some guinea pog modified the ifile to start a program
>much lower in memory?  How much lower could you safely go?  Any takers?
>
>	It would crash with a segmentation violation and be very boring.

Correct, with one corrollary:  On 3.51 (and I believe 3.5) the first page
of virtual memory is read/only and filled with zeros.  This was done, I
believe, to permit programs that (incorrectly) dereference the NULL pointer
to work without causing the segmentation fault Alex describes above.  (There
were so many examples of this problem floating around, it was giving the
Unix-PC OS people nightmares.)

Fine, I thought:  They just allocated 4K and set up the MMU for R/O access.
However, on closer inspection of the hardware, I found what I believe is a
prohibition against accessing memory below x80000 in the MMU firmware itself:
The firmware appears to use the supervisor/user mode state signal from the
68010 to prohibit user level access to the lower x80000 bytes, regardless of
the actual MMU state for page zero.  Can anyone confirm or refute this, and
if it is so, how does the OS allow read access to page zero?

-- 
Rob Stampfli, 614-864-9377, res@kd8wk.uucp (osu-cis!kd8wk!res), kd8wk@n8jyv.oh

jmm@eci386.uucp (John Macdonald) (06/22/91)

In article <384@kyzyl.mi.org> tkacik@kyzyl.mi.org (Tom Tkacik) writes:
|I have been perusing /usr/include/sys to try to see where that
|2.5MB virtual memory limit comes from, and found in /usr/include/sys/param.h
|the following defines.
|
|#define VUSER_START	0x80000		/* start address of user process */
|#define VUSER_END	0x300000	/* end address of user process */
|#define SHLIB_START	VUSER_END	/* start address of shared lib */
|#define SHLIB_END	0x380000	/* end address of shared lib */
|#define KVMEM_VBASE	SHLIB_END	/* start addr of kernel vm */
|#define KVMEM_VLIMIT	0x400000	/* end addr of kernel vm	*/
|
|This defines how virtual memory space is used in each process.
|They clearly show that the user process goes from 0x80000 to 0x300000.
|This is the 2.5MB.  Shared libraries must fit into the 0.5MB between
|0x300000 and 0x380000, while the kernel must fit into the 0.5Mb
|between 0x380000 and 0x400000 (the very top of virtual memory).
|
|But that's only 3.5MB of the available 4MB.  I cannot find what
|could possibly be in that space from 0x00000 to 0x80000.
|The ifiles (in /lib) all instruct ld(1) to start the user
|program at 0x80000.
|
|Does anybody have any idea what is in that low part of memory?
|Shouldn't we be able to get 3MB of virtual memory for out programs?
|What would happen if some guinea pog modified the ifile to start a program
|much lower in memory?  How much lower could you safely go?  Any takers?

I would guess that the low 0.5 Meg is non-virtual memory for the
kernel - i.e. the portion of the kernel that is fixed when the
kernel is built.  The virtual memory for the kernel would then
include loadable device driver and allocatable space...
-- 
Usenet is [like] the group of people who visit the  | John Macdonald
park on a Sunday afternoon. [...] luckily, most of  |   jmm@eci386
the people are staying on the paths and not pissing |
on the flowers - Gene Spafford

adam@cs.UAlberta.CA (Michel Adam; Gov't of NWT) (06/22/91)

In article <224@kas.helios.mn.org> rhealey@kas.helios.mn.org (Rob Healey) writes:
>In article <1991Jun15.195720.25820@cs.UAlberta.CA> adam@cs.UAlberta.CA (Michel Adam;  Gov't of NWT) writes:
>=
>=Can someone who archived these articles look up the conclusion? What was it
>=that CT did differently in their implementation?
>=
>	They used a better processor family... B^). They also grafted 4.1BSD
>	VM onto a System V R2 kernel. As such, you HAVE to have enough

I tought that demand paged virtual memory only came with 4.2 or 4.3 ...?

>	swap to page in the whole program to swap and then memory. I.e. I
>	ran into troubles on a UNIX PC where I had 700k free memory but
>	only 100k of swap, the system REFUSED to load a 400k program; sucked

   Disk space is cheap, relative to RAM anyway ... It seems a fairly minor
compromise, if it avoid the trashing that seemed to have been a common
problem with the other implementation of SYS V R. 2 or 3. Of course it may
not be a 'purist' approach, but I'm in business, not academia ... Real world
tends to intrude ...

>	rocks. This just goes to show that both methods have their "evil twin"
>	side. SVR3.x chews up too much swap, Early BSD needs enough swap for the
>	whole initial program even if there is enough main memory but not
>	enough swap/paging.

   I was under the impression that the actual 'a.out' file was used for paging
in the code (read 'text'), and that was decreasing the need for disk space in
the swap partition. I once tried to delete a program and got an error message
about the file being in use. Did I miss something here? What about the
shared library? Isn't it supposed to be locked when in use?

   Could someone send me a list of recommended books on the design of the
BSD system, particularly the version that was used as a 'donnor' for the
UNIX-PC kernel. A list of all the BSDism in this kernel would also be
interesting.

   I'll summarize if there is interest.

>
>	I believe SVR4 and 4.3 BSD are better at paging and are more
>	reasonable with it all. If nothing else you can add paging files
>	under both to solve the running out of swap problem, you just take
>	performance hits when the paging file blocks are scattered all
>	over a disk.
>
>		-Rob



Michel Adam
Yellowknife, N.W.T.

adam@cs.ualberta.ca
(...!alberta!adam)

tkacik@kyzyl.mi.org (Tom Tkacik) (06/23/91)

In article <1991Jun21.215503.26210@colnet.uucp>, res@colnet.uucp (Rob Stampfli) writes:
> Correct, with one corrollary:  On 3.51 (and I believe 3.5) the first page
> of virtual memory is read/only and filled with zeros.  This was done, I
> believe, to permit programs that (incorrectly) dereference the NULL pointer
> to work without causing the segmentation fault Alex describes above.  (There
 
> Fine, I thought:  They just allocated 4K and set up the MMU for R/O access.
> However, on closer inspection of the hardware, I found what I believe is a
> prohibition against accessing memory below x80000 in the MMU firmware itself:
> The firmware appears to use the supervisor/user mode state signal from the
> 68010 to prohibit user level access to the lower x80000 bytes, regardless of
> the actual MMU state for page zero.  Can anyone confirm or refute this, and
> if it is so, how does the OS allow read access to page zero?
 
I just did a quick test of reading page 0.  My program ran fine and printed
that location 0 does indeed contain the value 0.  However, if I tried
reading from any other location, (even 1), the program chrashed with
a core dump.

Perhaps it is not done in the MMU.  Instead of writing zeros to the first
page and giving you read permission, the bus error handler checks to
see if the bus error was due to a read of location 0, and returns a
value of zero.  I could be done entirely in software.

I remember trying this test a couple of years ago, and not being able
to read location 0.  This must be a new "feature" of 3.51 (maybe 3.5).

-- 
Tom Tkacik                |
tkacik@kyzyl.mi.org       |     To rent this space, call 1-800-555-QUIP.
tkacik@hobbes.cs.gmr.com  |

res@colnet.uucp (Rob Stampfli) (06/24/91)

In article <389@kyzyl.mi.org> tkacik@kyzyl.mi.org (Tom Tkacik) writes:
>In article <1991Jun21.215503.26210@colnet.uucp>, res@colnet.uucp (Rob Stampfli) writes:
>> Correct, with one corrollary:  On 3.51 (and I believe 3.5) the first page
>> of virtual memory is read/only and filled with zeros.  This was done, I
>> believe, to permit programs that (incorrectly) dereference the NULL pointer
>> to work without causing the segmentation fault Alex describes above.  (There
> 
>> Fine, I thought:  They just allocated 4K and set up the MMU for R/O access.
>> However, on closer inspection of the hardware, I found what I believe is a
>> prohibition against accessing memory below x80000 in the MMU firmware itself:
>> The firmware appears to use the supervisor/user mode state signal from the
>> 68010 to prohibit user level access to the lower x80000 bytes, regardless of
>> the actual MMU state for page zero.  Can anyone confirm or refute this, and
>> if it is so, how does the OS allow read access to page zero?
> 
>I just did a quick test of reading page 0.  My program ran fine and printed
>that location 0 does indeed contain the value 0.  However, if I tried
>reading from any other location, (even 1), the program chrashed with
>a core dump.
>
>Perhaps it is not done in the MMU.  Instead of writing zeros to the first
>page and giving you read permission, the bus error handler checks to
>see if the bus error was due to a read of location 0, and returns a
>value of zero.  I could be done entirely in software.
>
>I remember trying this test a couple of years ago, and not being able
>to read location 0.  This must be a new "feature" of 3.51 (maybe 3.5).

Hmm.  Here is the program I used to do the test:

	main()
	{
		int *i;
		for(i = 0;;i++)
			printf("%d=%d\n",i,*i);
	}

It produces printf output up to i=4092, and then then crashes and burns.  I
suspect that if Tom tried accessing location 1 as anything other than a
character, it might have caused a bus error for alignment reasons.

However, based on what Tom said, I tried this program:

	main()
	{
		int i,k;
		int *j;
	#ifdef NULLPTRTST
		j = 0;
	#else
		j = &k;
	#endif
		for(i = 0; i < 1000000; i++)
			k = *j;
	}

The results are quite interesting.  With j=&k, the program runs in a
reasonable amount of time, and is cpu bound.  With j=0, the program takes
over 15 minutes to run on an idle machine, and most of the time is spent
as system time (I used "time a.out" to try this).  This seems to indicate
that Tom's supposition that [ the error interrupt still occurs, but the
Kernel code recovers and returns control to the program in the case of a
read from low memory locations ] is correct.  One day, I will have to
delve into the code and see for sure.  Now, I wonder: was it just coincidence
that the value read was zero?  Perhaps if the code were buried deeply in
other code, and the registers were dirtied, the value returned might be some
other random value.
-- 
Rob Stampfli, 614-864-9377, res@kd8wk.uucp (osu-cis!kd8wk!res), kd8wk@n8jyv.oh