raj@utopia.calvin.edu (Ross A. Jekel;;;772-4870;ATRajT Unix PC) (04/30/91)
Sorry to ask this again. I know I've seen it on the network before. Could someone send me the info about increasing swap space? (please email me at s83949@ursa.calvin.edu). Thanks. Ross A. Jekel Project: Post TeX 3.14 and METAFONT 2.7 on OSU.
dt@yenta.alb.nm.us (David B. Thomas) (06/10/91)
How does the 3b1 know where to find swap space, and can it use the "swap partition" on the second hard disk? little david -- Unix is not your mother.
dnichols@ceilidh.beartrack.com (DoN Nichols) (06/11/91)
In article <1991Jun9.170520.4087@yenta.alb.nm.us> dt@yenta.alb.nm.us (David B. Thomas) writes: >How does the 3b1 know where to find swap space, and can it use the "swap >partition" on the second hard disk? > > little david It looks for the special device file "/dev/swap". Here, we see that it is the same major and minor device numbers as /dev/fp001, and probably should be linked, but isn't on this system. If you do so, you should use the permissions and ownership of the "swap" listing. br-------- 1 root root 0, 1 Dec 30 21:42 /dev/fp001 brw-r----- 1 sys sys 0, 1 Dec 30 21:50 /dev/swap You should be able to simply replace "/dev/swap" with a link to the selected partition on your other drive, but I know of no way to have it use both. It would be nice if it did paging from one drive, and swapping to the other, so you could increase the space to handle emergency conditions, since the actual "swap" proceedure is used only under relative emergency conditions, with paging used for normal conditions. You should probably wait a while to see if any reply postings from the "kernel cognosenti" appear with warnings before trying this, since I don't have access to kernel source, and am just speaking from my readings and general understanding. If I gave wrong advice, I'm sure I'll hear about it :-) Good Luck DoN. -- Donald Nichols (DoN.) | Voice (Days): (703) 664-1585 D&D Data | Voice (Eves): (703) 938-4564 Disclaimer: from here - None | Email: <dnichols@ceilidh.beartrack.com> --- Black Holes are where God is dividing by zero ---
adams@ucunix.san.uc.edu (J. Adams - SunOS Wizard) (06/13/91)
In article <1991Jun11.030216.6155@ceilidh.beartrack.com> dnichols@ceilidh.beartrack.com (DoN Nichols) writes: >In article <1991Jun9.170520.4087@yenta.alb.nm.us> dt@yenta.alb.nm.us (David B. Thomas) writes: >>How does the 3b1 know where to find swap space, and can it use the "swap >>partition" on the second hard disk? >> >> little david > > It looks for the special device file "/dev/swap". Here, we see that >it is the same major and minor device numbers as /dev/fp001, and probably >should be linked, but isn't on this system. If you do so, you should use >the permissions and ownership of the "swap" listing. > >br-------- 1 root root 0, 1 Dec 30 21:42 /dev/fp001 >brw-r----- 1 sys sys 0, 1 Dec 30 21:50 /dev/swap > > You should be able to simply replace "/dev/swap" with a link to the >selected partition on your other drive, but I know of no way to have it use >both. It would be nice if it did paging from one drive, and swapping to the >other, so you could increase the space to handle emergency conditions, since >the actual "swap" proceedure is used only under relative emergency >conditions, with paging used for normal conditions. > > You should probably wait a while to see if any reply postings from >the "kernel cognosenti" appear with warnings before trying this, since I >don't have access to kernel source, and am just speaking from my readings >and general understanding. If I gave wrong advice, I'm sure I'll hear about >it :-) I am strongly inclined to doubt this. With the exception of /dev/console (for telinit to work from a remote terminal), all the kernel hooks are by major/minor device number, not name. Linking /dev/swap to some other device is unlikely to have any effect. The kernel will continue to page/swap on 0,1. I suppose you might be able to hack the kernel object files with adb to change to another partition/disk, but I can see little benefit in doing this. In fact, most UNIX ports don't even have a /dev/swap entry... the swap device(s) are invisible to the filesystem. There also seems to be some confusion concerning swapping. Swapping and paging are the same process. Only the amount of data transferred is different. The kernel resorts to swapping entire processes as a last resort when the memory is nearly full and it needs a bigger chunk of memory than it can get from swapping unused pages. All of this memory must still reside within the virtual address space of the system. Thus, the notion of swapping to one device and paging to another is impossible. I am unsure of one other point: As I understand it, the total (not the per-process limit, which is clearly 2.5MB) virtual address space of the UNIX-PC is 4 megabytes. If this is in fact the case, increasing the available swap partition beyond its default maximum of around 4.5 MB (to allow for alternate blocks, filesystem overhead, etc.) cannot result in any benefit. I have never been able to get a straight answer from anyone at AT&T/Convergent about this. On most other 32-bit systems, the virtual address space is 4GB (3GB for 4.x BSD on a VAX). Thus, the possible swap space is much larger than the amount of RAM likely to be present. My understanding is that Convergent decided to allow just 4MB of address space for memory because they used a rather wasteful allocation of available addressing to provide direct addressability for almost every conceivable peripheral device/expansion card. Sort of like MS-DOS where you have only 640K out of 1MB (1024K) address space addressable as RAM. In any case, swapped-out processes MUST reside in the virtual address space of the system, since they are not swapped _in_ as whole processes. The System V swapping algorithm is not the same as the old Version 7 one where the entire process had to fit in the RAM. In SysVr2.1 the swapping algorithm is essentially the same as the 4.1 BSD algorithm. The process must therefore fit in the VIRTUAL address space since it may be paged in rather than swapped in as a whole entity. Swapping in the sense of V7 UNIX does not occur. In fact, a new process will not be allowed to run if it cannot fit in the virtual space available. It will be killed in the memory allocation stage. If you examine the kernal .o files, you will see that the only swapping program is vmswap.c. This program manages/shares the same virtual address space as vmpage.c. -- Jim Adams Department of Physiology and Biophysics adams@ucunix.san.uc.edu University of Cincinnati College of Medicine "I like the symbol visually, plus it will confuse people." ... Jim Morrison
floyd@ims.alaska.edu (Floyd Davidson) (06/13/91)
In article <1991Jun13.065207.10089@ucunix.san.uc.edu> adams@ucunix.san.uc.edu (J. Adams - SunOS Wizard) writes: >In article <1991Jun11.030216.6155@ceilidh.beartrack.com> dnichols@ceilidh.beartrack.com (DoN Nichols) writes: >>In article <1991Jun9.170520.4087@yenta.alb.nm.us> dt@yenta.alb.nm.us (David B. Thomas) writes: >>>How does the 3b1 know where to find swap space, and can it use the "swap >>>partition" on the second hard disk? >>> >>> little david >> >> It looks for the special device file "/dev/swap". Here, we see that [...] >> You should be able to simply replace "/dev/swap" with a link to the > >I am strongly inclined to doubt this. With the exception of You are right. The /dev/swap is there so you can do i/o to it, not for the kernel. > ... I suppose you might be able >to hack the kernel object files with adb to change to another >partition/disk, but I can see little benefit in doing this. This could be done. With 3.51 the object files are available and the swap device can be changed via conf.h, but as you say, why? (Has anyone tried it just to see if anything unexpected happened? Seems like 2-3 years ago someone like Lenny or Gil said something about this...) >There also seems to be some confusion concerning swapping. Swapping and >paging are the same process. Only the amount of data transferred is >different. The kernel resorts to swapping entire processes as a last >resort when the memory is nearly full and it needs a bigger chunk of >memory than it can get from swapping unused pages. All of this memory >must still reside within the virtual address space of the system. Thus, >the notion of swapping to one device and paging to another is >impossible. I'm no kernel expert... My understanding is that paging and swapping are two different ways to accomplish virtual memory... >I am unsure of one other point: As I understand it, the total (not the >per-process limit, which is clearly 2.5MB) virtual address space of the >UNIX-PC is 4 megabytes. If this is in fact the case, increasing the >available swap partition beyond its default maximum of around 4.5 MB >(to allow for alternate blocks, filesystem overhead, etc.) cannot >result in any benefit. Lets say you have 4Mb of swap space. You run two programs that each take up 1.5Mb of memory. If you start one more there isn't enough swap space to run it. If you had 6Mb of swap space you could run 3 processes that needed 1.5Mb of memory (and have some left over...). In practice the only way I've found to bump that limit is with gcc, which will take all the memory it can get. I'm using 8Mb of swap space. At this moment 6Mb is free. I've seen it down to less than 2Mb. (And at the same time the load averages were up around 4.00 and response time was long, and it was time to do something else for a few hours while gcc compiled two different versions of itself at the same time... Not exactly something you would normally do every day.) > I have never been able to get a straight answer >from anyone at AT&T/Convergent about this. On most other 32-bit >systems, the virtual address space is 4GB (3GB for 4.x BSD on a VAX). That must be the maximum amount that can be allocated, not the actual amount useable. The amount you can use is the size of the disk space allocated. >Thus, the possible swap space is much larger than the amount of RAM >likely to be present. My understanding is that Convergent decided to >allow just 4MB of address space for memory because they used a rather >wasteful allocation of available addressing to provide direct >addressability for almost every conceivable peripheral device/expansion >card. Sort of like MS-DOS where you have only 640K out of 1MB (1024K) >address space addressable as RAM. While there is some validity to that comparison of the memory map, that probably isn't the reason the swap space is so small. First off it obviously can be changed, second they didn't see much need (I assume) on a single user system to have more swap space than total memory. And remember they only had one hard disk that was no bigger than 67Mb. Under normal circumstances their choice of 4Mb has proven just about exactly right. I think the gripe we have about the way it got put together is that limit of 2.3-2.5Mb on per process virtual memory that can't be changed. I've got a 68020 system that also has 4Mb of real memory, but with 8Mb of swap it can allocate 11Mb of memory to one process (the difference is the kernel etc.). If I had some horrible need to do it it could be set up for 200Mb of swap and allocate that much to one process. Don't laugh too loud, I saw a post once where some guy running some kind of modeling program was doing exactly that on a Sun. >In any case, swapped-out processes MUST reside in the virtual address >space of the system, since they are not swapped _in_ as whole processes. >The System V swapping algorithm is not the same as the old Version 7 one >where the entire process had to fit in the RAM. In SysVr2.1 the >swapping algorithm is essentially the same as the 4.1 BSD algorithm. >The process must therefore fit in the VIRTUAL address space since it may >be paged in rather than swapped in as a whole entity. Swapping in the >sense of V7 UNIX does not occur. That fits my understanding of what is happening. That is the difference between swapping and paging. > In fact, a new process will not be >allowed to run if it cannot fit in the virtual space available. It will >be killed in the memory allocation stage. If you examine the kernal .o >files, you will see that the only swapping program is vmswap.c. This >program manages/shares the same virtual address space as vmpage.c. I don't know one way or the other on this. Does it kill the new process or kill an old process? I know if existing processes ask for more memory and swap space is full it will start killing off processes. But I don't know what the algorithm is. Floyd -- Floyd L. Davidson | Alascom, Inc. pays me, |UA Fairbanks Institute of Marine floyd@ims.alaska.edu| but not for opinions. |Science suffers me as a guest.
thad@public.BTR.COM (Thaddeus P. Floryan) (06/13/91)
In article <1991Jun13.065207.10089@ucunix.san.uc.edu> adams@ucunix.san.uc.edu (J. Adams - SunOS Wizard) writes: >[...] Thus, >the notion of swapping to one device and paging to another is >impossible. Not with DEC's VAX/VMS. There's both a "swap" and a "page" ``file''. (Gawd, I never thought I'd be defending VMS, the penultimate Vomit Making System :-) >[...] >I am unsure of one other point: As I understand it, the total (not the >per-process limit, which is clearly 2.5MB) virtual address space of the >UNIX-PC is 4 megabytes. If this is in fact the case, increasing the >available swap partition beyond its default maximum of around 4.5 MB >(to allow for alternate blocks, filesystem overhead, etc.) cannot >result in any benefit. [...] Not true. As I discovered earlier this year while having gcc compile the "ephem" program in the background and doing some other "online" emacs and gcc work, increasing the swap partition on my HD from the default multi-user 5MB to 12MB made a BIG difference (those processes simply would NOT run before ("Out of swap space"); now they do.) Thad Floryan [ thad@btr.com (OR) {decwrl, mips, fernwood}!btr!thad ]
tkacik@hobbes.cs.gmr.com (Tom Tkacik CS/50) (06/13/91)
In article <1991Jun13.065207.10089@ucunix.san.uc.edu>, adams@ucunix.san.uc.edu (J. Adams - SunOS Wizard) writes: |> |> I am unsure of one other point: As I understand it, the total (not the |> per-process limit, which is clearly 2.5MB) virtual address space of the |> UNIX-PC is 4 megabytes. If this is in fact the case, increasing the |> available swap partition beyond its default maximum of around 4.5 MB |> (to allow for alternate blocks, filesystem overhead, etc.) cannot |> result in any benefit. I have never been able to get a straight answer |> from anyone at AT&T/Convergent about this. On most other 32-bit |> systems, the virtual address space is 4GB (3GB for 4.x BSD on a VAX). |> Thus, the possible swap space is much larger than the amount of RAM |> likely to be present. My understanding is that Convergent decided to |> allow just 4MB of address space for memory because they used a rather |> wasteful allocation of available addressing to provide direct |> addressability for almost every conceivable peripheral device/expansion |> card. There are two virtual memory sizes being thrown around, plus swap space. What are these, and where do they come from? 4Mb. This is the total virtual address space of the 3b1. This IS per process. This is limited by the memory management unit of the 3b1. Only 22 address lines are used by the MMU. This not only made addressing peripherals easier, it allowed for a reasonablly sized page table. Unlike most workstations today that keep the page tables in main memory, the 3b1 has seperate RAM for that, (making memory management simpler). This was the obvious thing to do back in 1984, or whenever Convergent designed the UnixPC. Nobody complained then. Who would need that much virtual memory? :-) 2.5Mb. I'm not sure why you say the per-process limit is 'clearly' 2.5Mb? Where this number comes from is not obvious (or clear). This is the virtual address space available for a user program, (not process). The kernel, shared libraries, and perhaps other sundry stuff, is included in the virtual memory space of every process. On the 3b1, this amounts to 1.5Mb of virtual memory. The virtual space available for a user's program is the total virtual space minus that used for the kernel, etc., or 2.5Mb. Every process has 4Mb virtual memory. You just can't get at all of it. Swap space. This is simply where the machine puts processes when they do not fit into main memory. There is no limit (ok, a very large limit) as to how much swap space you can have. Having more swap space will allow you to run more (or larger) processes. Swap space and per process virtual memory are independent of each other. |> In any case, swapped-out processes MUST reside in the virtual address |> space of the system, since they are not swapped _in_ as whole processes. This is not true. Every process must reside in the virtual address space. You can think of the swap space as multiple virtual address spaces. One for each running process. Thus, more swap space, more running processes. Each virtual address space will be only as large as is needed by the process. Because the kernel and shared libraries do not change, they are not swapped out, and are not included in the virtual address space of the swapped process. Part of the memory management is to keep track of those multiple address spaces in swap space. Main memory is broken into pages, each page given to a particular process. The page tables are used to manage this. Each time a new process is scheduled to run, the page tables have to be modified to reflect that current running process. If all of the pages needed are already in main memory, nothing gets swapped, (though the page tables still need to be modified). Only if the pages currently needed are on disk, does paging occur. |> In fact, a new process will not be |> allowed to run if it cannot fit in the virtual space available. It will |> be killed in the memory allocation stage. Do not confuse virtual address space with swap space. What you are saying is correct, but I think that you mean to say that a new process will not be allowed to run if it cannot fit in that swap space available, which is also correct. This is what happens when you try to run too many large processes. -- Tom Tkacik GM Research Labs tkacik@hobbes.cs.gmr.com tkacik@kyzyl.mi.org
tkacik@hobbes.cs.gmr.com (Tom Tkacik CS/50) (06/13/91)
In article <1991Jun13.122803.25362@ims.alaska.edu>, floyd@ims.alaska.edu (Floyd Davidson) writes: |> In article <1991Jun13.065207.10089@ucunix.san.uc.edu> adams@ucunix.san.uc.edu (J. Adams - SunOS Wizard) writes: |> >In article <1991Jun11.030216.6155@ceilidh.beartrack.com> dnichols@ceilidh.beartrack.com (DoN Nichols) writes: |> > In fact, a new process will not be |> >allowed to run if it cannot fit in the virtual space available. It will |> >be killed in the memory allocation stage. If you examine the kernal .o |> >files, you will see that the only swapping program is vmswap.c. This |> >program manages/shares the same virtual address space as vmpage.c. |> |> I don't know one way or the other on this. Does it kill the new |> process or kill an old process? I know if existing processes |> ask for more memory and swap space is full it will start killing off |> processes. But I don't know what the algorithm is. When a process asks for more memory than there is virtual memory, (via malloc, or sbrk(2)), an error will be returned, and the process will not get any bigger. It will not be killed by the kernel either, it will usually call exit by itself, or fail with a segmentation violation as it mindlessly tries to use the NULL pointer returned by malloc. (Of course, a proper program will try to clean up after itself and handle the lack of memory in a graceful fashion.:-) I believe that if there is still plenty of virtual memory but very little swap space left, if a program then calls malloc (or sbrk) the same error will be returned, and the program will not grow. I do not think that a running program will be killed because there is not enough memory to grow into, it simply will not be allowed to grow anymore. If a new process tries to start running, and there is no more memory, the kernel will not let it run. I do not think that it will start and then be killed by the kernel. The kernel should never just start killing off processes when swap space runs out. Individual processes will start to die off as they need more memory and cannot get it. -- Tom Tkacik GM Research Labs tkacik@hobbes.cs.gmr.com tkacik@kyzyl.mi.org
pschmidt@athena.mit.edu (Peter H. Schmidt) (06/14/91)
In [a bunch of] article[s] [several people reply to Don Nichols' post about swapping, etc.]: Just a couple of points I wanted to clarify: o The 3b1 imposes a *per process* limit of 4M of *virtual memory*. Thus, every process can malloc up to 4M for itself, as long as there is enough physical memory to back it up. o Without some clever hardware hacks (involving installing a memory board in a way which fools the 3b1 about what slot it is in), there is a 3.5M RAM limit, i.e. you can't have more than 3.5M of RAM. o Physical memory = RAM + sizeof(swap partition) - sizeof(kernel) Therefore, with a 2M mother board, no memory cards, a 4M default (5000 ~= 4 * 1024 * 1k) swap partition, and a ~300k kernel: Physical memory = 2M + 4M -.3M = 5.7M Thus, all your daemons, your shell and everything else have up to 5.7M to split between them. If, like me, you tend to run two GNU-Emacses w/tons of elisp loaded, increasing the swap partition to 8M, say, can be a big help, eliminating the annoying "doing vfork: not enough space" message. This I know from experience :-) The absolute maximu physical memory a hacked-to-rediculousness 3b1 could have would be: PM = 4M + 180M -.3M = 183.7M ...with memory board hack, 3.51, WD2010, one of those big 180M disks, and a 0k file partition (/dev/fp002). Of course, it wouldn't be a very *useful* machine configured this way... As to the issue of "swapping off one disk and paging off the other", I can state that it is perfectly possible (leaving aside the point about the distinction between paging and swapping - I mean here that your VM backing stores can perfectly well reside on multiple disk partitions). For example, BBN's nX derivative of MACH implements the pager as a process outside of the kernel, and it pages off of whatever mounted device has the most free space (it just grabs blocks off the free list). Thus, paging can often be spread among several disks or partitions, and it works fine, because the pager has a nice little table telling it where to find the page that has virtual address 0xDEADBEEF when your program tries to touch it. (That's a *little* oversimplified ;-) So, to review: each process has a hope of grabbing up to 4M, but can only get as much of that 4M as there is physical memory left free (which may well be all of the 4M). OK? --Peter -- Peter H. Schmidt | ...mit-eddie!winter!pschmidt 3 Colonial Village, #10 | winter!pschmidt@mit-eddie.mit.edu Arlington, MA 02174 | -- Speaking for myself.
andyw@aspen32.cray.com (Andy Warner) (06/14/91)
In article <1991Jun11.030216.6155@ceilidh.beartrack.com>, dnichols@ceilidh.beartrack.com (DoN Nichols) writes: > In article <1991Jun9.170520.4087@yenta.alb.nm.us> dt@yenta.alb.nm.us (David B. Thomas) writes: > >How does the 3b1 know where to find swap space, and can it use the "swap > >partition" on the second hard disk? > > > > little david > > It looks for the special device file "/dev/swap". Here, we see that > it is the same major and minor device numbers as /dev/fp001, and probably > should be linked, but isn't on this system. If you do so, you should use > the permissions and ownership of the "swap" listing. > > br-------- 1 root root 0, 1 Dec 30 21:42 /dev/fp001 > brw-r----- 1 sys sys 0, 1 Dec 30 21:50 /dev/swap > > You should be able to simply replace "/dev/swap" with a link to the > selected partition on your other drive, but I know of no way to have it use > ..... This probably won't work. I'll defer to anyone with access to the source, but the kernel doesn't use /dev/swap, that's just for things like ps. Most V.2 kernels use three variables rootdev, swapdev & pipedev to find their root, swap & pipe device. These are either compiled into the binary (though they are variables, so you can patch them with adb), or are picked up from the Volume Header Block of the boot disk. I don't know if the Convergent VHB hold information about what a slice is to be used for (I've got a feeling it doesn't). One good thing is that I'm sure they pick up the size of the swap device from the VHB, so you don't have to tweak that. So you can do it, but you'll need to patch swapdev, and then ALSO change /dev/swap. Hope this helps, -- andyw. (W0/G1XRL) andyw@aspen.cray.com Andy Warner, Cray Research, Inc. (612) 683-5835
adams@ucunix.san.uc.edu (J. Adams - SunOS Wizard) (06/14/91)
In article <1991Jun13.122803.25362@ims.alaska.edu> floyd@ims.alaska.edu (Floyd Davidson) writes: >[...] >In article <1991Jun13.065207.10089@ucunix.san.uc.edu> >adams@ucunix.san.uc.edu (J. Adams - SunOS Wizard) writes: >>I am unsure of one other point: As I understand it, the total (not the >>per-process limit, which is clearly 2.5MB) virtual address space of the >>UNIX-PC is 4 megabytes. If this is in fact the case, increasing the >>available swap partition beyond its default maximum of around 4.5 MB >>(to allow for alternate blocks, filesystem overhead, etc.) cannot >>result in any benefit. > >Lets say you have 4Mb of swap space. You run two programs that >each take up 1.5Mb of memory. If you start one more there isn't >enough swap space to run it. If you had 6Mb of swap space you >could run 3 processes that needed 1.5Mb of memory (and have some >left over...). > >In practice the only way I've found to bump that limit is with >gcc, which will take all the memory it can get. I'm using 8Mb >of swap space. At this moment 6Mb is free. I've seen it down >to less than 2Mb. (And at the same time the load averages were >up around 4.00 and response time was long, and it was time to >do something else for a few hours while gcc compiled two >different versions of itself at the same time... Not exactly >something you would normally do every day.) > Based on this experience, the 4MB limit must apply only to physically addressable RAM and not to virtual memory. (I assume the physical RAM limit of 4MB could be increased by hardware modification, but as I don't have the H/W manual I'll leave that topic be.) Therefore, you should be able to allocate as much swap space as you can afford by expanding partition 1 on disk 0, up to a (theoretical :-) maximum of 4GB. (See below, however...) Unfortunately, the 3B1 kernel has only one swap device configured, so it won't interleave the swap over two (or more) drives the way 4.3BSD and later versions of System V can. In summary: Virtual Address Space: Determined by processor word size and MMU design (as in MVS/XA). 32 bits for 68010, so = 2^32 or 4GB. Only 3GB on the VAX due to hardware limitations. Swap space: Amount of the above you can actually use at any one time, present on one or more reserved disk partitions. This space is mapped into the physical RAM of the machine by the kernel. In "true" System V systems (see below) the available space can range from (size of swap) to (size of swap + size of physical ram + size of shared text and data) depending on the particular state of the system at any given moment. System V rel. 2.1 and later and 4.1BSD and later adopted different philosophies regarding the implementation of demand-paged virtual memory. The BSD systems, being research-oriented and likely to be used by programmers with a more intimate knowledge of the inner workings of the O/S, chose simple, less elegant approaches to memory management with provisions, such as the vfork() call, for the programmer to tune the system to the application. In 4.xBSD, swap space is allocated at the birth of a process, with enough space being allocated to contain the entire virtual image of the process in its initial state (text+data+BSS), excepting of course malloc'd (heap) space obtained via sbrk() or stack space pushdown expansion, and the user structure and page tables which are RAM-resident for an active (non-swapped) process. System V took the approach of being as elegant as possible (e.g., using copy-on-write and dynamic allocation of all tables except the actual physical memory map) and hiding the inner workings of the system from programmer intervention. In the System V scheme, swap space is allocated as needed by the page stealer to swap out pages or processes as needed. Thus, allocation of swap space is not made until it is actually used. Since process sharable text and data may be demand-paged in from the filesystem, virtual pages may be mapped to one of three places: physical RAM, swap space, or the executable program file. Unfortunately, the UNIX-PC kernel was developed approximately concurrent with, and relatively independently from, the SVr2.1 kernel. This means that the UNIX-PC kernel is a somewhat bastard hybrid of SVr2.0 and 4.1BSD. In this case, based on examination of the data structures in the kernal .o files, we have inherited the 'philosophy' of System V (no vfork() call, etc.) with the memory management algorithm of 4.1BSD (core-map based memory allocation). [...] > >I think the gripe we have about the way it got put together is >that limit of 2.3-2.5Mb on per process virtual memory that can't >be changed. I've got a 68020 system that also has 4Mb of real >memory, but with 8Mb of swap it can allocate 11Mb of memory to one >process (the difference is the kernel etc.). If I had some horrible >need to do it it could be set up for 200Mb of swap and allocate >that much to one process. Don't laugh too loud, I saw a post once >where some guy running some kind of modeling program was doing >exactly that on a Sun. The DECsystem 5000 this is being posted from has 216MB of swap configured with 64MB physical RAM. The Sun 3 I am typing on has 38MB interleaved over 2 drives with 24MB physical RAM. With the massive sizes of some executables (try running a SPICE simulation of a small microprocessor!) it will not be unreasonable to see systems with the full 4GB 32-bit address space supported by swap in the near future. Anyway, the problem with the BSD-derived core-map structure is that it restricts the size not only of the physical RAM, but also the number and size of mounted filesystems (including swap space), and the number of processes and shared text images. In "true" System V, the per-process virtual space is fairly easily tunable and is usually configured based on the size of the pfdata table (physical RAM). In our kernel, changing the 2.5MB limit requires a complete rebuild of the kernel from source to enlarge the coremap structure accordingly. >>In any case, swapped-out processes MUST reside in the virtual address >>space of the system, since they are not swapped _in_ as whole processes. >>The System V swapping algorithm is not the same as the old Version 7 one >>where the entire process had to fit in the RAM. In SysVr2.1 the >>swapping algorithm is essentially the same as the 4.1 BSD algorithm. >>The process must therefore fit in the VIRTUAL address space since it may >>be paged in rather than swapped in as a whole entity. Swapping in the >>sense of V7 UNIX does not occur. > >That fits my understanding of what is happening. That is the >difference between swapping and paging. Swapping under SysVr2.1 and later is merely an adaptation of the normal page-stealer (vhand) daemon in which entire processes are marked 'swapped but ready-to-run' making their entire working set (physical RAM pages) available for allocation until the memory utilization falls below the 'high water mark' that triggers swapping. For each process so marked, the swapper makes another such process ready to run. These processes, however, are NOT 'swapped in' in their entirety but are allowed to page-fault in just like any normal process. Both page stealing and swapping are spawned from process 0. Under 4.xBSD, page reclamation is initiated by a separate kernel process, the pagedaemon or process 2. The swapper is process 0. The swapper will become active under several conditions where memory is critical or a process has slept for over 20 seconds. Swapped processes page-fault in as above, but are treated specially by the system. In general, swapping under 4.xBSD slows the system abruptly because the swapper attempts to guess the amount of memory a process being swapped in will need and will attempt to reserve memory for processes being swapped in. In this sense, swapping under 4.xBSD resembles more closely the V7 process. >> In fact, a new process will not be >>allowed to run if it cannot fit in the virtual space available. It will >>be killed in the memory allocation stage. If you examine the kernal .o >>files, you will see that the only swapping program is vmswap.c. This >>program manages/shares the same virtual address space as vmpage.c. > >I don't know one way or the other on this. Does it kill the new >process or kill an old process? I know if existing processes >ask for more memory and swap space is full it will start killing off >processes. But I don't know what the algorithm is. It is dependent on the system (BSD or SV) and the state of the system. In any case, the process requesting swap space at the time is usually the one that will die. Under SysV, this could be a process in core that's being paged out in response to one being demand-paged from disk. The timing is rather critical. In the BSD case, it will generally be the new process since swap space for the entire virtual image is allocated up front. On the other hand, if the BSD swapper wakes up, it attempts to allocate additional swap space for the user structure and page tables which are normally RAM-resident for any active process. If this fails, the swapper will attempt to swap out another process. If the memory shortage is critical the system will not allow any processes other than those currently resident or being swapped in and out to run. Note that the above is based on published information rather than kernel source code, so some details may be inconsistent with individual implementations of the algorithms. Also, thad@public.BTR.COM (Thaddeus P. Floryan) writes: # #In article <1991Jun13.065207.10089@ucunix.san.uc.edu> adams@ucunix.san.uc.edu (J. Adams - SunOS Wizard) writes: #>[...] Thus, #>the notion of swapping to one device and paging to another is #>impossible. # #Not with DEC's VAX/VMS. There's both a "swap" and a "page" ``file''. (Gawd, #I never thought I'd be defending VMS, the penultimate Vomit Making System :-) # #>[...] #>I am unsure of one other point: As I understand it, the total (not the #>per-process limit, which is clearly 2.5MB) virtual address space of the #>UNIX-PC is 4 megabytes. If this is in fact the case, increasing the #>available swap partition beyond its default maximum of around 4.5 MB #>(to allow for alternate blocks, filesystem overhead, etc.) cannot #>result in any benefit. [...] # #Not true. As I discovered earlier this year while having gcc compile the #"ephem" program in the background and doing some other "online" emacs and #gcc work, increasing the swap partition on my HD from the default multi-user #5MB to 12MB made a BIG difference (those processes simply would NOT run #before ("Out of swap space"); now they do.) Another confirmation that swap space on the UNIX-PC _IS_ expandable. In VMS, the paging algorithm is not 'system-wide' but there are multiple partitions of memory, each of which has an independent paging daemon. I am not very familiar with VMS but would hazard a guess that this is the reason for the separate files. This approach allows the sysadmin to assure that certain classes of programs are always allocated at least a certain amount of system memory. In any event, it is also possible to envision a system in which there is a virtual space of some size(say, 4GB -- 2^32 -- 32 bit addressing) that is managed by a swapping algorithm which has a mush larger, separate, swap space from which entire processes are 'swapped' into and out of the virtual space in the manner of V7 UNIX. Those processes resident in the virtual space would themselves be paged in and out of main memory. In some respects this is a similar concept to background utilization systems that run processes on idle CPUs in a network. -- Jim Adams Department of Physiology and Biophysics adams@ucunix.san.uc.edu University of Cincinnati College of Medicine "I like the symbol visually, plus it will confuse people." ... Jim Morrison
tkacik@hobbes.cs.gmr.com (Tom Tkacik CS/50) (06/14/91)
In article <1991Jun13.170111.28789@athena.mit.edu>, pschmidt@athena.mit.edu (Peter H. Schmidt) writes: |> In [a bunch of] article[s] [several people reply to Don Nichols' post about |> swapping, etc.]: |> |> Just a couple of points I wanted to clarify: |> |> o The 3b1 imposes a *per process* limit of 4M of *virtual memory*. Thus, every |> process can malloc up to 4M for itself, as long as there is enough physical |> memory to back it up. Whoa there! That's not right. Just because there is a 4Mb virtual limit does not mean that you can malloc 4Mb. The number floating around is 2.5Mb. That is how much of the 4Mb you have access to. The rest is the kernal and shared libraries, which for some reason apparently take up 1.5Mb. |> o Without some clever hardware hacks (involving installing a memory board in |> a way which fools the 3b1 about what slot it is in), there is a 3.5M |> RAM limit, i.e. you can't have more than 3.5M of RAM. Wrong again. You can have 4Mb of physical memory in a 3b1. No need for hardware hacks either; unless you call installing a 2Mb memory card a hardware hack. I speak from experience when I say that my UnixPC has 4Mb of physical memory in it. |> o Physical memory = RAM + sizeof(swap partition) - sizeof(kernel) I have never heard of this definition, but it's no worse than any other. |> Therefore, with a 2M mother board, no memory cards, a 4M default (5000 ~= |> 4 * 1024 * 1k) swap partition, and a ~300k kernel: |> |> Physical memory = 2M + 4M -.3M = 5.7M Be careful with your numbers. I believe that the kernel takes up more than .3Mb, you have a bunch of buffers and things in there too. I have never gotten a reply on a question I have. If you have 4Mb RAM, and 7Mb swap space, can you run 11Mb worth of programs, or are you limited to how much swap space you have? I'm not sure that they add up that way. |> Thus, all your daemons, your shell and everything else have up to 5.7M to |> split between them. If, like me, you tend to run two GNU-Emacses w/tons |> of elisp loaded, increasing the swap partition to 8M, say, can be a big help, |> eliminating the annoying "doing vfork: not enough space" message. This I know |> from experience :-) I would be very reluctant to try to do the math you just did. There are other variables you need to take into account, like how many kernel buffers are there. Increasing swap space is a good way to allow running two or three gcc compiles and emacs at the same time (I have a 10Mb swap partition at home); just don't try to be so exact, it's not that important. (A quick note. The 3b1 does not have vfork. That is a Berkeleyism.) |> The absolute maximu physical memory a hacked-to-rediculousness 3b1 could have |> would be: |> PM = 4M + 180M -.3M = 183.7M |> ...with memory board hack, 3.51, WD2010, one of those big 180M disks, and a 0k |> file partition (/dev/fp002). Of course, it wouldn't be a very *useful* |> machine configured this way... You neglect the fact that you could install a second hard disk. |> As to the issue of "swapping off one disk and paging off the other", I can |> state that it is perfectly possible (leaving aside the point about the |> distinction between paging and swapping - I mean here that your VM backing |> stores can perfectly well reside on multiple disk partitions). For example, |> BBN's nX derivative of MACH implements the pager as a process outside of the |> kernel, and it pages off of whatever mounted device has the most free space |> (it just grabs blocks off the free list). Thus, paging can often be spread |> among several disks or partitions, and it works fine, because the pager has a |> nice little table telling it where to find the page that has virtual address |> 0xDEADBEEF when your program tries to touch it. (That's a *little* |> oversimplified ;-) I think that this discussion was pertaining to the 3b1. You can only have a single swap partition. Other machines can, and do, implement things differently. |> So, to review: each process has a hope of grabbing up to 4M, but can only get |> as much of that 4M as there is physical memory left free (which may well be |> all of the 4M). OK? You've got the right idea. Some of the details are questionable. -- Tom Tkacik GM Research Labs tkacik@hobbes.cs.gmr.com tkacik@kyzyl.mi.org
clewis@ferret.ocunix.on.ca (Chris Lewis) (06/14/91)
In article <1991Jun13.065207.10089@ucunix.san.uc.edu> adams@ucunix.san.uc.edu (J. Adams - SunOS Wizard) writes: >In article <1991Jun11.030216.6155@ceilidh.beartrack.com> dnichols@ceilidh.beartrack.com (DoN Nichols) writes: >>In article <1991Jun9.170520.4087@yenta.alb.nm.us> dt@yenta.alb.nm.us (David B. Thomas) writes: >> It looks for the special device file "/dev/swap". Here, we see that >>it is the same major and minor device numbers as /dev/fp001, and probably >>should be linked, but isn't on this system. If you do so, you should use >>the permissions and ownership of the "swap" listing. /dev/swap is *usually* just a convenient invariant name for ps and other user-level tools that rummage around inside other processes's address space to find the swap area. Most UNIX kernels have swapdev hardcoded as a major/minor pair in the binary. These machines usually need to have the kernel relinked to change swapdev. "/dev/swap" is just a convenience so you don't have to recompile the user-level processes like ps. There are a few machines that postpone swap device assignment to somewhere early during the boot phase and use special system calls to do it. But I don't think the 3b1 is one of those. If you simply rename /dev/swap the kernel won't notice, but ps will, and will rummage thru the "wrong" swap area. On SVR3 on a 3b2 or 386, on the other hand, you link in the initial swapdev, and you can append swap devices during multiuser so that you can have your paging area spread across physical devices. >There also seems to be some confusion concerning swapping. Swapping and >paging are the same process. Only the amount of data transferred is >different. Not quite. When you have a pure swapping system (eg: V7, early System V's, V32), the process gets completely swapped out, and the process can't be restarted until the whole process gets swapped in. These machines either cannot support demand load (eg: 68000's), or didn't want to bother... On dual systems, they frequently swap out, and then page fault back in. >The kernel resorts to swapping entire processes as a last >resort when the memory is nearly full and it needs a bigger chunk of >memory than it can get from swapping unused pages. All of this memory >must still reside within the virtual address space of the system. Thus, >the notion of swapping to one device and paging to another is >impossible. In systems capable of both swapping and paging (eg: BSD 4.x), the algorithms for deciding which to do is rather system dependent. One common reason for swapping on "simpler" machines is when you have to do raw I/O with buffers that span page boundaries, and you have to do a page shuffle to get the pages contiguous. >I am unsure of one other point: As I understand it, the total (not the >per-process limit, which is clearly 2.5MB) virtual address space of the >UNIX-PC is 4 megabytes. If this is in fact the case, increasing the >available swap partition beyond its default maximum of around 4.5 MB >(to allow for alternate blocks, filesystem overhead, etc.) cannot (alternate blocks and filesystem overhead on a swap partition? It's not a file system!) >In any case, swapped-out processes MUST reside in the virtual address >space of the system, since they are not swapped _in_ as whole processes. This may be true of a specific paging algorithm, but I think you may be getting a bit confused yourself. Are you speaking in general terms, or with a specific paging implementation in mind? In general, there's no particular reason that a paged out process be part of the address space of anything - the kernel only needs to associate a given page fault from a specific process with the set of page descriptors describing that process from which to find a disk address. In many cases, the swap space need only be constrained by the size of the integer used for disk addresses, and the virtual space of the kernel is "just" the physical memory. There a several examples of systems with virtual memory and paging where the total of process virtual address spaces considerably exceed the number of address lines that the processor has (eg: MVS and VM especially with pre-XA 24 bit physical addresses). I have this awful feeling that the total virtual address space limit on a 3b1 is simply because they fixed the size of the page table pool or some other reason. The 4mb physical limit is a different story. -- Chris Lewis, Phone: (613) 832-0541, Domain: clewis@ferret.ocunix.on.ca UUCP: ...!cunews!latour!ecicrl!clewis; Ferret Mailing List: ferret-request@eci386; Psroff (not Adobe Transcript) enquiries: psroff-request@eci386 or Canada 416-832-0541. Psroff 3.0 in c.s.u soon!
floyd@ims.alaska.edu (Floyd Davidson) (06/14/91)
In article <55913@rphroy.UUCP> tkacik@hobbes.cs.gmr.com (Tom Tkacik CS/50) writes: >In article <1991Jun13.122803.25362@ims.alaska.edu>, floyd@ims.alaska.edu >(Floyd Davidson) writes: >|> >|> I don't know one way or the other on this. Does it kill the new >|> process or kill an old process? I know if existing processes >|> ask for more memory and swap space is full it will start killing off >|> processes. But I don't know what the algorithm is. > >When a process asks for more memory than there is virtual memory, >(via malloc, or sbrk(2)), an error will be returned, and the process will not >get any bigger. It will not be killed by the kernel either, it will usually >call exit by itself, or fail with a segmentation violation as it mindlessly >tries to use the NULL pointer returned by malloc. >(Of course, a proper program will try to clean up after itself and handle the >lack of memory in a graceful fashion.:-) > >I believe that if there is still plenty of virtual memory but >very little swap space left, if a program then calls malloc (or sbrk) >the same error will be returned, and the program will not grow. I do not think >that a running program will be killed because there is not enough memory to >grow into, it simply will not be allowed to grow anymore. > >If a new process tries to start running, and there is no more memory, >the kernel will not let it run. I do not think that it will start and then >be killed by the kernel. The kernel should never just start killing >off processes when swap space runs out. Individual processes will start to >die off as they need more memory and cannot get it. This is pretty easy to check out, so I did. I took the memory allocation check routine from hard-params.c in the gcc-1.40 distribution and made a little program that allocates all the memory it can, tells how much it was, and sleeps for two minutes. Then I ran it several times in the background. Interesting. The first three invocations were always repeatable: At the start: -- Free memory: 2.368 Mb, Free swap 6.412 Mb Allocate 2.481 Mb -- Free memory: 2.296 Mb, Free swap 3.828 Mb Allocate 2.481 Mb -- Free memory: 2.236 Mb, Free swap 1.260 Mb Allocate 1.021 Mb -- Free memory: 2.164 Mb, Free swap 0.228 Mb At this point things work different every time. Sometimes it will allocate 20-40k of memory, sometimes it won't. When it won't, a prompt that I assume comes from the kernel but could be from the shell, says "Killed". Any attempt at running ls results in "Killed". However, several smaller utilities like who and ps will run. One one occasion I did get a message definitely from ksh that said something like "ksh: failed to fork()". I'm not sure exactly, but that was the essence of it. The fork failed get enough memory to start another process even. Nothing else got killed, and there was nothing running that would have tried mallocing more memory. The free memory and free swap space numbers came from sysinfo (updating every five seconds). It does appear that normally programs would in fact handle a full swap device by whatever means they handle an out of memory condition, because malloc definitely returns a NULL when there is no swap space, even if there is sufficient free memory. But it also appears that the kernel will kill a process attempting to start if it can't allocate swap space for it. Now as to whether the kernel has specifically a built in per process limit, or whether it is just a case of running out of virtual memory (at 4Mb max) due to the program size and shared libs and the kernel adding up, here is an easy way to find out. If a few people will run the hard-params binary that comes with the gcc distribution and report exactly how much memory it says can be allocated, we can tell. That is the last thing hard-params reports, so just let it run and read the last couple lines. My kernel is a slightly modified 3.51m version (re-linked with space for 10 mounted drives). A standard 3.51m, 3.51a, or 3.51 might show up slightly different if the limit is merely how much virtual memory is available. But it will be the same if the kernel actually has a limit. It might also be the same if the actual difference in size between various kernels is less than one page in memory too, though. (Running size on my kernel shows 157,840 if thats worth anything.) Floyd -- Floyd L. Davidson | Alascom, Inc. pays me, |UA Fairbanks Institute of Marine floyd@ims.alaska.edu| but not for opinions. |Science suffers me as a guest.
ignatz@wam.umd.edu (Mark J. Sienkiewicz) (06/15/91)
In article <1991Jun11.030216.6155@ceilidh.beartrack.com> dnichols@ceilidh.beartrack.com (DoN Nichols) writes: >In article <1991Jun9.170520.4087@yenta.alb.nm.us> dt@yenta.alb.nm.us (David B. Thomas) writes: >>How does the 3b1 know where to find swap space, and can it use the "swap >>partition" on the second hard disk? > > It looks for the special device file "/dev/swap". Here, we see that This is not correct. It uses partition #1 on hard disk #0. This just happens to be named /dev/swap, but the kernel does not look in /dev to find it. It just uses the block device with major number 0, minor number 1. > You should be able to simply replace "/dev/swap" with a link to the as described above, this doesn't work. >other, so you could increase the space to handle emergency conditions, since >the actual "swap" proceedure is used only under relative emergency >conditions, with paging used for normal conditions. The swap area is used both to swap processes and to store DATA pages when they are paged out. Read only TEXT pages are never swapped out, but they are sometimes thrown away and re-read from the a.out file you are executing. >and general understanding. If I gave wrong advice, I'm sure I'll hear about >it :-) Congrats! I wish more people on the net had this attitude. There are variables in most unix kernels: _swapdev major/minor # of swap device _swplo first block number on device available for swapping _nswap number of blocks available Not all unixes have these. If you are knowledgeable, brave, or foolhardy, you can use ADB to modify your kernel to change these. DO NOT DO THIS TO A RUNNING KERNEL. cp /unix /unix.hacked; adb /unix.hacked; then boot the modified kernel. happy hacking.
dnichols@ceilidh.beartrack.com (DoN Nichols) (06/16/91)
In article <100425.7286@timbuk.cray.com> andyw@aspen32.cray.com (Andy Warner) writes: > [ ... my mistaken assumption deleted ... ] > >This probably won't work. I'll defer to anyone with access to the >source, but the kernel doesn't use /dev/swap, that's just for things like >ps. Most V.2 kernels use three variables rootdev, swapdev & pipedev to >find their root, swap & pipe device. These are either compiled into the >binary (though they are variables, so you can patch them with adb), or >are picked up from the Volume Header Block of the boot disk. I don't >know if the Convergent VHB hold information about what a slice is to >be used for (I've got a feeling it doesn't). One good thing is that >I'm sure they pick up the size of the swap device from the VHB, so you >don't have to tweak that. Looking at the /usr/include/sys/gdisk.h file to check for this, I found something that looks interesting. Does this mean that I can have a filesystem automatically mount itself (aside from under the control of /etc/rc)? If so, when would this happen? Does it happen only for removable-media drives like the Syquest? How about a floppy? I can see why I might want to have removable-media drives automount, but is there any reason for doing things this way with a normal disk partition? Is there any good reason for this to exist? Why am I asking all these questions ... ? /* volume home block on disk */ struct vhbd { uint magic; /* S4 disk format code */ int chksum; /* adjustment so that the 32 bit sum starting [ ... ] char fpulled; /* dismounted last time? */ ***>>> struct mntnam mntname[MAXSLICE]; /* names for auto mounting. null string means no auto mount */ long time; /* time last came on line */ [ ... ] }; Thanks DoN. -- Donald Nichols (DoN.) | Voice (Days): (703) 664-1585 D&D Data | Voice (Eves): (703) 938-4564 Disclaimer: from here - None | Email: <dnichols@ceilidh.beartrack.com> --- Black Holes are where God is dividing by zero ---
adam@cs.UAlberta.CA (Michel Adam; Gov't of NWT) (06/16/91)
In article <1991Jun14.225216.23361@cfctech.cfc.com> kevin@cfctech.cfc.com (Kevin Darcy) writes: >In article <1991Jun13.192444.7500@ucunix.san.uc.edu> adams@ucunix.san.uc.edu (J. Adams - SunOS Wizard) writes: >>In article <1991Jun13.122803.25362@ims.alaska.edu> floyd@ims.alaska.edu >>(Floyd Davidson) writes: >>>[...] >>>In article <1991Jun13.065207.10089@ucunix.san.uc.edu> >>>adams@ucunix.san.uc.edu (J. Adams - SunOS Wizard) writes: >> >>System V rel. 2.1 and later and 4.1BSD and later adopted different >>philosophies regarding the implementation of demand-paged virtual >>memory. The BSD systems, being research-oriented and likely to be used >>by programmers with a more intimate knowledge of the inner workings of >>the O/S, chose simple, less elegant approaches to memory management with >>provisions, such as the vfork() call, for the programmer to tune the >>system to the application. In 4.xBSD, swap space is allocated at the >>birth of a process, with enough space being allocated to contain the >>entire virtual image of the process in its initial state >>(text+data+BSS), excepting of course malloc'd (heap) space obtained [ ... ] > >Although I'm no kernel hacker, I will point out, from a sysadmin's point-of- >view (with advice from the higher levels of AT&T support/development people) >that later SysV implementations still retain vestiges of the old virtual >memory scheme you describe: on an AT&T 3B2/600 running r3, for instance, the >kernel always attempts to map two -entire- sets of -contiguous- pages in the >swap area for every process upon startup, whenever sbrk() is called for more >memory, and so on. One of these sets is for paging, and the other, for >swapping. If, for some reason, there is not enough contiguous space available >to allocate one of these sets, the kernel spits out a "getcpages: cannot >allocate X contiguous pages" console message (X being some-number-or-other). >If there is insufficient contiguous space to map EITHER page, new processes >will die on startup, or ungrowable processes will (usually) seg violate, with >all of this mayhem accompanied by delightful kernel messages such as "growreg: >unable to allocate ...", or "dupreg: unable to allocate..." or "uballoc: ><something-or-other>" spewing on the console. > >The algorithm for mapping this swap space is unknown to me, but it appears to >be imperfect - from time to time, one of our 3B2's or another will suddenly >experience these symptoms for no particular reason, and plenty of available >swap space. Once it starts, even programs with relatively small run-time >images (e.g. /bin/cat) will then die on startup, with the kernel complaining >that it cannot allocate swap-area space. Obviously, the only cure at that >point is to reboot... > >Perhaps AT&T has been slow in -implementing- the VM "elegance" of which you >speak? I see very little of it on their own products. > I seem to remember a discussion on this subject, 6 to 12 months ago, on the problems with insufficient RAM available on the AT&T 386 machines, (must have been a SYS V Ver. 3.something). One comment that struck my mind at that time was that for some reason, there was a recommendation to have as much real RAM as the swap partition's size ( I may be mistaken on that ...), to avoid some kind of trashing. There was an explicit reference to the implementation by Convergent, for the 7300, being somewhat different and managing to avoid this problem. There where numerous references to the Bach book, but I don't remember if the difference between the CT version for the 3B1 and the other versions running on 386 machine was described. Can someone who archived these articles look up the conclusion? What was it that CT did differently in their implementation? (Can someone with access to the kernel source find out?) >>-- >> Jim Adams Department of Physiology and Biophysics >>adams@ucunix.san.uc.edu University of Cincinnati College of Medicine >> "I like the symbol visually, plus it will confuse people." >> ... Jim Morrison > >------------------------------------------------------------------------------ >kevin@cfctech.cfc.com | Kevin Darcy, Unix Systems Administrator >...sharkey!cfctech!kevin | Technical Services (CFC) >Voice: (313) 759-7140 | Chrysler Corporation >Fax: (313) 758-8173 | 25999 Lawrence Ave, Center Line, MI 48015 >------------------------------------------------------------------------------ Michel Adam adam@cs.ualberta.ca (...!alberta!adam) or adam@iceman.UUCP (...!alberta!iceman!adam)
tkacik@kyzyl.mi.org (Tom Tkacik) (06/21/91)
In article <55907@rphroy.UUCP>, I wrote: > In article <1991Jun13.065207.10089@ucunix.san.uc.edu>, > adams@ucunix.san.uc.edu (J. Adams - SunOS Wizard) writes: > |> > |> I am unsure of one other point: As I understand it, the total (not the > |> per-process limit, which is clearly 2.5MB) virtual address space of the > |> UNIX-PC is 4 megabytes. If this is in fact the case, increasing the > > There are two virtual memory sizes being thrown around, plus swap space. > What are these, and where do they come from? > > 4Mb. This is the total virtual address space of the 3b1. This IS > > 2.5Mb. I'm not sure why you say the per-process limit is 'clearly' 2.5Mb? > Where this number comes from is not obvious (or clear). > This is the virtual address space available for a user program, > (not process). The kernel, shared libraries, and perhaps other sundry I have been perusing /usr/include/sys to try to see where that 2.5MB virtual memory limit comes from, and found in /usr/include/sys/param.h the following defines. #define VUSER_START 0x80000 /* start address of user process */ #define VUSER_END 0x300000 /* end address of user process */ #define SHLIB_START VUSER_END /* start address of shared lib */ #define SHLIB_END 0x380000 /* end address of shared lib */ #define KVMEM_VBASE SHLIB_END /* start addr of kernel vm */ #define KVMEM_VLIMIT 0x400000 /* end addr of kernel vm */ This defines how virtual memory space is used in each process. They clearly show that the user process goes from 0x80000 to 0x300000. This is the 2.5MB. Shared libraries must fit into the 0.5MB between 0x300000 and 0x380000, while the kernel must fit into the 0.5Mb between 0x380000 and 0x400000 (the very top of virtual memory). But that's only 3.5MB of the available 4MB. I cannot find what could possibly be in that space from 0x00000 to 0x80000. The ifiles (in /lib) all instruct ld(1) to start the user program at 0x80000. Does anybody have any idea what is in that low part of memory? Shouldn't we be able to get 3MB of virtual memory for out programs? What would happen if some guinea pog modified the ifile to start a program much lower in memory? How much lower could you safely go? Any takers? -- Tom Tkacik | tkacik@kyzyl.mi.org | To rent this space, call 1-800-555-QUIP. tkacik@hobbes.cs.gmr.com |
alex@umbc4.umbc.edu (Alex S. Crain) (06/21/91)
In article <384@kyzyl.mi.org> tkacik@kyzyl.mi.org (Tom Tkacik) writes: [This is the pertinant magic describing the process environment] >#define VUSER_START 0x80000 /* start address of user process */ >#define VUSER_END 0x300000 /* end address of user process */ >#define SHLIB_START VUSER_END /* start address of shared lib */ >#define SHLIB_END 0x380000 /* end address of shared lib */ >#define KVMEM_VBASE SHLIB_END /* start addr of kernel vm */ >#define KVMEM_VLIMIT 0x400000 /* end addr of kernel vm */ [No the questions, reformatted for clarity] 1] This defines how virtual memory space is used in each process. Yup. 2] They clearly show that the user process goes from 0x80000 to 0x300000. Yup 3] This is the 2.5MB. Shared libraries must fit into the 0.5MB between 0x300000 and 0x380000, while the kernel must fit into the 0.5Mb between 0x380000 and 0x400000 (the very top of virtual memory). Bzzzzt! wrong. The shared libraries live between 0x300000-0x380000. The top .5 megs is dynamic memory for the kernel, which lives below 0x80000. 4] But that's only 3.5MB of the available 4MB. I cannot find what could possibly be in that space from 0x00000 to 0x80000. The ifiles (in /lib) all instruct ld(1) to start the user program at 0x80000. Yup. From the kernels point of view, the world looks like this: KERNEL 0x400000:-------------------------------------------------------------- KERNEL Kernel dynamic memory pages (for system buffers, etc) KERNEL 0x380000:-------------------------------------------------------------- USER Shared library area USER 0x300000:-------------------------------------------------------------- USER User stack USER 0x2fe???:---------------------- USER User text/data USER USER USER USER 0x080000:-------------------------------------------------------------- KERNEL User structure (8K) KERNEL Kernel stack KERNEL Kernel data KERNEL Kernel text KERNEL Interrrupt vector table KERNEL 0x000000:-------------------------------------------------------------- It looks like that from the perspective of a user program too, except that the areas labeld KERNEL can't be read by user programs (segfaults). Unix works by changing the user pages, but the kernel pages are always there. 5] Shouldn't we be able to get 3MB of virtual memory for out programs? You can, but you have to put some of the code in the shared library (yes, I know that wasn't what you were looking for). This is actually pretty reasonable, as long as you *add* it to the existing library image without distrubing any addresses. There is a package to do this in the archives, shlib.something.Z. The Directions are pretty obtuse, but it works. 6]What would happen if some guinea pog modified the ifile to start a program much lower in memory? How much lower could you safely go? Any takers? It would crash with a segmentation violation and be very boring. &&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&&& A better way to run very large programs would be to use overlays. This is where a program links several text areas into the same memory space and does its own swapping. Its pretty common on architectures with very small address spaces, like PDP-11s, C-64s, etc. The C-128 had some really slick hardware support for overlays, where they put the OD in ROM, and you could toggle a register to make it appear in different places in memory. The idea was to have the OS load your program, and then make the OS go away and have your program appear in its place. When you needed to make a system call, you set up the call, toggled the OS back into memory, and jumped. Cute. Anywho, the system 5 loader is pretty smart, and can do overlays just fine. You could then toggle the memory in as shared memory or, if your a little bolder, you could just diddle the page table. -- ################################# :alex. #Disclaimer: Anyone who agrees # Systems Programmer #with me deserves what they get.# University of Maryland Baltimore County ################################# alex@umbc3.umbc.edu
res@colnet.uucp (Rob Stampfli) (06/22/91)
In article <1991Jun21.151926.15624@umbc3.umbc.edu> alex@umbc4.umbc.edu (Alex S. Crain) writes: > >6]What would happen if some guinea pog modified the ifile to start a program >much lower in memory? How much lower could you safely go? Any takers? > > It would crash with a segmentation violation and be very boring. Correct, with one corrollary: On 3.51 (and I believe 3.5) the first page of virtual memory is read/only and filled with zeros. This was done, I believe, to permit programs that (incorrectly) dereference the NULL pointer to work without causing the segmentation fault Alex describes above. (There were so many examples of this problem floating around, it was giving the Unix-PC OS people nightmares.) Fine, I thought: They just allocated 4K and set up the MMU for R/O access. However, on closer inspection of the hardware, I found what I believe is a prohibition against accessing memory below x80000 in the MMU firmware itself: The firmware appears to use the supervisor/user mode state signal from the 68010 to prohibit user level access to the lower x80000 bytes, regardless of the actual MMU state for page zero. Can anyone confirm or refute this, and if it is so, how does the OS allow read access to page zero? -- Rob Stampfli, 614-864-9377, res@kd8wk.uucp (osu-cis!kd8wk!res), kd8wk@n8jyv.oh
jmm@eci386.uucp (John Macdonald) (06/22/91)
In article <384@kyzyl.mi.org> tkacik@kyzyl.mi.org (Tom Tkacik) writes: |I have been perusing /usr/include/sys to try to see where that |2.5MB virtual memory limit comes from, and found in /usr/include/sys/param.h |the following defines. | |#define VUSER_START 0x80000 /* start address of user process */ |#define VUSER_END 0x300000 /* end address of user process */ |#define SHLIB_START VUSER_END /* start address of shared lib */ |#define SHLIB_END 0x380000 /* end address of shared lib */ |#define KVMEM_VBASE SHLIB_END /* start addr of kernel vm */ |#define KVMEM_VLIMIT 0x400000 /* end addr of kernel vm */ | |This defines how virtual memory space is used in each process. |They clearly show that the user process goes from 0x80000 to 0x300000. |This is the 2.5MB. Shared libraries must fit into the 0.5MB between |0x300000 and 0x380000, while the kernel must fit into the 0.5Mb |between 0x380000 and 0x400000 (the very top of virtual memory). | |But that's only 3.5MB of the available 4MB. I cannot find what |could possibly be in that space from 0x00000 to 0x80000. |The ifiles (in /lib) all instruct ld(1) to start the user |program at 0x80000. | |Does anybody have any idea what is in that low part of memory? |Shouldn't we be able to get 3MB of virtual memory for out programs? |What would happen if some guinea pog modified the ifile to start a program |much lower in memory? How much lower could you safely go? Any takers? I would guess that the low 0.5 Meg is non-virtual memory for the kernel - i.e. the portion of the kernel that is fixed when the kernel is built. The virtual memory for the kernel would then include loadable device driver and allocatable space... -- Usenet is [like] the group of people who visit the | John Macdonald park on a Sunday afternoon. [...] luckily, most of | jmm@eci386 the people are staying on the paths and not pissing | on the flowers - Gene Spafford
adam@cs.UAlberta.CA (Michel Adam; Gov't of NWT) (06/22/91)
In article <224@kas.helios.mn.org> rhealey@kas.helios.mn.org (Rob Healey) writes: >In article <1991Jun15.195720.25820@cs.UAlberta.CA> adam@cs.UAlberta.CA (Michel Adam; Gov't of NWT) writes: >= >=Can someone who archived these articles look up the conclusion? What was it >=that CT did differently in their implementation? >= > They used a better processor family... B^). They also grafted 4.1BSD > VM onto a System V R2 kernel. As such, you HAVE to have enough I tought that demand paged virtual memory only came with 4.2 or 4.3 ...? > swap to page in the whole program to swap and then memory. I.e. I > ran into troubles on a UNIX PC where I had 700k free memory but > only 100k of swap, the system REFUSED to load a 400k program; sucked Disk space is cheap, relative to RAM anyway ... It seems a fairly minor compromise, if it avoid the trashing that seemed to have been a common problem with the other implementation of SYS V R. 2 or 3. Of course it may not be a 'purist' approach, but I'm in business, not academia ... Real world tends to intrude ... > rocks. This just goes to show that both methods have their "evil twin" > side. SVR3.x chews up too much swap, Early BSD needs enough swap for the > whole initial program even if there is enough main memory but not > enough swap/paging. I was under the impression that the actual 'a.out' file was used for paging in the code (read 'text'), and that was decreasing the need for disk space in the swap partition. I once tried to delete a program and got an error message about the file being in use. Did I miss something here? What about the shared library? Isn't it supposed to be locked when in use? Could someone send me a list of recommended books on the design of the BSD system, particularly the version that was used as a 'donnor' for the UNIX-PC kernel. A list of all the BSDism in this kernel would also be interesting. I'll summarize if there is interest. > > I believe SVR4 and 4.3 BSD are better at paging and are more > reasonable with it all. If nothing else you can add paging files > under both to solve the running out of swap problem, you just take > performance hits when the paging file blocks are scattered all > over a disk. > > -Rob Michel Adam Yellowknife, N.W.T. adam@cs.ualberta.ca (...!alberta!adam)
tkacik@kyzyl.mi.org (Tom Tkacik) (06/23/91)
In article <1991Jun21.215503.26210@colnet.uucp>, res@colnet.uucp (Rob Stampfli) writes: > Correct, with one corrollary: On 3.51 (and I believe 3.5) the first page > of virtual memory is read/only and filled with zeros. This was done, I > believe, to permit programs that (incorrectly) dereference the NULL pointer > to work without causing the segmentation fault Alex describes above. (There > Fine, I thought: They just allocated 4K and set up the MMU for R/O access. > However, on closer inspection of the hardware, I found what I believe is a > prohibition against accessing memory below x80000 in the MMU firmware itself: > The firmware appears to use the supervisor/user mode state signal from the > 68010 to prohibit user level access to the lower x80000 bytes, regardless of > the actual MMU state for page zero. Can anyone confirm or refute this, and > if it is so, how does the OS allow read access to page zero? I just did a quick test of reading page 0. My program ran fine and printed that location 0 does indeed contain the value 0. However, if I tried reading from any other location, (even 1), the program chrashed with a core dump. Perhaps it is not done in the MMU. Instead of writing zeros to the first page and giving you read permission, the bus error handler checks to see if the bus error was due to a read of location 0, and returns a value of zero. I could be done entirely in software. I remember trying this test a couple of years ago, and not being able to read location 0. This must be a new "feature" of 3.51 (maybe 3.5). -- Tom Tkacik | tkacik@kyzyl.mi.org | To rent this space, call 1-800-555-QUIP. tkacik@hobbes.cs.gmr.com |
res@colnet.uucp (Rob Stampfli) (06/24/91)
In article <389@kyzyl.mi.org> tkacik@kyzyl.mi.org (Tom Tkacik) writes: >In article <1991Jun21.215503.26210@colnet.uucp>, res@colnet.uucp (Rob Stampfli) writes: >> Correct, with one corrollary: On 3.51 (and I believe 3.5) the first page >> of virtual memory is read/only and filled with zeros. This was done, I >> believe, to permit programs that (incorrectly) dereference the NULL pointer >> to work without causing the segmentation fault Alex describes above. (There > >> Fine, I thought: They just allocated 4K and set up the MMU for R/O access. >> However, on closer inspection of the hardware, I found what I believe is a >> prohibition against accessing memory below x80000 in the MMU firmware itself: >> The firmware appears to use the supervisor/user mode state signal from the >> 68010 to prohibit user level access to the lower x80000 bytes, regardless of >> the actual MMU state for page zero. Can anyone confirm or refute this, and >> if it is so, how does the OS allow read access to page zero? > >I just did a quick test of reading page 0. My program ran fine and printed >that location 0 does indeed contain the value 0. However, if I tried >reading from any other location, (even 1), the program chrashed with >a core dump. > >Perhaps it is not done in the MMU. Instead of writing zeros to the first >page and giving you read permission, the bus error handler checks to >see if the bus error was due to a read of location 0, and returns a >value of zero. I could be done entirely in software. > >I remember trying this test a couple of years ago, and not being able >to read location 0. This must be a new "feature" of 3.51 (maybe 3.5). Hmm. Here is the program I used to do the test: main() { int *i; for(i = 0;;i++) printf("%d=%d\n",i,*i); } It produces printf output up to i=4092, and then then crashes and burns. I suspect that if Tom tried accessing location 1 as anything other than a character, it might have caused a bus error for alignment reasons. However, based on what Tom said, I tried this program: main() { int i,k; int *j; #ifdef NULLPTRTST j = 0; #else j = &k; #endif for(i = 0; i < 1000000; i++) k = *j; } The results are quite interesting. With j=&k, the program runs in a reasonable amount of time, and is cpu bound. With j=0, the program takes over 15 minutes to run on an idle machine, and most of the time is spent as system time (I used "time a.out" to try this). This seems to indicate that Tom's supposition that [ the error interrupt still occurs, but the Kernel code recovers and returns control to the program in the case of a read from low memory locations ] is correct. One day, I will have to delve into the code and see for sure. Now, I wonder: was it just coincidence that the value read was zero? Perhaps if the code were buried deeply in other code, and the registers were dirtied, the value returned might be some other random value. -- Rob Stampfli, 614-864-9377, res@kd8wk.uucp (osu-cis!kd8wk!res), kd8wk@n8jyv.oh