staff@cadlab.sublink.ORG (Alex Martelli) (11/08/90)
Maybe it's the pageable kernel, who knows, but for sure AIX 3 is rather funny wrt programs that do dynamic allocation (don't they all?-). We have an interactive program (a solid modeler) which allocates memory dynamically, depending on what the user is doing, with straight calls to malloc(). On most Unix platforms (...all of them, except AIX 3...!), if the program (=its user) is too ambitious for the amount of paging space, malloc() will eventually return NULL - the program religiously tests for this (to free all that's freeable, retry the malloc(), and if it still fails gracefully inform the user). On some machines the failing malloc()'s give console messages suggesting expansion of paging space, which is bothersome enough (we DO NOT want to expand paging space to infinity, there's NO limit on the complexity of solid models that a user will *attempt* to build, we JUST want to inform the user that a given model takes more memory than he/she's got, is that such an unusual approach on our part, grumble grumble!-). On AIX 3, things are worse. It appears that malloc() does succeed, BUT then "system paging space" gets low, and funny things happen. If our application does not catch SIGDANGER, it gets killed; if it DOES catch SIGDANGER, the *X Window System* (under which the app's running) gets killed instead! The app does not appear to be able to really "free" memory to the system, i.e. normal free() probably does not sbrk() (this happens on MANY platforms... the malloc()/free() pair seems to attempt to minimize system-call overhead). We could try funneling our malloc()'s through a safemalloc() which will check psdanger() and refuse to allocate if this would take paging-space too low, but this does not appear to solve the problem: I believe Xlib, Xt, Ingres, and whatever else we link with our app, use raw malloc()'s. Ok, so I COULD completely rewrite the malloc() package and fix things for OUR process, but this STILL would not solve it - it's quite likely that the malloc() from OUR process happens when everything's fine, but right after that any process running some application which source we don't have might well allocate more memory and cause the danger condition! It upsets me that an application programmer is supposed to fix these low-level, system-oriented things, and the only alternative appears to forego dynamic memory allocation completely! Just having a program (or the X Window System indispensible for interfacing to it) die on the user when he/she attempts construction of a complex solid model does NOT appear to be a viable approach for a commercial application!!! A system-level solution would be best, but I can't find a good one. I think we now have ALL the documentation IBM supplies, but I don't see a way there to reserve some amount of resource (paging space) for the kernel, or for root-owned processes, or whatever. The limits file allows things like fixing maximum amount of data area PER PROCESS, but what good will this do me??? I can't predict how many processes WILL be running when a dangerous situation approaches! Why oh why can't brk()/sbrk() just REFUSE to expand space to a dangerous situation? Suggestions will be appreciated, particularly on how to AVOID danger situations, but also on how to GRACEFULLY HANDLE them. Our plight does NOT appear to me to be a very strange one, so I would really hope somebody else's "been there before"! Thanks in advance. -- Alex Martelli - CAD.LAB s.p.a., v. Stalingrado 45, Bologna, Italia Email: (work:) staff@cadlab.sublink.org, (home:) alex@am.sublink.org Phone: (work:) ++39 (51) 371099, (home:) ++39 (51) 250434; Fax: ++39 (51) 366964 (work only), Fidonet: 332/401.3 (home only).
richard@locus.com (Richard M. Mathews) (11/13/90)
staff@cadlab.sublink.ORG (Alex Martelli) writes: >On AIX 3, things are worse. It appears that malloc() does succeed, BUT >then "system paging space" gets low, and funny things happen. It seems that some customers insist on this behavior because they malloc huge areas which are only sparsely used. They want to have a virtual process size much greater than the paging space of the system. At least that is what I was told when I was asked to write the SIGDANGER code for AIX 1.x. Richard M. Mathews Freedom for Lithuania Locus Computing Corporation Laisve! richard@locus.com lcc!richard@seas.ucla.edu ...!{uunet|ucla-se|turnkey}!lcc!richard
marc@arnor.uucp (11/15/90)
malloc fails when the request causes the heap to exceed the ulimit for data. It has nothing to do with paging space. In AIX V3 the default data limit is quite large, which is why it appears to behave differently. Marc Auslander
rhoover@arnor.uucp (11/15/90)
In article <MARC.90Nov14153807@marc.watson.ibm.com>, marc@arnor.uucp writes: |> malloc fails when the request causes the heap to exceed the |> ulimit for data. It has nothing to do with paging space. |> |> In AIX V3 the default data limit is quite large, which is why |> it appears to behave differently. |> |> Marc Auslander Well, this is not true under sunos. For example, consider the following program (called big.c): #include <stdio.h> main() { while (malloc(1024*1024*4) != NULL) fprintf(stderr,"Another 4 meg\n"); fprintf(stderr,"That's all folks\n"); } cirrus% limit cputime unlimited filesize unlimited datasize 524280 kbytes stacksize 8192 kbytes coredumpsize unlimited memoryuse unlimited cirrus% /etc/pstat -s 15312k allocated + 4816k reserved = 20128k used, 29772k available cirrus% big Another 4 meg Another 4 meg Another 4 meg Another 4 meg Another 4 meg Another 4 meg Another 4 meg That's all folks cirrus% Every unix system that I have ever used has returned 0 when malloc can no longer allocate usable memory. When I malloc storage, I check for NULL and if my application has files to be written out, etc, I free some storage and clean up. I see this malloc issue as one of compatability. Programs should not have to be rewritten in order to run on IBM machines. If the /6000 version of malloc is faster, then a new call ( vmalloc() ? ) should be provided for fast memory allocation under the new semantics. Would you have been a happy camper if berkeley had replaced fork() with the vfork() semantics and had provided psfork() for compatability? roger rhoover@ibm.com
dennis@gpu.utcs.utoronto.ca (Dennis Ferguson) (11/16/90)
In article <1990Nov14.223820.29154@arnor.uucp> rhoover@cirrus.watson.ibm.com (Roger Hoover) writes: >In article <MARC.90Nov14153807@marc.watson.ibm.com>, marc@arnor.uucp writes: >|> malloc fails when the request causes the heap to exceed the >|> ulimit for data. It has nothing to do with paging space. >|> > Well, this is not true under sunos. For example, consider the > following program (called big.c): [...] It has been a while since I understood this really well, but I think what is being described is a System V versus BSD Unix variation. Under BSD Unix (and very old System V derivatives?), sufficient space on the paging device to back all active virtual memory is allocated when the memory is allocated, no matter whether the paging space is ever used or not. The effect of this is that you run out of paging space only when allocating new virtual memory, i.e. when exec()ing a new program (in which case the shell probably emits a "No memory" message) or when growing an existing process (in which case malloc() returns a NULL value). If a process is started successfully it will never be terminated due to paging space exhaustion, though requests for more memory may be denied. Other implications are that you can't run a BSD Unix system with no paging space, and if the size of the paging space doesn't exceed the size of your physical memory you won't be able to use all of the latter. System V (or at least the release I was familiar with) doesn't do this. Instead it allocates page space dynamically, when you need to page something out. Running processes have no page space allocation unless they actually have pages out on the backing store. The good effects of this are that you can run System V systems with no page space at all if need be, and that the total in-use memory allowed is related to (physical memory + page space) rather than just page space. The problem is that System V doesn't know if page space is exhausted when it allocates new memory, but rather finds this out only when it needs to page something out. To avoid deadlock, the process which is being paged out is killed. I think AIX exhibits the latter behaviour exactly. Malloc() never returns NULL because the kernel doesn't know page space is exhausted at that point. Sometime later, however, a process will die to pay for this. Note that the process which dies is hardly ever the process which grew itself, since the latter process is obviously active and needs its pages, but rather something that was recently active but which is now idle, like your shell, the window system or a daemon. Something has to die when you get to this point, since it isn't normally possible to free memory back to the system. It sounds like the SIGDANGER thing was added to AIX to give the (more likely to be guilty) active process a chance to commit hara kiri before an innocent dies. To tell the truth, I too like the BSD behaviour a lot better (though the implementation in a vanilla BSD Unix is old, grotty, and still suspects all the world is a vax). Having random processes die is truly annoying. Dennis Ferguson University of Toronto
mccalpin@perelandra.cms.udel.edu (John D. McCalpin) (11/16/90)
>On 15 Nov 90 18:09:19 GMT,dennis@gpu.utcs.utoronto.ca (Dennis Ferguson) said: Dennis> In article <1990Nov14.223820.29154@arnor.uucp> rhoover@cirrus.watson.ibm.com (Roger Hoover) writes: >In article <MARC.90Nov14153807@marc.watson.ibm.com>, marc@arnor.uucp writes: >|> malloc fails when the request causes the heap to exceed the >|> ulimit for data. It has nothing to do with paging space. >|> Dennis> System V (or at least the release I was familiar with) doesn't Dennis> do this. Instead it allocates page space dynamically, when Dennis> you need to page something out. Running processes have no Dennis> page space allocation unless they actually have pages out on Dennis> the backing store. The good effects of this are that you can Dennis> run System V systems with no page space at all if need be, and Dennis> that the total in-use memory allowed is related to (physical Dennis> memory + page space) rather than just page space. Dennis> I think AIX exhibits the latter behaviour exactly. This does not mesh with my experience with AIX. Under AIX 3.1 on my RS/6000, I find that I cannot run jobs for which there is not enough paging space available on the disk --- even though there is plenty of memory to contain the job. Examples: (1) With 16 MB paging space, the O/S used 12 MB and reported 4 MB free. With either 8 MB or 32 MB installed in the machine, I was unable to run jobs with an *active working set* larger than 4 MB. (2) With 36 MB paging space, the O/S used 12 MB and reported 24 MB free. With 32 MB RAM, I was able to run jobs with *active working sets* right up to the 24 MB paging space limit. A check with 'ps v' showed that the jobs were completely in RAM. So what do I mean by *active working set*? Well, I'm not sure how the O/S figures it out, but the following program runs until the part of the array that is *actually used* gets too big for the currently available paging space: parameter (n = 2**22) doubleprecision a(n) do length=65536,n,65536 do i=1,length a(i) = float(i) end do print *,'Size (MB) :',float(length*8)/float(2**20) end do end So somewhere this datum has to fit into theories on how AIX does paging.... -- John D. McCalpin mccalpin@perelandra.cms.udel.edu Assistant Professor mccalpin@vax1.udel.edu College of Marine Studies, U. Del. J.MCCALPIN/OMNET
dennis@gpu.utcs.utoronto.ca (Dennis Ferguson) (11/16/90)
In article <MCCALPIN.90Nov15150308@pereland.cms.udel.edu> mccalpin@perelandra.cms.udel.edu (John D. McCalpin) writes: >>On 15 Nov 90 18:09:19 GMT,dennis@gpu.utcs.utoronto.ca (Dennis Ferguson) said: >Dennis> run System V systems with no page space at all if need be, and >Dennis> that the total in-use memory allowed is related to (physical >Dennis> memory + page space) rather than just page space. > >Dennis> I think AIX exhibits the latter behaviour exactly. >This does not mesh with my experience with AIX. Under AIX 3.1 on my >RS/6000, I find that I cannot run jobs for which there is not enough >paging space available on the disk --- even though there is plenty of >memory to contain the job. [some interesting examples] >So what do I mean by *active working set*? Well, I'm not sure how the >O/S figures it out, but the following program runs until the part of >the array that is *actually used* gets too big for the currently >available paging space: John, Figuring out what memory has been modified is fairly simple since the memory management hardware keeps track of this. Note that the only memory which may need to go out to paging space is modified data. Text and unmodified, initialized data pages can be paged in from the binary, while unmodified, uninitialized data pages need not exist at all until they are touched. BSD kernels don't worry so much about any of this (i.e. they allocate page space in advance for all (potentially modifiable) data pages), but I think System V kernels do. If you allocate page space on the fly you can do things like allocate huge chunks of memory which you only use little bits of without having to have a huge, mostly unused swap area to back it. Of course, if you do this what you lose are the nice "No memory" messages and NULL return values from malloc() when you run out. You are right that your examples do not match my memory of the behaviour of System V kernels, in fact they seem to exhibit a combination of some of the worst characteristics of both System V and BSD paging strategies. What would be interesting to know, however, is whether yo were ignoring or catching the SIGwhatever which indicates low memory when you were running these and, if you weren't, whether the behaviour you see changes if you do ignore this. If the latter is true what you are seeing may be the result of an IBM value-added feature rather than a characteristic of the underlying System V kernel. Indeed, what would be really interesting is if someone who actually knew what they were talking about would explain how AIX paging works to both of us. What ever they have done, the results are often unpleasant and are not helped by the greedy memory consumption of AIX and some of its utilities. Dennis Ferguson University of Toronto
madd@world.std.com (jim frost) (11/17/90)
marc@arnor.uucp writes: >malloc fails when the request causes the heap to exceed the >ulimit for data. It has nothing to do with paging space. You are mistaken -- either can cause malloc to fail. jim frost saber software jimf@saber.com
geoff@edm.uucp (Geoff Coleman) (11/17/90)
From article <311@cadlab.sublink.ORG>, by staff@cadlab.sublink.ORG (Alex Martelli): > > A system-level solution would be best, but I can't find a good one. > I think we now have ALL the documentation IBM supplies, but I don't > see a way there to reserve some amount of resource (paging space) for > the kernel, or for root-owned processes, or whatever. The limits file > allows things like fixing maximum amount of data area PER PROCESS, but > what good will this do me??? I can't predict how many processes WILL > be running when a dangerous situation approaches! Why oh why can't > brk()/sbrk() just REFUSE to expand space to a dangerous situation? But you really should read the man page for the limits file It resides in hardcopy in "Files Reference". It is without a doubt the funniest AIX man page I've found yet. Of 6 paramaters all but fsize are marked as "not used" (so why are they there). So even if data=xxxx would theoretically solve your problem it wouldn't really because the value is ignored. Geoff Coleman > -- > Alex Martelli - CAD.LAB s.p.a., v. Stalingrado 45, Bologna, Italia
rogers@rogers.austin.ibm.com (Mark D. Rogers/100000) (11/21/90)
Regarding Dennis Ferguson's request for an explanation of how AIX paging space allocation works, here goes: - paging disk slots are allocated on first reference to a page (early disk allocation, similar to the BSD-style Dennis described, although we do not implement quotas). - malloc() (actually brk()/sbrk()) Do permit over-allocation of paging space. SIGDANGER was quite correctly interpreted by Mr. Ferguson in a prior posting as being something that allows a process to `gracefully' exit. I forget what the algorithm for determining which process to kill, is, but you understand the mechanism correctly. SIGDANGER was actually invented for the IBM RT, and migrated to the RISC System/6000. The RT Virtual Memory Subsystem did Late Paging Space Allocation, and allowed malloc() to over-allocate, exactly like the System V description. For the RISC System/6000, a completely new file system which has directory journalling and is tightly coupled with the paging supervisor was written. The pager does most of the file i/o via internally mapped files. Early allocation was done on the new pager, essentially, for two reasons: - to allow for potential future accounting mechanisms to be implemented which take advantage of the fact that virtual memory is always backed one-to-one with a disk slot (what you guys want). I don't know what we are going to do in this area yet (if anything). - the new journalled file system/paging subsystem is quite complex, having many objects with many states in and among themselves, to manage. Early allocation simplified the paging subsystem design, and allows for potential to implement accounting of sorts. As to whether we will or will not do accounting/quotas, I really don't know at the moment, as I am not the person tracking that. With regard to the malloc() issue - - you are quite right in saying that we have customers who want `sparse' virtual memory. That really is a big deal among certain applications. Being able to have a large memory object and only pay for what you really use can be a nice programming construct. - We did provide a `safemalloc()' which goes and touches all the pages & checks for SIGDANGER. I thought we shipped that as a sample. We have at Least two very distinct classes of customers where paging space allocation is concerned: 1. Those such as yourselves, who want your application to either run, or not, based upon how much backing storage you have. If your app. doesn't run, go buy another disk & add paging space. 2. Customers who like to allocate all the virtual memory they can, knowing that it will never all be used. This, alleviates the need for any complicated run-time memory management schemes in their model. It is a very convenient programming construct. From a historical perspective (for your information) we had a number of `type 2' customers Early on on the RT, and that influenced the System V-like behaviour of the RT somewhat. It also has something to do with why we, on the RISC System/6000 still want to allow large virtual memory objects efficiently. Basically what we have on the RISC System/6000, is a hybrid attempt at allowing both types of customers. SIGDANGER is a compromise attempt to allow them to co-exist. Admittedly, it is not perfect, however, no matter what route one goes in, on this issue, we have found thorns in the path. We are continuing to investigate the entire issue, and welcome any comments. Mark D. Rogers AIX Operating System Architecture Austin, Texas
alex@am.sublink.org (Alex Martelli) (11/23/90)
rogers@rogers.austin.ibm.com (Mark D. Rogers/100000) writes: ... [after a clear explanation of what AIX is doing re malloc()] >We have at Least two very distinct classes of customers where >paging space allocation is concerned: >1. Those such as yourselves, who want your application to either > run, or not, based upon how much backing storage you have. > If your app. doesn't run, go buy another disk & add paging space. >2. Customers who like to allocate all the virtual memory they can, > knowing that it will never all be used. This, alleviates the > need for any complicated run-time memory management schemes in > their model. It is a very convenient programming construct. Thanks for the explanation. However, in our case it's not really that the application "would not run" - it would run fine as long as the user was only doing solid models whose complexity were compatible with his/her amount of paging space, and would give a harmless warning if a model turned out to be too complex for that (and in many cases there will be some work-around for such resource limitations). I would assume that many applications where the user interactively asks for memory-consuming operations, from symbolic maths to statistical data analysis, would be similar to solid modeling in this respect. Pity that this class of apps has been sacrificed to others who apparently need garbage-collection stuff and appear to be too lazy to do it:-). Another consideration is that many applications will be PORTED from the huge existing Unix base to AIX; these will NOT expect unreliable, over-committed malloc()! For newly developed apps it may be a draw, but considering potential portings to AIX, I believe the 2 vs 1 choice was inferior. Pity! Touching all pages and calling psdanger() on EACH malloc() is WAY too much overhead, I think. I believe I will have to rewrite a malloc()/free()/realloc() package, putting all overhead only on the sbrk(); it's either that, or have to explain to customers how and why our solid modeler is so fragile on AIX, while it's solid as a rock on HP/UX, Ultrix, Sun/Os, SONY/NeWS, and so on and on! >From a historical perspective (for your information) we had >a number of `type 2' customers Early on on the RT, and >that influenced the System V-like behaviour of the RT somewhat. >It also has something to do with why we, on the RISC System/6000 >still want to allow large virtual memory objects efficiently. >Basically what we have on the RISC System/6000, is a hybrid attempt >at allowing both types of customers. SIGDANGER is a compromise attempt >to allow them to co-exist. Admittedly, it is not perfect, however, >no matter what route one goes in, on this issue, we have found >thorns in the path. We are continuing to investigate the entire >issue, and welcome any comments. I hope this input is some use! If you would just include, say, a libmalloc.a which does all necessary checks on sbrk(), for "compatibility with non-overcommitting Unices", it would be nice. Best, I believe, would be a tunable parameter to force sbrk() to non-overcommittance on a system-wide basis; I don't really see how you could make SOME processes overcommit while all defaults to safe allocation, or viceversa, but if you could, that would definitely be a jump upwards in quality, to go with others that AIX has undoubtedly. Feel free to follow this up either here or by email (but in this case to staff@cadlab.sublink.org, please - it's not really a "personal interest" thing, although I'm replying from home - I can't really afford a RS/6000 machine at home...:-). And thanks again for your clear explanation and comments. -- Alex Martelli - (home snailmail:) v. Barontini 27, 40138 Bologna, ITALIA Email: (work:) staff@cadlab.sublink.org, (home:) alex@am.sublink.org Phone: (work:) ++39 (51) 371099, (home:) ++39 (51) 250434; Fax: ++39 (51) 366964 (work only), Fidonet: 332/401.3 (home only).