dave@smaug.UUCP (Dave Cornutt) (06/17/86)
Keywords: malloc, brk Summary: beware of stdio doing malloc behind your back Line eater: uh-huh If you want to use brk() or sbrk() to do memory allocation, there is something you have to keep in mind: the standard I/O library does mallocs to allocate buffers for files that it handles. The problem is that, at least on our system (UTX/32, a 4.2BSD derivative), malloc maintains its own notion of where the top of memory is, and it doesn't know about brk or sbrk calls done external to it. (Actually, this makes sense from a performance standpoint: if malloc had to check the actual top of memory, that would mean a system call for every malloc.) What happens is this: say you use sbrk to allocate space at the top of memory for something (like a indeterminately large array). You do some stuff in this space, and then you do some printf's or scanf's on some file (or some stdio operation, like puts/gets, putc/getc, etc.) If the printf/whatever is the first I/O operation done on that file, stdio will malloc a buffer for it at that time. Since malloc thinks that the top of memory is still where it was originally, it will happily allocate a hunk of your array and let stdio scribble on it. I ran into this problem a few months ago, and you would not believe the amount of time we spent running CPU and memory diags because of it. The moral of the story is this: (1) once you have done a brk or sbrk, don't do any mallocs, and (2) if you use stdio and brk/sbrk in a program together, make sure that all files being handled by stdio have buffers allocated to them prior to doing any brk/sbrk by either doing I/O on them (which forces a buffer to be allocated), or by calling setbuf() to explicitly allocate a buffer (alternatively, you can set it to unbuffered, although this usually hurts performance). Don't forget about stdin/out/err. Now for the flame: There should be some way to confine malloc to a certain heap space by setting an upper limit on how high it can allocate memory. This way, the programmer could use malloc and still have clear memory above the top of the heap space. Dave Cornutt Gould Computer Systems Ft. Lauderdale, FL "The opinions expressed herein are not necessarily those of my employer, not necessarily mine, and probably not necessary."
dave@onfcanim.UUCP (Dave Martindale) (06/20/86)
In article <49@houligan.UUCP> dave@smaug.UUCP (Dave Cornutt) writes: >Keywords: malloc, brk >Summary: beware of stdio doing malloc behind your back > >If you want to use brk() or sbrk() to do memory allocation, there is >something you have to keep in mind: the standard I/O library does >mallocs to allocate buffers for files that it handles. > > ..... > >The moral of the story is this: (1) once you have done a brk or sbrk, >don't do any mallocs, and (2) if you use stdio and brk/sbrk in a >program together, make sure that all files being handled by stdio have >buffers allocated to them prior to doing any brk/sbrk by either >doing I/O on them (which forces a buffer to be allocated), or by >calling setbuf() to explicitly allocate a buffer (alternatively, >you can set it to unbuffered, although this usually hurts performance). >Don't forget about stdin/out/err. The problem with doing things this way is 1) it is vulnerable being broken by the next person who works on the code and doesn't understand the interaction and 2) it may break when you reload it with a new version of some library that does a malloc where it formerly used a static buffer. Much better is to use a single consistent memory allocation scheme. If you are currently using sbrk just to get a large chunk of memory, just malloc the chunk instead, and then there will be no conflict. If you really need the flexibility of having a massive chunk of memory that you can dynamically grow and shrink, then you do need to use brk; in that case you can write your own malloc that is compatible with whatever private memory allocation scheme you need to use, and stdio will then use your malloc. Either of these is a lot more robust than having two different memory allocation strategies that your carefully arrange not to interfere with each other, you think. >Now for the flame: There should be some way to confine malloc to a certain >heap space by setting an upper limit on how high it can allocate memory. >This way, the programmer could use malloc and still have clear memory >above the top of the heap space. I disagree. Malloc is designed to be the single memory manager for the vast majority of C programs (and it's more portable than sbrk/brk). For that it works adequately, and I don't think it should be more complicated in order to handle an unusual situation like yours. There will always be some strange situation it will be unable to handle no matter how it is written, and you can always supply your own version.
chris@umcp-cs.UUCP (Chris Torek) (06/21/86)
In article <49@houligan.UUCP> dave@smaug.UUCP (Dave Cornutt) writes: >... at least on our system (UTX/32, a 4.2BSD derivative), malloc >maintains its own notion of where the top of memory is, and it doesn't >know about brk or sbrk calls done external to it. (Actually, this makes >sense from a performance standpoint: if malloc had to check the actual >top of memory, that would mean a system call for every malloc.) Not so; and someone may have changed the allocator for UTX, for I believe that the standard 4.2 malloc did not mind programs increasing the break. It does not require one system call per malloc, only one extra system call per `morecore' (an internal routine used to get more space by calling sbrk). Note, however, that decreasing the break will confuse most mallocs, including the standard 4.2BSD version. The `moral' in the quoted article does, with some changes, apply to any code that is to be called `portable': mix not malloc() and brk(), lest ye someday sorrow. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1516) UUCP: seismo!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@mimsy.umd.edu
jpl@allegra.UUCP (John P. Linderman) (06/25/86)
Chris Torek notes: > The `moral' in the quoted article does, with some changes, apply to > any code that is to be called `portable': mix not malloc() and brk(), > lest ye someday sorrow. The immediate corollary is ``Thou shalt not use brk()'', because there in no way to avoid malloc(), at least if you use that paragon of portability, the standard i/o package. With 4.3 and ULTRIX, not only i/o buffers but the FILE structures themselves are acquired with malloc. You can supply your own buffers, but I know of no way to avoid the malloc for the FILE structures. But if I can't use brk(), and I have an application that uses a LOT of memory for a short time, then much less memory for a very long time (sort, as always, comes to mind), then my implementation will hog memory and perform poorly on those machines that aren't paged. Programs don't port just because they compile and run, they have to perform respectably. In a very real sense, an application may be less portable if it is denied the use of brk(). If this were just a problem with sorts and brk, I wouldn't worry much. The more important, underlying issue, is ``Why should the use of brk result in reduced portability?''. I'll accept a response like ``Malloc is a more general paradigm,'' (and I'll put #ifdef's in my code so it doesn't rely on the existence of brk). But I am troubled by responses like ``Some implementations of malloc fail if brk is invoked outside of malloc.'' As far as I'm concerned, such an implementation is broken. There's no hope of writing programs that port to broken systems. You can add a lot of complexity to your code, and thereby make it less likely to work on systems that aren't broken, but you can never anticipate all the ways a library routine might misbehave. And if you are not careful, you can ``enshrine'' bugs in such a way that your code will break if the bugs are fixed. Of course, ``broken'' and ``portable'' are squishy terms. For the most part, I am quite happy if my programs work on BSD, System V, and ULTRIX distributions, and, for that reason, I will accommodate ``features'' of these systems that I would regard as ``bugs'' in less widely used systems. But, at some point, I have to decide that my programs are ``portable enough'', and not worry about the implications of different implementations, broken or otherwise. John P. Linderman Department of Ported Sorts allegra!jpl