misha@daisy.UUCP (Mike Umansky) (06/13/86)
I have a need in my application to find the size of a previously malloc'd object (maybe an array or structure). For example, one routine may allocate some array or just one big chunk of memory and pass a pointer to the chunk to another routine. This other routine now needs to know the size of the area pointed to by this pointer. Also, this other routine has no idea about the structure of this area and doesn't even care. Now, the question, how can this other routine get the size of the area pointed to by passed pointer. Would (sizeof(*ptr)) work or will it just give me the size of the pointer type? Another possibility would be to write a routine which would scan malloc's allocation array and try to match the pointers; then the size is easily obtained. Any help and comments on this would be greately appreciated. Either post your answers or mail them directly please. Thanks for consideration. -- -- NAME: Michael Umansky (misha) E-MAIL: ucbvax!hplabs!nsc!daisy!misha WORK: Daisy Systems Corp. 700B Middlefield Road Mountain View, CA 94039-7006 (415) 960-7166 (work) HOME: 94 Cassia Ct. Hayward, CA 94544 (415) 886-4805 (home)
thomas@utah-gr.UUCP (Spencer W. Thomas) (06/13/86)
In article <165@daisy.UUCP> misha@daisy.UUCP (Mike Umansky) writes: > I have a need in my application to find the size of a > previously malloc'd object (maybe an array or structure). We had this same need, and for many years satisfied it by writing a macro SIZE(ptr) that `knew' how malloc saved the block size (usually in the word preceding ptr). Finally, with the 4.3 release, and in the interests of easier portability (we were distributing our own malloc with our code), we went to a different scheme: We now do storage allocation via two routines `new' and `dispose'. The function `new' asks malloc for the size requested plus one word, puts the size in the extra word, and returns a pointer to the next word (obviously, it has to worry about alignment, too, so it may not be quite this simple). The function `dispose' readjusts the pointer passed to it and calls free. A third routine, `size', can then easily look at the word preceding the pointer passed to it and return the size of the block. This scheme has the advantage of being blissfully independent of the underlying storage allocation management. -- =Spencer ({ihnp4,decvax}!utah-cs!thomas, thomas@utah-cs.ARPA)
ed@valid.UUCP (Ed Patriquin) (06/13/86)
> Xref: pesnta net.unix:21451 net.unix-wizards:22133 net.arch:21118 net.lang.c:21930 > > > I have a need in my application to find the size of a > previously malloc'd object (maybe an array or structure). > Also, this other routine has no idea about the structure > of this area and doesn't even care. Now, the question, > how can this other routine get the size of the area pointed > to by passed pointer. > NAME: Michael Umansky (misha) > Daisy Systems Corp. Well in UNIX(TM) one could write their own memory allocator and include a chunksize (ptr) function, I don't know how you would do it in DNIX(TM). But really, why not establish a protocol between the offending functions. Since some function somewhere knows how big the chunk is, create a data structure which contains the information and make it available to the rest of the program. It is really hard to come up which general purpose algorithmic solutions to problems without having even a glimmer of what environment you are talking about, i.e. passing around display lists, or strings or matrix inversion. -- IRS: Ed Patriquin USENET: ...!ucbvax!hplabs!pesnta!valid!ed USPS: Valid Logic, 2820 Orchard Parkway, San Jose, Calif 95134 ATT: (408) 945-9400 This disclaimer is to disclaim any knowledge of a disclaimer.
gwyn@brl-smoke.ARPA (Doug Gwyn ) (06/15/86)
In article <165@daisy.UUCP> misha@daisy.UUCP (Mike Umansky) writes:
- I have a need in my application to find the size of a
- previously malloc'd object (maybe an array or structure).
- For example, one routine may allocate some array or just
- one big chunk of memory and pass a pointer to the chunk to
- another routine. This other routine now needs to know the
- size of the area pointed to by this pointer. Also, this
- other routine has no idea about the structure of this area
- and doesn't even care. Now, the question, how can this
- other routine get the size of the area pointed to by passed
- pointer. Would (sizeof(*ptr)) work or will it just give
- me the size of the pointer type? Another possibility would
- be to write a routine which would scan malloc's allocation
- array and try to match the pointers; then the size is
- easily obtained.
Sorry to be the bearer of bad tidings, but there is NO reliable
way to do what you're attempting. malloc() may even have allocated
more space than you asked for, but in any case there's no portable
way to find out how much. You'll just have to keep track of the
size of your malloc()ed objects yourself.
jer@peora.UUCP (J. Eric Roskos) (06/17/86)
It is really hard to come up which general purpose algorithmic solutions to problems without having even a glimmer of what environment you are talking about ... Sure it is. Just write your own routine to call malloc; allocate a sizeof(int) worth of extra space, then store the size of the thing you malloc'ed in the int at the front of the allocated block, advance the pointer past the place where your stored the size, and return that as the pointer to the block you allocated. The size of the object is then found in the word preceeding the location pointed to by the object pointer. Also provide a routine to deallocate the block by backing the pointer up before calling free(). This approach is portable, simple, and easy to understand. Also it doesn't require any assumptions about what kind of objects are being allocated. -- E. Roskos "They are a nexus of commercially produced artifact and individual message that expresses a dialect central to popular culture." -- Sociologist explaining the reason for having Florida postcards.
chris@umcp-cs.UUCP (Chris Torek) (06/18/86)
In article <2206@peora.UUCP>, jer@peora.UUCP (J. Eric Roskos) writes: >... allocate a sizeof(int) worth of extra space, then store the >size of the thing you malloc'ed in the int at the front of the >allocated block, advance the pointer past the place where your >stored the size, and return that as the pointer to the block you >allocated. > >This approach is portable, simple, and easy to understand. Also >it doesn't require any assumptions about what kind of objects are >being allocated. Unfortunately, it is not necessarily portable. I can imagine a machine where `double' arguments must be aligned on an eight-byte boundary, but integers are only four bytes wide. In this case calling the new routine to allocate doubles would result in an odd-address-style trap when the returned value is used. What can be done is this: #include <stdio.h> /* change all `char *'s to `void *'s in ANSI C */ struct mymem { struct mymem *m_next; /* linked list */ struct mymem *m_prev; /* doubly, for easy removal */ unsigned m_size; /* size of mem, in bytes */ char *m_mem; /* storage */ }; /* queue head */ static struct mymem m_base = { &m_base, &m_base }; extern char *malloc(); /* * Allocate some memory, remembering its size. */ char * myalloc(size) unsigned size; { register struct mymem *m; if ((m = (struct mymem *) malloc(sizeof (*m))) == NULL) return (NULL); if ((m->m_mem = malloc(size)) == NULL) { free((char *) m); return (NULL); } m->m_size = size; /* * Insert at head: change to tail if mem usage is more * `queueish' than `stackish'. */ m->m_next = m_base.m_next; m->m_prev = &m_base; m_base.m_next->m_prev = m; m_base.m_next = m; return (m->m_mem); } /* * Locate the queue element corresponding to the given memory * region. */ static struct mymem * myfind(mem) register char *mem; { register struct mymem *m = &m_base; while ((m = m->m_next) != &m_base) if (m->m_mem == mem) return (m); return (NULL); } /* * Free a region, and drop it from the queue. */ myfree(mem) char *mem; { register struct mymem *m = myfind(mem); if (m == NULL) { /* * Complain? Abort? Or what? * Change to suit local tastes... */ (void) fprintf(stderr, "myfree: invalid free\n"); (void) fflush(stderr); abort(); } /* remove from queue */ m->m_next->m_prev = m->m_prev; m->m_prev->m_next = m->m_next; /* and discard */ free(m->m_mem); free((char *) m); } unsigned mysize(mem) char *mem; { struct mymem *m; if ((m = myfind(mem)) == NULL) return (0); /* not allocated */ return (m->m_size); } This definitely *is* portable, if slow for large remembrance queues. Of course, malloc() knows all along how big the allocated region is (though the number it knows might be larger than the original request); so it is a shame to have to go through all this effort. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1516) UUCP: seismo!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@mimsy.umd.edu
jer@peora.UUCP (J. Eric Roskos) (06/18/86)
> Sure it is. Just write your own routine to call malloc; allocate a > sizeof(int) worth of extra space, then store the size of the thing you > malloc'ed in the int at the front of the allocated block, advance the > pointer past the place where your stored the size, and return that as > the pointer to the block you allocated. The size of the object is then > found in the word preceeding the location pointed to by the object pointer. hansen@pegasus points out that for this to work, the distance you advance the pointer by has to be sufficient to put the pointer on the boundary malloc usually aligns on. So you have to allocate sizeof(ALIGN) extra space for it to work for all possible types of data, where ALIGN is the largest alignment boundary required for the machine. E.g., on a typical machine it might work for chars and ints, but not double-precision floating point numbers. -- E. Roskos "Hold on, Figment." "Why?" "The Idea Bag is full." "Then, let's imagine--" "--No, wait." -- Dialogue from "Journey Into Imagination," Kodak exhibit, Epcot Center, Orlando, FL.
tps@sdchem.UUCP (Tom Stockfisch) (06/19/86)
In article <2206@peora.UUCP> jer@peora.UUCP (J. Eric Roskos) writes: > > It is really hard to come up which general purpose algorithmic > solutions to problems without having even a glimmer of what > environment you are talking about ... > >Sure it is. Just write your own routine to call malloc; allocate a >sizeof(int) worth of extra space, then store the size of the thing you >malloc'ed in the int at the front of the allocated block, advance the >pointer past the place where your stored the size, and return that as >the pointer to the block you allocated. The size of the object is then >found in the word preceeding the location pointed to by the object pointer. >... >This approach is portable, simple, and easy to understand. >Also it doesn't require any assumptions about what kind of objects are >being allocated. >-- >E. Roskos This is NOT portable. Suppose your machine requires "doubles" to be aligned on addresses divisible by 8 and sizeof(int) == 4. You do double *dp = (double *)my_malloc( (unsigned)100 * sizeof(double) ); If my_malloc() calls malloc() and then increments the pointer returned by malloc by 4, you will get a segmentation fault as soon as your allocated space is accessed. A *portable* way to do this (which wastes a little memory, but doesn't require that you know the worst-case alignment size) is to hand both the unit size and number of units to my_malloc() and allocate one extra *unit* to store the size of the space in. Of course, if the unit size < sizeof(int) you have to do a little extra diddling. Usually when I need to know the end of the allocated space I need to access it too often to be done by a subroutine call so I set up a structure to replace the pointer: struct foo { arraytype *beg; /* begining of array */ arraytype *end; /* end of array */ } bar; -- Tom Stockfisch, UCSD Chemistry
brett@wjvax.UUCP (06/20/86)
In article <2206@peora.UUCP> jer@peora.UUCP (J. Eric Roskos) writes: > > It is really hard to come up which general purpose algorithmic > solutions to problems without having even a glimmer of what > environment you are talking about ... > >Sure it is. Just write your own routine to call malloc; allocate a >sizeof(int) worth of extra space, then store the size of the thing you >malloc'ed in the int at the front of the allocated block, advance the >pointer past the place where your stored the size, and return that as >the pointer to the block you allocated. The size of the object is then >found in the word preceeding the location pointed to by the object pointer. > >Also provide a routine to deallocate the block by backing the pointer up >before calling free(). > >This approach is portable, simple, and easy to understand. >Also it doesn't require any assumptions about what kind of objects are >being allocated. In fact, this assumes that the resulting pointer (malloc(len+sizeof(int))+sizeof(int)) is still well-aligned. This is only true if sizeof(int) is the most coarse alignment boundary on the machine. If, for example, structures require more coarse alignment, attempting to cast the above pointer to a structure pointer will fail. ------------- Brett Galloway {pesnta,twg,ios,qubix,turtlevax,tymix,vecpyr,certes,isi}!wjvax!brett
jer@peora.UUCP (06/23/86)
> Unfortunately, it is not necessarily portable. I can imagine a > machine where `double' arguments must be aligned on an eight-byte > boundary, but integers are only four bytes wide. That's a good point. It's unfortunate that in all the system-specific #defines that one can include, there isn't one that specifies the alignment to be used in malloc's, etc; malloc has it #defined in it, as do a number of other programs (e.g., Peter Honeyman's "pathalias" program). Personally I still favor putting the size up at the front, eventhough it requires now making the alignment a configuration parameter (i.e., the user has to specify it for various machines), because maintaining a linked list or other external record of the size introduces a consistency problem -- you have to keep your external size information consistent with what malloc knows the size to be, i.e., the physical separation of the allocated block and the array or linked list that maintains the information makes it more likely that they can become inconsistent. (Notice, incidentally, that this is essentially a flaw in C -- there is an essential piece of information, the alignment constant, that is not available from within the language, eventhough the language allows one to do things that require this information.) Actually there's a trick you could use to get the alignment constant, maybe, however: char *p, *q; int diff; p = malloc(1); q = malloc(1); diff = q - p; if (diff < 0) diff = p - q; But it's not really portable either. Although the argument to malloc is defined to be a number of "bytes", and although in C a "byte" is an "unspecified unit ... which is the same size as a char" (K&R p. 126), and subtracting two pointers gives "the number of elements between p and q" (K&R p. 98), where in this case an "element" is a char which is the same size as a "byte", which means the distance computation itself is correct, you're not guaranteed that the two consecutive mallocs will give consecutive blocks of memory -- they certainly won't be likely to after some other mallocs and frees have been done. It's hard to see a situation where the initial allocations wouldn't be consecutive, but it's not guaranteed, and that's sufficient reason to have doubts. (The reason for the "if" above is to handle mallocs that allocate consecutively upwards or downwards; both of which, like stacks that grow upwards or downwards, do exist.) Of course, our "compiler wizard" who is sitting here in my office reading the newspaper says "just align it on a quadword". :-) -- E. Roskos
zben@umd5.UUCP (Ben Cranston) (06/23/86)
In article <2081@umcp-cs.UUCP> chris@umcp-cs.UUCP (Chris Torek) writes: > In article <2206@peora.UUCP>, jer@peora.UUCP (J. Eric Roskos) writes: >> ... allocate a sizeof(int) worth of extra space, then store the >> size of the thing you malloc'ed in the int at the front of the >> allocated block, advance the pointer past the place where your >> stored the size, and return that as the pointer to the block you >> allocated. > Unfortunately, it is not necessarily portable. I can imagine a > machine where `double' arguments must be aligned on an eight-byte > boundary, but integers are only four bytes wide. In this case > calling the new routine to allocate doubles would result in an > odd-address-style trap when the returned value is used. I can see three solutions to this problem. In order of lowest to highest degree of hacking to Un*x itself: 1. Erect a superstructure upon Malloc that maintains the required additional information (this is what Chris did in the deleted part of his posting). Requires *no* changes to Un*x, best when the additional gore of the new superstructure can be swamped in the gore of a large system within which the my-malloc is being distributed (hi Chris!). 2. Require something like malloc.h to include not only the function headers for malloc and free, but either one of these two: #define ALIGNTYPE double -or- #define ALIGNSIZE 8 The definitions should be obvious. This requires only the provenance of the appropriate header file (and unfortunately standardization of the particulars of the implementation... :-). 3. Define the Malloc package portably in C, so the information required can be extracted from the underlying data base. The problem with this approach is that the primitives upon which Malloc is based, (e.g. sbrk() ) would then have to be standardized (hey, maybe this posting belongs in this [net.arch] group after all :-). If anybody is going to do this, and would like to use a Fibonacci number variant of the binary buddy system, see Rick's and my article in CACM: "A Simpified Recombination Scheme for the Fibonacci Buddy System", I think it was either June or July of 1975... > Of course, malloc() knows all along how big the allocated region > is (though the number it knows might be larger than the original > request); so it is a shame to have to go through all this effort. Yes, but I think there is a question whether you want to tell the user about 1. how much he ASKED for, or 2. how much happens to BE there... I think there is an argument for either side, so perhaps there needs to be two ways of doing this. Feeping Creaturism??? -- umd5.UUCP <= {seismo!umcp-cs,ihnp4!rlgvax}!cvl!umd5!zben Ben Cranston zben @ umd2.UMD.EDU Kingdom of Merryland Sperrows 1100/92 umd2.BITNET "via HASP with RSCS"
desj@brahms.BERKELEY.EDU (David desJardins) (06/24/86)
I can't believe all of this discussion about clever tricks to try to guess how malloc does alignment, or what the maximum alignment boundary is, or whatever. No matter what you do along these lines, it is going to be possible to devise a perfectly valid C implementation in which it will fail. It seems obvious that the solution to the whole problem is to define a structure which contains two fields: the pointer to memory returned by malloc, and an int (or long int, or whatever) which contains the size of the block requested from malloc. This is guaranteed to work, as opposed to the many alternate implementations that have been proposed and are guaranteed to fail on at least some architectures. Note that in a strongly-typed language you would have had to do the right thing from the beginning... -- David desJardins
ark@alice.UucP (Andrew Koenig) (06/24/86)
There is a very simple way to do this portably: struct chunk { unsigned size; char *mem; }; struct chunk challoc (n) unsigned n; { struct chunk ch; ch.size = n; ch.mem = malloc (n); return ch; } void chfree (ch) struct chunk ch; { if (ch.mem) free (ch.mem); } Now rewrite your program to deal with memory through chunk structures instead of through simple pointers. It should be possible to make this easier by using macros.
franka@mmintl.UUCP (Frank Adams) (06/24/86)
In article <2216@peora.UUCP> jer@peora.UUCP writes: ~>determine alignment by difference between consecutive malloc'ed objects >you're not guaranteed that the two consecutive mallocs will give >consecutive blocks of memory -- they certainly won't be likely to after >some other mallocs and frees have been done. It's hard to see a situation >where the initial allocations wouldn't be consecutive, but it's not >guaranteed, and that's sufficient reason to have doubts. (1) If you are using the "buddy system"[1] for storage allocation, and the total storage available is not a power of two, initial allocations might well not be consecutive. (2) Some version of malloc have header information about the allocated block immediately before the block. For such versions, the distance between blocks always includes at least one such header. (3) Storage may be allocated and freed in the process of starting the program, before your code gets control. (4) Storage for all programs may be allocated from a single heap, so that other tasks would have done malloc's and free's before (or even between) yours. In short, this is completely unreliable. Frank Adams ihnp4!philabs!pwa-b!mmintl!franka Multimate International 52 Oakland Ave North E. Hartford, CT 06108 [1] For a description of the buddy system, see Knuth, _The_Art_of_Computer _Programming,_vol._1,_Fundamental_Algorithms_.
tanner@ki4pv.UUCP (Tanner Andrews) (06/24/86)
The problem with alignments, it would appear to me, would be easily solved by allowing for the largest machine type, which should be double. Thus (cast the routine type as needed when you call it): /* allocate core -- must free using my_free() */ char *my_malloc(sz) short int sz; { register char *p; if (p = malloc(sz+sizeof(double))) { *((short int *)p) = sz; p += sizeof(double); } return(p); } /* size of allocated object -- works for any p such that p is return * value from my_malloc * which has not been my_free'd */ short int my_sizeof(p) char *p; { return(*((short int *)(p - sizeof(double))) } -- <std dsclm, copies upon request> Tanner Andrews
jiml@uwslh.UUCP (Jim Leinweber) (06/24/86)
One nasty efficiency quirk probably ought to be mentioned. The schemes which add extra user information to remember the size of the object are okay, modulo some pointer alignment details. Care with the size of the objects may be needed if you are allocating lots of them, though. Some memory allocators allocate things in fixed sizes, such as power-of-two minus some internal overhead. If a program is allocating funny sized buffers scaled to cooperate with brain damaged allocators, and then adds extra overhead, the resulting size may cause poor allocators to waste lots of space. Of course, 'knowing' the size of blocks actually allocated is not portable. But if you are tight on address space or physical memory, it could matter. Jim Leinweber usenet: jiml@uwslh.uucp State Hygiene Laboratory {seismo, harvard, topaz, ihnp4,...}! University of Wisconsin uwvax!uwslh!jiml 465 Henry Mall internet: uwslh!jiml@wisc.edu Madison, Wisconsin, USA 53706 phone: (608) 262-8092 -- James E. Leinweber ...!{seismo,harvard,topaz,ihnp4}!uwvax!uwslh!jiml Wisconsin State Hygiene Lab (608) 262-8092 University of Wisconsin; 465 Henry Mall; Madison, WI 53706
arnold@ucsfcgl.UUCP (Ken Arnold%CGL) (06/25/86)
In article <2081@umcp-cs.UUCP> chris@umcp-cs.UUCP (Chris Torek) writes: >In article <2206@peora.UUCP>, jer@peora.UUCP (J. Eric Roskos) writes: >>... allocate a sizeof(int) worth of extra space, then store the >>size of the thing you malloc'ed in the int at the front of the >>allocated block, advance the pointer past the place where your >>stored the size, and return that as the pointer to the block you >>allocated. >> > >Unfortunately, it is not necessarily portable. I can imagine a >machine where `double' arguments must be aligned on an eight-byte >boundary, but integers are only four bytes wide. In this case >calling the new routine to allocate doubles would result in an >odd-address-style trap when the returned value is used. So put the int at the end of the space, after increasing the size to be allocated by sizeof (int) + enough space to align the int on a byte which is a multiple of sizeof (int). Then return the original malloc()ed pointer. Can we say "hack"? Ken Arnold
ark@alice.UucP (Andrew Koenig) (06/25/86)
> One nasty efficiency quirk probably ought to be mentioned. The schemes which > add extra user information to remember the size of the object are okay, > modulo some pointer alignment details. Care with the size of the objects > may be needed if you are allocating lots of them, though. Some memory > allocators allocate things in fixed sizes, such as power-of-two minus some > internal overhead. If a program is allocating funny sized buffers scaled to > cooperate with brain damaged allocators, and then adds extra overhead, the > resulting size may cause poor allocators to waste lots of space. Of course, > 'knowing' the size of blocks actually allocated is not portable. But if you > are tight on address space or physical memory, it could matter. Translated into English, this says: Some memory allocators are broken. Thus we should all warp our programs into forms that work well with these broken memory allocators.
jer@peora.UUCP (J. Eric Roskos) (06/25/86)
> Yes, but I think there is a question whether you want to tell the user > about 1. how much he ASKED for, or 2. how much happens to BE there... The only portable value is how much he asked for. If he got more than that, it is just an artifact of the memory allocation scheme, which would be implementation-dependent, and thus he shouldn't use it; allowing him to know how much extra he really got is a step in the direction of promoting him to use it, which would seem not to be a good idea. -- E. Roskos
arnold@ucsfcgl.UUCP (Ken Arnold%CGL) (06/26/86)
In article <9894@ucsfcgl.ucsfcgl.UUCP> I write: >So put the int at the end of the space, after increasing the size >to be allocated by sizeof (int) + enough space to align the int >on a byte which is a multiple of sizeof (int). Then return the >original malloc()ed pointer. > >Can we say "hack"? I don't know what came over me. I only wish I could say that I was on drugs at the time I wrote this, since at least my embarassment could have been partially compensated by the entertainment. Please forgive my obvious error, and just chalk it up to a long, tired day. Ken Arnold
desj@brahms.BERKELEY.EDU (David desJardins) (06/26/86)
In article <9894@ucsfcgl.UUCP> arnold@ucsfcgl.UUCP (Ken Arnold) writes: >So put the int at the end of the space, after increasing the size >to be allocated by sizeof (int) + enough space to align the int >on a byte which is a multiple of sizeof (int). Then return the >original malloc()ed pointer. This is the silliest thing I've seen yet. If you have a pointer to the beginning of a region, and store the size of the region at the end of the region, how are you ever going to be able to recover this information?? -- David desJardins
chris@umcp-cs.UUCP (Chris Torek) (06/26/86)
In article <6922@ki4pv.UUCP> tanner@ki4pv.UUCP (Tanner Andrews) writes: >The problem with alignments, it would appear to me, would be easily >solved by allowing for the largest machine type, which should be >double. Why? If you mean to state that in every C compiler now available, the largest machine type is `double', then I might believe you. But if you mean that the largest machine type in any future C compiler will always be `double'---well, to put it mildly, I doubt it. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1516) UUCP: seismo!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@mimsy.umd.edu
flaps@utcs.UUCP (06/27/86)
A way to get a sufficient alignment constant should be as follows:
union ALIGNED {
char garbage0;
unsigned char garbage1;
short garbage2;
int garbage3;
unsigned garbage4;
long garbage5;
unsigned long garbage6;
float garbage7;
double garbage8;
};
(sorry if I've left out any)
Then you can use sizeof(ALIGNED) as the alignment constant. Am I mistaken?
The problem is that this number might be greater than the actual
alignment constant, for example if a "double" has length 8 but only
needs quad-word alignment.
--
Alan J Rosenthal
{decvax|cbosgd|linus}!ihnp4!utcs!flaps, utzoo!utcs!flaps
greg@utcsri.UUCP (Gregory Smith) (06/27/86)
In article <1986Jun27.01:36:11.4839@utcs.uucp> flaps@utcs.UUCP (Alan J Rosenthal) writes: > >A way to get a sufficient alignment constant should be as follows: > >union ALIGNED { > char garbage0; > unsigned char garbage1; > short garbage2; > int garbage3; > unsigned garbage4; > long garbage5; > unsigned long garbage6; > float garbage7; > double garbage8; >}; > >(sorry if I've left out any) > >Then you can use sizeof(ALIGNED) as the alignment constant. Am I mistaken? ^ union >The problem is that this number might be greater than the actual >alignment constant, for example if a "double" has length 8 but only >needs quad-word alignment. > How about: struct foo{ char ignatz; union ALIGNED wumpus; /* union ALIGNED as above */ }; then the most strict alignment is given by sizeof( struct foo ) - sizeof( union ALIGNED ) Interesting aside: I first thought of declaring 'struct foo xyzzy' and using: ((char *)&xyzzy.wumpus - (char*)&xyzzy) The problem is that this is not a run time constant, but it could be if the semantics of '-' in constant expressions allowed pointer subtraction if both pointers were based at the same label. The portability of this is dubious. The interesting part: int i = ((char *)&xyzzy.wumpus - (char*)&xyzzy); this gave the following on PCC: "test.c", line 20: compiler error: expression causes compiler loop: try simplifying -- "Shades of scorpions! Daedalus has vanished ..... Great Zeus, my ring!" ---------------------------------------------------------------------- Greg Smith University of Toronto UUCP: ..utzoo!utcsri!greg
levy@ttrdc.UUCP (Daniel R. Levy) (06/29/86)
In article <9894@ucsfcgl.ucsfcgl.UUCP>, arnold@ucsfcgl.UUCP writes: >In article <2081@umcp-cs.UUCP> chris@umcp-cs.UUCP (Chris Torek) writes: >So put the int at the end of the space, after increasing the size >to be allocated by sizeof (int) + enough space to align the int >on a byte which is a multiple of sizeof (int). Then return the >original malloc()ed pointer. > >Can we say "hack"? > > Ken Arnold Now tell me please, how is one supposed to get at the int containing the size later on (only knowing the pointer)? You have to know the size in order to calculate the offset to reach the size data! Hardly sounds useful (that's why the other suggestions have been to PREpend the size data, so its location is known, though details have to be ironed out which has been fueling much of the debate).
gwyn@brl-smoke.ARPA (Doug Gwyn ) (06/29/86)
In article <9895@ucsfcgl.ucsfcgl.UUCP> arnold@ucsfcgl.UUCP (Ken Arnold%CGL) writes: -In article <9894@ucsfcgl.ucsfcgl.UUCP> I write: ->So put the int at the end of the space, after increasing the size ->to be allocated by sizeof (int) + enough space to align the int ->on a byte which is a multiple of sizeof (int). Then return the ->original malloc()ed pointer. - -I don't know what came over me. I only wish I could say that I was on -drugs at the time I wrote this, since at least my embarassment could -have been partially compensated by the entertainment. Please forgive -my obvious error, and just chalk it up to a long, tired day. I dunno, Ken -- I thought the idea of putting the size at the end of the data was pretty amusing. It might be worth observing that several requests to these nets take the general form: "How do I do X? Here's what I tried so far..." where "X" is only part of a solution to the REAL problem. Usually, there are other solutions to the REAL problem that do not have "X" as a subcomponent. It often happens that solving "X" is much harder than solving the REAL problem. So, if you find yourself having difficulties like this, it may be that you've prematurely settled on a difficult approach and that it might be worthwhile to step back and consider whether there is some simpler, more straightforward method. Now, if you find yourself truly needing to keep track of the size of dynamically-allocated chunks of data space, you can take either of two good general approaches. The first is: typedef struct { void *loc; size_t size; } alloc_info; /* this should be in a header file */ alloc_info * my_alloc( size_t length ) /* use this instead of malloc */ { extern void free(void *), *malloc(size_t); register alloc_info *p; if ( (p = (alloc_info *)malloc( sizeof(alloc_info) )) != 0 && (p->loc = malloc( p->size = length )) == 0 ) { free( (void *)p ); p = 0; } return p; } and the second is to supply one's own memory allocator (Knuth's "boundary tag" method is particularly suitable for this application). Make sure that your allocator can coexist with malloc(), since the C library may have to invoke malloc(). One way to guarantee this is to rip off a large pool of memory with malloc() early on, then have one's own allocator restricted to allocating from this pool. If you're more ambitious, you could try implementing your own virtual memory scheme using buffers into a disk file (this makes data access more complicated than simply using pointers, however). The best implementation I ever had was under single-job RT-11, where I could directly manipulate the KT-11 memory management hardware registers. (Not feasible under UNIX, but perhaps worth considering for stand-alone applications.)
jer@peora.UUCP (J. Eric Roskos) (06/29/86)
> It seems obvious that the solution to the whole problem is to define a > structure which contains two fields: the pointer to memory returned by malloc, > and an int (or long int, or whatever) which contains the size of the block > requested from malloc. This is guaranteed to work, as opposed to the many > alternate implementations that have been proposed and are guaranteed to fail > on at least some architectures. > Note that in a strongly-typed language you would have had to do the right > thing from the beginning... Wouldn't malloc() be something of a hole in a strongly-typed language anyway? I mean, if you are going to allocate an object in a strongly-typed language, you should have to allocate an object of a particular type, not just "a block of memory that is aligned on the worst-case alignment boundary". The above "obvious" solution has the shortcoming that either you return a pointer to the structure (in which case the memory allocator doesn't work like standard malloc), or you have to have a way to coordinate the structure with the allocated block of memory: when the user frees a block, you have to find the structure given only the address of the block -- which is stored in the structure, but not in the block you're freeing. From this discussion I have become firmly convinced (well, semi-firmly) that this reflects a problem in the C language. Specifically, C does allow constraints on alignment, which is reasonable, but there isn't any way to find out anything about alignment from within the language. The ANSI C folks should look into that. Maybe there should be an operator "alignment" that works the same as "sizeof" but gives the alignment required for a specified type; and some defined constant (like MAXALIGN) that gives the worst-case alignment. For example, how can you write a portable malloc? After all, that's really what we've been trying to do here. You can argue "malloc is sort of part of the language so it is OK for it to have to have changes made to it when you port it to a different machine," (or put "is too close to the hardware" in place of "is part of the language", or some similar argument) but the problem is, that seems to give malloc an unduly privileged status. What if you do want to write your own malloc? Malloc really isn't "part of the language," and there's no reason why it should have to be considered "too close to the hardware" except inasmuch as there is this problem with C that you have to tell your program something you've determined from an outside source that couldn't be found out from within the language itself. But of course the compiler knows about alignment requirements, and we have here a demonstrated "need to know," only the compiler won't tell you what it knows because there's nothing in C to let it do so! Well, the above is my argument for adding information about alignment to the C language. -- E. Roskos
mash@mips.UUCP (John Mashey) (06/30/86)
Although the topic is interesting, I suggest that it's not so crucial as to be posted to to 4 groups, whose readership overlaps heavily anyway. The discussion has gotten especially far from anything related to net.arch. How about moving this to net.lang.c? -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: {decvax,ucbvax,ihnp4}!decwrl!mips!mash, DDD: 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086
aglew@ccvaxa.UUCP (07/01/86)
... > alignment E. Rosko is right - C needs to be able to specify alignment better than double. Of course it's non-portable, but C is used to write one hell of a lot of machine dependent programs in an almost portable manner. I've needed to be able to specify an alignment spec in programs where it really made a big difference to have data structures aligned on cache boundaries. Doesn't Ada have an alignment pragma? If not Ada, then some flavour of Pascal that I've seen somewhere in my travels. Andy "Krazy" Glew. Gould CSD-Urbana. USEnet: ihnp4!uiucdcs!ccvaxa!aglew 1101 E. University, Urbana, IL 61801 ARPAnet: aglew@gswd-vms
franka@mmintl.UUCP (Frank Adams) (07/01/86)
In article <2219@peora.UUCP> jer@peora.UUCP writes: >> Yes, but I think there is a question whether you want to tell the user >> about 1. how much he ASKED for, or 2. how much happens to BE there... > >The only portable value is how much he asked for. If he got more than >that, it is just an artifact of the memory allocation scheme, which would >be implementation-dependent, and thus he shouldn't use it; allowing him to >know how much extra he really got is a step in the direction of promoting >him to use it, which would seem not to be a good idea. Depends on what you mean by "portable". Certainly, it is possible to use this number in an implementation-dependent way; but there are natural and non-implementation-dependent ways to use the information. E.g., if one is allocating character strings and one of them grows, one can try to fit the expanded string into the allocated size, and only reallocate (and reset all the pointers to the data) if it won't fit. The philosphy of C is to give the programmer access to information if there are reasonable ways to use it, not to try to withhold the information if there are unreasonable ways to use it. Frank Adams ihnp4!philabs!pwa-b!mmintl!franka Multimate International 52 Oakland Ave North E. Hartford, CT 06108
jer@peora.UUCP (J. Eric Roskos) (07/03/86)
> Depends on what you mean by "portable". Certainly, it is possible to use > this number in an implementation-dependent way; but there are natural and > non-implementation-dependent ways to use the information. E.g., if one is > allocating character strings and one of them grows, one can try to fit the > expanded string into the allocated size, and only reallocate (and reset all > the pointers to the data) if it won't fit. Well, actually it depends on some other factors, just fairly abstract ones (and thus debatable). According to the System V manual, "Malloc returns a pointer to a block of at least _size_ bytes suitably aligned for any use." Notice that this places no upper bound on the size that's actually allocated. Now, you can argue "well, obviously it can't be more than the number of bytes representable in an int", or something of that nature, but such an argument wouldn't hold water in any formal proof. The only thing it says is *at least* that many bytes. The problem is that, unless the definition is constrained more, it makes it impossible to prove a program that uses the "actual size allocated" correct. Suppose you store the actual size allocated in an int in your program. Then I can argue that on some implementation, the maximum actual size allocated requires a long int. If you use a long int, I can argue that it requires 2* the number of bits in a long int. And so on. Since there's no upper bound specified for the size, there's always a risk of truncation. This seems to be somewhat abstract and theoretical, but the problem is really that there's an "unknown variable" involved that's outside the program, and it makes it hard to make assertions about the program's portability to other environments since the variable changes according to the environment. You can think of more realistic examples; e.g., the situation where you want to save space by storing the size in a small number of bits in a word, and it'll work fine for one implementation but not another; but that gets into other portability problems of C (e.g., not knowing how many bits is really in an "int", etc.) and it wasn't my intent to be arguing over the existing problems with C, just the ones that might be avoided in a new feature. I agree, however, that for "most cases" the "actual size allocated" might be useful. But it's difficult to see why, other than to avoid wasted space, you would really need it; and my original reasoning in my original comment was that you shouldn't provide features that introduce these sorts of problems just out of a "liberty and freedom" approach to C -- I think you shouldn't introduce arbitrary constraints, but there's a fine but definite line between arbitrary constraints, in many cases, and purposeful ones. It's just that the purposeful ones are sometimes esoteric. If you wonder why anyone would be concerned with formal proofs, it's just that more and more "real world" applications (especially those for the gov't) require them, and the inability to do them well for C is one of the reasons so many people favor Ada. Personally I like C and would rather have a "disciplined C". -- E. Roskos (See net.net-people.)
franka@mmintl.UUCP (Frank Adams) (07/09/86)
In article <2233@peora.UUCP> jer@peora.UUCP writes: >[Frank Adams] >> Depends on what you mean by "portable". Certainly, it is possible to use >> this number in an implementation-dependent way; but there are natural and >> non-implementation-dependent ways to use the information. E.g., if one is >> allocating character strings and one of them grows, one can try to fit the >> expanded string into the allocated size, and only reallocate (and reset all >> the pointers to the data) if it won't fit. > >Well, actually it depends on some other factors, just fairly abstract ones >(and thus debatable). According to the System V manual, "Malloc returns a >pointer to a block of at least _size_ bytes suitably aligned for any use." > >The problem is that, unless the definition is constrained more, it makes >it impossible to prove a program that uses the "actual size allocated" >correct. So constrain the definition more. It isn't written in stone. I dare say that there is no system in existence that would fail under the assumption that the actual size allocated fits in a long -- probably none would fail assuming an int. Compared to the total task of dealing with the C environment in such a way that correctness proofs become possible, this is a very tiny issue, easily dealt with. Frank Adams ihnp4!philabs!pwa-b!mmintl!franka Multimate International 52 Oakland Ave North E. Hartford, CT 06108