tarvaine@tukki.jyu.fi (Tapani Tarvainen) (06/19/89)
What exactly does pANS say about size_t? In Turbo C 2.0 it is defined as unsigned int in all memory models, yet in huge model array indices are long. Is this a bug? (If not, what is size_t good for, anyway?) -- Tapani Tarvainen BitNet: tarvainen@finjyu Internet: tarvainen@jylk.jyu.fi -- OR -- tarvaine@tukki.jyu.fi
dfp@cbnewsl.ATT.COM (david.f.prosser) (06/20/89)
In article <934@tukki.jyu.fi> tarvaine@tukki.jyu.fi (Tapani Tarvainen) writes: >What exactly does pANS say about size_t? In Turbo C 2.0 it is >defined as unsigned int in all memory models, yet in huge model >array indices are long. Is this a bug? >(If not, what is size_t good for, anyway?) >-- >Tapani Tarvainen BitNet: tarvainen@finjyu >Internet: tarvainen@jylk.jyu.fi -- OR -- tarvaine@tukki.jyu.fi Section 4.1.5 of the pANS: The types [defined in <stddef.h>] are ... size_t which is the unsigned integral type of the result of the sizeof operator ... It is also defined in <stdio.h>, <stdlib.h>, <string.h> and <time.h>. Dave Prosser ...not an official X3J11 answer...
karl@haddock.ima.isc.com (Karl Heuer) (06/20/89)
In article <934@tukki.jyu.fi> tarvaine@tukki.jyu.fi (Tapani Tarvainen) writes: >What exactly does pANS say about size_t? It says (among other things) that |size_t| is big enough to hold the size of the largest declarable object. >In Turbo C 2.0 it is defined as unsigned int in all memory models, yet in >huge model array indices are long. Is this a bug? Yes. In huge model, |size_t| should be |unsigned long int|, and |ptrdiff_t| should be |long int|. Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
manderso@ugly.cs.ubc.ca (mark c anderson) (06/20/89)
In article <934@tukki.jyu.fi> tarvaine@tukki.jyu.fi (Tapani Tarvainen) writes: >What exactly does pANS say about size_t? In Turbo C 2.0 it is >defined as unsigned int in all memory models, yet in huge model >array indices are long. Is this a bug? As has already been noted, size_t is defined as the "unsigned integral type of the result of the sizeof operator", i.e. unsigned int (at least in this case). I'm not sure how Turbo C handles the huge memory model, but I was interested to read how Microsoft deals with it: if you cast the result of a sizeof operation on a huge array to unsigned long, the correct result is produced. A similar extension allows you to cast the result of a pointer-difference operation on huge pointers to long, and get the desired result. i.e. char huge *p, *q; long size; ... size = (long) ( p - q ); --- Mark Anderson <manderso@ugly.cs.ubc.ca> {att!alberta,uw-beaver,uunet}!ubc-cs!{good,bad,ugly}!manderso Am I suspended in Gaffa?
tarvaine@tukki.jyu.fi (Tapani Tarvainen) (06/20/89)
In article <845@cbnewsl.ATT.COM> dfp@cbnewsl.ATT.COM (david.f.prosser) writes: >Section 4.1.5 of the pANS: > > The types [defined in <stddef.h>] are ... > > size_t > > which is the unsigned integral type of the result of the sizeof > operator ... > >It is also defined in <stdio.h>, <stdlib.h>, <string.h> and <time.h>. If this is all pANS says about it, TurboC's behaviour is legal. As static objects can't exceed 64K, sizeof result will always fit in unsigned int; but in huge model one can have a dynamic array with more than 64K elements. Therefore: What should be done when one has an array whose indices may not fit in an int? Is there a suitable type for that in pANS? If I use long, TC will warn that "Conversion may lose significant digits" every time in models other than huge, not to mention that it is waste of resources, but in huge model int won't do. Suggestions, anyone? (BTW, thank you David for informative answers, even though the info on realloc was somewhat unfortunate: it seems I'll need an extra variable for every pointer to the buffer that is reallocated.) -- Tapani Tarvainen BitNet: tarvainen@finjyu Internet: tarvainen@jylk.jyu.fi -- OR -- tarvaine@tukki.jyu.fi
geoff@cs.warwick.ac.uk (Geoff Rimmer) (06/21/89)
In article <845@cbnewsl.ATT.COM> dfp@cbnewsl.ATT.COM (david.f.prosser) writes: > Section 4.1.5 of the pANS: > > The types [defined in <stddef.h>] are ... > > size_t > > which is the unsigned integral type of the result of the sizeof > operator ... > > It is also defined in <stdio.h>, <stdlib.h>, <string.h> and <time.h>. Does this mean that size_t should be a #define rather than a typedef? If it were a typedef, and I #include <stdio.h> AND <stdlib.h> (which is a perfectly reasonable thing to do!), I would get errors. > Dave Prosser ...not an official X3J11 answer... /---------------------------------------------------------------\ | GEOFF RIMMER | | email : geoff@cs.warwick.ac.uk | | geoff@uk.ac.warwick.cs | | address : Computer Science Dept, Warwick University, | | Coventry, England. | | PHONE : +44 203 692320 | | FAX : +44 865 726753 | \---------------------------------------------------------------/ "First one I've had in twenty years and I won't be here to enjoy it." - Filthy, "Filthy Rich and Catflap" (best comedy series EVER)
karl@haddock.ima.isc.com (Karl Heuer) (06/27/89)
In article <2284@ubc-cs.UUCP> manderso@ugly.cs.ubc.ca (mark c anderson) writes: >size_t is defined as the "unsigned integral type of the result of the sizeof >operator", i.e. unsigned int (at least in this case). Which just means that (in the case of TC huge model) both |size_t| and |sizeof| are wrong. If the compiler allows you to create object larger than 65535 bytes, then |size_t| should not be a 16-bit type. >I'm not sure how Turbo C handles the huge memory model, but I was interested >to read how Microsoft deals with it: if you cast the result of a sizeof >operation on a huge array to unsigned long, the correct result is produced. A hollow voice says, "Kludgh". Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
gwyn@smoke.BRL.MIL (Doug Gwyn) (06/28/89)
In article <GEOFF.89Jun21011005@onyx.cs.warwick.ac.uk> geoff@cs.warwick.ac.uk (Geoff Rimmer) writes: >Does this mean that size_t should be a #define rather than a typedef? No, size_t must be a genuine type name. >If it were a typedef, and I #include <stdio.h> AND <stdlib.h> (which >is a perfectly reasonable thing to do!), I would get errors. It is the implementor's job to make sure that there is no such problem. I think it makes an interesting exercise to figure out how this can be implemented.
dfp@cbnewsl.ATT.COM (david.f.prosser) (07/01/89)
In article <941@tukki.jyu.fi> tarvaine@tukki.jyu.fi (Tapani Tarvainen) writes: >In article <845@cbnewsl.ATT.COM> dfp@cbnewsl.ATT.COM (david.f.prosser) writes: >>Section 4.1.5 of the pANS: >> >> The types [defined in <stddef.h>] are ... >> >> size_t >> >> which is the unsigned integral type of the result of the sizeof >> operator ... >> >>It is also defined in <stdio.h>, <stdlib.h>, <string.h> and <time.h>. > >If this is all pANS says about it, TurboC's behaviour is legal. >As static objects can't exceed 64K, sizeof result will always >fit in unsigned int; but in huge model one can have a dynamic >array with more than 64K elements. That is essentially all that the pANS says directly about size_t, but that doesn't mean that you can draw the following conclusions. The sizeof operator must be able to express the size, in bytes, of any object created by a strictly conforming program. If such a program can allocate an object at runtime through malloc, calloc, or realloc that is too big for the sizeof operator, then the implementation is not conforming. Since the constraints for size_t are the same as for sizeof (at least in terms of representation), size_t must be big enough to hold the number of bytes of any validly created object. >Therefore: > >What should be done when one has an array whose indices may not fit >in an int? Is there a suitable type for that in pANS? By the pANS, size_t should work, for any strictly conforming program. Of course, if there is a "hugealloc()" function provided which is the only access to objects that are bigger than what sizeof or size_t can describe, this is still a conforming implementation. If a program makes use of such a function, then a larger than size_t integral type would be necessary. > >If I use long, TC will warn that "Conversion may lose significant digits" >every time in models other than huge, not to mention that it is waste >of resources, but in huge model int won't do. >Suggestions, anyone? I, personally, dislike compilers that try to be "lint" at the same time, but that may be my UNIX system biases showing through. A conversion of a larger integral type to a smaller unsigned type (such as size_t) is well defined, even if the value being converted doesn't "fit". It may be that TC will be quiet if you put an explicit cast on the assignments. > >(BTW, thank you David for informative answers, even though the info on >realloc was somewhat unfortunate: it seems I'll need an extra variable >for every pointer to the buffer that is reallocated.) I've found that often it is better to use pointers to such buffers only in local situations and to keep only offsets in the shared (file scope) parts. When a buffer is then realloc()ed, there are no readjustments of a slew of pointers. Another favorite approach is to determine, if possible, the maximum extent necessary for the buffer based on early information (for example, by knowing the size of the input file), and never needing to grow the buffer at all. This can save a lot of program complexity. > > >-- >Tapani Tarvainen BitNet: tarvainen@finjyu >Internet: tarvainen@jylk.jyu.fi -- OR -- tarvaine@tukki.jyu.fi Dave Prosser ...not an official X3J11 answer...
dfp@cbnewsl.ATT.COM (david.f.prosser) (07/06/89)
In article <971@tukki.jyu.fi> tarvaine@tukki.jyu.fi (Tapani Tarvainen) writes: >Something related which I would call a bug is the behaviour of >calloc() that e.g., calloc(1000,1000) won't give an error or NULL but >silently truncates the product to 16960 (== 1000000 && 0x0ffff) and >allocates that amount. What does the pANS say about overflow handling >in this situation? >-- >Tapani Tarvainen BitNet: tarvainen@finjyu >Internet: tarvainen@jylk.jyu.fi -- OR -- tarvaine@tukki.jyu.fi There is a general statement in section 4.1.6 for the arguments to the library functions. It allows undefined behavior in the library if a function is passed arguments with invalid values, or values outside of the function's domain. Since calloc() must produce an object with no more bytes than can be counted in a size_t, a pair of arguments that, while individually are valid, cannot be multiplied and produce a result that fits in a size_t, will result in undefined behavior for calloc(). If there were some special part of calloc()'s description that constrained the function to handle this case, the behavior would be otherwise. Dave Prosser ...not an official X3J11 answer...
roelof@idca.tds.PHILIPS.nl (R. Vuurboom) (07/11/89)
In article <1003@cbnewsl.ATT.COM> dfp@cbnewsl.ATT.COM (david.f.prosser) writes: >In article <971@tukki.jyu.fi> tarvaine@tukki.jyu.fi (Tapani Tarvainen) writes: >>Something related which I would call a bug is the behaviour of >>calloc() that e.g., calloc(1000,1000) won't give an error or NULL but >>silently truncates the product to 16960 (== 1000000 && 0x0ffff) and >>allocates that amount. What does the pANS say about overflow handling >>in this situation? >>-- > >There is a general statement in section 4.1.6 for the arguments to the >library functions. It allows undefined behavior in the library if a >function is passed arguments with invalid values, or values outside of >the function's domain. True but not particularly relevant (me thinks) since each argument _is_ valid and well within (I hope!) the domain that size_t can handle. >Since calloc() must produce an object with no more bytes than can be >counted in a size_t, Care to quote the relevant sentence? Doesn't seem to have made it into my Jan 88 draft :-). Quoting from the Mark Williams Ansi C - A lexical guide: calloc allocates a portion of memory large enough to hold count items, each of which is size bytes long. It then initializes every byte within the portion to zero. calloc returns a pointer to the portion allocated. The pointer is aligned for any type of object. If it cannot allocate the amount of memory requested, it returns NULL. My guess is that the above implementation is broken. In the above case, if you're using calloc to allocate memory for an array then you can't use sizeof to find the size of your array (in bytes) since sizeof returns size_t. Could this be the source of confusion? >Dave Prosser ...not an official X3J11 answer... Roelof Vuurboom ...still not an official X3J11 answer... -- Roelof Vuurboom SSP/V3 Philips TDS Apeldoorn, The Netherlands +31 55 432226 domain: roelof@idca.tds.philips.nl uucp: ...!mcvax!philapd!roelof
dfp@cbnewsl.ATT.COM (david.f.prosser) (07/11/89)
In article <149@ssp1.idca.tds.philips.nl> roelof@idca.tds.PHILIPS.nl (R. Vuurboom) writes: >In article <1003@cbnewsl.ATT.COM> dfp@cbnewsl.ATT.COM (david.f.prosser) writes: >>In article <971@tukki.jyu.fi> tarvaine@tukki.jyu.fi (Tapani Tarvainen) writes: >>>Something related which I would call a bug is the behaviour of >>>calloc() that e.g., calloc(1000,1000) won't give an error or NULL but >>>silently truncates the product to 16960 (== 1000000 && 0x0ffff) and >>>allocates that amount. What does the pANS say about overflow handling >>>in this situation? >>>-- >> >>There is a general statement in section 4.1.6 for the arguments to the >>library functions. It allows undefined behavior in the library if a >>function is passed arguments with invalid values, or values outside of >>the function's domain. > >True but not particularly relevant (me thinks) since each argument _is_ >valid and well within (I hope!) the domain that size_t can handle. True, each argument value is within the range of values representable by size_t, but that's not sufficient in this case. > >>Since calloc() must produce an object with no more bytes than can be >>counted in a size_t, > >Care to quote the relevant sentence? Doesn't seem to have made it into my >Jan 88 draft :-). > I did slightly misstate the above. Let me go into the argument in more detail: What you are probably looking for is a statement somewhere in the memory allocation portion of the pANS (section 4.10.3) that explicitly requires that any allocated object's size must be no bigger than can be sized by size_t or that the multiplication of the arguments must not be bigger than a size_t. You won't find such simply because this is not the way the pANS is written. Instead, the pANS handles dynamically allocated objects just the same as other objects, as much as possible. The relevant part of the pANS is that section 3.3.3.4 requires that the sizeof operator evaluate to the size of its operand in bytes, and that the type of a sizeof expression is an unsigned integral type--the same as the typedef size_t. It is possible for calloc to allocate an object bigger than can be described by size_t, but it is not required to do so, just as an implementation can choose not to accept a request for a statically allocated object bigger than can be described by size_t. (There is a requirement that objects of at least 32767 bytes must be accepted.) As a consequence, a strictly conforming program cannot request an object bigger than 32767 bytes. Given the semantics of C, calloc(1000,1000) is such a request. (It takes a bunch of references before this can be fully supported, but I'm taking this as given for this discussion.) Therefore, the portable domain of calloc has been left behind. Since there are no explicit statements that override the generic "behavior is undefined given out-of-bounds arguments" for calloc, an implementation has no behavior constraints. In fact, a valid implementation can "choose" to "core dump" when the multiplication overflows in calloc! Thus, my statement that calloc cannot allocate an object larger than can be sized by size_t was inaccurate: a strictly conforming program cannot attempt to allocate an object bigger than size_t as in the example because the behavior of the library is undefined. > >Quoting from the Mark Williams Ansi C - A lexical guide: > >calloc allocates a portion of memory large enough to hold count items, >each of which is size bytes long. It then initializes every byte within >the portion to zero. > >calloc returns a pointer to the portion allocated. The pointer is aligned for >any type of object. If it cannot allocate the amount of memory requested, >it returns NULL. This is a how calloc must behave when given strictly conforming argument values. Since calloc(1000,1000) cannot be part of a strictly conforming program, the implementation can choose to behave in virtually any manner, including exec'ing "rogue", as has been noted in earlier postings. > >My guess is that the above implementation is broken. > >In the above case, if you're using calloc to allocate memory for an array then >you can't use sizeof to find the size of your array (in bytes) since sizeof >returns size_t. Could this be the source of confusion? In any strictly conforming program, sizeof *must* be able to return the number of bytes in *any* object. The pANS only describes the behavior of strictly conforming programs, and translators that accept all strictly conforming programs. Since, as I have argued above, a program that contains a call to calloc(1000,1000) that is executed is not strictly conforming, the pANS does not constrain calloc's behavior. > >>Dave Prosser ...not an official X3J11 answer... > >Roelof Vuurboom ...still not an official X3J11 answer... >-- >Roelof Vuurboom SSP/V3 Philips TDS Apeldoorn, The Netherlands +31 55 432226 >domain: roelof@idca.tds.philips.nl uucp: ...!mcvax!philapd!roelof All of this does not mean that I believe that calloc(1000,1000) should "not work"; this has all been in the realm of what does the pANS require if calloc(1000,1000) occurs. Moreover, as the argument hinges on the less than strictly conforming nature of the call, and since everything except the result of the multiplication is strictly conforming, the argument may well be tenuous. I, nevertheless, am sticking by my guns. Dave Prosser ...not an official X3J11 answer... (of course)
gwyn@smoke.BRL.MIL (Doug Gwyn) (07/12/89)
In article <1062@cbnewsl.ATT.COM> dfp@cbnewsl.ATT.COM (david.f.prosser) writes: >It is possible for calloc to allocate an object bigger than can >be described by size_t, but it is not required to do so, ... >In fact, a valid implementation can "choose" to >"core dump" when the multiplication overflows in calloc! Hey, Dave -- let's not get carried away here! The Standard requires that it either allocate storage or, *if the space cannot be allocated*, return a null pointer. The meaning of "unable to allocate space" is not specified, which leaves it up to the implementor. There is no direct way to feed the actual (dynamically-allocated) object pointed to by the pointer returned from one of the *alloc() routines to the sizeof operator, so there is no operational way that the semantics of sizeof can be "probed" by the huge object that we're presuming may be actually allocated. One can try to cast the pointer from void* into a pointer to a huge array, but that permits compile-time determination that the object size limit (e.g. representability in a size_t) is being violated, and in any case is something that would only be done AFTER *alloc() is called. I see no reason to say that all these size considerations permit the implementation of *alloc() from doing its job properly, i.e. either correctly allocating contiguous storage or else returning a null pointer. >... a strictly conforming program cannot attempt to allocate an object >bigger than size_t as in the example because the behavior of the library >is undefined. I don't see that at all, unless related to the 32,767 bytes mentioned in 2.2.4.1 (about which more below). sizeof is not required to work properly when applied to huge objects, but if a program avoids trying to do that I don't see that it becomes non-conforming just because it handles huge objects (that WOULD break sizeof IF fed to it, but which they AREN'T). >In any strictly conforming program, sizeof *must* be able to return the >number of bytes in *any* object ... to which it is applied! I don't see how a program could be considered to be in violation for something that it doesn't do. Re: 2.2.4.1: The 32,767-byte object size can be argued to be among the "minimum implementation limits" that a strictly conforming program shall not exceed. On the other hand, one can argue that *alloc() is obliged to return a null pointer if an implementation "hard limit" really would be exceeded by the *alloc() request. This issue probably deserves consideration by X3J11 for a formal ruling. I would be most upset if *alloc() gave me a non-null pointer which then wouldn't work right. It is simpler to let calloc() tell me when at run time I've exceeded such a limit than to have to keep performing such checks myself in my application code.
roelof@idca.tds.PHILIPS.nl (R. Vuurboom) (07/13/89)
In article <1062@cbnewsl.ATT.COM> dfp@cbnewsl.ATT.COM (david.f.prosser) writes: >In article <149@ssp1.idca.tds.philips.nl> roelof@idca.tds.PHILIPS.nl (R. Vuurboom) writes: > [Interesting and intricate argument why calloc(1000,1000) could even > dump its core on the floor] phew! >calloc(1000,1000) occurs. Moreover, as the argument hinges on the less than >strictly conforming nature of the call, and since everything except the >result of the multiplication is strictly conforming, the argument may well >be tenuous. I, nevertheless, am sticking by my guns. Well...as long as you don't shoot yourself in the foot :-) I think your train of argument is (formally) correct. I agree with Doug Gwyn however that allowing the calloc to fail to ensure that a _possible_ sizeof call will succeed seems to be putting the cart before the horse. It makes more sense to me to just define nonconformant behaviour for sizeof for objects larger than size_t can handle. > >Dave Prosser ...not an official X3J11 answer... (of course) Of course :-)
tarvaine@tukki.jyu.fi (Tapani Tarvainen) (08/09/89)
In article <975@cbnewsl.ATT.COM> dfp@cbnewsl.ATT.COM (david.f.prosser) writes: ... > size_t must be big enough >to hold the number of bytes of any validly created object. ... >Of course, if there is a "hugealloc()" function provided which is the >only access to objects that are bigger than what sizeof or size_t can >describe, this is still a conforming implementation. If a program >makes use of such a function, then a larger than size_t integral type >would be necessary. It turns out this is exactly the case with TurboC: malloc(), calloc() and realloc() won't allocate blocks bigger than 64K. If you need such, you must use farmalloc(), farcalloc(), farrealloc(), which expect the block size as a long, so TC appears to be conforming in this respect after all. Unfortunately this apparently means there is no standard-conforming way to create objects bigger than 64K in TC, or indeed using the huge model at all in any useful way. I do hope Borland does something to this in a future version of TC, either change the behaviour of huge or provide a separate ANSI-huge model where everything is long that needs to be and pointer declarations and arithmetic work automatically OK so that I can take a conforming program that needs big blocks and compile it without any changes, just by setting a compiler option. Something related which I would call a bug is the behaviour of calloc() that e.g., calloc(1000,1000) won't give an error or NULL but silently truncates the product to 16960 (== 1000000 && 0x0ffff) and allocates that amount. What does the pANS say about overflow handling in this situation? -- Tapani Tarvainen BitNet: tarvainen@finjyu Internet: tarvainen@jylk.jyu.fi -- OR -- tarvaine@tukki.jyu.fi