[comp.lang.c] malloc

g-rh@cca.CCA.COM (Richard Harter) (04/29/88)

In article <5253@sdcrdcf.UUCP> markb@sdcrdcf.UUCP (Mark Biggar) writes:

>Actually malloc is a case where rolling your own can provide emormous
>speed up in your program.  Do you realize just how much overhead is added
>to malloc just so you can free things.  I wrote a large (20000 lines)
>program that used malloc everywhere.  The longer it ran the slower it got.
>On profiling it I discovered that the program was spending 47% of its time
>in malloc!  The program never freed anything, so I replaced malloc with a
>much simpler one that just gave me some memory and didn't do any of the
>screwy thing that the regular malloc does (like chase down a link list of
>every block ever allocated to see if you might just have freed a block big
>enough to honor the the current request).  This gave me a 40%+ speedup in
>the program and the program stopped getting slower.  By putting the simple
>replacement malloc in its own file, I made the program just as protable as
>befor because you didn't have to use my malloc. (Altough mine would work
>correctly on both bsd and SV type unix systems.)

As someone else notes, the right thing to do is to make your own routine
with its own name.

The question I have, is malloc usually implemented as badly as described.
We did our own allocator for sundry reasons, mostly beause what we had
heard of malloc implementations was a little disturbing.  Some of the
things that we did were 

(a)	Remove allocation control data from the allocated space so that
an overwrite would not crash the allocator.

(b)	Use a table of linked lists for blocks of small and medium lengths
so that allocation usually involves no search for a block of the right
size.

(c)	Keep the allocated addresses in a hash table so that it is possible
to check whether a deallocation is legitimate.

(d)	Associate a time and origin stamp with each allocated block [added
for storage allocation leak analysis]

	The point of all this is that it is not all that hard to implement
an allocator that is fast, efficient, and reasonably secure against abuse.
How good are typical implementations of malloc?
-- 

In the fields of Hell where the grass grows high
Are the graves of dreams allowed to die.
	Richard Harter, SMDS  Inc.

chad@lakesys.UUCP (Chad Gibbons) (04/03/89)

	I have seen a somewhat strange style of coding a malloc() call on
some systems.  Usually, given struct foo, you would execute the call such
as this:

	struct foo tmp = (struct foo *)malloc(sizeof(struct foo));

The style I have seen used recently around here has been this:

	struct foo tmp = (struct foo *)malloc(sizeof *tmp);


	I ran tests of this on several different systems, and they all
compiled and worked fine...however, this seems to be a poor programming
practice at best, and a shoestring at worse.  Anyone have any comments
about this?  I would assume it wouldn't be a good idea to use it, but
then again, you never know.
-- 
D. Chadwick Gibbons, chad@lakesys.lakesys.com, ...!uunet!marque!lakesys!chad

chris@mimsy.UUCP (Chris Torek) (04/03/89)

In article <510@lakesys.UUCP> chad@lakesys.UUCP (Chad Gibbons) writes:
>	I have seen a somewhat strange style of coding a malloc() call on
>some systems.  Usually, given struct foo, you would execute the call such
>as this:
>
>	struct foo tmp = (struct foo *)malloc(sizeof(struct foo));

No: rather, `struct foo *tmp = <as above>'

>The style I have seen used recently around here has been this:
>
>	struct foo tmp = (struct foo *)malloc(sizeof *tmp);

Perfectly legal (given `struct foo *tmp' rather than `struct foo
tmp').  The argument to sizeof is examined only to find its type; any
compiler that actually tries to find `*tmp' during program execution
has a severe bug (one on the order of computing 1|2 as -1 instead of 3,
or 1||2 as -1 instead of 1).
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

gwyn@smoke.BRL.MIL (Doug Gwyn ) (04/04/89)

In article <9132@alice.UUCP> andrew@alice.UUCP (Andrew Hume) writes:
>the ONLY justification put forward is some stuff about zero-sized
>objects (gwyn admits to being the point of contact). the only
>point actually mentioned is devising semantics for zero-sized objects; ...

I believe that was in fact the only technical issue involved in the
decision to permit malloc(0) to fail for reasons other than running out
of memory.  There were several X3J11 members who, rightly recognizing
that if the standard permitted portable programs to rely on malloc(0)
succeeding under normal circumstances it would also have to address the
issue of semantics for 0-sized objects, balked at having to cross that
threshold.  I'm a big fan of 0-sized objects, but I've become convinced
that formally recognizing them would have significant impact in several
places in the standard.  The amount of work to get all the technical
points right was considered more than could be justified simply to
support the style of malloc() usage you're interested in.

Incidentally, our last-minute reworking of the wording about last-plus-
one element of arrays, etc. may have provided most of the scaffolding
needed to formally support 0-sized objects as well.  However, 0-sized
objects could not be allowed without explicit full committee action,
since they had been explicitly disallowed in previous voting.

Some people might make the argument that it is more likely when malloc()
is called with a 0 size request that there is a bug in the program than
that it is a consistent and sensible thing to be attempting.  Certainly
it COULD be used for legitimate purposes as Andrew apparently does, but
most of the malloc() applications I've seen are already in trouble if
they get to the point where they would be malloc(0)ing.  Thus I don't
think the ANSI-portable malloc() behavior poses a significant problem.

#define	myalloc(n)	malloc((n)?(n):1)

gwyn@smoke.BRL.MIL (Doug Gwyn ) (04/04/89)

In article <510@lakesys.UUCP> chad@lakesys.UUCP (Chad Gibbons) writes:
>The style I have seen used recently around here has been this:
>	struct foo tmp = (struct foo *)malloc(sizeof *tmp);
>compiled and worked fine...however, this seems to be a poor programming
>practice at best, and a shoestring at worse.

sizeof comes in two flavors, sizeof(type) and sizeof object.  In the
latter case the object-expression is not evaluated, only its type is
used.  Therefore the above usage is perfectly legitimate.  As to
whether it is better or worse than the alternative style, there
don't seem to be really strong arguments on either side.  I personally
prefer sizeof(type) since to me the other form is just a corruption of
this fundamental definition, but I'm sure other programmers disagree.
It doesn't seem to be worth arguing about..

jym@wheaties.ai.mit.edu (Jym Dyer) (04/04/89)

In article <9969@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>In article <510@lakesys.UUCP> chad@lakesys.UUCP (Chad Gibbons) writes:
>>	struct foo tmp = (struct foo *)malloc(sizeof *tmp);
>
> Sizeof comes in two flavors, sizeof(type) and sizeof object.  In the
> latter case the object-expression is not evaluated, only its type is
> used.  Therefore the above usage is perfectly legitimate.

It should be pointed out that object can be in parentheses.  This fact
 is not lost on those who prefer to use sizeof like a function instead
  of the operator that it is.

Also, the example has a typo in it.  The `tmp' variable must be declared
 as a pointer to struct foo, not as a struct foo.
  <_Jym_>

ark@alice.UUCP (Andrew Koenig) (04/04/89)

I have found the following style useful:

	#define new(T) ((T *) malloc (sizeof (T)))

	struct Foo *fp = new(struct Foo);

This is, of course, reminiscent of C++.
-- 
				--Andrew Koenig
				  ark@europa.att.com

guy@auspex.auspex.com (Guy Harris) (04/04/89)

>	I ran tests of this on several different systems, and they all
>compiled and worked fine...however, this seems to be a poor programming
>practice at best, and a shoestring at worse.  Anyone have any comments
>about this?

Well, in the December 7, 1988 dpANS, kt gives as an example of "sizeof",
on page 46:

	double *dp = alloc(sizeof *dp);

so I presume they intend it to be usable, and even that they don't think
it's too weird.  As for more formal guarantees, we have on page 45:

	3.3.3.4 The "sizeof" operator

	...

	...The size is determined from the type of the operand, which is
	not itself evaluated.

so I'd say that "dp" doesn't have to point to anything reasonable for
"sizeof *dp" to be valid.

richard@pantor.UUCP (Richard Sargent) (04/04/89)

> Received: by pantor.UUCP (UUL1.3#5109)
> 	from uunet with UUCP; Tue, 4 Apr 89 03:29:06 est
> Path: uunet!lll-winken!ames!haven!adm!smoke!gwyn
> From: gwyn@smoke.BRL.MIL (Doug Gwyn )
> Newsgroups: comp.lang.c
> Subject: Re: malloc() and sizeof
> Message-ID: <9969@smoke.BRL.MIL>
> Date: 3 Apr 89 19:11:08 GMT
> References: <510@lakesys.UUCP>
> Reply-To: gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>)
> Organization: Ballistic Research Lab (BRL), APG, MD.
> Lines: 14
> 
> In article <510@lakesys.UUCP> chad@lakesys.UUCP (Chad Gibbons) writes:
> >The style I have seen used recently around here has been this:
> >	struct foo tmp = (struct foo *)malloc(sizeof *tmp);
> >compiled and worked fine...however, this seems to be a poor programming
> >practice at best, and a shoestring at worse.
> 
> sizeof comes in two flavors, sizeof(type) and sizeof object.  In the
> latter case the object-expression is not evaluated, only its type is
> used.  Therefore the above usage is perfectly legitimate.  As to
> whether it is better or worse than the alternative style, there
> don't seem to be really strong arguments on either side.  I personally
> prefer sizeof(type) since to me the other form is just a corruption of
> this fundamental definition, but I'm sure other programmers disagree.
> It doesn't seem to be worth arguing about..


There is a very valid reason why one uses sizeof(object) rather than 
sizeof(type):  no matter what happens to the declaration of the object
during the software's lifetime, sizeof(object) will always remain
correct.  It is very easy to see where sizeof(type) requires dependency
changes in code: for example

        int   count;
        ...
        ... fread( ... sizeof(int) ... );

If the data structure were changed to a long, then maintainers MUST
go to the trouble of analysing the entire program to ensure that
any type dependencies are changed too.  This can be a very expensive
proposition for software development companies.

Yes, this is the voice of experience.  I have been burned this way.

I will close by agreeing that there are always going to be circumstances
where one form makes better sense than the other.  You just have to
figure out which case is which.

Richard Sargent
Systems Analyst

daveb@gonzo.UUCP (Dave Brower) (04/05/89)

In <9969@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>In article <510@lakesys.UUCP> chad@lakesys.UUCP (Chad Gibbons) writes:
>>The style I have seen used recently around here has been this:
>>	struct foo tmp = (struct foo *)malloc(sizeof *tmp);
>>compiled and worked fine...however, this seems to be a poor programming
>>practice at best, and a shoestring at worse.
>
>...  As to
>whether it is better or worse than the alternative style, there
>don't seem to be really strong arguments on either side.  I personally
>prefer sizeof(type) since to me the other form is just a corruption of
>this fundamental definition, but I'm sure other programmers disagree.
>It doesn't seem to be worth arguing about.

Of course I disagree, or I wouldn't post :-).  And I think I can make an
argument that is at least partly convincing from a software engineering
or reliability perspective.

In the trivial example shown, it really doesn't make any difference. 
However, the real case is quite often:

	struct foo *tmp;

	/* tons-o-code deleted */

	tmp  = (struct foo *)malloc( /*your choice here*/ );

It is all too easy to forget the type of "tmp" at this point.  If you do
your malloc argument as sizeof(*tmp), you cannot go wrong.  Further, if
you change the type of the definition of tmp above, you will not need to
change the argument to sizeof.

Compare this to the use of manifest constants.  If you can define
something in one place, you are stylistically better off.  Since C can't
verify that the arg to the malloc sizeof is the right type, better to
not need to define it multiple places either.

Of course, this argument bogs down when you realize you already needed
to know the type to get the cast on the malloc return correct.  But
given the choice, I'd still rather only have to get the type right once
(in the cast) rather than in the cast _and_ the sizeof.  (And it takes
less space if you have vars named "tmp" or "p" :-).

-dB

PS,

So my bias shows, I'm also in the minority that prefers

	if( CONST == var )

to let the compiler check that I didn't do

	if( CONST = var )
	
by mistake.  Some of my co-workers truly revile this style quirk.

-- 
"I came here for an argument." "Oh.  This is getting hit on the head"
{sun,mtxinu,amdahl,hoptoad}!rtech!gonzo!daveb	daveb@gonzo.uucp

piet@cs.ruu.nl (Piet van Oostrum) (04/06/89)

In article <624@gonzo.UUCP>, daveb@gonzo (Dave Brower) writes:
 `However, the real case is quite often:
 `
 `	struct foo *tmp;
 `
 `	/* tons-o-code deleted */
 `
 `	tmp  = (struct foo *)malloc( /*your choice here*/ );
 `
 `It is all too easy to forget the type of "tmp" at this point.  If you do
 `your malloc argument as sizeof(*tmp), you cannot go wrong.  Further, if
 `you change the type of the definition of tmp above, you will not need to
 `change the argument to sizeof.
 `
 `
 `Of course, this argument bogs down when you realize you already needed
 `to know the type to get the cast on the malloc return correct.  But
 `given the choice, I'd still rather only have to get the type right once
 `(in the cast) rather than in the cast _and_ the sizeof.

The following example has a cast that is different from the type of the
variable:

	typedef ..... FOO;
	FOO buf [BUFSIZ];
	FOO *bufcopy;
	....
	bufcopy = (FOO *) malloc (sizeof (buf));
	bcopy (bufcopy, buf, sizeof (buf));

Of course you can use sizeof (FOO[BUFSIZ]), but why would you. It is much
cleaner (IMHO) to use sizeof buf. This esample won't happen very often in
practice, I think, but it is better to use a single programming style.
-- 
Piet van Oostrum, Dept of Computer Science, University of Utrecht
Padualaan 14, P.O. Box 80.089, 3508 TB Utrecht, The Netherlands
Telephone: +31-30-531806. piet@cs.ruu.nl (mcvax!hp4nl!ruuinf!piet)

scs@adam.pika.mit.edu (Steve Summit) (08/20/89)

In article <1527@cbnewsl.ATT.COM> dfp@cbnewsl.ATT.COM (david.f.prosser) writes:
>Strictly speaking, malloc must return a pointer to an object that
>can be accessed by a type commensurate with its size in bytes.
>Moreover, it may well be possible to argue that unless the requested size
>is a multiple of the size of an int, the returned pointer need not be
>aligned appropriately for an int.  For example, ``malloc(5)''.

I hope not.  This would break the variable-sized structure trick
(discussed here at length not long ago):

	struct string
		{
		int length;
		char text[1];	/* actually text[length] */
		};

where we allocate a "string" with something like:

	(struct string *)malloc(sizeof(struct string)-1+stringlen)

Henry Spencer says that Dennis Ritchie calls this "unwarranted
chumminess with the compiler," but it's widely used.

(Note that (struct string *)malloc(sizeof(struct string)) _would_
work, since sizeof(struct string) will typically be 2*sizeof(int).)

                                            Steve Summit
                                            scs@adam.pika.mit.edu

pvarner@blackbird.afit.af.mil (Paul A. Varner) (03/28/90)

To all you 'C' gurus:
  I am in the process of porting some C code to be as portable as
possible across many different machines.  Anyhow, one of the calls I
have to make is malloc(something), where somthing is a BIG value, such
as 300,000.  Anyhow, the code works perfectly on a Unix machine I have
access to, but bombs out in Turbo C, using the huge memory model.
When I say bomb, it locks the machine up.  Anyhow, I really need to
keep the Malloc call in the code.  Any suggestions??

Please Email responses and I will summarize to the net if there is enough
interest.

					Paul Varner
Address: pvarner@blackbird.afit.af.mil
or       pvarner@afit-ab.arpa

 
                                        _/|
                                        \'o,O'
                                        =(___)=
                                           U

				Brain Fried - Core dumped 

pvarner@blackbird.afit.af.mil (Paul A. Varner) (04/08/90)

In article <1550@blackbird.afit.af.mil> I wrote:
>To all you 'C' gurus:
>  I am in the process of porting some C code to be as portable as
>possible across many different machines.  Anyhow, one of the calls I
>have to make is malloc(something), where somthing is a BIG value, such
>as 300,000.  Anyhow, the code works perfectly on a Unix machine I have
>access to, but bombs out in Turbo C, using the huge memory model.
>When I say bomb, it locks the machine up.  Anyhow, I really need to
>keep the Malloc call in the code.  Any suggestions??

In response to this, I have received many suggestions.  I appreciate all
the help and thank the following people who sent me their advice.

rds95@leah.Albany.EDU (Robert Seals)
raymond@math.berkeley.edu (Raymond Chen)
D. Richard Hipp <drh@cs.duke.edu>
tarvaine@jyu.fi (Tapani Tarvainen)
grimlok@hubcap.clemson.edu (Mike Percy)
einari@rhi.hi.is (Einar Indridason)
Alvin I Rosenthal <air@unix.cis.pitt.edu>
kdq@demott.com (Kevin D. Quitt)
"Rich Walters" <raw@math.arizona.edu>
Marvin Rubenstein <marv@ism.isc.com>
gordon@sneaky.lonestar.org (Gordon Burditt)
G.Toal%edinburgh.ac.uk@NSFnet-Relay.AC.UK
HENRIK SANDELL <E89HSE@rigel.efd.lth.se>
npw@nbsr.nbsr.duke.edu (Nicholas Wilt)
s.michnowicz@trl.oz.au (Simon Michnowicz - A Free Spirit)
kim@wacsvax.OZ (Kim Shearer)

My problem is caused by several things.  The first was that malloc is
defined so that it takes parameters of size_t. On the Unix system I was using
size_t is the length of a long.  In Turbo C, it is the length of an int.
Therefore, I was mallocing something like 300,000 mod 65536 and expecting to
have all 300,000 bytes.  The second was that since, I was actually allocating
some memory, the malloc call was not returning NULL.  The solution to this
problem was to use the function farmalloc.  This function takes a long as a
parameter.  The next thing is that in order to address all of the memory
allocated I needed to cast to a huge pointer.

philbo@arasta.uucp (Phillip Lindsay) (06/22/91)

Is there defined behavior in the ANSI C specification for the
result of a malloc() of zero bytes? Microsoft 5.1 returns
a non-NULL result where Borland C++ returns a NULL result.
Please EMAIL results; I am not fed this newsgroup. Many
thanks in advance.Phillip Lindsay "Those environmentalists are just trying to ruin the world."
Internet: ???????????????  Phone: Work7143852311 Home7142891201
UUCP    : {spsd,zardox,felix}!dhw68k!arasta!philbo
USMAIL  : 152A S. Cross Creek Rd, Orange, Ca. 92669