[comp.lang.c] BIT Counting: C programming problem

jim@athsys.uucp (Jim Becker) (01/10/89)

	I was recently asked a problem to which there are a number of
solutions, I'm looking for opinions as to the best of those different
solutions.

	The problem is the most efficient method to simply count the
number of on bits in a sixteen bit short. This is a really easy
exercise, but one that has a variety of different approaches. I would
be interested in the opinions of programmers out there, an efficient
algorithm that is understandable.

	Assume that this is a basic sample algorithm:

		int	
		BitCount( num )
		short	num;
		{
			int	count	= 0;

			while( num ) {

				count	+= (num &  1);
				num	 = (num >> 1);
			}

			return count;
		}

	Is this efficient? Is this understandable? What is a better
algorithm? Please email your responses. I'll report the results of
algorithms when the votes are in. Thanx!

-Jim Becker 	...!sun!athsys!jim

bader+@andrew.cmu.edu (Miles Bader) (01/10/89)

jim@athsys.uucp (Jim Becker) writes:
>         The problem is the most efficient method to simply count the
> number of on bits in a sixteen bit short. This is a really easy
> exercise, but one that has a variety of different approaches. I would
> be interested in the opinions of programmers out there, an efficient
> algorithm that is understandable.

Chunk it:

#define CHUNKSIZE 8

/* table of #-bits for each index */
char numBits[1<<CHUNKSIZE]={....}; /* if big, you could init it at run-time */

int bitCount(num)
int num;
{
    int bc=0;

    while(num>0){
	bc+=numBits[num&((1<<CHUNKSIZE)-1)];
	num>>=CHUNKSIZE;
    }

    return bc;
}

cosell@bbn.com (Bernie Cosell) (01/11/89)

In article <225@tityus.UUCP> jim@athsys.uucp (Jim Becker) writes:
}
}	The problem is the most efficient method to simply count the
}number of on bits in a sixteen bit short. 

Try this:

int BitCount(num)
    short num;
{
    int count = 0;

    while (num != 0)
	num &= num-1, count += 1;
    return (count);
}

   __
  /  )                              Bernie Cosell
 /--<  _  __  __   o _              BBN Sys & Tech, Cambridge, MA 02238
/___/_(<_/ (_/) )_(_(<_             cosell@bbn.com

worley@ardent.UUCP (John Worley) (01/11/89)

> 	The problem is the most efficient method to simply count the
> number of on bits in a sixteen bit short. ...
> 
> 		int	
> 		BitCount( num )
> 		short	num;
> 
> -Jim Becker 	...!sun!athsys!jim

    First, the parameter num must be declared unsigned.  Otherwise, the high
order bit is propagated in the shift; if set, this is an infinite loop.  Also,
since parameters as passed as ints, there may be extra high order bits set
when num is negative.

    I prefer the following algorithm:

		/*  Count the number of bits set in num.  While num is not
		 *  zero, increment the count and turn off the lowest bit
		 *  set.
		 */

		int
		BitCout(num)
			unsigned short num;
		{
			int cnt;

			for (cnt = 0; num != 0; ++cnt)
				num &= num - 1;

			return (cnt);
		}

    For N bit parameters, there will be on average N/2 iterrations if the
numbers are randomly distributed; the simple mask and shift will need an
average of N - 1 iterations (proofs on request).  You could speed up the
shift and mask scheme by looking at more than one bit each iteration:

		int
		BitCount(num)
			unsigned short num;
		{
			static int value2bits[] = {
				0, 1, 1, 2,
				1, 2, 2, 3,
				1, 2, 2, 3,
				2, 3, 3, 4
			};

			int cnt;

			for (cnt = 0; num != 0; num >>= 4)
				cnt += value2bits[num & 0xf];

			return (cnt);
		}

    However, this requires extra data space, an index computation and a
memory load for each iteration, where as the previous approaches could all
run in registers.  For certain architectures and applications, however,
it could run faster.

    For the fans (like me) of falling through in switch statements:

		int
		BitCount(num)
			unsigned short num;
		{
			int cnt;

			for (cnt = 0; num != 0; num >>= 2)
				switch (num & 0x3) {
				case 3:
					++cnt;

				case 2:
				case 1:
					++cnt;

				case 0:
					break;
				}

			return (cnt);
		}

					Regards,
					John Worley
					Ardent Computer
					uunet!ardent!worley

chuck@Morgan.COM (chuck) (01/11/89)

In article <225@tityus.UUCP> jim@athsys.uucp (Jim Becker) writes:
>
>	I was recently asked a problem to which there are a number of
>solutions, I'm looking for opinions as to the best of those different
>solutions.
>
>	The problem is the most efficient method to simply count the
>number of on bits in a sixteen bit short. This is a really easy
>exercise, but one that has a variety of different approaches. I would
>be interested in the opinions of programmers out there, an efficient
>algorithm that is understandable.
>

A pretty simple bit counting algorithm can be done with a table lookup
if the calculation of the count is critical enough to warrant the space for
the table.  There need only be 1 256 byte table containing the number of
bits in a byte.  Just look up the number of bits in the lower and upper bytes
of the word.

static char bitcounts[] = { 0, 1, 1, 2, 1, ..., 7, 8 };

#define BITCOUNT(s) ((int)bitcounts[s & 0xff] + bitcounts[(s & 0xff00) >> 8])

If its REALLY critical make a 64k table which is accessed using the whole
short as an index.  This is not much memory if you're talking about a really
heavily used loop.

-- 
+------------------+   Chuck Ocheret, Sr. Staff Engineer   +------------------+
| chuck@morgan.com |       Morgan Stanley & Co., Inc.      |  (212)703-4474   |
|   Duty now ...   |19th Floor, 1251 Avenue of the Americas| for the future.  |
+------------------+         New York, N.Y.  10020         +------------------+

djones@megatest.UUCP (Dave Jones) (01/11/89)

From article <34389@bbn.COM), by cosell@bbn.com (Bernie Cosell):
) In article <225@tityus.UUCP) jim@athsys.uucp (Jim Becker) writes:
) }
) }	The problem is the most efficient method to simply count the
) }number of on bits in a sixteen bit short. 
) 
) Try this:
) 
) int BitCount(num)
)     short num;
) {
)     int count = 0;
) 
)     while (num != 0)
) 	num &= num-1, count += 1;
)     return (count);
) }
) 




Note well that he did not claim that it works. He only said, "Try this."

barmar@think.COM (Barry Margolin) (01/11/89)

I haven't seen anyone so far suggest the far end of the time/space
trade-off: have a 64K table, with each element N containing the number
of bits in N (you'll probably have to use one of the slower mechanisms
to compute the contents of this table, but it only needs to be run
once when writing the program and then incorporated into the program
text as an initial value).  Someone else came close by suggesting a
256-element table which would be consulted twice and then the results
added.  But is an extra 64Kbytes of data much of a problem in these
days of large and/or virtual memories?

One argument against this in a paging environment is that this is more
likely to cause a page fault than other mechanisms that use less data
and more computation, and a page fault is at least an order of
magnitude more expensive than arithmetic instructions.  And in
swapping systems it will increase process loading time.  But on 1Mb PC
it's only 6% of your memory capacity, and it might be worth it
depending on the application.

Barry Margolin
Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

leichter@cs.yale.edu (Jerry Leichter) (01/11/89)

You might find the following amusing.  It works ONLY on 32-bit quantities
(though an analogous algorithm can be derived for any word size).  It's not
a particularly good way to count bits at run-time (though I suppose it might
lead to an optimally-short program on some architectures).  What the "best"
algorithm is depends on what you want, but along most measures of space and
speed using a 256-entry table of "bits per byte" count and then adding up the
counts for each byte is usually good.  This algorithm's claim to fame (other
than obscurity!) is that it can be evaluated completely at compile time by
C's limited macro pre-processor (plus a constant folder) if the argument is a
constant.  There are simpler algorithms with this property - you could always
write a sum of 32 expressions testing each bit, or (better) a recursive sub-
division into halves until you use such an expression on small pieces - but
they produce much larger macro expansions, often expansions so large that
some macro processors die.  (If they don't die, they run for a LONG time.)

							-- Jerry

/*
 * Macros to compute a bit value from a bit number, bit number from bit value,
 * and the bit count (number of bits on) for 32-bit quantities.  The results
 * are compile-time constants if the arguments are.
 *
 * The definition of BITCOUNT is based on an algorithm condensed from one
 * extracted from a graph coloring register allocator which manipulates
 * bitvectors.  It was inspired by an analogous algorithm for counting bits in
 * 36 bit words in MIT's HAKMEM, designed by Chris Smith, and written by Jane
 * Hoffman.  The original conversion to a form suitable for C on a VAX was by
 * Brad Merrill; that code was fixed and re-written by Jerry Leichter.  It
 * assumes a 32-bit word.  Note that the argument is evaluated multiple times,
 * so should not have side-effects.  A full explanation of the algorithm is
 * too long to include here; in summary, BX_(x) replaces each 4-bit sequence
 * in x by the number of bits in that sequence (which will, of course, fit in
 * 4 bits).  BITCOUNT folds the 4-bit sequences into 8-bit sequences.  The
 * 8-bit sequences are then added up by recognizing that they can be viewed as
 * a base 256 number, and the sum of the digits of a base b number is the same
 * mod b-1 as the number mod b-1.
 *
 * The original implementation produced results in groups of 3, then 6 bits.
 * While natural for a PDP-10's 36-bit word, this requires special case
 * handling of the sign bit on a VAX, since the top 3-bit sequence is really
 * only 2 bits long. 
 */
#define BITVAL(n)	(1 << (n))
#define BITNUM(b)	BITCOUNT((b)-1)
#define BITCOUNT(x)	(((BX_(x)+(BX_(x)>>4)) & 0x0F0F0F0F) % 255)
#define  BX_(x)		((x) - (((x)>>1)&0x77777777)			\
			     - (((x)>>2)&0x33333333)			\
			     - (((x)>>3)&0x11111111))

jlh@uplherc.UUCP (Jordan Henderson) (01/11/89)

In article <225@tityus.UUCP>, jim@athsys.uucp (Jim Becker) writes:
> 	Is this efficient? Is this understandable? What is a better
> algorithm? Please email your responses. I'll report the results of
> algorithms when the votes are in. Thanx!
> 
  This is not really in reply to Mr. Becker's article above.  I'm just
pointing out to all of those who have posted on the bit counting problem
that Mr. Becker requested that you were to email responses and he 
would summarize.  Let's all be responsible.  This is the proper
way to handle this kind of thing.  We don't need to see your clever
solution right away no matter how good it is.  Let's save some band-
width and some trouble reading this newsgroup (how many duplicate solutions
and comments have there already been?).

  Yes, I know, I'm wasting your time with this remark...

					Jordan Henderson
					utah-cs!uplherc!jlh

malloy@nprdc.arpa (Sean Malloy) (01/12/89)

In article <35329@think.UUCP> barmar@kulla.think.com.UUCP (Barry Margolin) writes:
>I haven't seen anyone so far suggest the far end of the time/space
>trade-off: have a 64K table, with each element N containing the number
>of bits in N  . . .
		< some text deleted >
>       . . .                Someone else came close by suggesting a
>256-element table which would be consulted twice and then the results
>added.  But is an extra 64Kbytes of data much of a problem in these
>days of large and/or virtual memories?

>   . . .     But on 1Mb PC
>it's only 6% of your memory capacity, and it might be worth it
>depending on the application.

However, it's 10% of the effective address space for DOS, and using
64K for a bleeping lookup lookup table shoots any hope of using any of
the memory models that don't let you use more than 64K of data space.
Two or three TSR programs that had a 64K lookup table in addition to
whatever data and code space the programs took up would drive your
free memory into the floor.

Taking 256 times as much memory just so that you can use one lookup
rather than two lookups, a shift, an AND, and an add is the kind of
programming I would expect someone who's never had to deal with
limited memory space would come up with. It reminds me of the posting
a while back about the guy who was porting a program from a mainframe
to a PC and found that the first thing the program did was malloc 200K
bytes four times, and actually _used_ about 60K of one of those
spaces.

	Sean Malloy
	Navy Personnel Research & Development Center
	San Diego, CA 92152-6800
	malloy@nprdc.arpa

jcm@mtunb.ATT.COM (was-John McMillan) (01/12/89)

In article <35329@think.UUCP> barmar@kulla.think.com.UUCP (Barry Margolin) writes:
>I haven't seen anyone so far suggest the far end of the time/space
>trade-off: have a 64K table, with each element N containing the number
...

Actually, I believe the originating author requested E-Mail of the
suggestions and stated he would summarize the results.  This minimizes
the NetWork traffic and other resource consumption.

And explains why there are probably several suggestions you've missed
-- I hope! ;-)

jc mcmillan	-- att!mtunb!jcm

nobody@tekecs.TEK.COM (-for inetd server command) (01/12/89)

In article <225@tityus.UUCP> jim@athsys.uucp (Jim Becker) writes:
}
}	The problem is the most efficient method to simply count the
}number of on bits in a sixteen bit short. 

Try this:

char bitCount[0x10000] = {
    0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4,
    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
    /* rest of table left as an exercise to the reader */
};

#define BITCOUNT(num)	(bitCount[(unsigned short)(num)])

Or try this (a little slower, uses less space,
watch out for parameters with side effects):

char bitCount[0x100] = {
    0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4,
    1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
    /* rest of table left as an exercise to the reader */
};

#define BITCOUNT(num)	(bitCount[(unsigned char)(num)]\
			    + bitCount[(unsigned char)((num) >> 8)])

Paul Scherf, Tektronix, Box 1000, MS 61-028, Wilsonville, OR, USA
paulsc@orca.GWD.Tek.COM     503-685-2734     tektronix!orca!paulsc

desnoyer@Apple.COM (Peter Desnoyers) (01/12/89)

In article <35329@think.UUCP> barmar@kulla.think.com.UUCP (Barry Margolin) writes:
>I haven't seen anyone so far suggest the far end of the time/space
>trade-off: have a 64K table, with each element N containing the number
>...

Note that if disk transfer speed is 1MByte/sec (counting seeks), this
will add 64mS to the total program execution time. If you save 1uS per
call with this table, you need to use the function 64000 times just to
break even. Moral - huge lookup tables are not free - you have to load
them in before you start. (unless you are running from rom :-)

				Peter Desnoyers

greggy@infmx.UUCP (greg yachuk) (01/12/89)

In article <1171@goofy.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes:
:From article <34389@bbn.COM), by cosell@bbn.com (Bernie Cosell):
:) 
:) Try this:
:) 
:) int BitCount(num)
:)     short num;
:) {
:)     int count = 0;
:) 
:)     while (num != 0)
:) 	num &= num-1, count += 1;
:)     return (count);
:) }
:) 
:
:Note well that he did not claim that it works. He only said, "Try this."

Guess what, Dave.  It *does* work.  Check it out manually, if necessary!

Greg Yachuk	Informix Software Inc., Menlo Park, CA	(415) 322-4100
{uunet,pyramid}!infmx!greggy	why yes, I DID choose that login myself

gwyn@smoke.BRL.MIL (Doug Gwyn ) (01/12/89)

In article <35329@think.UUCP> barmar@kulla.think.com.UUCP (Barry Margolin) writes:
>But is an extra 64Kbytes of data much of a problem in these
>days of large and/or virtual memories?

Yes!

The reason for stopping at 8-bit chunks is that all modern architectures
can efficiently access that size, for a very good time/space tradeoff.

desnoyer@Apple.COM (Peter Desnoyers) (01/13/89)

In article <1310@skinner.nprdc.arpa> malloy@nprdc.arpa (Sean Malloy) writes:
>In article <35329@think.UUCP> barmar@kulla.think.com.UUCP (Barry Margolin) writes:
>>I haven't seen anyone so far suggest the far end of the time/space
>>trade-off: have a 64K table, with each element N containing the number
>>of bits in N  . . .

More thoughts on large tables and program load time - if the disk the
program is running off of is slow, it may be faster to initialize the
table by calculating it (using a slower algorithm) than to load it as
part of the code. (!) Someone suggested in email that load overhead is
avoided with shareable libraries - it is not, unless another process
is already using that function. The library still resides on disk
before being faulted in.

				Peter Desnoyers

jbuck@epimass.EPI.COM (Joe Buck) (01/13/89)

Bernie Cosell's simple bit-counting algorithm can easily be demonstrated
to be correct, and the basic idea can be used in other ways (I've used it
in a tiny little real-time O/S for a DSP chip to allocate event flags).


int BitCount(num)
int num;
{
     int count = 0;
     while (num != 0) {
 	num &= num-1;
	count += 1;
     }
     return count;
}

Given an integer N, what is N & (N-1), where "&" is the bitwise AND operation?
It is a word identical to N, except that the least significant "1" bit is
cleared.  So the result of the operation has one fewer bit set, and
for a number with M "1" bits, repeating the operation M times reduces the
result to zero.

So here's a picture to show how it works.  Let's say N has the pattern

N       = XXXXXX10000
N-1     = XXXXXX01111
N&(N-1) = XXXXXX00000 

and the least significant bit is cleared.

In the application I referred to, I had 32 global event flags, represented
as bits in a word.

alloc_evf ()
{
   int tmp, evf;

   if (AvailEvents != 0) {
       tmp = AvailEvents & (AvailEvents - 1);
       evf = AvailEvents - tmp;
       AvailEvents = tmp;
   }
   else ERROR; /* no event flags available */
}   
-- 
- Joe Buck	jbuck@epimass.epi.com, or uunet!epimass.epi.com!jbuck,
		or jbuck%epimass.epi.com@uunet.uu.net for old Arpa sites
I am of the opinion that my life belongs to the whole community, and as long
as I live it is my privilege to do for it whatever I can.  -- G. B. Shaw

cramer@optilink.UUCP (Clayton Cramer) (01/13/89)

In article <35329@think.UUCP., barmar@think.COM (Barry Margolin) writes:
. I haven't seen anyone so far suggest the far end of the time/space
. trade-off: have a 64K table, with each element N containing the number
. of bits in N (you'll probably have to use one of the slower mechanisms
. to compute the contents of this table, but it only needs to be run
. once when writing the program and then incorporated into the program
. text as an initial value).  Someone else came close by suggesting a
. 256-element table which would be consulted twice and then the results
. added.  But is an extra 64Kbytes of data much of a problem in these
. days of large and/or virtual memories?

Someone must own stock in DRAM companies.

To be fair, my first summer programming jobs involved having to
sort a randomly entered sequence of part numbers, and two data values
associated with each part number.  (Part numbers were hex -- my boss
hadn't figured out how to unpack binary to decimal in assembler
language -- that's why he hired me).  Since I didn't know how many
records I would get, and thus didn't know how much memory to allocate,
I created a file with four byte records (initialized to -1), used 
the part numbers as a random access key, and wrote the four bytes of 
data as records.  Then, when I was finished, I reopened the file for 
sequential access, and read through the file.

At the time, I thought, "What a neat way to sort all these records".
Of course, I was only 16, so I suppose it was forgiveable. :-)
-- 
Clayton E. Cramer
{pyramid,pixar,tekbspa}!optilink!cramer
Disclaimer?  You must be kidding!  No company would hold opinions like mine!

gwyn@smoke.BRL.MIL (Doug Gwyn ) (01/14/89)

In article <11025@rpp386.Dallas.TX.US> jfh@rpp386.Dallas.TX.US (John F. Haugh II) writes:
-That wasn't what he wrote, he wrote
-	num &= num - 1 , count += 1;
-		       ^-- oops.

Excuse my asking, but what's supposed to be the problem with that?

scs@adam.pika.mit.edu (Steve Summit) (01/15/89)

Pathetically flogging his long-dead equine quadruped, the style
advocate tonelessly repeats his sad lament:

	"First code it in the obvious way, deferring any attempt
	at optimization until profiling demonstrates that the
	bit-counting code contributes significantly to
	unacceptable run time."

Judging from the attention these "cute problems" continually
receive here (last month it was finding the high-order bit), one
would have to conclude that virtually every C program requires
such arcane bit manipulations, and that infants would perish,
space shuttles crash, and empires topple if the answer were not
obtained in the absolute minimal, micro-optimized time.

The suggestions for 64K lookup tables had implied smiley faces, right?
Please?

(Smiley face here, too; I know we're all just having good, clean,
computer nurd fun.)

                                            Steve Summit
                                            scs@adam.pika.mit.edu

flaps@dgp.toronto.edu (Alan J Rosenthal) (01/16/89)

There seems to be debate over whether or not the algorithm posted works, and
there was even a ``proof'' that it does work, which is very interesting.

It doesn't work for MININT unless (MININT - 1) happens to evaluate to MAXINT
(as it frequently does in real life).  On a one's complement machine, I doubt
that MININT - 1 == MAXINT very often; it's probably -0 (if it's not trapped).

If MININT - 1 is an overflow, the posted algorithm certainly won't give the
right answer for MININT (the right answer being 1 on a two's complement
machine, or log2(MAXINT + 1) on a one's complement machine).

ajr

ejablow@dasys1.UUCP (Eric Robert Jablow) (01/23/89)

In article <35329@think.UUCP> barmar@kulla.think.com.UUCP (Barry Margolin) writes:
>I haven't seen anyone so far suggest the far end of the time/space
>trade-off: have a 64K table, with each element N containing the number
>of bits in N (you'll probably have to use one of the slower mechanisms
>to compute the contents of this table, but it only needs to be run
>once when writing the program and then incorporated into the program
>text as an initial value).  [Remainder deleted.]
>Barry Margolin
>Thinking Machines Corp.
>barmar@think.com
>{uunet,harvard}!think!barmar

	"Why do you hoard?"
	"Why do you waste?"
		The Inferno,
		Dante

That is a ridiculous idea.  For microscopic savings in speed, you
waste 63K of space.  Were this program intended to be a standalone
program, your suggestion might be accessible, but this program is
obviously intended to be a part of some larger project.  Certainly, 
the surrounding program will be much larger than the bitcount routine;
what if it needs 400K to run effectively, or even 640K. An extra 64K
will put most users over the edge.  Besides, more and more people are
going to multiprogramming environments, even on PCs, like Windows or
DesqView.  Then, it is vital to keep all programs as small as
possible, so that the other programs have room to work.  If you can
assure me that the speed is the overriding condition, I might accept
the 63K penalty.  But I'd hate myself for it.

-- 
Eric Jablow                      {allegra,philabs,cmcl2}!phri\
Big Electric Cat Public Unix           {bellcore,cmcl2}!cucard!dasys1!ejablow
New York, NY, USA	 	 
New address:	jessica!eric@sbee.sbcc.edu.

mlandau@bbn.com (Matt Landau) (01/25/89)

In comp.lang.c (<8398@dasys1.UUCP>), ejablow@dasys1.UUCP (Eric Robert Jablow) 
writes:
>[Barry Margolin suggests using a 64K lookup table for something.]
>
>That is a ridiculous idea.  For microscopic savings in speed, you
>waste 63K of space...what if it needs 400K to run effectively, or 
>even 640K...Besides, more and more people are going to multiprogramming 
>environments, even on PCs, like Windows or DesqView.  

Anyone else noticed that the "all the world's a VAX" mentality seems to be 
turning into an "all the world's a PC" mentality?  Positively frightening.
--
 Matt Landau			Waiting for a flash of enlightenment
 mlandau@bbn.com			  in all this blood and thunder

bph@buengc.BU.EDU (Blair P. Houghton) (01/26/89)

In article <12571@diamond.BBN.COM> mlandau@bbn.com (Matt Landau) writes:
>In comp.lang.c (<8398@dasys1.UUCP>), ejablow@dasys1.UUCP (Eric Robert Jablow) 
>writes:
>>[Barry Margolin suggests using a 64K lookup table for something.]
[Barry is at Thinking Machines, builders of the Connection Machine.]
>>
>>That is a ridiculous idea.  For microscopic savings in speed, you
>>waste 63K of space...what if it needs 400K to run effectively, or 
>>even 640K...Besides, more and more people are going to multiprogramming 
>>environments, even on PCs, like Windows or DesqView.  
>
>Anyone else noticed that the "all the world's a VAX" mentality seems to be 
>turning into an "all the world's a PC" mentality?  Positively frightening.

Anyone else notice that Barry could very well assign each number to a
different _computer_ in the initialization...?

				--Blair
				  "Life...is a tree."

djones@megatest.UUCP (Dave Jones) (01/26/89)

From article <8398@dasys1.UUCP>, by ejablow@dasys1.UUCP (Eric Robert Jablow):
> In article <35329@think.UUCP> barmar@kulla.think.com.UUCP (Barry Margolin) writes:
>>I haven't seen anyone so far suggest the far end of the time/space
>>trade-off: have a 64K table, with each element N containing the number
>>of bits in N 

....

> That is a ridiculous idea.  For microscopic savings in speed, you
> waste 63K of space.

...

As I type this letter, I am using what is, by today's standards, an
excellent, but rather commonplace workstation: a Sun3/60.  It has 8Meg
of RAM and a virtual memory of about 42Meg, if I remember correctly.
So 64K is less than two tenths of one percent of the virtual memory.

Always bear in mind that "ridiculous" depends on the point of view of
the ridiculer.

		Dave J.

andrew@alice.UUCP (Andrew Hume) (01/26/89)

> >[Barry Margolin suggests using a 64K lookup table for something.]
> >
> >That is a ridiculous idea.  For microscopic savings in speed, you
> >waste 63K of space...what if it needs 400K to run effectively, or 
> >even 640K...Besides, more and more people are going to multiprogramming 
> >environments, even on PCs, like Windows or DesqView.  


the time saving is not microscopic (i measure a 1.8x speedup between
8 bit lookup and 16 bit lookup) unless you don't call the bitcount
routine. i do not encourage mindless devouring of memory but i
figure that i don't much of a time tradeoff to worry about 64KB.

jeenglis@nunki.usc.edu (Joe English) (01/27/89)

djones@megatest.UUCP (Dave Jones) writes:
<From article <8398@dasys1.UUCP>, by ejablow@dasys1.UUCP (Eric Robert Jablow):
<> In article <35329@think.UUCP> barmar@kulla.think.com.UUCP (Barry Margolin) writes:
<>>I haven't seen anyone so far suggest the far end of the time/space
<>>trade-off: have a 64K table, with each element N containing the number
<>>of bits in N 

<> That is a ridiculous idea.  For microscopic savings in speed, you
<> waste 63K of space.

<As I type this letter, I am using what is, by today's standards, an
<excellent, but rather commonplace workstation: a Sun3/60.  It has 8Meg
<of RAM and a virtual memory of about 42Meg, if I remember correctly.
<So 64K is less than two tenths of one percent of the virtual memory.

<Always bear in mind that "ridiculous" depends on the point of view of
<the ridiculer.

Well, this here ridiculer also thinks the idea is
ridiculous: Even though you've got 8 Meg of RAM, if
the program is running under Unix it's not going to
have it all to itself, and that 64K table is likely
to be swapped out.  So now the ultrafast straight
lookup algorithm may take a _large_ disk read and
has lost whatever speed advantages it might have
had.

Besides that, a 64K lookup table for bit-counting
strikes me as just plain inelegant and wasteful,
even if it's going to be run on a machine with 8
gigabytes of core.  Don't you know there are
starving Lisp processes out there? :-)


--Joe English

  jeenglis@nunki.usc.edu

ejablow@dasys1.UUCP (Eric Robert Jablow) (01/30/89)

Point taken.  I acted too PC-centric.  Still, it seems
extraordinaryily wasteful to expend 64K of resources without reason.
Bloated and clumsy programs can be written on any architecture and any
computer.  If programmers can say "64 extra K--who cares?"  today,
tomorrow it will be 128K, and the next day 1M, and so on.  Let's keep
our code lean and mean, folks.  On any computer.

-- 
Eric Jablow                      {allegra,philabs,cmcl2}!phri\
Big Electric Cat Public Unix           {bellcore,cmcl2}!cucard!dasys1!ejablow
New York, NY, USA	 	 
New address:	jessica!eric@sbee.sbcc.edu.