[comp.lang.c] count of bits set in a long

nwagner@ut-emx.uucp (Neal R. Wagner) (09/25/90)

  I need a fast method to count the number of bits that are set in a 32-bit
integer on a Sun 3/80 running 4.0.3 SunOS.  The brute-force method follows.
Is there a faster way in C?  How close in speed to a hand-coded assembly
routine would a fast method in C be (after the latter is run through the
optimizer)?
  Thanks in advance for your help.  Send e-mail and I will summarize for the
net.

===============================================================================
#define	BIT_SETSIZE	32
typedef	unsigned long	bit_set;
#define BIT_ISSET(bit,bitset)	(((bitset) & (1<<(bit))) >> (bit))
int bit_count();

main()
{
	bit_set i;
	for (i=0; i<=16; i++)
		printf("%d has %d bits on\n", i, bit_count(i));
}

intint bit_count(i)
bit_set i;
{
	int j, k;
	k = 0;
	for (j=0; j<BIT_SETSIZE; j++) k = k + BIT_ISSET(j, i);
	return k;
}
===============================================================================

dbrooks@osf.osf.org (David Brooks) (09/25/90)

In article <37545@ut-emx.uucp> nwagner@ut-emx.uucp (Neal R. Wagner) writes:
>
>  I need a fast method to count the number of bits that are set in a 32-bit
>integer on a Sun 3/80 running 4.0.3 SunOS.

I wish people who aren't using the net to do their homework, but look
as though they are, would say so.

>Is there a faster way in C?  How close in speed to a hand-coded assembly
>routine would a fast method in C be (after the latter is run through the
>optimizer)?

Well, if you don't restrict youself to the Sun3, I once knew a machine
that could do it in one instruction.  I doubt if any compiler could
compile your code into that.
-- 
David Brooks				dbrooks@osf.org
Systems Engineering, OSF		uunet!osf.org!dbrooks
"A Loaf of Bread, a Jug of Wine, and Six Spades Redoubled -- Omar somebody.

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (09/25/90)

In article <37545@ut-emx.uucp>, nwagner@ut-emx.uucp (Neal R. Wagner) writes:
>   I need a fast method to count the number of bits that are set in a 32-bit

A simple way is to set up a table:
	static char n_bits_set[256] = {0, 1, 1, 2, ... };

	int bit_count(n)
	    unsigned long int n;
	    {
		return n_bits_set[(n >> 24)      ]
		     + n_bits_set[(n >> 16) & 255]
		     + n_bits_set[(n >>  8) & 255]
		     + n_bits_set[(n      ) & 255];
	    }

There's a rather nice loop that does one iteration per bit, and works
for any size, but I'll leave that for others.
-- 
Heuer's Law:  Any feature is a bug unless it can be turned off.

eyal@moses.canberra.edu.au (Eyal Lebedinsky) (09/25/90)

In article <37545@ut-emx.uucp> nwagner@ut-emx.uucp (Neal R. Wagner) writes:
>
>  I need a fast method to count the number of bits that are set in a 32-bit
>integer on a Sun 3/80 running 4.0.3 SunOS.  The brute-force method follows.
>Is there a faster way in C?  How close in speed to a hand-coded assembly
>routine would a fast method in C be (after the latter is run through the
>optimizer)?
>  Thanks in advance for your help.  Send e-mail and I will summarize for the
>net.
>[...program follows...]

If speed if VERY important, try the bulk method:
break the int into 4 bytes, have an array of 256 that holds the bit-count
for any value from 0-255, and add the four counts. You can go dirty and
redefine (union) the int because byte-ordering is not important. If the
array is of type char then the indexing is very efficient. The program
should DEFINITLY have no loops, just an addition of four array elements.


-- 
Regards
	Eyal

md@doc.ic.ac.uk (Mark Dawson) (09/25/90)

In <3820@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
>There's a rather nice loop that does one iteration per bit, and works
>for any size ...

Here it is:

	num_set_bits = 0;
	while (word != 0) { /* there must be at least one bit set */
		num_set_bits++;
		word &= (word - 1); /* remove the least significant set bit */
		/* iterate... */
	}

Mark

jc@atcmp.nl (Jan Christiaan van Winkel) (09/25/90)

From article <37545@ut-emx.uucp>, by nwagner@ut-emx.uucp (Neal R. Wagner):
> 
>   I need a fast method to count the number of bits that are set in a 32-bit
> integer on a Sun 3/80 running 4.0.3 SunOS.  The brute-force method follows.

How about:

int bit_count(i)
unsigned long i;
{
	int k=0;
	for (; i!=0; i>>=1) k += (i & 1);
	return k;
}
-- 
___  __  ____________________________________________________________________
   |/  \   Jan Christiaan van Winkel      Tel: +31 80 566880  jc@atcmp.nl
   |       AT Computing   P.O. Box 1428   6501 BK Nijmegen    The Netherlands
__/ \__/ ____________________________________________________________________

gallag@hp-and.HP.COM (Mike Gallagher) (09/26/90)

>   I need a fast method to count the number of bits that are set in a 32-bit
> integer on a Sun 3/80 running 4.0.3 SunOS.  The brute-force method follows.
> Is there a faster way in C?  How close in speed to a hand-coded assembly
> routine would a fast method in C be (after the latter is run through the
> optimizer)?

Try this:

/**************************************************************************/

typedef	unsigned long bit_set;
static int bit_count();

static char bit_array[] = {
0, 1, 1, 2, 1, 2, 2, 3, 1, 2, 2, 3, 2, 3, 3, 4,
1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
1, 2, 2, 3, 2, 3, 3, 4, 2, 3, 3, 4, 3, 4, 4, 5,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
2, 3, 3, 4, 3, 4, 4, 5, 3, 4, 4, 5, 4, 5, 5, 6,
3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
3, 4, 4, 5, 4, 5, 5, 6, 4, 5, 5, 6, 5, 6, 6, 7,
4, 5, 5, 6, 5, 6, 6, 7, 5, 6, 6, 7, 6, 7, 7, 8
};

main()
{
        bit_set i;
        for (i=0; i<=256; i++)
                printf("%d has %d bits on\n", i, bit_count(i));
}

static int bit_count(i)
bit_set i;
{

        int count;

        count = bit_array[i&0xff];
        i >>= 8;
        count += bit_array[i&0xff];
        i >>= 8;
        count += bit_array[i&0xff];
        i >>= 8;
        count += bit_array[i&0xff];
        return count;

}

/**************************************************************************/

Mike Gallagher

parke@star.enet.dec.com (Bill Parke) (09/26/90)

In article <661@atcmpe.atcmp.nl>, jc@atcmp.nl (Jan Christiaan van Winkel)
writes:
|> From: jc@atcmp.nl (Jan Christiaan van Winkel)
|> Newsgroups: comp.lang.c
|> Subject: Re: count of bits set in a long
|> 
|> From article <37545@ut-emx.uucp>, by nwagner@ut-emx.uucp (Neal R. Wagner):
|> > 
|> >   I need a fast method to count the number of bits that are set in a
32-bit
|> > integer on a Sun 3/80 running 4.0.3 SunOS.  The brute-force method
follows.
|> 
|> How about:
|> 
|> int bit_count(i)
|> unsigned long i;
|> {
|> 	int k=0;
|> 	for (; i!=0; i>>=1) k += (i & 1);
|> 	return k;
|> }
|> -- 
|> ___  __ 
____________________________________________________________________
|>    |/  \   Jan Christiaan van Winkel      Tel: +31 80 566880  jc@atcmp.nl
|>    |       AT Computing   P.O. Box 1428   6501 BK Nijmegen    The
Netherlands
|> __/ \__/
____________________________________________________________________
|> 

Actually I like:

int bitcount(int i)
{
	register int c = 0 ;
	while(i) {
	    i &= (i-1) ;
	    c++ ;
	    }
	return c ;
}

Better.

--
Bill Parke 			parke%star.enet.dec@decwrl.dec.com
VMS Development			decwrl!star.enet.dec.com!parke
Digital Equipment Corp		parke@star.enet.dec.com
110 Spit Brook Road ZK01-1/F22, Nashua NH 03063

The views expressed are my own.

peter@neccan.oz (Peter Miller) (09/26/90)

In article <37545@ut-emx.uucp> nwagner@ut-emx.uucp (Neal R. Wagner) writes:
>
>  I need a fast method to count the number of bits that are set in a 32-bit
>integer on a Sun 3/80 running 4.0.3 SunOS.

HACKMEM 169

you can do it in straight line code, no loop.
A handful of shifts, a couple of subtracts,
and a modulo.  (Hint: modulo 63)

Look in the X11 sample server sources.  Grep for "HACKMEM".
D#! rnews 1014
Newsgroups: dcs.general
Path: anucsd!anucsd!daw
From: daw@anucsd.anu.oz.au (David Wanless)
Subject: Seamail - Wilderness diaries and calendars
Message-ID: <daw.654311041@anucsd>
Sender: news@anucsd.anu.oz.au (USENET News software)
Organization: ANU Department of Computer Science
Distribution: dcs
Date: 26 Sep 90 01:04:01 GMT

Wondering what to send to your overseas colleagues and relatives for
Christmas (knowing that seamail postage deadlines are fast approaching) ?
Why not send them an Australian wilderness calendar or diary ?  Wilderness
Society landscape and wildlife calendars, and ACF diaries, are all profusely
illustrated with beautiful photographs, and all proceeds go towards the
campaigns of TWS and ACF.  If interested, contact me, or
visit the Wilderness Shop in the Griffin Centre in Civic (ph. 249 8011). 

                    David Wanless, 
                    (Room 225, Computer Science Department, 
                     phone 5142 (afternoons),
                     e-mail to daw@anucsd).

sa1z+@andrew.cmu.edu (Sudheer Apte) (09/26/90)

> From article <37545@ut-emx.uucp>, by nwagner@ut-emx.uucp (Neal R. Wagner):
> > 
> >   I need a fast method to count the number of bits that are set in a 32-bit
> > integer on a Sun 3/80 running 4.0.3 SunOS. ...

jc@atcmp.nl (Jan Christiaan van Winkel) writes:

> [... a really neat iteration, one per bit at most, deleted ]

That was probably what Karl Heuer was talking about leaving for others.
I was thinking along these lines:

static int num_bits[256] = {0, 1, 1, 2, 1, ... /* etc. */ };

struct four_pack { char a; char b; char c; char d;};

int bit_count(i)
unsigned long i;
{
	struct four_pack f;
	f = (struct four_pack) i;
	return num_bits[f.a] + num_bits[f.b] + num_bits[f.c] +
		num_bits[f.d];
}

Would this be faster, in reality, than shifts and iteration? 

	Sudheer.
------------------
...{harvard, uunet}!andrew.cmu.edu!sa1z
sa1z%andrew@CMCCVB.BITNET

paul@athertn.Atherton.COM (Paul Sander) (09/26/90)

In article <37545@ut-emx.uucp> nwagner@ut-emx.uucp (Neal R. Wagner) writes:
>
>  I need a fast method to count the number of bits that are set in a 32-bit
>integer on a Sun 3/80 running 4.0.3 SunOS.  The brute-force method follows.
>Is there a faster way in C?  How close in speed to a hand-coded assembly
>routine would a fast method in C be (after the latter is run through the
>optimizer)?
>  Thanks in advance for your help.  Send e-mail and I will summarize for the
>net.
>
>===============================================================================
>#define	BIT_SETSIZE	32
>typedef	unsigned long	bit_set;
>#define BIT_ISSET(bit,bitset)	(((bitset) & (1<<(bit))) >> (bit))
>int bit_count();
>
>main()
>{
>	bit_set i;
>	for (i=0; i<=16; i++)
>		printf("%d has %d bits on\n", i, bit_count(i));
>}
>
>intint bit_count(i)
>bit_set i;
>{
>	int j, k;
>	k = 0;
>	for (j=0; j<BIT_SETSIZE; j++) k = k + BIT_ISSET(j, i);
>	return k;
>}
>===============================================================================

I can't think of a faster way to do it, but can think of a more portable
way (portable to two's complement machines, anyway):

int bit_count(i)
bit_set i;
{
	int k;
	for (k = 0 ; i != 0; i = i >> 1) k += (i & 1);
	return k;
}

Or:

int bit_count(i)
bit_set i;
{
	int k;
	for (k = 0 ; i != 0; i = i >> 1)
		if (i & 1) k++;
	return k;
}

Note that this requires the bit_set typedef to be unsigned, otherwise the
right-shifts are sign-extended and the loop becomes infinite.
-- 
Paul Sander        (408) 734-9822  | "Passwords are like underwear," she said,
paul@Atherton.COM                  | "Both should be changed often."
{decwrl,pyramid,sun}!athertn!paul  | -- Bennett Falk in "Mom Meets Unix"

ts@cup.portal.com (Tim W Smith) (09/26/90)

How about something like this?

int	NumberOfBitsInLong( register unsigned long input )
{
	register int count;

	for ( count = 0; input != 0; count++ )
		input &= input-1;

	return count;
}

						Tim Smith

davea@quasar.wpd.sgi.com (David B.Anderson) (09/26/90)

In article <md.654260871@achilles>, md@doc.ic.ac.uk (Mark Dawson) writes:
> In <3820@goanna.cs.rmit.oz.au> ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) writes:
> >There's a rather nice loop that does one iteration per bit, and works
> >for any size ...
> 
> Here it is:

See Harbison & Steele, C A Reference Manual,
first and second edition, Pages 174-176.

[ David B. Anderson  Silicon Graphics  (415)335-1548  davea@sgi.com ]
[``What can go wrong?''                          --Calvin and Hobbes]

roger@everexn.com (Roger House) (09/27/90)

In <37545@ut-emx.uucp> nwagner@ut-emx.uucp (Neal R. Wagner) writes:

>  I need a fast method to count the number of bits that are set in a 32-bit
>integer on a Sun 3/80 running 4.0.3 SunOS.  The brute-force method follows.
>Is there a faster way in C?  How close in speed to a hand-coded assembly
>routine would a fast method in C be (after the latter is run through the
>optimizer)?


The following approach should run faster since it operates on 4 bytes
rather than on 32 bits.  However, a table of 256 bytes is required.


static char num1bits[256] =
{
	/*   0 */	0,		/* num1bits[i] = no. of 1 bits	*/
	/*   1 */	1,		/*   in the binary representa-	*/
	/*   2 */	1,		/*   tion of the number i	*/
	/*   3 */	2,
	/*   4 */	1,
	/*   5 */	2,
		...
	/* 254 */	7,
	/* 255 */	8,
};



int count1bits(unsigned long x)

{
	return (
		num1bits[ x     & 0xff] + num1bits[(x>8)  & 0xff] + 
		num1bits[(x>16) & 0xff] + num1bits[(x>24) & 0xff]
	       );
}


						Roger House

parke@star.enet.dec.com (Bill Parke) (09/27/90)

In article <34281@cup.portal.com>, ts@cup.portal.com (Tim W Smith) writes:
|> From: ts@cup.portal.com (Tim W Smith)
|> Newsgroups: comp.lang.c
|> Subject: Re: count of bits set in a long
|> 
|> How about something like this?
|> 
|> int	NumberOfBitsInLong( register unsigned long input )
                                     ^^^^^^^^
		Not necessary
|> {
|> 	register int count;
|> 
|> 	for ( count = 0; input != 0; count++ )
|> 		input &= input-1;
	This will count bits in a signed number (see
	my previous posting of this using "while" )
|> 
|> 	return count;
|> }
|> 
|> 						Tim Smith
|> 
--
Bill Parke 			parke%star.enet.dec@decwrl.dec.com
VMS Development			decwrl!star.enet.dec.com!parke
Digital Equipment Corp		parke@star.enet.dec.com
110 Spit Brook Road ZK01-1/F22, Nashua NH 03063

The views expressed are my own.

rsalz@bbn.com (Rich Salz) (09/27/90)

X11R4/src/mit/server/dix/devices.c:
    int
    Ones(mask)                /* HACKMEM 169 */
	Mask mask;
    {
	register Mask y;

	y = (mask >> 1) &033333333333;
	y = mask - y - ((y >>1) & 033333333333);
	return (((y + (y >> 3)) & 030707070707) % 077);
    }

X11R4/src/mit/X11/X.h:
    typedef unsigned long Mask;
-- 
Please send comp.sources.unix-related mail to rsalz@uunet.uu.net.
Use a domain-based address or give alternate paths, or you may lose out.

kdq@demott.COM (Kevin D. Quitt) (09/27/90)

In article <Mazyneu00WB6EAjWo8@andrew.cmu.edu> sa1z+@andrew.cmu.edu (Sudheer Apte) writes:
>I was thinking along these lines:
>
>static int num_bits[256] = {0, 1, 1, 2, 1, ... /* etc. */ };
>
>struct four_pack { char a; char b; char c; char d;};
>
>int bit_count(i)
>unsigned long i;
>{
>	struct four_pack f;
>	f = (struct four_pack) i;
>	return num_bits[f.a] + num_bits[f.b] + num_bits[f.c] +
>		num_bits[f.d];
>}
>
>Would this be faster, in reality, than shifts and iteration? 
>

    Massively.  Look at the code produced - it is generally two or three
instructions per byte. 

-- 
 _
Kevin D. Quitt         demott!kdq   kdq@demott.com
DeMott Electronics Co. 14707 Keswick St.   Van Nuys, CA 91405-1266
VOICE (818) 988-4975   FAX (818) 997-1190  MODEM (818) 997-4496 PEP last

                96.37% of all statistics are made up.

chris@mimsy.umd.edu (Chris Torek) (09/28/90)

In article <2859@litchi.bbn.com> rsalz@bbn.com (Rich Salz) posts the
`hackmem 169' technique for counting set bits in a 32 bit word:
>	y = (mask >> 1) &033333333333;
>	y = mask - y - ((y >>1) & 033333333333);
>	return (((y + (y >> 3)) & 030707070707) % 077);

This is similar to one of the methods summarized earlier, which also
used remainders (63 here, 255 there).  Note that remainder (`%') is
a `heavy' operation on many computers---you can typically do anywhere
from four to a several dozen `regular' instructions in the time it
takes to compute a remainder.  One of the other methods is therefore
likely to be faster.

(The nice thing about the `x &= x-1' loop is that it is independent of
the word size, and only runs for as many bits as are set.  For a 32 bit
word and a random distribution, that is 16 iterations average, so the
look-up-each-byte version is likely to be faster, requiring only four
table lookups and three additions.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 405 2750)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

tac@cs.brown.edu (Theodore A. Camus) (09/28/90)

>       unsigned long mask,y;
>
>	y = (mask >> 1) &033333333333;
>	y = mask - y - ((y >>1) & 033333333333);
>	return (((y + (y >> 3)) & 030707070707) % 077);


As chris@mimsy.umd.edu noted, '%' is usually a costly operation.
However, 077 in octal is 63 in decimal, and I believe the following
relationship is true : x % 63 = [(x % 64) + (x / 64)] % 63 and can 
be efficiently calculated by : 

        while (x > 63) {             /* 2 assembly instructions */
          x =  (x & 63)+(x >> 6);    /* 3 assembly instructions */
        }
        if (x = 63)                  /* 1 assembly instruction  */
          return 0                   /* 1 assembly instruction  */
        else                         /* fall through, take value of x */
          return x;                  /*    =  0 instructions         */
         
The loop is not expected to execute approx. log  x times, i.e. not many.
                                                63
Furthermore, although the lookup table method is conceptually elegant,
it has essentially NO locality of reference, and could easily cause several
cache misses, forcing a line fill just to access one byte, whereas this
code-only method will benefit from code pre-fetching etc.


my $2.0e-2  -  Ted



  CSnet:     tac@cs.brown.edu                          Ted Camus  
  ARPAnet:   tac%cs.brown.edu@relay.cs.net             Box 1910 CS Dept
  BITnet:    tac@browncs.BITNET                        Brown University
  "An ounce of example is worth a pound of theory."    Providence, RI 02912

jrv@sdimax2.mitre.org (VanZandt) (10/01/90)

tac@cs.brown.edu (Theodore A. Camus) writes...
> As chris@mimsy.umd.edu noted, '%' is usually a costly operation.
> However, 077 in octal is 63 in decimal, and I believe the following
> relationship is true : x % 63 = [(x % 64) + (x / 64)] % 63 ...

The corresponding algorithm for finding x%9 in the decimal world is
known as "casting out nines".

> ...and can be efficiently calculated by : 
...
>        if (x = 63)
               ^ should be ==
>          return 0 
                   ^ needs ;

The function is then...

bitcount(unsigned long mask)
{	unsigned long y;
	
	y = (mask >> 1) &033333333333L;
	y = mask - y - ((y >>1) & 033333333333L);
	y = (y + (y >> 3)) & 030707070707L;

			/* find y%077 using the relation 
				y % 077 = [(y % 0100) + (y / 0100)] % 077 */
	while (y > 077) {				/* 2 assembly instructions */
		y =  (y & 077)+(y >> 6);	/* 3 assembly instructions */
		}
	if (y == 077)					/* 1 assembly instruction  */
		return 0;					/* 1 assembly instruction  */
	else							/* fall through, take value of y */
		return y;					/*    =  0 instructions   */
}

Would the corresponding algorithm using 255 rather than 63 be faster?

                           - Jim Van Zandt

jrv@sdimax2.mitre.org (VanZandt) (10/02/90)

I think I have finally figured out how the first part of that bit count routine
is working.  I describe the algorithm as follows:

	1.	Establish a count in each 3 bit field of the number of bits set
		in the corresponding field of the input number.
	2.	Successively "double up": combine pairs of fields while summing
		their contents.
	3.	Sum up the contents of all the fields using a modulo operation.

We have to start with a 3 bit field, because a 3 bit counter can count
the bits in a 6 bit field, but a 2 bit counter can't count the bits in
a 4 bit field.  We have to "double up" at least once, because a 3 bit
counter can't count the bits in a 32 bit word, while a 6 bit counter
can.  NOTE: with a 64 bit word, we would have to "double up" at least TWICE.

We have two implementation decisions - how many times to "double up",
and whether to do the modulo operation directly or with a loop as
proposed by tac@cs.brown.edu.  The loop is favored if '%' is expensive, as
suggested by chris@mimsy.umd.edu, or if high order bits are usually not
set.

If the loop is used, it could pay to double up an extra time so fewer
iterations are needed.  (Finding x%077 requires log to the base 63 of
x, or ln(x)/ln(63), iterations.  x%07777 requires ln(x)/ln(07777)
iterations, or about half as many).  

If we double up four times, no modulo operation is needed at all.

For example...

bitcount4(unsigned long mask)
{	unsigned long y;

	y = (mask >> 1) &033333333333L;
	y = mask - y - ((y >>1) & 033333333333L);
			/* each 3 bit field now contains the count of 
				bits set in the corresponding field of 'mask' */
	y = ((y >> 3) + y) & 030707070707L;		/* double up */
			/* each 6 bit field now contains the count of 
				bits set in the corresponding field of 'mask' */
	y = ((y >> 6) + y) & 007700770077L;		/* double up 2nd time */
			/* each 12 bit field now contains the count of 
				bits set in the corresponding field of 'mask' */
	return y % 07777;
}

I coded up six variations in all using Turbo C: 

bitcount3: naive routine using (y>>1) 32 times
bitcount1: doubling up once, finding y%077 using loop in separate function
bitcount0: doubling up once, finding y%077 using loop 
bitcount5: doubling up 4 times, no modulo operation needed
bitcount4: doubling up twice, finding y%07777 directly
bitcount2: doubling up once, finding y%077 directly

The relative execution times on a '386, as recorded by the Turbo Profiler, are:

_bitcount3         2.14 sec  22% |**************************
_bitcount1+_mod63  0.80 sec   7% |********
_bitcount0         0.64 sec   6% |*******
_bitcount5         0.38 sec   3% |****
_bitcount4         0.35 sec   3% |****
_bitcount2         0.15 sec   1% |*

For this architecture, neither extra doublings nor use of the loop
appear helpful.  I seem to remember that an 8088 needs many more cycles
for an integer division, suggesting that the loop might help there.

                         - Jim Van Zandt

davidsen@antarctica.crd.GE.COM (william E Davidsen) (10/02/90)

  There have been lots of fast but non-portable solutions posted. I
think this problem lends itself to a non-portable solution for speed,
but please remember that if you assume (a) all longs have 32 bits or (b)
all computers have 2's complement arithmetic, or (c) all bytes are 8
bits...

                ** All the world's not a VAX (or Sun) **

and many of the solutions posted are not going to work.

  I like the one which said (paraphrase)

	while (n) {
	  count += (n & 1);
	  n >>= 1;
	}
-- 
Bill Davidsen (davidsen@crdos1.crd.ge.com, uunet!crdgw1!crdos1!davidsen)
  GE Corp R&D Center, Schenectady NY
  Moderator of comp.binaries.ibm.pc and 386users mailing list
       "This is your PC. This is your PC on OS/2. Any questions?"

kdq@demott.COM (Kevin D. Quitt) (10/03/90)

In article <12330@crdgw1.crd.ge.com> davidsen@antarctica.crd.GE.COM (william E Davidsen) writes:
>
>  There have been lots of fast but non-portable solutions posted.
>
>  I like the one which said (paraphrase)
>
>	while (n) {
>	  count += (n & 1);
>	  n >>= 1;
>	}

    This one, of course, assumes that the shift is logical, not arithmetic,
or your first negative number will keep you occupied for quite a while.  I
think the best solution is:

	char	*ptr	= (char *)&lvar;
	int	 count	= 0;

	for ( i = 0; i < sizeof( long ); i++ )
	    count	+= bits_in_byte[ *p++ ];
-- 
 _
Kevin D. Quitt         demott!kdq   kdq@demott.com
DeMott Electronics Co. 14707 Keswick St.   Van Nuys, CA 91405-1266
VOICE (818) 988-4975   FAX (818) 997-1190  MODEM (818) 997-4496 PEP last

                96.37% of all statistics are made up.

karl@haddock.ima.isc.com (Karl Heuer) (10/03/90)

In article <12330@crdgw1.crd.ge.com> davidsen@crdos1.crd.ge.com (bill davidsen) writes:
>[Aren't these solutions non-portable?]

Didn't the original problem ask for the bit count of "a 32-bit word"?  In
any case, most of the solutions can be easily configured to use the constants
from <limits.h>.  (I may have left them out of my solution, for brevity.)

The two's complement assumption isn't a problem if you define the argument to
be an unsigned type (which it *should* be, if you're doing bit manipulations).

Karl W. Z. Heuer (karl@kelp.ima.isc.com or ima!kelp!karl), The Walking Lint