[comp.lang.c] which bits are set

hartley@emily.uvm.edu (Stephen J. Hartley) (12/13/90)

  Remember the series of articles on counting the number of bits that are
set in a 32-bit integer?  I have a related problem: determining the position
numbers of all the bits that are set in a 32-bit integer.  The "brute force"
approach follows.  Is there a faster (clever) way to do this in C?  No, I am
not a student trying to get the net to do my homework.
  Thanks in advance for your help.  Send e-mail and I will summarize for the
net.
                        Steve Hartley
                        Computer Science faculty
                        Trinity University in San Antonio

===============================================================================
#define BIT_SETSIZE     32
typedef unsigned long   bit_set;

#define NELEM(array) (sizeof(array)/sizeof(array[0]))

void which_bits_set();

main() {
/*
 * Some test data.
 */
        static bit_set test[] = {
                0x7, 0x40000006, 0x7007, 0xf2f4, 0x1f2001f4, 0x80000000,
                0xffffffff, 0xabcabcdd, 0x0, 0x10000000, 0x00070000,
                0x11111111, 0x22222222, 0x33333333, 0x44444444
        };
        int i, m;
        bit_set set_of_bits;
        int bits_are_set[BIT_SETSIZE];

        for (i=0; i<NELEM(test); i++) {
                set_of_bits = test[i];
                printf("0x%08x has these bits set:", set_of_bits);
                which_bits_set(bits_are_set, set_of_bits);
                for (m=0; m<BIT_SETSIZE; m++) {
                        if (bits_are_set[m] < 0) break;
                        printf(" %d", bits_are_set[m]);
                }
                printf("\n");
        }
}

/*
 * This procedure determines which bits are set in a 32-bit long integer
 * and fills up an array with the position numbers (zero based) of the
 * set bits.  A -1 is put into the array if there is an unused slot to
 * serve as a terminator.
 */

void which_bits_set(bits_are_set, set_of_bits)
int bits_are_set[]; bit_set set_of_bits;
{
        int j, m;

        m = 0;
        for (j=0; j<BIT_SETSIZE; j++) {
                if (set_of_bits & 1) {
                        bits_are_set[m] = j;
                        m++;
                }
                set_of_bits = set_of_bits >> 1;
        }
        if (m < BIT_SETSIZE) bits_are_set[m] = -1; /* terminator if room */
}
===============================================================================
Stephen J. Hartley, Assistant Professor, phone O:(512)736-7480, H:344-6523
Department of Computer Science, Trinity University, San Antonio, TX 78212
hartley@uvm.edu || ...!uvm.edu!hartley || ...!uunet!uvm-gen!hartley

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (12/17/90)

In article <1990Dec12.205114.2376@uvm.edu> hartley@emily.uvm.edu (Stephen J. Hartley) writes:
> I have a related problem: determining the position
> numbers of all the bits that are set in a 32-bit integer.  The "brute force"
> approach follows.  Is there a faster (clever) way to do this in C?

The brute force (i.e., fast) approach is to use a big table.

Even better, represent sets of numbers as integers.

---Dan

sef@kithrup.COM (Sean Eric Fagan) (12/17/90)

In article <3047:Dec1618:51:1590@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>The brute force (i.e., fast) approach is to use a big table.

Eek.  Why do that?  The only way I can think of to do it faster than the
presented method is to get rid of the shift and loop:

	set_bits[0] = val&1;
	set_bits[1] = val&2;
	set_bits[2] = val&4;
	/* ... */
	set_bits[31] = val&0x80000000;

Takes 32 sequential statements; on some machines, it will take 32
instructions, while on others, it might take 64.  It should not take more
than that, though (r0 <= val&mask; (set_bits+x) <= r0).

-- 
Sean Eric Fagan  | "I made the universe, but please don't blame me for it;
sef@kithrup.COM  |  I had a bellyache at the time."
-----------------+           -- The Turtle (Stephen King, _It_)
Any opinions expressed are my own, and generally unpopular with others.

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (12/18/90)

In article <1990Dec17.071404.6544@kithrup.COM> sef@kithrup.COM (Sean Eric Fagan) writes:
> In article <3047:Dec1618:51:1590@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
> >The brute force (i.e., fast) approach is to use a big table.
> Eek.  Why do that?

For speed. The exact answer depends on how you want to store the sets.
If you take my half-facetious suggestion of storing the sets as
integers, it takes 0 operations.

> The only way I can think of to do it faster than the
> presented method is to get rid of the shift and loop:
  [ ... ]
> Takes 32 sequential statements; on some machines, it will take 32
> instructions, while on others, it might take 64.

Eek. Why do you want to make this so slow?

---Dan

sef@kithrup.COM (Sean Eric Fagan) (12/18/90)

In article <15598:Dec1804:57:0390@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>Eek. Why do you want to make this so slow?

Well, let's see.  The problem was to convert something like

	0x12345678

into something like

	bit_set[0] = 0;
	bit_set[1] = 0;
	bit_set[2] = 0;
	bit_set[3] = 1;
/* ... */

Using a table, the way I would think of is something like

	struct bits_set_struct {
		/* int val; -- optional */
		int bit1, bit2, bit3, bit4;
	} bits_set_array[16] = {
		{ 0, 0, 0, 0 },
		{ 1, 0, 0, 0 },
		{ 0, 1, 0, 0 },
		{ 1, 1, 0, 0 },
	/* ... */
	};

	/* ... */

	get_bits_set(word32_t val, int *array) {
		word32_t temp;
		int i;

		for (i=0; i<8; i++) {
			temp = val&0xf;
			memcpy (array, bit_set_array[temp], 4*sizeof(int));
			array+=4;
			val>>=4;
		}
	}

Right?  You could increase the size of the tables, of course, at expense of
memory.

But, to be honest, I don't think that (the one I just did) is going to be
any faster than the method I suggested earlier.  Since I don't really have a
lot of interest in this (just some spare time before going to sleep 8-)), I
haven't tested and timed them.  They're probably so close it doesn't matter
for most purposes (i.e., I'm not going to run a few million or billion tests
to see which one is faster).

-- 
Sean Eric Fagan  | "I made the universe, but please don't blame me for it;
sef@kithrup.COM  |  I had a bellyache at the time."
-----------------+           -- The Turtle (Stephen King, _It_)
Any opinions expressed are my own, and generally unpopular with others.

gceych@juliet.caltech.edu (Eychaner, Glenn C.) (12/19/90)

In article <1990Dec18.093905.17196@kithrup.COM>, sef@kithrup.COM (Sean Eric Fagan) writes...
>In article <15598:Dec1804:57:0390@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>>Eek. Why do you want to make this so slow?
> 
Ok, to throw my two cents in, this is the algorithm I use...

int bit_set_array [SIZE];

void test_for_bits (int test_int)
{
int j = 0;

while (test_int) {
    bit_set_array [j] = test_int & 1;
    test_int >>= 1;
    j++;
    }
for (; j < SIZE; j++) bit_set_array [j] = 0;
}

Note that this has the advantage of dropping out if test_int is small.
Alternatively:

while (j < SIZE) {

has the advantage of not filling the array at the end.
(I hope this is right.  It's a conversion from my parity calculation program,
 which is:)

parity = 0;
while (test_int) {
    parity ^= test_int & 1;
    test_int >>= 1;
    }
return (parity);

I love bitwise operators!

Glenn Eychaner   |Eychaner@SunCub.Caltech.edu |Remember: It is easier to ride a
40386 N Shore Ln |gceych@iago.caltech.edu     |camel through the eye of a needle
Big Bear City, CA| Big Bear Solar Observatory |than to drive a Buick through the
            92314| !*** G O  N I N E R S ***! |hole in a doughnut.

hilfingr@rama.cs.cornell.edu (Paul N. Hilfinger) (12/19/90)

In article <1990Dec18.093905.17196@kithrup.COM> sef@kithrup.COM (Sean Eric Fagan) writes:

>Well, let's see.  The problem was to convert something like
>
>	0x12345678
>
>into something like
>
>	bit_set[0] = 0;
>	bit_set[1] = 0;
>	bit_set[2] = 0;
>	bit_set[3] = 1;

Well, no it wasn't.  The original problem was to convert 0x12345678
into something like

	bit_set[0] = 3;
	bit_set[1] = 4;
	bit_set[2] = 5;
	bit_set[3] = 6;
	bit_set[4] = 9;
	...
	bit_set[15] = -1;   /* Excuse me if my counting is off. */

for which I suspect Mr. Bernstein's proposed solution sketch is
probably pretty good.

P. Hilfinger