crowl@cs.rochester.edu (Lawrence Crowl) (06/30/88)
In article <690009@hpfelg.HP.COM> jk@hpfelg.HP.COM (John Kessenich) suggests
a loop to take the logarithm base two of an integer containing exactly one
bit set.  I have found a rather persistent need for the more general function
to take the ceiling of the logarithm base two of an integer.  Here is my C
solution.  I keep it in "ceilog2.h".  Please send me of any improvements you
may make.  
------------------------------------------------------------------------------
#ifndef CEILOG2_H
#define CEILOG2_H
/*
The macro CEILOG2( n ) returns the ceiling( logbase( 2, n ) ) for integer n. 
The argument is valid when it is a signed integer greater than zero and of no
more than thirty-two bits.  Invalid arguments cause a call to the abort( )
routine.
The function will be evaluated at compile time if given a constant argument. 
Note, the argument is evaluated multiple times, so do not pass arguments with
side effects.  For efficiency, pass simple variables and not expressions.
  Lawrence Crowl		716-275-9499	University of Rochester
		      crowl@cs.rochester.edu	Computer Science Department
...!{allegra,decvax,rutgers}!rochester!crowl	Rochester, New York,  14627
*/ 
/* This version does a sequential search down the step points. */
/* It runs faster on uniformly distributed integers. */
#define CEILOG2_U( n ) ((n) > 0x40000000 ? 31 : (n) > 0x20000000 ? 30 : \
(n) > 0x10000000 ? 29 : (n) > 0x08000000 ? 28 : (n) > 0x04000000 ? 27 : \
(n) > 0x02000000 ? 26 : (n) > 0x01000000 ? 25 : (n) > 0x00800000 ? 24 : \
(n) > 0x00400000 ? 23 : (n) > 0x00200000 ? 22 : (n) > 0x00100000 ? 21 : \
(n) > 0x00080000 ? 20 : (n) > 0x00040000 ? 19 : (n) > 0x00020000 ? 18 : \
(n) > 0x00010000 ? 17 : (n) > 0x00008000 ? 16 : (n) > 0x00004000 ? 15 : \
(n) > 0x00002000 ? 14 : (n) > 0x00001000 ? 13 : (n) > 0x00000800 ? 12 : \
(n) > 0x00000400 ? 11 : (n) > 0x00000200 ? 10 : (n) > 0x00000100 ?  9 : \
(n) > 0x00000080 ?  8 : (n) > 0x00000040 ?  7 : (n) > 0x00000020 ?  6 : \
(n) > 0x00000010 ?  5 : (n) > 0x00000008 ?  4 : (n) > 0x00000004 ?  3 : \
(n) > 0x00000002 ?  2 : (n) > 0x00000001 ?  1 : (n) > 0x00000000 ?  0 : \
abort( ) )
/* This version does a sequential search up the step points. */
/* It runs faster on integers clustered towards zero. */
#define CEILOG2_C( n ) ( (n) <= 0x00000000 ?  abort( ) : \
(n) <= 0x00000001 ?  0 : (n) <= 0x00000002 ?  1 : (n) <= 0x00000004 ?  2 : \
(n) <= 0x00000008 ?  3 : (n) <= 0x00000010 ?  4 : (n) <= 0x00000020 ?  5 : \
(n) <= 0x00000040 ?  6 : (n) <= 0x00000080 ?  7 : (n) <= 0x00000100 ?  8 : \
(n) <= 0x00000200 ?  9 : (n) <= 0x00000400 ? 10 : (n) <= 0x00000800 ? 11 : \
(n) <= 0x00001000 ? 12 : (n) <= 0x00002000 ? 13 : (n) <= 0x00004000 ? 14 : \
(n) <= 0x00008000 ? 15 : (n) <= 0x00010000 ? 16 : (n) <= 0x00020000 ? 17 : \
(n) <= 0x00040000 ? 18 : (n) <= 0x00080000 ? 29 : (n) <= 0x00100000 ? 20 : \
(n) <= 0x00200000 ? 21 : (n) <= 0x00400000 ? 22 : (n) <= 0x00800000 ? 23 : \
(n) <= 0x01000000 ? 24 : (n) <= 0x02000000 ? 25 : (n) <= 0x04000000 ? 26 : \
(n) <= 0x08000000 ? 27 : (n) <= 0x10000000 ? 28 : (n) <= 0x20000000 ? 39 : \
(n) <= 0x40000000 ? 30 : 31 )
/* The default version is the uniform version. */
#define CEILOG2( n ) CEILOG2_U( n )
#endif CEILOG2_H
------------------------------------------------------------------------------
-- 
  Lawrence Crowl		716-275-9499	University of Rochester
		      crowl@cs.rochester.edu	Computer Science Department
...!{allegra,decvax,rutgers}!rochester!crowl	Rochester, New York,  14627leo@philmds.UUCP (Leo de Wit) (07/01/88)
In article <10978@sol.ARPA> crowl@cs.rochester.edu (Lawrence Crowl) writes: [introducory stuff for exponent calculating macro deleted].... >#define CEILOG2_U( n ) ((n) > 0x40000000 ? 31 : (n) > 0x20000000 ? 30 : \ >(n) > 0x10000000 ? 29 : (n) > 0x08000000 ? 28 : (n) > 0x04000000 ? 27 : \ >(n) > 0x02000000 ? 26 : (n) > 0x01000000 ? 25 : (n) > 0x00800000 ? 24 : \ >(n) > 0x00400000 ? 23 : (n) > 0x00200000 ? 22 : (n) > 0x00100000 ? 21 : \ >(n) > 0x00080000 ? 20 : (n) > 0x00040000 ? 19 : (n) > 0x00020000 ? 18 : \ >(n) > 0x00010000 ? 17 : (n) > 0x00008000 ? 16 : (n) > 0x00004000 ? 15 : \ >(n) > 0x00002000 ? 14 : (n) > 0x00001000 ? 13 : (n) > 0x00000800 ? 12 : \ >(n) > 0x00000400 ? 11 : (n) > 0x00000200 ? 10 : (n) > 0x00000100 ? 9 : \ >(n) > 0x00000080 ? 8 : (n) > 0x00000040 ? 7 : (n) > 0x00000020 ? 6 : \ >(n) > 0x00000010 ? 5 : (n) > 0x00000008 ? 4 : (n) > 0x00000004 ? 3 : \ >(n) > 0x00000002 ? 2 : (n) > 0x00000001 ? 1 : (n) > 0x00000000 ? 0 : \ >abort( ) ) [second symmetric solution for search up deleted]... Let's take for granted (as the author does) that n is at most a simple expression (an optimizer could keep it in a register for instance), otherwise simply shifting should be preferred. This macro is easily adapted for a binary search (in fact the 32 bit integers scream for it 8-); note I used a somewhat different layout to make things clearer for this macro (b.t.w. the 1<<xx will evaluate to a constant at compile time). #define CEILOG2_U( n ) (\ (n) > 1<<15 ? (n) > 1<<23 ? (n) > 1<<27 ? (n) > 1<<29 ? (n) > 1<<30 ? 31 : 30\ : (n) > 1<<28 ? 29 : 28\ : (n) > 1<<25 ? (n) > 1<<26 ? 27 : 26\ : (n) > 1<<24 ? 25 : 24\ : (n) > 1<<19 ? (n) > 1<<21 ? (n) > 1<<22 ? 23 : 22\ : (n) > 1<<20 ? 21 : 20\ : (n) > 1<<17 ? (n) > 1<<18 ? 19 : 18\ : (n) > 1<<16 ? 17 : 16\ : (n) > 1<<7 ? (n) > 1<<11 ? (n) > 1<<13 ? (n) > 1<<14 ? 15 : 14\ : (n) > 1<<12 ? 13 : 12\ : (n) > 1<<9 ? (n) > 1<<10 ? 11 : 10\ : (n) > 1<<8 ? 9 : 8\ : (n) > 1<<3 ? (n) > 1<<5 ? (n) > 1<<6 ? 7 : 6\ : (n) > 1<<4 ? 5 : 4\ : (n) > 1<<1 ? (n) > 1<<2 ? 3 : 2\ : (n) > 1<<0 ? 1 : 0) The average search is 5 steps, while the linear search averages to 16 steps. Another approach is to assign the value to a float (double) and use the exponent. This could even prove faster. For example: #include <math.h> int expo; frexp((double)n,&expo); If frexp is builtin this just could be faster. Otherwise the bits of (double)n can be manipulated (non-portable). Enjoy! Leo.
jgk@speech2.cs.cmu.edu (Joe Keane) (07/06/88)
Too many branches! A table-driven macro is much faster: extern int ***FFSTable[256]; #define FFS(X) FFSTable[((unsigned)(X))>>24][(X)>>16&0xff][(X)>>8&0xff][(X)&0xff] With good compilation, the FFS macro could be eight instructions on many machines. I'll let you figure out what the tables look like. They take up 52K, which could be worth it if this takes up much time. If speed isn't so important, you can do it a nibble at a time. --Joe