[comp.lang.c] Divide and C

karln@uunet.uu.net (Karl Nicholas) (03/28/91)

    Greetings most knowledgable C experts.

    I have a problem.


    I get a number. I need to use this number to index into a
    2 dimensional array.

    EX: int array[10][10];
        val = 24;
        ++array[val/10][val%10];

    This is the only way I know how to do this.
    My problem with it is that is requires TWO divides.
    
    Divide 1: val/10. Here the result is used. The remainder
                      is thrown away.

    Divide 2: val%10. Here the remainder is used, the result is thrown away.

    I happen to know that divide instructions are one of the MOST time
    consuming instructions around. I measured it.

    Wanting my program to run almost twice as fast, I wrote an ASSEMBLY
    lang subroutine, to be called from C.

    void ASMDIVIDE( int dividend, int divisor, int *result, int* remain );

    so now I have in my code:

    int, result, remain;
    int array[10][10];
  
    ASMDIVIDE ( val, 10, &result, &remain );
    ++array[result][remain];

    And got most of my desired increase in performance.

    Since I do not consider ASSEMBLY to be portable, how do I do
    this in a C only and portable way?

    I will summarize any e-mailed responses.

    Thanks all.

    Karl Nicholas

    karln!karln@uunet.uu.net

bhoughto@hopi.intel.com (Blair P. Houghton) (03/28/91)

In article <1991Mar27.185804.7221@uunet.uu.net> karln!karln@uunet.uu.net (Karl Nicholas) writes:
>    EX: int array[10][10];
>        val = 24;
>        ++array[val/10][val%10];

ANS:    int array[10][10];              /* for example */
        int dividend, remainder;        /* to use as array indices */

	val = 24;			/* for example */

	dividend = val/10;
	remainder = val - val*dividend;	/* the definition of `%' */
	++array[dividend][remainder];	/* dividend more significant */

This depends on the fact that an integer division
takes even longer than two assignments, a multiplication,
a subtraction, and whatever hocus-pocus the compiler
uses when it sees variables in the array indices...

				--Blair
				  "Lawyer:  never ask a question to which
				   you don't already know the answer.
				   Programmer:  never ask a question to
				   which you do already know the answer..."

darcy@druid.uucp (D'Arcy J.M. Cain) (03/28/91)

In article <1991Mar27.185804.7221@uunet.uu.net> Karl Nicholas writes:
>    EX: int array[10][10];
>        val = 24;
>        ++array[val/10][val%10];
>
>    This is the only way I know how to do this.
>    My problem with it is that is requires TWO divides.

How about ZERO divides?  Use "++array[val];"  The following test program
works under GCC.

#include	<stdio.h>
int		main(void)
{
	int		a[10][10], *p, v = 24;

	p = (int *)(a);
	p[v] = 0x1234;
	printf("%x\n", a[2][4]);
	return(0);
}
-- 
D'Arcy J.M. Cain (darcy@druid)     |
D'Arcy Cain Consulting             |   There's no government
Toronto, Ontario, Canada           |   like no government!
+1 416 424 2871                    |

volpe@camelback.crd.ge.com (Christopher R Volpe) (03/29/91)

In article <3508@inews.intel.com>, bhoughto@hopi.intel.com (Blair P.
Houghton) writes:
|>In article <1991Mar27.185804.7221@uunet.uu.net>
karln!karln@uunet.uu.net (Karl Nicholas) writes:
|>>    EX: int array[10][10];
|>>        val = 24;
|>>        ++array[val/10][val%10];
|>
|>ANS:    int array[10][10];              /* for example */
|>        int dividend, remainder;        /* to use as array indices */
|>
|>	val = 24;			/* for example */
|>
|>	dividend = val/10;
|>	remainder = val - val*dividend;	/* the definition of `%' */
|>	++array[dividend][remainder];	/* dividend more significant */

Since what Karl obviously wants is the <val>th entry into the
2D array, in "storage-order", this can be done without any divisions
whatsoever. Remember, after you explicitly compute the quotient and
remainder, the compiler is going to do the complete opposite when it needs
to compute the offset from the array indices. Therefore, just bypass both
steps like this:

     (&array[0][0])[val]

I don't think you can get much more efficient than that. "&array[0][0]"
is evaluated at compile time as a pointer to the first int in the 2D array,
and then [val] just offsets it as if it were a regular one-dimensional array.

-Chris

P.S.:
  I was going to suggest the following:
   ((int *)array)[val]

which seems cleaner to me because it doesn't have the "&" and "[]" which
cancel each other out, so to speak. However, I figured someone would
come up with a reason why casting from "pointer to array [10] of int"
to "pointer to int" is not a great idea. Hmm. Considering that
"array[0]" is of type "array [10] of int" which decays to "pointer to int"
in a value context, perhaps we can just say this:
   (array[0])[val]
or, equivantly(?)
   array[0][val]

Anyone care to comment on the strictly-conformingness of the above
four expressions?
                        
==================
Chris Volpe
G.E. Corporate R&D
volpecr@crd.ge.com

steve@taumet.com (Stephen Clamage) (03/29/91)

karln!karln@uunet.uu.net (Karl Nicholas) writes:

>    I get a number. I need to use this number to index into a
>    2 dimensional array.

>    EX: int array[10][10];
>        val = 24;
>        ++array[val/10][val%10];

>    This is the only way I know how to do this.
>    My problem with it is that is requires TWO divides.

Some compilers are smart enough to recognize a pattern like
	array[val/10][val%10]
and do a single division, using the resulting quotient and remainder.
Some are smart enough to recognize this pattern across statement
boundaries, e.g.,
	i = a/b; j = a%b;

If your compiler is not this smart and you know that the time spent in
the extra division is making your program unacceptably slow, you can
try the Standard C library function div(), which takes a numerator
and denominator and returns a structure containing the quotient and
remainder.  Use of this function might turn out to be slower than
the extra divide.
-- 

Steve Clamage, TauMetric Corp, steve@taumet.com

lamont@uts.amdahl.com (Duane Richard LaMont) (03/29/91)

In article <1991Mar27.185804.7221@uunet.uu.net> karln!karln@uunet.uu.net (Karl Nicholas) writes:
>    EX: int array[10][10];
>        val = 24;
>        ++array[val/10][val%10];
>
>    This is the only way I know how to do this.
>    My problem with it is that is requires TWO divides.
>    
>    Wanting my program to run almost twice as fast, I wrote an ASSEMBLY
>    lang subroutine, to be called from C.

Considering that you've already resorted to assembly language, the
following hack shouldn't bother you.  Provided that the array is
defined as in your example with both dimensions specified, you could
use:

	++((int *) array)[val];

or even:

	++*(*array + val);

However, if your program uses arrays of pointers to arrays of
integers, the above won't work and there is no counterpart solution.
See the FAQ if you don't know what I'm talking about.


Rick LaMont

karln@uunet.uu.net (03/30/91)

>
>If your compiler is not this smart and you know that the time spent in
>the extra division is making your program unacceptably slow, you can
>try the Standard C library function div(), which takes a numerator
>and denominator and returns a structure containing the quotient and
>remainder.  Use of this function might turn out to be slower than
>the extra divide.
>
>Steve Clamage, TauMetric Corp, steve@taumet.com

	This shows a good question someone pointed out
about the speed of an assembly divide plus a multiply
verses doing the real binary divide in C with subtracts and
shifts. Here is the posting that did not make into the
summary.


int twodivide(val,divisor,&result,&remain)
{
        int r;
        r=val/divisor;
        *result=r;
        *remain=val-divisor*r;
}
 
that's with a multiply (divisor*r) - but that's probably a lot faster 
than looping, subtracting divisor each time, adding 1 onto r, you know.

Russell Schulz             ersys!rschulz@nro.cs.athabascau.ca
Edmonton Remote Systems:  Serving Northern Alberta since 1982


  Do you suppose that is true? I do have that 'Binary Divide' code around 
somewhere. I think I will find it and do a benchmark. Stay tuned. 

	Karl Nicholas
	karln!karln@uunet.uu.net

PS. I wonder if the optimizer will recoqnize the Binary Divide
routine and just do an Assembly Divide :-)  

HARM@SLACVM.SLAC.STANFORD.EDU (03/30/91)

An unportable but fun way;
         waste some space to make your storage 2i X 2j
         use the binary representation of the index
                 clip it between 2j and 2(j+1) with mask
                 shift upper to byte boundary and evaluate
both numbers determined by shift and mask. Kept in registers
could be pretty quick. if dimensions are under 256X256 it
could even maintain a great deal of portability.

       Would have sent to you directly but !'s are confusing to my
mailer.







       For that which has spewn forth, blame not my company, my university,
nor my country.  'Tis my own thoughts bubbled over's all.  Blame most
properly the oversugaring of coffee by me.
                                               Jim Harm
                            SLACVM.SLAC.STANFORD.EDU

neufeld@aurora.physics.utoronto.ca (Christopher Neufeld) (03/30/91)

   On the subject of slow divides, I managed to cut ten percent off the
execution time of a program by doing division of floating points by
powers of two directly. My machine doesn't have a coprocessor (yet). I
wrote a test program to see just how much the divide was costing me:

   for (i=1;i<=N;i++)
     for (j=1;j<=N;j++)
       a[i][j] = b[i][j] / 4.0;

with 'a' and 'b' pointers to pointers to doubles (64-bit IEEE in my
compiler).

   I compared that to the following:
   int *ptr;
   double hold;

   for (i=1;i<=N;i++)
     for (j=1;j<=N;j++) {
       hold = b[i][j];
       ptr = (int *) &hold;
       ptr += 3;
       *ptr -= 0x20;
       a[i][j] = hold;
     }

and despite all the extra junk in there, it still worked almost three
times faster. Multiplication was faster by about a factor of two. In the
end, I didn't implement this in any code because it's obscure, to say
the least, and because it can fail badly if the operation is being
applied to a number less than about 2^(-1022), or about 1e-306.

   Are there compilers which go to the trouble of hardcoding something
like this if they see division or multiplication by integer powers of
two, or do they all get lazy and assume a math coprocessor?


-- 
 Christopher Neufeld....Just a graduate student  | Flash: morning star seen
 neufeld@aurora.physics.utoronto.ca    Ad astra! | in evening! Baffled
 cneufeld@{pnet91,pro-cco}.cts.com               | astronomers: "could mean
 "Don't edit reality for the sake of simplicity" | second coming of Elvis!"

gwyn@smoke.brl.mil (Doug Gwyn) (03/30/91)

In article <1991Mar29.182919.20728@helios.physics.utoronto.ca> neufeld@aurora.physics.utoronto.ca (Christopher Neufeld) writes:
>   Are there compilers which go to the trouble of hardcoding something
>like this if they see division or multiplication by integer powers of
>two, or do they all get lazy and assume a math coprocessor?

Most compilers I've dealt with have optimized integer division and
multiplication by constant powers of two into shift instructions,
but not for floating-point operands.

By the way, it is not necessarily "lazy" to rely on an FPU for speed.
Anyone who has serious floating-point demands is going to get one
anyway.

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (04/01/91)

In article <b7mN01vS4cVi00@amdahl.uts.amdahl.com> lamont@amdahl.uts.amdahl.com (Duane Richard LaMont) writes:
> 	++((int *) array)[val];
> However, if your program uses arrays of pointers to arrays of
> integers, the above won't work and there is no counterpart solution.

This is a common argument of Fortran programmers against the common C
trick of keeping a table of pointers to each row of an array. It's an
incorrect argument. You can allocate the array contiguously, and use the
pointer to the first row *exactly* the same way as ``array'' in the
above example.

(Btw, I'm not accusing you of being a Fortran programmer. :-) )

---Dan