[comp.lang.c] Careful "for" Loops

torek@elf.ee.lbl.gov (Chris Torek) (03/20/91)

In article <MCDANIEL.91Mar19124111@dolphin.adi.com> mcdaniel@adi.com
(Tim McDaniel) writes:
>Case a)
>   semi-open intervals, like [low,high).  low is in the range, but
>   high is not.  If high==0, the range extends from low through all
>   higher memory.  The problem is that high==0 is likely to prove a
>   special case in iteration.
>Case b)
>   closed intervals, like [low,high].  The range is inclusive: high is
>   in the range. The problem is that upper bounds look ugly, like
>   0x01ffffff.

>... Zero-length ranges are a possibility.

Languages that do this sort of thing usually take closed intervals,
although there are ways to handle both.

For instance, the loop

	for i := 1 to 300 do foo

(in Pascal) is usually generated as

	if (1 <= 300) {
		for (i = 1;; i++) {
			foo;
			if (i == 300)
				break;
		}
	}

(equivalent C code).  (Yes, it is OK to leave `i' unset; Pascal index
variables may not be examined after the loop ends, until they are set
to some other values.)

If there is a step, the loop must compute the terminator value (or
the number of iterations):

	for i := 1 to 4 by 2

should compare for 3 (for ending) or compute a count of ((4-1)+1)/2
iterations.  In some cases this can be done at compile time.

To do the same for half-open intervals, simply subtract one from the
end value (using unsigned arithemtic, if necessary, to avoid underflow)
and do the same.  The loop

	for i in [m..n) do foo;

can be `compiled' to

	if (m < n) {
		stop = n - 1;
		for (i = 0;; i++) {
			foo;
			if (i == stop)
				break;
		}
	}

Iteration count computations are identical except that instead of

	((end - start) + 1) / incr

you simply use

	(end - start) / incr
-- 
In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427)
Berkeley, CA		Domain:	torek@ee.lbl.gov

gwyn@smoke.brl.mil (Doug Gwyn) (03/21/91)

In article <MCDANIEL.91Mar19124111@dolphin.adi.com> mcdaniel@adi.com (Tim McDaniel) writes:
>1) How do you code a "for" loop to avoid overflow gotchas?

(a) Use pointers instead.
(b) Don't push against the very limit of the representable range.
(c) Use unsigned types and compare for equality with zero.
(d) Consider each case on its own merits.

willcr@bud.sos.ivy.isc.com (Will Crowder) (03/22/91)

In article <15524@smoke.brl.mil>, gwyn@smoke.brl.mil (Doug Gwyn) writes:
|> In article <MCDANIEL.91Mar19124111@dolphin.adi.com> mcdaniel@adi.com (Tim McDaniel) writes:
|> >1) How do you code a "for" loop to avoid overflow gotchas?
|> 
|> (a) Use pointers instead.
|> (b) Don't push against the very limit of the representable range.
|> (c) Use unsigned types and compare for equality with zero.
|> (d) Consider each case on its own merits.

On a somewhat related note, the ANSI C standard guarantees that the address of
the element one past the end of an array is representable and won't overflow.
This makes loops like

{
	int a[10];
	int *p, *q;

	/* point q and at end a */

	q = &a[sizeof a / sizeof a[0]];

	for (p = a; p != q; p++)
		useful_work_goes_here();

}

work properly.  [The (sizeof a / sizeof a[0]) expression keeps the explicit 
knowledge of the size of the array in its declaration.]  The implementation
must place the array a such that &a[10] is representable and will not
overflow.

Will

--------------------------------------------------------------------------------
Will Crowder, MTS            | "I belong to no organized politcal party.  
(willcr@ivy.isc.com)         |  I'm a democrat."  
INTERACTIVE Systems Corp.    |		-- Will Rogers

berg@marvin.e17.physik.tu-muenchen.de (Stephen R. van den Berg) (03/26/91)

Tim McDaniel writes:
>Case a)
>   semi-open intervals, like [low,high).
>Case b)
>   closed intervals, like [low,high].

Use case a) like intervals, seems to be more intuitive (more "round" numbers).

>1) How do you code a "for" loop to avoid overflow gotchas?

 unsigned long i,low,high,incr;
	for(i=low;(long)(high-i)>0;i+=incr)

The only restrictions here are: high-low<=LONG_MAX && incr<=LONG_MAX && incr>0

The first restriction can be avoided if you code it as follows:

 unsigned long i,low,high,incr; 
        for(i=low;i-high>=incr;i+=incr)

Restriction is now: high-low<=UNSIGNED_LONG_MAX-incr
--
Sincerely,                 berg@marvin.e17.physik.tu-muenchen.de
           Stephen R. van den Berg.
"I code it in 5 min, optimize it in 90 min, because it's so well optimized:
it runs in only 5 min.  Actually, most of the time I optimize programs."

browns@iccgcc.decnet.ab.com (Stan Brown) (03/26/91)

In article <1991Mar21.181807.24059@ism.isc.com>, willcr@bud.sos.ivy.isc.com (Will Crowder) writes:
> On a somewhat related note, the ANSI C standard guarantees that the address of
> the element one past the end of an array is representable and won't overflow.
> This makes loops like
> 
> {
> 	int a[10];
> 	int *p, *q;
> 
> 	/* point q and at end a */
> 
> 	q = &a[sizeof a / sizeof a[0]];
> 
> 	for (p = a; p != q; p++)
> 		useful_work_goes_here();
> }
> 
> work properly.  [The (sizeof a / sizeof a[0]) expression keeps the explicit 
> knowledge of the size of the array in its declaration.]  The implementation
> must place the array a such that &a[10] is representable and will not
> overflow.

I use
        for (p=a; p<a; p++)
Then if my loop also increments p, the loop still terminates.

(Yes, I know my loop _shouldn't_ increment p, but sometimes I
make coding errors.)

My opinions are mine:  I don't speak for any other person or company.
                   email (until 91/4/30): browns@iccgcc.decnet.ab.com
Stan Brown, Oak Road Systems, Cleveland, Ohio, USA    +1 216 371 0043

jlg@cochiti.lanl.gov (Jim Giles) (03/26/91)

In article <4176@rwthinf.UUCP>, berg@marvin.e17.physik.tu-muenchen.de
(Stephen R. van den Berg) writes:
|> Tim McDaniel writes:
|> >Case a)
|> >   semi-open intervals, like [low,high).
|> >Case b)
|> >   closed intervals, like [low,high].
|> [...]
|>  unsigned long i,low,high,incr; 
|>         for(i=low;i-high>=incr;i+=incr)
|> 
|> Restriction is now: high-low<=UNSIGNED_LONG_MAX-incr

Since this is a Case a) loop, the restriction can be further
narrowed (by just the value of incr):

  unsigned long i, j, low, high, incr, trip_count;
            ...
            trip_count = (high-low)/incr;
            for (i=low, j=0; j<trip_count; i+=incr, j++) ...

The restrictions are now: high-low <= UNSIGNED_LONG_MAX, incr != 0,
and high >= low.  (These last two conditions are obvious, but should
be bourne in mind anyway.)

Similarly, Case b) can be handled with the same restrictions:

  unsigned long i, j, low, high, incr, trip_count;
            ...
            trip_count = (high-low)/incr;
            i=low; j=0;
            while (1){                  /* loop forever (sort of) */
               ...
               if (j==trip_count) exit; /* exit condition in middle */
               i+=incr; j++;
            }

Note that in Case a) the trip_count variable is actually the number
of times the loop body is executed.  In case b), trip_count is one
less than the number of passes through the loop.

J. Giles

rjohnson@shell.com (Roy Johnson) (03/27/91)

In article <3942.27edea2f@iccgcc.decnet.ab.com> browns@iccgcc.decnet.ab.com (Stan Brown) writes:
> I use
>	   for (p=a; p<a; p++)
> Then if my loop also increments p, the loop still terminates.

I would guess it would terminate before it ever went into the loop,
so it doesn't matter what the loop does.  8^)

> (Yes, I know my loop _shouldn't_ increment p, but sometimes I
> make coding errors.)

Oh.  Then you meant for (p=0; p<a; p++) ??
--
======= !{sun,psuvax1,bcm,rice,decwrl,cs.utexas.edu}!shell!rjohnson =======
Feel free to correct me, but don't preface your correction with "BZZT!"
Roy Johnson, Shell Development Company