[comp.bugs.sys5] Is this an `awk' bug?

rns@se-sd.SanDiego.NCR.COM (Rick Schubert) (10/18/90)

In <4023@se-sd.SanDiego.NCR.COM> rns@se-sd.SanDiego.NCR.COM I wrote:

>-----------------------CUT HERE for test.awk---------------------
>BEGIN {
>	lastn = 0
>}
>{
>	n = substr($1, index($1,"#") + 1, 2)
>	while (++lastn < n) {
>		printf " %2d",lastn
>	}
>	printf " %2d",n
>}
>END {
>	while (++lastn < 26) {
>		printf " %2d",lastn
>	}
>	printf "\n"
>}
>-----------------------END of test.awk--------------------------
>-----------------------CUT HERE for test.in---------------------
>#01
>#04
>#07
>#08
>#12
>#14
>#16
>#17
>#21
>#22
>----------------------------END of test.in---------------------

>The expected output is:
>  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25
>(the while loops are intended to fill in the missing numbers in the input
>sequence)

>instead I get:
>  1  4  7  8 12 14 16 17 21 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25

>When I add some debugging statements, the problem goes away!  Is it me
>or `awk'?

I posted this 3 days ago and haven't gotten any responses, but I figured out
the problem myself.  I don't know if this was the wrong newsgroup to post to,
or if people just didn't want to try to debug my `awk' program.  Anyway, last
night (while I was away from my system) I was thinking and trying to figure
out if there was anything I could have been doing wrong.  My program seemed
too simple to turn up a bug in at least 2 different versions of `awk', at least
one of which has been around for many years.  I came up with a hypothesis for
where there might be a bug in my program, and first thing this morning I
checked it out.  Sure enough, I found out how to fix it.  The problem has
to do with conversions between digit strings and integers.  I had the
statements.


>	n = substr($1, index($1,"#") + 1, 2)
>	while (++lastn < n) {

Now, `substr' returns a string (it just happens to be composed of 2 digits
in my case).  Since variables in `awk' are untyped, `n' is still a string.
In comparing `++lastn' with `n', an integer is being compared with a string.
One of the expressions needed to be converted to the type of the other.
My guess is that the integer was converted to a string.  Since `n' was a
2 digit number (with a leading `0' for numbers less than 10), "0x" was less
than any integer greater than 0, so the comparison didn't work the way I
wanted.

I changed the assignment to:

	n = substr($1, index($1,"#") + 1, 2) + 0

This forces `n' to be an integer.

I think I thought of this based on some deeply-buried memory of the `man'
page, which says under "RESTRICTIONS" (nee "BUGS"):

   There are no explicit conversions between numbers and strings.
   To force an expression to be treated as a number add 0 to it;
   to force it to be treated as a string concatenate the null string to it.

There's an acronym that people on the Net use in situations like this, but
it's not one that I like (the acronym references an obscenity, even though
the obscenity is not printed, plus, the obscenity is an adjective modifying
the wrong noun).

-- Rick Schubert (rns@se-sd.sandiego.NCR.COM)