root@qetzal.UUCP (Admin) (12/18/87)
[Submitted courtesy of John Rupley, reachable at arizona!rupley!local]
(The following is for those who have purchased for ca $300 or
otherwise have legal access to the ATT source code for the
new awk. If you use awk regularly for text or database
manipulation, the new awk is really worth its price. See the
book, "The Awk Programming Language", by Aho, Kernighan and
Weinberger (Addison-Wesley, 198[78]) for a description.)
The new awk compiles and runs on an 80286 system (Microport
sys5/AT 2.2), but with some problems that are not found
on a Vax:
(1) The system hangs on "print" of numerical values larger than
31 bits. Work-around is to use "printf".
(2) Regular expressions containing a bracketed character group
(character class) dumped core. Fix is to repair int and int *
declarations and casts and sizeofs.
(3) There are various compilation problems, not related to code,
that have fixes or work-arounds.
(4) Richard Stevens has reported a bug in and a fix of error()
in lib.c
Details of (1) to (4) are given below.
I would appreciate learning of a fix for (1).
Anyone else have nawk problems that they have resolved? I
suspect that there are some still submerged monsters that will
surface when running on a 286 machine.
The new awk is truly improved and appears to be worth the problem
of bringing it up.
John Rupley
uucp: ..{ihnp4 | hao!noao}!arizona!rupley!local
internet: rupley!local@megaron.arizona.edu
telex: 9103508679(JARJAR)
(H) 30 Calle Belleza, Tucson AZ 85716 - (602) 325-4533
(O) Dept. Biochemistry, Univ. Arizona, Tucson AZ 85721 - (602) 621-3929
**************************************************************************
Sat Nov 21 16:01:23 MST 1987
NEW AWK-- ON MICROPORT SYS5/AT (80286)
(1) The system hangs on "print" of numerical values larger than
31 bits.
BUG: The following hangs the system with $1=46341; ok with
$1=46340 (input mode does not matter-- same effect with
4.6341e4):
echo $1 | awk '{print $1; x = $1*$1; print x}'
WORK-AROUND: The following code is fine for large values, eg
$1=1e100 (although *very* large values hang the system):
echo $1 | awk '{print $1; x = $1*$1; printf("%20.6e\n", x)}'
COMMENTS: The system hangs when nawk attempts to "print" a
numerical result larger than 31 bits (2.147 * 10^9). This was
found through testing under the awktest.a programs t.i.x and
T.misc.
The hang is *only* with the function "print". Large numbers
(greater than 10^100) can be handled with "printf". Similarly,
no problem with computation ( *, exp(), ...), except again a
hang at very large results, much greater than 10^100.
There is no problem with nawk on the vax, for which the limit is
127 bits in the output of the "print" function (at least in the
exercises used).
The uport v2.2 distribution version of old awk seems to hang on
the above print statement also.
(2) Regular expressions containing a bracketed character group
dumped core.
BUG:
any /re/ of the type /..[..]../ dumps core.
FIX (NOT GUARANTEED):
in awk.h, declare lval as int *
in b.c, change several casts and sizeofs, from int to
int * or int * to int, as appropriate
(See the diff file below.)
COMMENTS: The structure element re[].lval is used to store both
characters (as int) and pointers to arrays of characters. A vax
is happy with the structure declaration of lval as int. However,
with 16 bit int's, uport is not.
The kludged code passes the awktest.a tests, but no guarantees--
I did as little tracing of the logic as possible, to the extent
of not even listing the code for b.c, much less that for the
other routines.
(3A) Compilation problem--
BUG:
getsval() stands alone in a line of code
FIX:
insert dummy lvalue, in lib.c and run.c, at several
places
(See the diff file below.)
COMMENTS: PCC under uport coughs, apparently at a conditional
expression that lacks an assignment. Specifically, getsval() is
#defined as a several-branched conditional, some branches of
which do not constitute a statement.
(3B) Other remarks on compilation on a 286 machine--
The version of yacc from uport gave an out of space error in the
nawk compilation.
Yacc from sys5 source, compiled with the "HUGE" size option,
worked.
Alternatively, one might get y.tab.[ch] and lex.yy.c by use of
yacc and lex on a vax, ie generate C code from awk.g.y and
awk.lx.l on a vax, then move it over and compile it on a uport
system.
Small model compilation of nawk gave a working version, but its
data segment is too small for reasonable temporaries, etc, as
found by testing with the awktest.a programs and data files,
which are not particularly large.
Large model compilation (-Ml) gave a version that passes the
awktest.a tests, after repair of the problems noted above,
excepting the two tests that hang the system.
For the debug option (nawk -d ....) to work, various dprint
statements must have %o changed to %lo.
(4) There is a bug in the print operations of error() of lib.c,
reported by Richard Stevens in comp.bugs.sys5. His fix is
included in the diff file below.
*********************diff output***************************
(distribution compared with modified)
awk.h compared with ../nawk/awk.h
203c203
< int lval;
---
> int *lval;
b.c compared with ../nawk/b.c
115c115
< if ((f->posns[0] = (int *) Calloc(1, *(f->re[0].lfollow)*sizeof(int))) == NULL)
---
> if ((f->posns[0] = (int *) Calloc(1, *(f->re[0].lfollow)*sizeof(int *))) == NULL)
117c117
< if ((f->posns[1] = (int *) Calloc(1, sizeof(int))) == NULL)
---
> if ((f->posns[1] = (int *) Calloc(1, sizeof(int *))) == NULL)
137c137
< if ((f->posns[2] = (int *) Calloc(1, (k+1)*sizeof(int))) == NULL)
---
> if ((f->posns[2] = (int *) Calloc(1, (k+1)*sizeof(int *))) == NULL)
278c278
< f->re[(int) left(v)].lval = (int) right(v);
---
> f->re[(int) left(v)].lval = (int *) right(v);
283c283
< if ((p = (int *) Calloc(1, (setcnt+1)*sizeof(int))) == NULL)
---
> if ((p = (int *) Calloc(1, (setcnt+1)*sizeof(int *))) == NULL)
439c439
< if ((f->posns[2] = (int *) Calloc(1, (k+1)*sizeof(int))) == NULL)
---
> if ((f->posns[2] = (int *) Calloc(1, (k+1)*sizeof(int *))) == NULL)
491c491
< if ((f->posns[2] = (int *) Calloc(1, (k+1)*sizeof(int))) == NULL)
---
> if ((f->posns[2] = (int *) Calloc(1, (k+1)*sizeof(int *))) == NULL)
707c707
< if (k == CHAR && c == f->re[p[i]].lval
---
> if (k == CHAR && c == (int ) f->re[p[i]].lval
754c754
< if ((p = (int *) Calloc(1, (setcnt+1)*sizeof(int))) == NULL)
---
> if ((p = (int *) Calloc(1, (setcnt+1)*sizeof(int *))) == NULL)
lib.c compared with ../nawk/lib.c
183a184
> uchar *dummy;
190c191
< getsval(recloc);
---
> dummy = getsval(recloc);
400a402,404
> /* bug fix richard stevens */
> register int c;
>
404a409,411
>
> /* bug fix richard stevens */
> /*
405a413,417
> */
> while (c = *s++)
> putc(c, stderr);
> /* end of bug fix */
>
makefile compared with ../nawk/makefile
10c10,11
< CFLAGS = -g
---
> #CFLAGS = -g
> CFLAGS = -g -Ml
run.c compared with ../nawk/run.c
901a902
> uchar *dummy;
905,906c906,907
< getsval(x);
< getsval(y);
---
> dummy = getsval(x);
> dummy = getsval(y);
1346a1348
> uchar *dummy;
1349c1351
< getsval(x);
---
> dummy = getsval(x);
--
//////////////////286 Moderator -- comp.unix.microport\\\\\\\\\\\\\\\\\
Email to microport@uwspan for info on the newsgroup comp.unix.microport.
otherwise mail to microport@uwspan with a Subject containing one of:
386 286 Bug Source Merge or "Send Buglist" (rutgers!uwvax!uwspan!microport)