[net.bugs] YAAB

bill@ur-cvsvax.UUCP (Bill Vaughn) (09/09/85)

DESCRIPTION:
Awk's version of 'printf' core dumps or prints garbage when a field width
variable is used. The manual page refers to printf(3S) so I assume this
feature of printf should be implemented.

REPEAT BY:
Examine the following script. Try it yourself.

Script started on Mon Sep  9 12:58:06 1985
% cat test.c
main(){printf("%*s\n",5,"a");}
% cc test.c ; a.out
    a
% echo 5 |awk '{printf("%*s\n",$1,"a")}'
                                                                            l$l3lll)l:lHlSl]lhl(m%n
% echo 5 |awk '{printf("%*s\n",5,"a")}'
Segmentation fault (core dumped)
% ^D
script done on Mon Sep  9 13:01:08 1985

FIX:
I don't have one.  Can anyone out there help?
I haven't been able to kludge around this problem as yet.

_________________________________
\          Bill Vaughn          /
 \  Center for Visual Science  /
  |  University of Rochester  |
 /     Rochester, NY 14627     \
/seismo!rochester!ur-cvsvax!bill\
---------------------------------

naiman@pegasus.UUCP (Ephrayim J. Naiman) (09/11/85)

<munch, munch>

> Awk's version of 'printf' core dumps or prints garbage when a field width
> variable is used. The manual page refers to printf(3S) so I assume this
> feature of printf should be implemented.

I also found this bug a long time ago and gave up on a fix.
Incidentally, I have a version of the unreleased new awk and it also
core dumps.  I hope someone is listening.
-- 

==> Ephrayim J. Naiman @ AT&T Information Systems Laboratories (201) 576-6259
Paths: [ihnp4, allegra, mtuxo, maxvax, cbosgd, lzmi, ...]!pegasus!naiman

mp@allegra.UUCP (Mark Plotnick) (09/13/85)

As long as we're telling our favorite awk bugs, here's mine.

(Karl Heuer, whuxlb!kwh, discovered this bug)

In run.c, routine program(), there's a call to tempfree(x) at a point where
x may still be uninitialized if the awk script doesn't have any
BEGIN actions and doesn't have any input.  

	awk 'END { print NR }' </dev/null

will run into the problem.

On 4.2bsd, x is an automatic variable (a struct), and the garbage it
contains will usually cause nothing bad to happen.
But, on SVR2, x is a register ptr.  On a vax it's register 10, and
its value is inherited from whatever was in register 10 at the time awk
was exec'ed (e.g. the variable p in the execs() routine in the shell).
When run under ksh, the above awk program gets a memory fault.  When
run under sh or sdb, it works OK.  Do you know what a pain it is to have
a program fail under the shell and work fine when run under a debugger?

	Mark Plotnick
	allegra!mp

vijay@topaz.RUTGERS.EDU (P. Vijay) (09/13/85)

> DESCRIPTION:
> Awk's version of 'printf' core dumps or prints garbage when a field width
> variable is used. The manual page refers to printf(3S) so I assume this
> feature of printf should be implemented.
> 
> REPEAT BY:
> % echo 5 |awk '{printf("%*s\n",$1,"a")}'
> Segmentation fault (core dumped)

CAUSE:
	All printf(fmt,v1,v2,...,vn) statements are tokenised.
Eventually, format() is called for each individual tokens, with the
format string pointer being updated every invocation. The format
string is parsed in a rather simplistic fashion, which does not look
out for '%*' like constructs. So, a statement like 'printf("%*s",$1,"a")'
results in a call 'sprintf("%*s",$1)', which ends up accessing some
bogus pointer somewhere, thus dumping core.

FIX:
	Make the format string parsing more sophisticated, and set
booleans to indicate presence of variable fieldwidth and/or precision.
Also, conditional to these booleans, call sprintf with appropriate
arguments.

CAVEAT:
	Since this situation does not occur very often ("%*" in
printf), and since printf itself is typically a heavily called routine
in awk programs, the parsing for '%*' is done in a not so robust
fashion. Calls such as 'printf("%**.22s\n",$1,"Foo")' might cause the
awk program to croak without indicating the proper error.

----------P-A-T-C-H----T-O----RUN.C----F-O-L-L-O-W-S------------
*** run.c.ORIG	Fri Feb 10 06:53:37 1984
--- run.c	Fri Sep 13 04:04:26 1985
***************
*** 321,326
  	int flag = 0;
  	awkfloat xf;
  
  	os = s;
  	p = buf = (char *)malloc(RECSIZE);
  	while (*s) {

--- 321,339 -----
  	int flag = 0;
  	awkfloat xf;
  
+ 	/* [P. Vijay - Sep 1985] Fix the problem of awk dumping core when
+ 	   '*' is present either in the precision or field width spec. in
+ 	   the format string.
+ 	   
+ 	   Reason: Presently this routine does not check for the presence 
+ 	   of '*' in the said places, and as a result, sprintf is called 
+ 	   with insufficient params. [e.g., "sprintf("%*s\n",x.opt->sval)"].
+ 	 */
+ 
+ 	int VarFieldWidth;	/* Set 'true' if '*' used to spec. fld. wid. */
+ 	int VarPrecision;	/* Set 'true' if '*' used to spec. precision */
+ 	int FieldWidth , Precision; /* Hold value of respective specifier */
+ 
  	os = s;
  	p = buf = (char *)malloc(RECSIZE);
  	while (*s) {
***************
*** 334,340
  			continue;
  		}
  		for (t=fmt; (*t++ = *s) != '\0'; s++)
! 			if (*s >= 'a' && *s <= 'z' && *s != 'l')
  				break;
  		*t = '\0';
  		if (t >= fmt + sizeof(fmt))

--- 347,353 -----
  			continue;
  		}
  		for (t=fmt; (*t++ = *s) != '\0'; s++)
! 		  if (*s >= 'a' && *s <= 'z' && *s != 'l')
  				break;
  		*t = '\0';
  		if (t >= fmt + sizeof(fmt))
***************
*** 339,344
  		*t = '\0';
  		if (t >= fmt + sizeof(fmt))
  			error(FATAL, "format item %.20s... too long", os);
  		switch (*s) {
  		case 'f': case 'e': case 'g':
  			flag = 1;

--- 352,367 -----
  		*t = '\0';
  		if (t >= fmt + sizeof(fmt))
  			error(FATAL, "format item %.20s... too long", os);
+ 
+ 		/* [P. Vijay - Sep 1985] Check for presence of '%*', '%*.',
+ 		   and/or '%.*'. This is a QuikFix!! No checking for specs.
+ 		   such as '%**.12', etc. Someday, this should be fixed to
+ 		   do the task in a more robust fashion...
+ 		 */
+ 		
+ 		VarFieldWidth = fmt[1] == '*';
+ 		VarPrecision  = (*(s-2) == '.') && (*(s-1) == '*');
+ 		
  		switch (*s) {
  		case 'f': case 'e': case 'g':
  			flag = 1;
***************
*** 368,373
  			p += strlen(p);
  			continue;
  		}
  		if (a == NULL)
  			error(FATAL, "not enough arguments in printf(%s)", os);
  		x = execute(a);

--- 391,410 -----
  			p += strlen(p);
  			continue;
  		}
+ 		if (VarFieldWidth){
+ 		    if (a == NULL)
+ 			error(FATAL, "not enough arguments in printf(%s)", os);
+ 		    x = execute(a);
+ 		    FieldWidth = getfval(x.optr);
+ 		    a = a->nnext;
+ 	        }
+ 		if (VarPrecision){
+ 		    if (a == NULL)
+ 			error(FATAL, "not enough arguments in printf(%s)", os);
+ 		    x = execute(a);
+ 		    Precision = getfval(x.optr);
+ 		    a = a->nnext;
+ 	        }
  		if (a == NULL)
  			error(FATAL, "not enough arguments in printf(%s)", os);
  		x = execute(a);
***************
*** 374,383
  		a = a->nnext;
  		if (flag != 4)	/* watch out for converting to numbers! */
  			xf = getfval(x.optr);
! 		if (flag==1) sprintf(p, fmt, xf);
! 		else if (flag==2) sprintf(p, fmt, (long)xf);
! 		else if (flag==3) sprintf(p, fmt, (int)xf);
! 		else if (flag==4) sprintf(p, fmt, x.optr->sval==NULL ? "" : getsval(x.optr));
  		tempfree(x);
  		p += strlen(p);
  		s++;

--- 411,434 -----
  		a = a->nnext;
  		if (flag != 4)	/* watch out for converting to numbers! */
  			xf = getfval(x.optr);
! 
! 		/* [P. Vijay - Sep 1985] To avoid a mess of repetitive
! 		   code sequences, define 'Xsprintf' macro to do the
! 		   actual sprintf's.
! 		 */
! 
! #define Xsprintf(str,fmt,var)\
! 		if (!VarFieldWidth)\
! 		  if (!VarPrecision) sprintf(str,fmt,(var));\
! 		  else sprintf(str,fmt,Precision,(var));\
! 		else\
! 		  if (!VarPrecision) sprintf(str,fmt,FieldWidth,(var));\
! 		  else sprintf(str,fmt,FieldWidth,Precision,(var))
! 
! 		if (flag==1) Xsprintf(p, fmt, xf);
! 		else if (flag==2) Xsprintf(p, fmt, (long)xf);
! 		else if (flag==3) Xsprintf(p, fmt, (int)xf);
! 		else if (flag==4) Xsprintf(p, fmt, x.optr->sval==NULL ? "" : getsval(x.optr));
  		tempfree(x);
  		p += strlen(p);
  		s++;

ado@elsie.UUCP (Arthur David Olson) (09/19/85)

For fellow conservatives, here's a change to "run.c" that's:
	small in magnitude;
	gets rid of core dumps caused by '*' characters in awk format strings
	(and by excessively long format strings);
	and ensures that awk scripts that work on your system will be portable
	to other systems.
The disadvantage is that '*' characters in formats are not allowed--if one is
found, a fatal error message is produced.  See "substr" for workarounds.

As usual, the trade secret status of the code involved precludes a clearer
posting.

#ifdef OLDVERSION
		for (t=fmt; (*t++ = *s) != '\0'; s++)
			if (*s >= 'a' && *s <= 'z' && *s != 'l')
				break;
		*t = '\0';
		if (t >= fmt + sizeof(fmt))
			error(FATAL, "format item %.20s... too long", os);
#else
		for (t=fmt; (*t++ = *s) != '\0'; s++)
			if (t >= fmt + sizeof fmt)
				error(FATAL,
					"format item %.20s... too long", os);
			/*
			** Ought to use ctype, but that might change the
			** behavior on non-ASCII machines.
			*/
			else if (*s >= 'a' && *s <= 'z' && *s != 'l')
				break;
			else if (*s == '*')
				error(FATAL, "format item %s has wild '*'", os);
		*t = '\0';
#endif

--
Bugs is a Warner Brothers trademark.
Awk is a Polly the Parrot trademark.
--
	UUCP: ..decvax!seismo!elsie!ado    ARPA: elsie!ado@seismo.ARPA
	DEC, VAX and Elsie are Digital Equipment and Borden trademarks

ado@elsie.UUCP (Arthur David Olson) (09/20/85)

A simple, portable workaround is to replace awk nonstatements such as

	printf "%*.*s\n", a, b, c

with awk statements such as

	printf "%" a "." b "s\n", c
--
Bugs is a Warner Brothers trademark.
Awk is a Polly the Parrot trademark.
--
	UUCP: ..decvax!seismo!elsie!ado    ARPA: elsie!ado@seismo.ARPA
	DEC, VAX and Elsie are Digital Equipment and Borden trademarks