[gnu.utils.bug] SED bug

dar@nucleus.mi.org (Dario Alcocer) (07/29/89)

Hello, I wanted to report a bug in GNU sed.  Whenever a
regular expression range is specified, the negate operator
(\!) does not work correctly.  I think I have a fix, but
I wanted to check with you first. My fix is contained in
the following lines; this is output from Bob Stout's
diff program:

 DIFF 2.01B 
< \TEMP\SEDEXEC.C
> \TMP\SEDEXEC.C
------------------------------------
 16,16
< #include <stdio.h>      /* {f}puts, {f}printf, getc/putc, f{re}open, fclose */
< #include <ctype.h>      /* for isprint(), isdigit(), toascii() macros */
< #include "sed.h"        /* command type structures & miscellaneous constants */
< #include "sed.dcl"
---
> /*
> BUG FIX:
>    Changed match() so that regular expression addresses will NOT be matched
> if the allbut flag is true. Changed by Dario Alcocer 7/12/89.
> */
 21,22
< extern char     *strcpy();      /* used in dosub */
---
> #include <stdio.h>      /* {f}puts, {f}printf, getc/putc, f{re}open, fclose */
> #include <ctype.h>      /* for isprint(), isdigit(), toascii() macros */
> #include "sed.h"        /* command type structures & miscellaneous constants */
> #include "sed.dcl"
 23,27
< /***** shared variables imported from the main ******/
---
> extern char     *strcpy();      /* used in dosub */
 29
> /***** shared variables imported from the main ******/
> 
 213,219
<         }
<         else
<         {
<                 if (ipc->flags.allbut)
<                         return(TRUE);
<                 ipc++;
<                 return(FALSE);
---
>                 /*
>                  * Code added 7/12/89 by Dario Alcocer to prevent
>                  * regular expression addresses from being matched if
>                  * the allbut flag is true.
>                  */
>                 if (ipc->flags.allbut)
>                         return(FALSE);
 227
>         else
>         {
>                 if (ipc->flags.allbut)
>                         return(TRUE);
>                 ipc++;
>                 return(FALSE);
>         }
eof \TEMP\SEDEXEC.C


Please let me know if this fix is correct.  I can be
reached voice at 619-450-1667 ext. 6363, or at
dar@nucleus.mi.org (try to logon regularly, but
sometimes I can't get on for days...). 

Thanks for your attention,

Dario Alcocer

tarvaine@tukki.jyu.fi (Tapani Tarvainen) (09/18/89)

The y-command doesn't seem to work with Gnu SED 1.02: e.g.,

echo a | sed 'y/a/A/'

complains "extra characters after command".  I tried it under
SunOS 4.0 and MS-DOS with identical results.

Is 1.02 the latest release, and if not, where could a newer one
be found?  (I just might try fixing this myself.)

-- 
Tapani Tarvainen    (tarvaine@tukki.jyu.fi, tarvainen@finjyu.bitnet)

tarvaine@tukki.jyu.fi (Tapani Tarvainen) (09/18/89)

In article <1311@tukki.jyu.fi> tarvaine@tukki.jyu.fi I wrote:
>The y-command doesn't seem to work with Gnu SED 1.02

The bug turned out to be that the program assumed that there is
always '\n' at the end of a line, but it can be EOF as well.
Here's my fix:

*** sed.old	Sun Sep 17 22:28:15 1989
--- sed.c	Sun Sep 17 22:42:38 1989
***************
*** 704,710 ****
  				cur_cmd->x.translate[*string++]=ch;
  			}
  			flush_buffer(b);
! 			if(inchar()!=slash || inchar()!='\n')
  				bad_prog(LINE_JUNK);
  			break;
  
--- 704,710 ----
  				cur_cmd->x.translate[*string++]=ch;
  			}
  			flush_buffer(b);
! 			if(inchar()!=slash || ((ch=inchar())!='\n' && ch!=EOF)
  				bad_prog(LINE_JUNK);
  			break;
  
-- 
Tapani Tarvainen    (tarvaine@tukki.jyu.fi, tarvainen@finjyu.bitnet)

tarvaine@tukki.jyu.fi (Tapani Tarvainen) (09/18/89)

I just found yet abother bug in the y-command of Gnu SED 1.02:
characters >127 won't get translated properly.  This depends on
the compiler, though; the offending line is

				cur_cmd->x.translate[*string++]=ch;

and the problem is that string is declared as char *,
and if chars are signed and *string is negative, the ANSI
promotion rules keep it negative when converted to an int.

The easiest fix (or first that I could think of) is to declare
string as unsigned char*; lint &c will complain about mixing
pointers to signed and unsigned chars, but who cares (if you do,
throw in a few casts or declare all chars as unsigned).
At least it seems to work.

Anyway, here's the diff (this includes the earlier patch):

*** sed.old	Sun Sep 17 22:28:15 1989
--- sed.c	Mon Sep 18 01:58:58 1989
***************
*** 412,418 ****
  	int	ch;
  	int	slash;
  	VOID	*b;
! 	char	*string;
  	int	num;
  
  	FILE *compile_filename();
--- 412,418 ----
  	int	ch;
  	int	slash;
  	VOID	*b;
! 	unsigned char	*string;
  	int	num;
  
  	FILE *compile_filename();
***************
*** 704,710 ****
  				cur_cmd->x.translate[*string++]=ch;
  			}
  			flush_buffer(b);
! 			if(inchar()!=slash || inchar()!='\n')
  				bad_prog(LINE_JUNK);
  			break;
  
--- 704,710 ----
  				cur_cmd->x.translate[*string++]=ch;
  			}
  			flush_buffer(b);
! 			if(inchar()!=slash || ((ch=inchar())!='\n' && ch!=EOF))
  				bad_prog(LINE_JUNK);
  			break;
  


-- 
Tapani Tarvainen    (tarvaine@tukki.jyu.fi, tarvainen@finjyu.bitnet)