[gnu.utils.bug] Bugs in Gnu SED 1.06

tarvaine@tukki.jyu.fi (Tapani Tarvainen) (09/28/89)

This is a report of problems I encountered in porting Gnu SED (1.06)
to MS-DOS.  Some of them are genuine bugs in the code (not related to
DOS weirdness):

(1) sed "s/x//" gave "Couldn't allocate memory".  It turned out that
when the replacement string was null string, sed tried to allocate
zero bytes with ck_malloc, which called malloc(0), got NULL and
errored out.  Curiously this didn't happen in a Sun, because there
malloc(0) returned some non-NULL value, and the value returned by
ck_malloc was never used in this situation.  I fixed this simply by
adding an if to skip the ck_malloc() and following bcopy() calls
when replacement_length==0.  This works here OK, though it would be
safer to set replacement=NULL just in case somebody uses this code
in a context where the memory is later free()d.

(2) 'N' at EOF returned nothing, which may have been intentional but I
found it rather unfriendly compared to how UNIX (SunOS) sed behaves
(quits the script), because an error in the script may cause an
infinite loop trying to 'N' forever.  I changed this by inserting 'if
feof() goto quit' before append_pattern_space call (and putting label
quit at where 'q' command goes).  Consequently the initial EOF test in
append_pattern_space could be removed.

(3) 'n' at EOF repeatedly produced copies of the last line.  I changed
it to quit just like 'N' above.

(4) append_pattern_space failed when line.alloc==line.length in the
beginning (wrote past the end of the buffer).  This was easily fixed
by moving the n==0 (buffer full) test to the beginning of the loop.
(I'd guess this bug resulted from copying code from read_pattern_space,
where the buffer is never initially full.)

(5) This is no real bug, but I wanted it to give "usage:" message
when run with no arguments, rather than "No program to run".  As it
was it actually never gave the "usage" message, which may explain why
it was rather odd (namely, "SED: Usage: ...").

(6) Functions strdup, memchr and memmove defined in sed.c are part of
ANSI standard library, and the standard doesn't like anybody
redefining them.  Memchr and memmove are identical with the standard
routines (or close enough: they only don't declare their arguments as
"const"), so they can simply be isolated with #ifndef __STDC__, but
strdup differs in that it gives an error rather than returning NULL
when there isn't enough memory.  Not wishing to lose this feature I
added ck_strdup which calls strdup and errors out if it returns NULL
(of course only if __STDC__ is defined, otherwise it behaves as before).  

(7) I also made a few small changes to to get rid of some warnings: 
VOID was defined as "void" only when __GNUC__ was defined, char
otherwise;  I changed it to be "void" whenever __STDC__ is defined.  
Then I changed the header of panic() (in __STDC__ case) into
standard-conforming "panic(char *str,...)" and the redeclaration of
init_buffer() in compile_address() from "char *" into "VOID *".

(8) Gnu sed isn't quite compatible with other seds I've used (SunOS
and HP-UX): it doesn't understand \{m,n\} convention, but does
interpret ? as at-most-once operator (? is a normal character to the
others).  Actually I like the way ? works, although perhaps it should
be \? instead for compatibility.

(9) Glob.c would require complete rewriting to work in DOS, however it
isn't really necessary at all: I simply isolated the relevant pieces
of code in compile_filename (two) with #ifndef NOGLOB.  The only
drawback is that things like "r *.x" won't work, but since that would
only be legal if there was exactly one matching file, it's no great
loss.  Getopt.c and alloca.c I got from the e?grep distribution, no big
changes were needed (only changing include file names &c).  I did add
out-of-memory check to alloca, though.

(10) In more places than I cared to count pointer-valued functions are
used without declaration, zeros are passed as arguments without cast
where pointers are expected &c, i.e., sizeof(pointer)==sizeof(int) is
assumed, which obviously caused a disaster when I tried to use large
data models.  There is a fairly easy way to correct this, however: give
prototypes for all functions (easy because I'd just written a sed
script to generate prototypes...).  So I added ANSI-style prototypes
everywhere (= sed.c, regex.c, regex.h, getopt.c and alloca.c, inside
#ifdef __STDC__ of course), and it seems to work just fine. 

(11) In even more places pointers are subtracted and the result
assumed to be int, which also won't do in general.  This is not as
serious, however: it is only a problem if you actually allocate a
block >32K, so I didn't bother fixing it.

It could be fixed by using long (or ptrdiff_t, if compilers could be
relied on to define it in a standard-conforming way, which they can't)
instead of int where appropriate, or size_t (unsigned int) which would
be more efficient (in large data models in TC ptrdiff_t is long but
size_t is unsigned int - and yes, that *is* correct, see below) but
require rearranging various pieces of code so that unsigned arithmetic
would work.

For those interested in MS-DOS C compiler pathology: TurboC 2.0
defines size_t as unsigned int in all models except huge, which is
legal (standard-conforming) since malloc() won't allocate blocks >64K.
Microsoft C 5.1 defines it as unsigned int also in huge model, which
doesn't matter much since huge requires nonstandard kludges anyway.
TC defines ptrdiff_t as int in small data models (which is *not* OK as
you can allocate blocks >32K) and long in large (as it should, despite
of the fact that it won't allocate anything >64K anyway).  MSC defines
ptrdiff_t as int always (yuck).  Note that you can fix these by
yourself, if you wish: they haven't yet discovered a way to ship
include files in a non-readable form (but of course then you cannot
give your sources and expect them to work without giving patches to
the include files, too).

If you want to be able to allocate blocks >64K there's no even
remotely standard way to do it (it can be done using the huge model,
but it's a real mess).

(12) Out-of-memory tests in the code are not waterproof.  For example,

	line.text=ck_realloc(line.text,line.alloc*2);

has the problem that line.alloc*2 may cause integer overflow.  (OK,
it's not very likely - but assume a script has a bug that leads to an
endless or very long loop with 'N' in it ...)  (OK, so I'm paranoid.
But that doesn't mean they're not out to get me.)

Amusingly, the EXTEND_BUFFER macro in regex.c does catch overflows in
a somewhat roundabout way: The "allocated" field is declared as long,
and when it exceeds 1L<<16 it is replaced with that, which is then
passed on to malloc(), implicitly cast as unsigned and becomes 0, so
malloc returns NULL just as if it'd run out of memory.
Note that it can have a value between 2^15 and 2^16, which is why
it has to be long.
However, making it (and "used"-field) long won't necessarily save the day 
if that actually happens (see (11) above), so we might as well let them
be ints and change EXTEND_BUFFER slightly:  I replaced (1L<<16) with
MAX_BUFFER in the definition and inserted 

#ifndef MAX_BUFFER
#define MAX_BUFFER (1<<16)
#endif

Then compiling with -DMAX_BUFFER=(unsigned)32767 does the rest.
It must be unsigned, because the test

    bufp->allocated *= 2;                                               \
    if (bufp->allocated > MAX_BUFFER) bufp->allocated = MAX_BUFFER;     \

would otherwise fail (allocated overflows).  (Or you could test before
multiplying - would also be safer, some machine might trap or something
on signed-multiply overflow).
Besides guarding against overflows this also improves efficiency
(replaces 32-bit operations with 16-bit ones).  

-- The "right" way to do this would (IMHO) be to use size_t for
allocated and used, check all other ints if they should really be
size_t or ptrdiff_t, and change expressions that rely on something
being signed that no longer is.  Then it should still work when a
compiler comes along that has sizeof(size_t) > sizeof(int) (i.e.,
allows malloc()ing more than 64K), assuming of course it defines
size_t and ptrdiff_t correctly (sigh).

(13) After I got it working with TurboC I tried Microsoft C 5.1.  It
compiled with no changes (after I discovered that MSC defines MSDOS
instead of TC's __MSDOS__), and seems to work OK, _except_

echo a | sed "s/a/b/"

returns a, yet

echo b | sed "s/b/c/"

returns c as it should.  A little experimentation shows that it fails
whenever the string to be replaced is a single character with least
significant bit 1!  What kind of bug could cause that??


That was all: as far as I can tell it now works (when compiled with TC).
It also still works when compiled in a Sun 4.
I'll put diffs for sed.c here; regex.[ch], getopt.c and alloca.c can
be used as they are (from e?grep package for example).  If you want
the complete package, though, just send me mail.

Oh, to compile it use
TCC -A -I. sed.c regex.c alloca.c getopt.c
or equivalent in the integrated environment.  Add optimizations to taste
(I used  TCC -I. -A -O -f- -d -k- -G -Z  and it still seems to work).


*** 1.06/sed.c	Mon Sep 18 17:17:24 1989
--- sed.c	Thu Sep 28 08:34:47 1989
***************
*** 1,6 ****
--- 1,7 ----
  
  /*  GNU SED, a batch stream editor.
      Copyright (C) 1989, Free Software Foundation, Inc.
+ 	Last changed by Tapani Tarvainen 28 September 1989
  
      This program is free software; you can redistribute it and/or modify
      it under the terms of the GNU General Public License as published by
***************
*** 42,47 ****
--- 43,70 ----
  	1.05	Fixed error in 'r' (now does things in the right order)
   */
  
+ /*	Changes by Tapani Tarvainen, September 1989:
+ 	- MS-DOS specifics (include files &c)
+ 	- fixed 'n' and 'N' at EOF
+ 	- fixed a bug in append_pattern_space
+ 	  (moved n==0 test to beginning of loop)
+ 	- fixed 's/any//' so it won't do ck_malloc(0)
+ 	- changed error and usage messages slightly (print usage
+ 	  if no arguments, remove path from name in messages)
+ 	- moved some ANSI-routine replacements inside #ifndef __STDC__
+ 	  (introduced ck_strdup)
+ 	- added ANSI-style prorotypes for everything (so this compiles
+ 	  OK even when sizeof(int) != sizeof(pointer) ...)
+ */
+ 
+ #ifdef MSDOS	/* MicroSoft C 5.1 defines this	*/
+ #define __MSDOS__
+ #endif
+ #ifdef __MSDOS__
+ #define USG
+ #define NOGLOB
+ #endif
+ 
  #ifdef USG
  #define bcopy(s, d, n) ((void)memcpy((d),(s), (n)))
  #endif
***************
*** 88,94 ****
  #define ADDR_NUM	1
  #define ADDR_REGEX	2
  #define ADDR_LAST	3
! 	
  struct addr {
  	int	addr_type;
  	struct re_pattern_buffer *addr_regex;
--- 111,117 ----
  #define ADDR_NUM	1
  #define ADDR_REGEX	2
  #define ADDR_LAST	3
! 
  struct addr {
  	int	addr_type;
  	struct re_pattern_buffer *addr_regex;
***************
*** 174,193 ****
  } file_ptrs[NUM_FPS];
  
  
! /* This for all you losing compilers out there that can't handle void * */
! #ifdef __GNU__
  #define VOID void
  #else
  #define VOID char
- #endif
- 
- extern int optind;
- extern char *optarg;
  extern int getopt();
- 
  extern char *memchr();
  extern VOID *memmove();
! 
  extern VOID *ck_malloc(),*ck_realloc();
  extern VOID *init_buffer();
  extern char *get_buffer();
--- 197,247 ----
  } file_ptrs[NUM_FPS];
  
  
! #ifdef __STDC__
! #include <stdlib.h>
! #include <string.h>
  #define VOID void
+ int compile_string(char *str);
+ int compile_file(char *str);
+ struct vector * compile_program(struct vector *vector);
+ int bad_prog(char *why);
+ int inchar(void);
+ void savchar(int ch);
+ int compile_address(struct addr *addr);
+ struct sed_label * setup_jump(struct sed_label *list, struct sed_cmd *cmd, struct vector *vec);
+ FILE * compile_filename(int readit);
+ void read_file(char *name);
+ void execute_program(struct vector *vec);
+ int match_address(struct addr *addr);
+ int read_pattern_space(void);
+ void append_pattern_space(void);
+ void line_copy(struct line *from, struct line *to);
+ void line_append(struct line *from, struct line *to);
+ void str_append(struct line *to, char *string, int length);
+ int usage(void);
+ int panic(char *str, ...);
+ char * __fp_name(FILE *fp);
+ FILE * ck_fopen(char *name, char *mode);
+ void ck_fwrite(char *ptr, int size, int nmemb, FILE *stream);
+ void ck_fclose(FILE *stream);
+ void * ck_malloc(int size);
+ void * ck_realloc(void *ptr, int size);
+ char * ck_strdup(char *s);
+ void * init_buffer(void);
+ void flush_buffer(void *bb);
+ int size_buffer(void *b);
+ void add_buffer(void *bb, char *p, int n);
+ void add1_buffer(void *bb, int ch);
+ char * get_buffer(void *bb);
+ int getopt (int argc, char **argv, char *optstring);
  #else
+ /* This for all you losing compilers out there that can't handle void * */
  #define VOID char
  extern int getopt();
  extern char *memchr();
  extern VOID *memmove();
! extern char *strdup();
! #define ck_strdup strdup
  extern VOID *ck_malloc(),*ck_realloc();
  extern VOID *init_buffer();
  extern char *get_buffer();
***************
*** 196,204 ****
  extern void ck_fwrite();
  extern void flush_buffer();
  extern void add1_buffer();
- 
- extern char *strdup();
- 
  struct vector *compile_program();
  void savchar();
  struct sed_label *setup_jump();
--- 250,255 ----
***************
*** 207,212 ****
--- 258,266 ----
  void append_pattern_space();
  void read_file();
  void execute_program();
+ #endif
+ extern int optind;
+ extern char *optarg;
  
  #ifndef HAS_UTILS
  char *myname;
***************
*** 308,319 ****
  	int compiled = 0;
  	struct sed_label *go,*lbl;
  
! 	myname=argv[0];
  	while((opt=getopt(argc,argv,"ne:f:"))!=EOF) {
  		switch(opt) {
  		case 'n':
  			if(no_default_output)
! 				panic(USAGE);
  			no_default_output++;
  			break;
  		case 'e':
--- 362,386 ----
  	int compiled = 0;
  	struct sed_label *go,*lbl;
  
! #ifdef __MSDOS__
! 	if (myname = strrchr(argv[0], '\\')) {
! 		char *p;
! 		++myname;
! 		if (p = strrchr(myname, '.'))
! 			*p = 0;
! 	}
! #else
! 	if (myname = strrchr(argv[0], '/'))
! 		++myname;
! #endif
! 	else
! 		myname=argv[0];
! 
  	while((opt=getopt(argc,argv,"ne:f:"))!=EOF) {
  		switch(opt) {
  		case 'n':
  			if(no_default_output)
! 				usage();
  			no_default_output++;
  			break;
  		case 'e':
***************
*** 328,334 ****
  	}
  	if(!compiled) {
  		if(argc<=optind)
! 			panic("No program to run\n");
  		compile_string(argv[optind]);
  		optind++;
  	}
--- 395,404 ----
  	}
  	if(!compiled) {
  		if(argc<=optind)
! 			if (argc==1)
! 				usage();
! 			else
! 				panic("No program to run\n");
  		compile_string(argv[optind]);
  		optind++;
  	}
***************
*** 635,643 ****
  				} else
  					add1_buffer(b,ch);
  			}
! 			cur_cmd->x.cmd_regex.replace_length=size_buffer(b);
! 			cur_cmd->x.cmd_regex.replacement=ck_malloc(cur_cmd->x.cmd_regex.replace_length);
! 			bcopy(get_buffer(b),cur_cmd->x.cmd_regex.replacement,cur_cmd->x.cmd_regex.replace_length);
  			flush_buffer(b);
  
  			cur_cmd->x.cmd_regex.flags=0;
--- 705,714 ----
  				} else
  					add1_buffer(b,ch);
  			}
! 			if (cur_cmd->x.cmd_regex.replace_length=size_buffer(b)) {
! 				cur_cmd->x.cmd_regex.replacement=ck_malloc(cur_cmd->x.cmd_regex.replace_length);
! 				bcopy(get_buffer(b),cur_cmd->x.cmd_regex.replacement,cur_cmd->x.cmd_regex.replace_length);
! 			}
  			flush_buffer(b);
  
  			cur_cmd->x.cmd_regex.flags=0;
***************
*** 788,794 ****
  {
  	int	ch;
  	int	num;
! 	char	*b,*init_buffer();
  
  	ch=inchar();
  
--- 859,866 ----
  {
  	int	ch;
  	int	num;
! 	char	*b;
! 	VOID	*init_buffer();
  
  	ch=inchar();
  
***************
*** 862,868 ****
  	tmp=(struct sed_label *)ck_malloc(sizeof(struct sed_label));
  	tmp->v=vec;
  	tmp->v_index=cmd-vec->v;
! 	tmp->name=strdup(get_buffer(b));
  	tmp->next=list;
  	flush_buffer(b);
  	return tmp;
--- 934,940 ----
  	tmp=(struct sed_label *)ck_malloc(sizeof(struct sed_label));
  	tmp->v=vec;
  	tmp->v_index=cmd-vec->v;
! 	tmp->name=ck_strdup(get_buffer(b));
  	tmp->next=list;
  	flush_buffer(b);
  	return tmp;
***************
*** 878,885 ****
--- 950,959 ----
  	int n;
  	VOID *b;
  	int ch;
+ #ifndef NOGLOB
  	char **globbed;
  	extern char **glob_filename();
+ #endif
  
  	if(inchar()!=' ')
  		bad_prog("missing ' ' before filename");
***************
*** 888,893 ****
--- 962,968 ----
  		add1_buffer(b,ch);
  	add1_buffer(b,'\0');
  	file_name=get_buffer(b);
+ #ifndef NOGLOB
  	globbed=glob_filename(file_name);
  	if(globbed==0 || globbed==(char **)-1)
  		bad_prog("can't parse filename");
***************
*** 895,900 ****
--- 970,976 ----
  		bad_prog("multiple files");
  	if(globbed[0])
  		file_name=globbed[0];
+ #endif
  	for(n=0;n<NUM_FPS;n++) {
  		if(!file_ptrs[n].name)
  			break;
***************
*** 906,912 ****
  		}
  	}
  	if(n<NUM_FPS) {
! 		file_ptrs[n].name=strdup(file_name);
  		file_ptrs[n].readit=readit;
  		file_ptrs[n].phile=ck_fopen(file_name,readit ? "r" : "a");
  		flush_buffer(b);
--- 982,988 ----
  		}
  	}
  	if(n<NUM_FPS) {
! 		file_ptrs[n].name=ck_strdup(file_name);
  		file_ptrs[n].readit=readit;
  		file_ptrs[n].phile=ck_fopen(file_name,readit ? "r" : "a");
  		flush_buffer(b);
***************
*** 1137,1147 ****
--- 1213,1225 ----
  			break;
  
  		case 'n':
+ 			if (feof(input_file)) goto quit;
  			ck_fwrite(line.text,1,line.length,stdout);
  			read_pattern_space();
  			break;
  
  		case 'N':
+ 			if (feof(input_file)) goto quit;
  			append_pattern_space();
  			break;
  
***************
*** 1159,1165 ****
  			break;
  
  		case 'q':
! 			quit_cmd++;
  			end_cycle++;
  			break;
  
--- 1237,1243 ----
  			break;
  
  		case 'q':
! quit:			quit_cmd++;
  			end_cycle++;
  			break;
  
***************
*** 1166,1172 ****
  		case 'r':
  			{
  				int n;
! 				char tmp_buf[1024];
  
  				rewind(cur_cmd->x.io_file);
  				while((n=fread(append.text+append.length,sizeof(char),append.alloc-append.length,cur_cmd->x.io_file))>0) {
--- 1244,1250 ----
  		case 'r':
  			{
  				int n;
! 				/*char tmp_buf[1024];*/
  
  				rewind(cur_cmd->x.io_file);
  				while((n=fread(append.text+append.length,sizeof(char),append.alloc-append.length,cur_cmd->x.io_file))>0) {
***************
*** 1407,1415 ****
  
  	input_line_number++;
  	replaced=0;
- 	if(feof(input_file))
- 		return;
  	for(;;) {
  		ch=getc(input_file);
  		if(ch==EOF) {
  			if(n==line.alloc)
--- 1485,1497 ----
  
  	input_line_number++;
  	replaced=0;
  	for(;;) {
+ 		if(n==0) {
+ 			line.text=ck_realloc(line.text,line.alloc*2);
+ 			p=line.text+line.alloc;
+ 			n=line.alloc;
+ 			line.alloc*=2;
+ 		}
  		ch=getc(input_file);
  		if(ch==EOF) {
  			if(n==line.alloc)
***************
*** 1427,1438 ****
  			line.length=line.alloc-n;
  			break;
  		}
- 		if(n==0) {
- 			line.text=ck_realloc(line.text,line.alloc*2);
- 			p=line.text+line.alloc;
- 			n=line.alloc;
- 			line.alloc*=2;
- 		}
  	}
  	ch=getc(input_file);
  	if(ch!=EOF)
--- 1509,1514 ----
***************
*** 1487,1492 ****
--- 1563,1576 ----
  	to->length+=length;
  }
  
+ /* print usage info and exit */
+ usage()
+ {
+ 	fprintf(stderr,USAGE,myname);
+ 	exit(4);
+ }
+ 
+ 
  #ifndef HAS_UTILS
  /* These routines were written as part of a library (by me), but since most
     people don't have the library, here they are.  */
***************
*** 1495,1502 ****
  #include "stdarg.h"
  
  /* Print an error message and exit */
! panic(str)
! char *str;
  {
  	va_list iggy;
  
--- 1579,1585 ----
  #include "stdarg.h"
  
  /* Print an error message and exit */
! panic(char *str, ...)
  {
  	va_list iggy;
  
***************
*** 1643,1648 ****
--- 1726,1743 ----
  	return ret;
  }
  
+ #ifdef __STDC__
+ char *
+ ck_strdup(char *s)
+ {
+ 	char *ret;
+ 	if (!(ret = strdup(s)))
+ 		panic("Couldn't allocate memory");
+ 	return ret;
+ }
+ 
+ #else /* not __STDC__ */
+ 
  /* Return a malloc()'d copy of a string */
  char *
  strdup(str)
***************
*** 1655,1661 ****
  	return ret;
  }
  
- 
  /*
   * memchr - search for a byte
   *
--- 1750,1755 ----
***************
*** 1713,1718 ****
--- 1807,1813 ----
  
  	return(dst);
  }
+ #endif /* not __STDC__ */
  
  /* Implement a variable sized buffer of 'stuff'.  We don't know what it is,
     nor do we care, as long as it doesn't mind being aligned by malloc. */


-- 
Tapani Tarvainen    (tarvaine@tukki.jyu.fi, tarvainen@finjyu.bitnet)

tarvaine@tukki.jyu.fi (Tapani Tarvainen) (09/29/89)

In article <1397@tukki.jyu.fi> tarvaine@tukki.jyu.fi I wrote:
>This is a report of problems I encountered in porting Gnu SED (1.06)
>to MS-DOS.
...
>That was all: as far as I can tell it now works

Well, I was a bit hasty:  I've found some more bugs:

* Comments at end of file were not treated properly:  The vector->v_length
is incremented before comments are detected.  After junking the comment
a jump is made to a point after it has been incremented (where addresses
are processed), so there's no problem if a real command follows, but if
it's at EOF, v_length if left one too big.  I fixed this by decrementing
v_length after skipping the comment text and jumping to an earlier point
(where EOF and blank lines are skipped before incrementing v_length).

* 'r' reallocation of append buffer looked like this:

case 'r':
    {
        int n;
        char tmp_buf[1024];

        rewind(cur_cmd->x.io_file);
        while((n=fread(append.text+append.length,sizeof(char),
                append.alloc-append.length,cur_cmd->x.io_file))>0) {
            append.length += n;
            if(append.length==append.alloc) {
                append.text = ck_realloc(append.text, 
                    append.alloc + cur_cmd->x.cmd_txt.text_len);
                append.alloc += cur_cmd->x.cmd_txt.text_len;
            }
        }

There are two problems with this:
First, it is possible that append.length==append.alloc initially
('a' won't increase append if it gets exactly full).
Second, cur_cmd->x.cmd_txt.text_len doesn't contain anything useful
here (maybe the code was copied from 'a'?).

The first can be fixed by making it a do-loop.

The second I fixed by doubling the size of the buffer when it gets full.
After looking at how 'a' works I changed it do the same:  it used to
increase the buffer size just enough to hold the new line, which works
but is inefficient.  (This technique is used also in regex &c.  It is
quite efficient in general, but in tight memory situations can result
in unnecessary out-of-memory error.  Would it be a good idea to write
an allocation routine which is given minimum and maximum amount of
memory required, instead of an exact amount?)

Finally, tmp_buf isn't used for anything at all, so I removed it.
I also removed another unused variable ('file' in compile_file()).
(Yes, I know, a good compiler would not allocate memory for unused
variables - but I'm not going to port gcc to MS-DOS!)

To save space I'll put here diff from my previous version to current.
If you want diffs from 1.06 to this just let me know (preferably by
mail: news arrive here about one week old for the time being).

*** sed.old	Thu Sep 28 08:34:47 1989
--- sed.c	Fri Sep 29 08:47:29 1989
***************
*** 1,7 ****
  
  /*  GNU SED, a batch stream editor.
      Copyright (C) 1989, Free Software Foundation, Inc.
! 	Last changed by Tapani Tarvainen 28 September 1989
  
      This program is free software; you can redistribute it and/or modify
      it under the terms of the GNU General Public License as published by
--- 1,7 ----
  
  /*  GNU SED, a batch stream editor.
      Copyright (C) 1989, Free Software Foundation, Inc.
! 	Last changed by Tapani Tarvainen 29 September 1989
  
      This program is free software; you can redistribute it and/or modify
      it under the terms of the GNU General Public License as published by
***************
*** 49,54 ****
--- 49,58 ----
  	- fixed a bug in append_pattern_space
  	  (moved n==0 test to beginning of loop)
  	- fixed 's/any//' so it won't do ck_malloc(0)
+ 	- fixed 'r' reallocation of append buffer
+ 	- changed 'a' reallocation: double append buffer when it overflows
+ 	  instead of adding just enough for the new line
+ 	- fixed handling of comments at end of file
  	- changed error and usage messages slightly (print usage
  	  if no arguments, remove path from name in messages)
  	- moved some ANSI-routine replacements inside #ifndef __STDC__
***************
*** 457,463 ****
  compile_file(str)
  char *str;
  {
! 	FILE *file;
  	int ch;
  
  	prog_start=prog_cur=prog_end=0;
--- 461,467 ----
  compile_file(str)
  char *str;
  {
! /*	FILE *file;	*/
  	int ch;
  
  	prog_start=prog_cur=prog_end=0;
***************
*** 505,510 ****
--- 509,515 ----
  		vector->next_one = 0;
  	}
  	for(;;) {
+ 	skip_comment:
  		do ch=inchar();
  		while(ch!=EOF && (isspace(ch) || ch=='\n' || ch==';'));
  		if(ch==EOF)
***************
*** 523,529 ****
  		cur_cmd->aflags=0;
  		cur_cmd->cmd=0;
  
- 	skip_comment:
  		if(compile_address(&(cur_cmd->a1))) {
  			ch=inchar();
  			if(ch==',') {
--- 528,533 ----
***************
*** 547,552 ****
--- 551,557 ----
  				bad_prog(NO_ADDR);
  			do ch=inchar();
  			while(ch!=EOF && ch!='\n');
+ 			vector->v_length--;
  			goto skip_comment;
  		case '!':
  			if(cur_cmd->aflags & ADDR_BANG_BIT)
***************
*** 1085,1093 ****
  			break;
  
  		case 'a':
! 			if(append.alloc-append.length<cur_cmd->x.cmd_txt.text_len) {
! 				append.text=ck_realloc(append.text,append.alloc+cur_cmd->x.cmd_txt.text_len);
! 				append.alloc+=cur_cmd->x.cmd_txt.text_len;
  			}
  			bcopy(cur_cmd->x.cmd_txt.text,append.text+append.length,cur_cmd->x.cmd_txt.text_len);
  			append.length+=cur_cmd->x.cmd_txt.text_len;
--- 1090,1098 ----
  			break;
  
  		case 'a':
! 			while(append.alloc-append.length<cur_cmd->x.cmd_txt.text_len) {
! 				append.alloc *= 2;
! 				append.text=ck_realloc(append.text,append.alloc);
  			}
  			bcopy(cur_cmd->x.cmd_txt.text,append.text+append.length,cur_cmd->x.cmd_txt.text_len);
  			append.length+=cur_cmd->x.cmd_txt.text_len;
***************
*** 1243,1259 ****
  
  		case 'r':
  			{
! 				int n;
! 				/*char tmp_buf[1024];*/
  
  				rewind(cur_cmd->x.io_file);
! 				while((n=fread(append.text+append.length,sizeof(char),append.alloc-append.length,cur_cmd->x.io_file))>0) {
  					append.length += n;
  					if(append.length==append.alloc) {
! 						append.text = ck_realloc(append.text, append.alloc + cur_cmd->x.cmd_txt.text_len);
! 						append.alloc += cur_cmd->x.cmd_txt.text_len;
  					}
! 				}
  				if(ferror(cur_cmd->x.io_file))
  					panic("Read error on input file to 'r' command\n");
  			}
--- 1248,1263 ----
  
  		case 'r':
  			{
! 				int n=0;
  
  				rewind(cur_cmd->x.io_file);
! 				do {
  					append.length += n;
  					if(append.length==append.alloc) {
! 						append.alloc *= 2;
! 						append.text = ck_realloc(append.text, append.alloc);
  					}
! 				} while((n=fread(append.text+append.length,sizeof(char),append.alloc-append.length,cur_cmd->x.io_file))>0);
  				if(ferror(cur_cmd->x.io_file))
  					panic("Read error on input file to 'r' command\n");
  			}
-- 
Tapani Tarvainen    (tarvaine@tukki.jyu.fi, tarvainen@finjyu.bitnet)