[mod.sources] vstr - a dynamic string package

sources-request@genrad.UUCP (06/04/85)

From: jordan@greipa (Jordan K. Hubbard)

-------- cut here ---------
# This is a shell archive.  Remove anything before this line,
# then unpack it by saving it in a file and typing "sh file".
#
# Wrapped by greipa!jordan on Fri May 31 02:15:19 PDT 1985
# Contents:  Makefile OPERATION README examp.c v_alloc.c v_append.c v_clear.c
#	v_cursor.c v_debug.c v_delete.c v_find.c v_free.c v_get.c v_move.c
#	v_put.c v_rewind.c v_string.c v_utils.c vstr.h vstr.man
 
echo x - Makefile
sed 's/^@//' > "Makefile" <<'@//E*O*F Makefile//'
# Makefile for vstr package, 04/16/85 Jordan K. Hubbard

INC=/usr/include/local
LIB=/usr/lib
OBJS=	v_put.o v_get.o v_alloc.o v_debug.o v_rewind.o v_append.o v_utils.o\
		v_string.o v_delete.o v_move.o v_cursor.o v_free.o v_clear.o v_find.o

INCS=vstr.h

@.c.o:
	cc -O -c $<

libvstr.a: $(OBJS)
	ar r libvstr.a $(OBJS)
	ranlib libvstr.a

$(OBJS): $(INCS)

install: libvstr.a
	cp libvstr.a $(LIB)
	cp $(INCS) $(INC)
	cp vstr.man /usr/man/manl/vstr.l
	ranlib $(LIB)/libvstr.a
@//E*O*F Makefile//
chmod u=rw,g=r,o=r Makefile
 
echo x - OPERATION
sed 's/^@//' > "OPERATION" <<'@//E*O*F OPERATION//'
The operation of this package is fairly straightforward. A "virtual string"
(actually, I should have called it a "dynamic" string , but it was too late
as I didn't want to have to change all those 'v's to 'd's..) Is nothing
more than a linked list of 'units', each containing a previous and next
(unit) pointer, a data area and a length word.

The header block allocated by new_vstr contains a head and tail pointer that
points to the first and last units in the 'chain'. This makes it easy to
to 'rewind' and 'append' the string. Also in the header block is the 'unit'
size for that particular vstr. A variable unit size is necessary so that
v-strings can be 'tuned' for various applications. A larger unit size can
save an appreciable amount of space on machines where ints and pointers
are 32 bits (here I have a 10 byte overhead for each unit).
However, if massive editing is being done, or the expected data is small,
smaller units pay off. Memory conscious folks should see the footnote at
the end of this file.

Finally, and probably most important, is a 'cursor'. The cursor is nothing
more that a structure containing a unit pointer and an offset into it.
The cursor 'points' nowhere when the string is first created, then moves 
as characters are added to the string. The cursor will always point
to one of three locations when the string contains text:

1)	In the 'rewound' state, the cursor points to the first unit in
	the chain with an offset of -1 (not on any character).

2)	In the 'appended' state, the cursor will point at the last unit
	with an offset equal to the number of characters in the last unit.
	Since arrays start at zero, this puts it 'off the end' of the data
	area for that unit, not on any character.

3)	After a 'get', 'put', 'find' etc. operation, the cursor will point to the
	character affected by the last operation. If one had just done a 'get'
	for instance, the cursor would point to the character just gotten.


There are also other operations can make use of 'cursors' that the
user allocates himself. They are described in the manual page.

One might ask why I put a length word in each unit instead of just
null terminating it. Well, in a word, it's a lot faster when calculating
the length of the string (which I seem to do a lot) and memory is
cheap on this machine. This is another area in which I welcome feedback.

One last small thing, I got kind of carried away with the 'debugging'
features. I found them rather useful for tracing the code as I debugged
it. You might wish to look at some of the 'wrapper' defines in vstr.h
and the code in v_debug.c (if the 'id' stack confuses you, just think
about nested procedure calls). It's all kind of crude, but effective.
It's also quite usable elsewhere. Please don't flame me for my
handling of 'variable' arguments. A lot of people don't have varargs
and I'm certainly not going to vax-kludge it (doesn't work on pyramids
anyway).

Footnote:

* For people that DON'T have a lot of memory, you might want to consider
the following (people wishing to create very large vstr's might also look
into it). Look at the define called LENTYPE in vstr.h. As shipped, it
is defined as a short (16 bits on my machine). This allows me to
create and index fairly large units (useful when I slurp in something
like the termcap file). Some people may find a limit of 127 adequate
and can change it to a signed char (must be signed). Inversely, some
may wish to make this an int.
@//E*O*F OPERATION//
chmod u=rw,g=r,o=r OPERATION
 
echo x - README
sed 's/^@//' > "README" <<'@//E*O*F README//'
You might want to edit the makefile to put the library and include file
somewhere else. Currently it's set for /usr/lib and /usr/include/local,
respectivly. The manual page automatically tries to go in /usr/man/manl,
you may want to change this as well.

This program has not really been used at all, I'm releasing it 
mainly to get some feedback on it (this is not to say that it doesn't
work, there are no KNOWN bugs..). There are two inconsistancies
in the find and delete commands that might raise some ire, see the
documentation.

You will probably want to edit vstr.h to suit your tastes, read OPERATION
for some more vaguely useful information.

There's also a very crude test driver called examp.c.. It wasn't written,
it just grew.. I include it only so that you may do some rudimentary testing
if you wish to take the easy way out. No documentation, no support, if
it was a child I'd disown it. You figure it out.

Please mail any bugs/flames to me so that I may improve the code (the package,
not examp.c).

I'm also interested in any uses made of it, I haven't figured out
any myself [ :-) ]... This package can be redistributed and used
freely with appropriate credit. I am quite curious about the ways
you may find to use it, any mail/interesting code describing
this would be appreciated.


Future plans include block operations, compaction on the fly, regular
expression searches (currently not included due to the differences
in SysV and 4.2 regex(p) code) and any speedups I can think of.

Suggestions gratefully accepted.



Jordan Hubbard
May 30th, 1985.
{decwrl, dual, pesnta}!greipa!jordan
@//E*O*F README//
chmod u=rw,g=r,o=r README
 
echo x - examp.c
sed 's/^@//' > "examp.c" <<'@//E*O*F examp.c//'
#include "vstr.h"
#include <stdio.h>
#include <ctype.h>

int i;

eatnl() {
	register char ch;
	register int foundnum;

	while ((ch = getchar()) != '\n')
		if (isdigit(ch) || ch == '-') {
			ungetc(ch, stdin);
			scanf("%d", &i);
			foundnum = 1;
		}
	if (!foundnum)
		i = 0;
}

main() {
		register p_vstr testv = 0;
		register char *st, fname[40];
		char ch;
		struct cursor save;
		int going = 1, mode = FWD, cnt, chr;
		register FILE *fp;

		while (going) {
			printf("Command> ");
			ch = getchar();
			if (ch == '\n') {
				eatnl();
				continue;
			}
			eatnl();
			switch (ch) {

				case 'p':
					while ((ch = getchar()) != '~') {
						v_put(testv, ch, mode);
						if (mode == HERE)
							break;
					}
					eatnl();
					break;

				case 'g':
					if (mode == INS) {
						printf("Now in FWD mode\n");
						mode = FWD;
					}
					while ((ch = v_get(testv, mode)) !=
					  (mode == FWD ? ATEND : ATBEG)) {
						putchar(ch);
						if (mode == HERE)
							break;
					}
					printf("*END*\n");
					break;

				case 's':
					v_savecursor(testv, &save);
					printf("saved\n");
					break;

				case 'r':
					v_setcursor(testv, &save);
					printf("restored\n");
					break;

				case 'S':
					v_show(testv);
					break;

				case 'm':
					if (i < 0)
						i = v_move(testv, BACK, -i);
					else
						i = v_move(testv, FWD, i);
					printf("Moved %d positions\n", i);
					break;

				case 'd':
					printf("Deleted %d chars\n", v_delete(testv, i));
					break;

				case 'q':
					going = 0;
					break;

				case 'f':
					v_free(testv);
					testv = 0;
					break;

				case 'n':
					testv = new_vstr(i);
					printf("Address is %x\n", testv);
					break;

				case 'c':
					v_clear(testv, END);
					break;

				case 'R':
					v_rewind(testv);
					printf("Rewound\n");
					break;

				case 'A':
					v_append(testv);
					printf("Appended\n");
					break;

				case 'M':
					switch(i) {

						case 0:
							mode = FWD;
							printf("Forward\n");
							break;

						case 1:
							mode = BACK;
							printf("Back\n");
							break;

						case 2:
							mode = HERE;
							printf("Here\n");
							break;

						case 3:
							mode = INS;
							printf("Insert\n");
							break;

						default:
							printf("Unknown mode. Set to forward\n");
							mode = FWD;
							break;
					}
				break;

				case '>':
					cnt = 0;
					printf("To file name: ");
					scanf("%s", fname);
					eatnl();
					fp = fopen(fname, "w");
					if (!fp) {
						printf("Can't open %s for writing\n", fname);
						continue;
					}
					while ((ch = v_get(testv, mode)) !=
					  (mode == FWD ? ATEND : ATBEG)) {
						fputc(ch, fp);
						cnt++;
						if (mode == HERE)
							break;
					}
					printf("%d chars written\n", cnt);
					fclose(fp);
					break;

				case '<':
					cnt = 0;
					printf("From file name: ");
					scanf("%s", fname);
					eatnl();
					fp = fopen(fname, "r");
					if (!fp) {
						printf("Can't open %s for writing\n", fname);
						continue;
					}
					while((chr = fgetc(fp)) != EOF) {
						cnt++;
						v_put(testv, chr, FWD);
					}
					fclose(fp);
					printf("%d chars read, length of vstr is now %d\n",
					  cnt, v_length(testv, BEG));
					break;

				case '/':
					/* look ma, no statement */
					{
						register p_cursor cp;

					printf("Search for char: ");
					scanf("%c", &ch);
					if (cp = v_find(testv, ch)) {
						printf("Found.\n");
						v_setcursor(testv, cp);
					}
					} /* yuk. So I'm too lazy to declare my locals earlier.
						this is just a lousy test program.
						What do you want from me? Style? */
					break;

				default:
					printf("Unknown command: %c.\n", ch);
			}
	}
}
@//E*O*F examp.c//
chmod u=rw,g=r,o=r examp.c
 
echo x - v_alloc.c
sed 's/^@//' > "v_alloc.c" <<'@//E*O*F v_alloc.c//'
/*
 * V_ALLOC Version 1.0, 4/16/85
 * Jordan K. Hubbard
 */

#include "vstr.h"

IMPORT char *malloc();

ROUTINE p_unit new_unit(v)
register p_vstr v;
{
	register p_unit tmp;

	entry(new_unit)

	tmp = (p_unit)malloc(sizeof *tmp);
	if (!tmp)
		freak_out(M_NOSPACE);
	tmp->next = tmp->prev = NIL;
	tmp->len = 0;
	tmp->data = malloc(v->unit_size);
	if (!tmp->data)
		freak_out(M_NOSPACE);
	ret(tmp)
}

ROUTINE p_vstr new_vstr(i)
register int i;
{
	register p_vstr tmp;

	entry(new_vstr)

	tmp = (p_vstr)malloc(sizeof *tmp);
	if (!tmp)
		freak_out(M_NOSPACE);
	tmp->head = tmp->tail = tmp->cur.here = NIL;
	tmp->cur.u_pos = -1;
	tmp->unit_size = i;
	ret(tmp)
}
@//E*O*F v_alloc.c//
chmod u=rw,g=r,o=r v_alloc.c
 
echo x - v_append.c
sed 's/^@//' > "v_append.c" <<'@//E*O*F v_append.c//'
#include "vstr.h"

ROUTINE void v_append(v)
register p_vstr v;
{
	entry(v_append)

	v->cur.here = v->tail;
	v->cur.u_pos = v->cur.here->len;
	end
}
@//E*O*F v_append.c//
chmod u=rw,g=r,o=r v_append.c
 
echo x - v_clear.c
sed 's/^@//' > "v_clear.c" <<'@//E*O*F v_clear.c//'
#include "vstr.h"

ROUTINE void v_clear(v, how)
register p_vstr v;
register int how;
{
	register p_unit temp;

	entry(v_clear)

	if (!v && !v->head)
		end
	if (how == END) {
		temp = v->cur.here;
		if (v->cur.u_pos) {
			temp->len = v->cur.u_pos;
			temp = temp->next;
		}
	}
	else
		temp = v->head;
	while (temp) {
		free(temp);
		temp = temp->next;
	}
	v->head = v->tail = v->cur.here = 0;
	v->cur.u_pos = -1;
	end
}
@//E*O*F v_clear.c//
chmod u=rw,g=r,o=r v_clear.c
 
echo x - v_cursor.c
sed 's/^@//' > "v_cursor.c" <<'@//E*O*F v_cursor.c//'
#include "vstr.h"

ROUTINE p_cursor v_savecursor(v, c)
register p_vstr v;
register p_cursor c;
{
	entry(v_savecursor)

	c->here = v->cur.here;
	c->u_pos = v->cur.u_pos;
#ifdef DEBUG
yelp("saving cursor at unit %x, pos %d\n", c->here, c->u_pos);
#endif
	ret(c)
}

ROUTINE void v_setcursor(v, c)
register p_vstr v;
register p_cursor c;
{
	entry(v_setcursor)

	v->cur.here = c->here;
	v->cur.u_pos = c->u_pos;
#ifdef DEBUG
yelp("setting cursor to unit %u, pos %d\n", c->here, c->u_pos);
#endif
	end
}

ROUTINE void v_copycursor(c1, c2)
register p_cursor c1, c2;
{
	entry(v_copycursor)

	c1->here = c2->here;
	c1->u_pos = c2->u_pos;
	end
}

ROUTINE int v_span(c1, c2)
register p_cursor c1, c2;
{
	register int cnt = 0;
	register p_unit tmp = c1->here;

	entry(v_span)

	while (tmp && (tmp != c2->here)) {
		cnt += tmp->len;
		tmp = tmp->next;
	}
	if (tmp)
		cnt += c2->u_pos;
	ret(cnt)
}
@//E*O*F v_cursor.c//
chmod u=rw,g=r,o=r v_cursor.c
 
echo x - v_debug.c
sed 's/^@//' > "v_debug.c" <<'@//E*O*F v_debug.c//'
/*
 * V_DEBUG Version 1.0 4/16/85
 * Jordan K. Hubbard
 */

#include "vstr.h"
#include <stdio.h>

#define PUSH	1
#define POP		2
#define LOOK	3

#ifdef DEBUG
char *id();

ROUTINE void yelp(fmt, p1, p2, p3, p4, p5, p6)
register char *fmt;
register unsigned char *p1, *p2, *p3, *p4, *p5, *p6;
{
	fprintf(stderr, "DEBUG: [%s] ", id(0, LOOK));
	fprintf(stderr, fmt, p1, p2, p3, p4, p5, p6);
	return;
}

ROUTINE char *id(s, mode)
register char *s;
register int mode;
{
	/* if we get more than 50 subroutine levels deep, something's wrong */
	/* with our program anyway (or we've taken modular programming too far!) */
	static char *stack[50];
	static int pointer = 0;
	
	switch (mode) {

		case PUSH:
			stack[pointer++] = s;
			if (pointer == 50) {
				yelp("stack overflow in id!\n");
				exit(1);
			}
			break;

		case POP:
			if (pointer - 1 < 0) {
				yelp("stack underflow in id!\n");
				exit(1);
			}
			return(stack[--pointer]);
			break;

		case LOOK:
			if (pointer - 1 < 0) {
				yelp("empty stack for 'look' in id!\n");
				exit(1);
			}
			return(stack[pointer - 1]);
			break;
	}
}

ROUTINE void _end()
{
	yelp("Exiting with no returned value\n");
	id(0, POP);
	return;
}

ROUTINE void _entry(s)
register char *s;
{
	id(s, PUSH);
	yelp("Entering\n");
	return;
}

ROUTINE void _ret(l)
unsigned char *l;
{
	yelp("Exiting with %u (hex: %x)\n", l, l);
	id(0, POP);
	return;
}

#else

EXPORT char *_Curr_rtn;

#endif

/* Makes character 'visible' by turning it into a printable control */
/* character if it is one or its octal \nnn form if it is greater than */
/* the upper limit for printable ascii */

ROUTINE char *visible(ch)
register char ch;
{
	static char buf[5];

	entry(visible)

	buf[0] = '\0';
	if (ch < 32) {
		buf[0] = '^';
		ch += 64;
	}
	if (ch > 31 && ch < 127) {
		if (buf[0]) {
			buf[1] = ch;
			buf[2] = '\0';
		}
		else {
			buf[0] = ch;
			buf[1] = '\0';
		}
	}
	else if (ch > 126) {
		sprintf(buf, "\\%03o", ch);
		buf[4] = '\0';
	}
	ret(buf)
}

ROUTINE void v_show(v)
register p_vstr v;
{
	register p_unit temp;
	register int n_units = 0, n_chars = 0, l;
	register int n_funits = 0;

	entry(v_show)

	if (!v) {
		fprintf(stderr, "v_show: vstr is NIL\n");
		end
	}
	fprintf(stderr, "Debugging information for vstr at %X\n", v);
	fprintf(stderr, "Head is at %X, Tail at %X\n", v->head, v->tail);
	fprintf(stderr, "Unit size is %d\n", v->unit_size);
	fprintf(stderr, "Cursor points to unit %X, position %d\n\n",
	  v->cur.here, v->cur.u_pos);
	temp = v->head;
	while (temp) {
		fprintf(stderr, "prev     unit     next\n");
		fprintf(stderr, "(%X) <= [%X] => (%X)   ", temp->prev, temp,
		  temp->next);
		if (temp == v->head)
			fprintf(stderr, "HEAD  ");
		if (temp == v->tail)
			fprintf(stderr, "TAIL  ");
		fprintf(stderr, "#%d\n", n_units);
		fprintf(stderr, "\tLength = %d", temp->len);
		if (temp->len) {
			n_chars += temp->len;
			fprintf(stderr, "  Data = '");
			for (l = 0; l < temp->len; l++)
				fprintf(stderr, "%s", visible(temp->data[l]));
			fprintf(stderr, "'\n\n");
			if (temp->len == v->unit_size)
				n_funits++;
		}
		else
			fprintf(stderr, "\n\n");
		temp = temp->next;
		n_units++;
	}
	fprintf(stderr, "\n\nTotal: %d characters in %d units. %d full, %d not.\n",
	    n_chars, n_units, n_funits, n_units - n_funits);
	end
}

ROUTINE void freak_out(fmt, p1, p2, p3, p4, p5, p6)
register char *fmt;
register int p1, p2, p3, p4, p5, p6;
{
#ifdef DEBUG
	fprintf(stderr, "%s: ", id(0, LOOK));
#else
	fprintf(stderr, "%s: ", _Curr_rtn);
#endif
	fprintf(stderr, fmt, p1, p2, p3, p4, p5, p6);
	exit(1);
}
@//E*O*F v_debug.c//
chmod u=rw,g=r,o=r v_debug.c
 
echo x - v_delete.c
sed 's/^@//' > "v_delete.c" <<'@//E*O*F v_delete.c//'
#include "vstr.h"

ROUTINE int v_delete(v, cnt)
register p_vstr v;
register int cnt;
{
	register p_unit temp;
	register int actual = 0, n;

	entry(v_delete)

	if (!cnt)
		end
	while (cnt) {
		temp = v->cur.here;
		n = temp->len - v->cur.u_pos;
		if (!n) {
			if (!temp->next)
				ret(actual)
			else {
				v->cur.here = temp->next;
				v->cur.u_pos = 0;
			}
			continue;
		}
		if (n <= cnt) {
			if (!v->cur.u_pos) {
				cnt -= temp->len;
				actual += temp->len;
				linkout(v, temp);
				if (v->cur.here == temp->prev)
					cnt = 0;
			}
			else {
				temp->len -= n;
				cnt -= n;
				actual += n;
				if (temp->next) {
					v->cur.here = temp->next;
					v->cur.u_pos = 0;
				}
				else
					cnt = 0;
			}
		}
		else {
			strncpy(temp->data + v->cur.u_pos,
				temp->data + v->cur.u_pos + cnt, n);
			temp->len -= cnt;
			actual += cnt;
			cnt = 0;
		}
	}
	ret(actual)
}
@//E*O*F v_delete.c//
chmod u=rw,g=r,o=r v_delete.c
 
echo x - v_find.c
sed 's/^@//' > "v_find.c" <<'@//E*O*F v_find.c//'
#include "vstr.h"

ROUTINE p_cursor v_find(v, ch, dir)
register p_vstr v;
register char ch;
register int dir;
{
	static struct cursor found;
	struct cursor save;
	register int i;

	entry(v_find)

	if (dir != FWD && dir != BACK)
		freak_out("called with bad mode (not BACK or FWD)\n");
	v_savecursor(v, &save);
	while ((i = v_get(v, dir)) != (dir == FWD ? ATEND : ATBEG))
		if (i == ch) {
			v_savecursor(v, &found);
			v_setcursor(v, &save);
			ret(&found)
		}
	v_setcursor(v, &save);
	ret(0)
}
@//E*O*F v_find.c//
chmod u=rw,g=r,o=r v_find.c
 
echo x - v_free.c
sed 's/^@//' > "v_free.c" <<'@//E*O*F v_free.c//'
#include "vstr.h"

ROUTINE void v_free(v)
register p_vstr v;
{
	register p_unit temp;

	entry(v_free)

	if (!v)
		freak_out(M_NILVSTR);
	temp = v->head;
	while (temp) {
		free(temp);
		temp = temp->next;
	}
	free(v);
	end
}
@//E*O*F v_free.c//
chmod u=rw,g=r,o=r v_free.c
 
echo x - v_get.c
sed 's/^@//' > "v_get.c" <<'@//E*O*F v_get.c//'
/*
 * V_GET Version 1.0, 4/16/85
 * Jordan K. Hubbard
 */

#include "vstr.h"

ROUTINE int v_get(v, mode)
register p_vstr v;
register int mode;
{
	register int i;

	entry(v_get)

#ifdef DEBUG
yelp("called with vstr at %x, mode = %d\n", v, mode);
#endif
	if (!v)
		freak_out(M_NILVSTR);
	if (!v->head)
		ret((mode == BACK) ? ATBEG : ATEND)
	switch(mode) {

		case FWD:
			if (v_move(v, FWD, 1) == ATEND)
				ret(ATEND)
			i = (int)v->cur.here->data[v->cur.u_pos];
			break;

		case BACK:
			if (v_move(v, BACK, 1) == ATBEG)
				ret(ATBEG)
			i = (int)v->cur.here->data[v->cur.u_pos];
			break;

		case HERE:
			if (IS_BACK(v))
				ret(ATEND)
			i = (int)v->cur.here->data[v->cur.u_pos];
			break;
	}
	ret(i)
}
@//E*O*F v_get.c//
chmod u=rw,g=r,o=r v_get.c
 
echo x - v_move.c
sed 's/^@//' > "v_move.c" <<'@//E*O*F v_move.c//'
#include "vstr.h"

ROUTINE int v_move(v, dir, cnt)
register p_vstr v;
register int dir, cnt;
{
	register int actual = 0;

	entry(v_move)

#ifdef DEBUG
yelp("called with vstr at %x, count of %d, direction = %d\n", v, cnt, dir);
#endif
	if (!cnt)
		ret(0)
	switch (dir) {

		case FWD:
			while (cnt && !IS_BACK(v)) {
				if (++v->cur.u_pos >= v->cur.here->len)
					if (!(v->cur.here = v->cur.here->next)) {
						v->cur.here = v->tail;
						v->cur.u_pos = v->tail->len;
						break;
					}
					else
						v->cur.u_pos = 0;
				cnt--;
				actual++;
			}
			break;

		case BACK:
			while (cnt && !IS_FRONT(v)) {
				if (--v->cur.u_pos < 0)
					if (!(v->cur.here = v->cur.here->prev)) {
						v->cur.here = v->head;
						v->cur.u_pos = -1;
						break;
					}
					else
						v->cur.u_pos = v->cur.here->len - 1;
				actual++;
				cnt--;
			}
			break;
	}
	if (!actual)
		ret((dir == FWD) ? ATEND : ATBEG)
	else
		ret(actual)
}
@//E*O*F v_move.c//
chmod u=rw,g=r,o=r v_move.c
 
echo x - v_put.c
sed 's/^@//' > "v_put.c" <<'@//E*O*F v_put.c//'
/*
 *  V_PUT Version 1.0, 4/16/85
 * Jordan K. Hubbard
 */

#include "vstr.h"

IMPORT p_unit linkin();

ROUTINE void v_put(v, c, mode)
register p_vstr v;
register char c;
register int mode;
{
	register int n;

	entry(v_put)

#ifdef DEBUG
yelp("called with vstr at %x, char = %c, mode = %d\n", v, c, mode);
if (!v->head)
	yelp("vstr is cleared\n");
#endif
	if (!v)
		freak_out(M_NILVSTR);
	if (!v->head) {
		v->head = v->tail = v->cur.here = new_unit(v);
		v->cur.u_pos = -1;
	}

	/*
	 * The awful nasty evil label here is so that some cases can mung
	 * around with the text and then change the mode and goto. It's
	 * crude, but it seems better that a recursive function call.. Look
	 * at the 'END' or 'BEG' case and you'll see what I mean..
	 */
sw:	switch(mode) {

		case FWD:
			if (v_move(v, FWD, 1) == ATEND) {
				if (v->cur.here->len < v->unit_size) {
					v->cur.here->data[v->cur.here->len] = c;
					v->cur.u_pos++;
				}
				else {
					v->cur.here = v->tail = linkin(v, FWD);
					*(v->cur.here->data) = c;
					v->cur.u_pos = 0;
				}
			}
			else
				v->cur.here->data[v->cur.u_pos] = c;
			if (v->cur.here->len <= v->cur.u_pos)
				v->cur.here->len++;
			break;

		case BACK:
			if (v_move(v, BACK, 1) == ATBEG) {
				if (v->cur.here->len < v->unit_size) {
					shift(v->cur.here, 0);
					*(v->cur.here->data) = c;
					v->cur.u_pos = 0;
				}
				else {
					v->cur.here = v->head = linkin(v, BACK);
					v->cur.here->len = 1;
					v->cur.u_pos = 0;
					*(v->cur.here->data) = c;
				}
			}
			else
				v->cur.here->data[v->cur.u_pos] = c;
			break;

		case HERE:
			if (IS_BACK(v))
				end;
			v->cur.here->data[v->cur.u_pos] = c;
			break;

		case BEG:
			v_rewind(v);
			mode = BACK;
			goto sw;
			break;

		case END:
			v_append(v);
			mode = FWD;
			goto sw;
			break;

		case INS:
			/*
			 * If we're at the end, then this isn't really an insert.
			 * I know it's kludge, but I don't wanna rewrite the append
			 * code!
			 */
			if (v_move(v, FWD, 1) == ATEND) {
				mode = FWD;
				goto sw;
			}
			if (v->cur.here->len < v->unit_size) {
				if (v->cur.u_pos < v->cur.here->len) 
					shift(v->cur.here, v->cur.u_pos);
				else 
					v->cur.here->len++;
			}
			else if (!ripple(v)) {
				linkin(v, FWD);
				ripple(v);
			}
			v->cur.here->data[v->cur.u_pos] = c;
			break;

		default:
			freak_out("Bad mode passed, mode = %d, char = %d\n", mode, c);
			break;
	}
	end
}
@//E*O*F v_put.c//
chmod u=rw,g=r,o=r v_put.c
 
echo x - v_rewind.c
sed 's/^@//' > "v_rewind.c" <<'@//E*O*F v_rewind.c//'
#include "vstr.h"

ROUTINE void v_rewind(v)
register p_vstr v;
{
	entry(v_rewind)

	v->cur.here = v->head;
	v->cur.u_pos = -1;
	end
}
@//E*O*F v_rewind.c//
chmod u=rw,g=r,o=r v_rewind.c
 
echo x - v_string.c
sed 's/^@//' > "v_string.c" <<'@//E*O*F v_string.c//'
#include "vstr.h"

IMPORT char *malloc();

ROUTINE void v_stov(s, v)
register char *s;
register p_vstr v;
{
	register int i, l;

	entry(v_stov)

	l = strlen(s);
#ifdef DEBUG
yelp("Putting string at %x, length of %d into vstr at %x\n", s, l, v);
#endif
	for (i = 0; i < l; i++)
		v_put(v, s[i], FWD);
	end
}

ROUTINE char *v_vtos(v)
register p_vstr v;
{
	register char *s;
	register i = 0;
	register char ch;

	entry(v_vtos)

	s = malloc(v_length(v, HERE) + 1);
#ifdef DEBUG
yelp("making string at %x, %d chars long for vstr at %x", s, v_length(v, HERE) +
  1, v);
#endif
	if (!s)
		freak_out(M_NOSPACE);
	while ((ch = v_get(v, FWD)) != ATEND)
		s[i++] = ch;
	s[i] = '\0';
	ret(s)
}

ROUTINE char *v_nvtos(v, n)
register p_vstr v;
register int n;
{
	register char *s;
	register i = 0;
	register char ch;

	entry(v_nvtos)

	s = malloc(n + 1);
#ifdef DEBUG
yelp("making string at %x, %d chars long for vstr at %x", s, n + 1, v);
#endif
	if (!s)
		freak_out(M_NOSPACE);
	while ((ch = v_get(v, FWD)) != ATEND && n) {
		s[i++] = ch;
		n--;
	}
	s[i] = '\0';
	ret(s)
}
@//E*O*F v_string.c//
chmod u=rw,g=r,o=r v_string.c
 
echo x - v_utils.c
sed 's/^@//' > "v_utils.c" <<'@//E*O*F v_utils.c//'
#include "vstr.h"
#include <stdio.h>

ROUTINE p_unit linkin(v, mode)
register p_vstr v;
register int mode;
{
	register p_unit temp, p1, u = v->cur.here;

	entry(linkin)

#ifdef DEBUG
yelp("called with vstr at %x, mode = %d\n", v, mode);
#endif
	switch(mode) {

		case FWD:
			temp = new_unit(v);
			p1 = u->next;
			temp->prev = u;
			u->next = temp;
			temp->next = p1;
			if (p1)
				p1->prev = temp;
			break;

		case BACK:
			temp = new_unit(v);
			p1 = u->prev;
			temp->next = u;
			u->prev = temp;
			temp->prev = p1;
			if (p1)
				p1->next = temp;
			break;

		default:
			freak_out("passed weird mode %d\n", mode);
			end
			break;
	}
	ret(temp)
}

ROUTINE void linkout(v, u)
register p_vstr v;
register p_unit u;
{
	register p_unit p1, p2;

	entry(linkout)

#ifdef DEBUG
yelp("linking out unit at %x from vstr at %x\n", u, v);
#endif
	p1 = u->prev;
	p2 = u->next;
	if (p1)
		p1->next = p2;
	else
		v->head = p2;
	if (p2)
		p2->prev = p1;
	else
		v->tail = p1;
	if (p2) {
		v->cur.here = p2;
		v->cur.u_pos = 0;
	}
	else {
		v->cur.here = p1;
		v->cur.u_pos = (p1 ? p1->len : 0);
	}
	free(u);
	end
}

ROUTINE void shift(u, pos)
register p_unit u;
register int pos;
{
	register int i = u->len;

	entry(shift)

#ifdef DEBUG
yelp("shifting from %d to %d in unit at %x\n", i, pos, u);
#endif
	if (!i)
		freak_out("passed an empty unit at %u\n", u);
	while(i != pos) {
		u->data[i] = u->data[i - 1];
		i--;
	}
	u->len++;
	end
}

ROUTINE boolean ripple(v)
register p_vstr v;
{
	struct cursor pos;
	register int i = COST, p;
	register p_unit tmp = v->cur.here;
	register p_unit dest_u = tmp;
	register int dest_p = v->cur.u_pos;
	
	entry(ripple)

	/* Scan for a hole */
	pos.here = 0;
	while (i--) {
		if (!tmp->next) {
			v->tail = tmp->next = new_unit(v);
			tmp->next->prev = tmp;
			pos.here = tmp->next;
			pos.u_pos = 0;
#ifdef DEBUG
yelp("found end while scanning, linked in unit at %x\n", pos.here);
#endif
			break;
		}
		else if (tmp->len < v->unit_size) {
			pos.u_pos = tmp->len;
			pos.here = tmp;
#ifdef DEBUG
yelp("Found hole at %x(%d)\n", pos.here, pos.u_pos);
#endif
			break;
		}
		else
			tmp = tmp->next;
	}
	if (!pos.here)
		ret(0)
	pos.here->len++;
#ifdef DEBUG
yelp("rippling from %x(%d) to %x(%d)\n", pos.here, pos.u_pos, dest_u, dest_p);
#endif
	while (!(pos.here == dest_u && pos.u_pos == dest_p)) {
		if (!pos.u_pos) {
			tmp = pos.here->prev;
			pos.u_pos = tmp->len - 1;
			*(pos.here->data) = tmp->data[pos.u_pos];
			pos.here = tmp;
		}
		else {
			pos.here->data[pos.u_pos] = pos.here->data[pos.u_pos - 1];
			pos.u_pos--;
		}
	}
	ret(1)
}

ROUTINE int v_length(v, how)
register p_vstr v;
register int how;
{
	register int i = 0;
	register p_unit temp;

	entry(v_length)

#ifdef DEBUG
yelp("Getting length for vstr at %x, how = %d\n", v, how);
#endif
	if (!v || !v->head)
		ret(0)
	switch(how) {

		case HERE:
			temp = v->head;
			while (temp) {
				i += temp->len;
				temp = temp->next;
			}
			break;

		case BEG:
			if (IS_FRONT(v))
				break;
			temp = v->cur.here;
			if (v->cur.u_pos < temp->len) {
				i = v->cur.u_pos + 1;
				temp = temp->prev;
			}
			while(temp) {
				i += temp->len;
				temp = temp->prev;
			}
			break;

		case END:
			if (IS_BACK(v))
				break;
			temp = v->cur.here;
			i = temp->len - v->cur.u_pos;
			while (temp) {
				i += temp->len;
				temp = temp->next;
			} 
			break;
	}
	ret(i)
}
@//E*O*F v_utils.c//
chmod u=rw,g=r,o=r v_utils.c
 
echo x - vstr.h
sed 's/^@//' > "vstr.h" <<'@//E*O*F vstr.h//'
/* The number of units to scan ahead for holes when inserting characters */
/* (as opposed to just creating a new unit). Scanning isn't so */
/* much overhead as is copying characters to the hole, so it's best not */
/* to define COST too large unless you're using large units and/or doing */
/* a fair number of deletions. Perhaps some more clever way of doing */
/* insertions will be implemented some day. (I welcome any suggestions) */

#define COST 4

/* To turn on routine tracing and general annoying messages #define
 * DEBUG, as usual. Modify _end(), _entry() and _ret() for more advanced
 * debugging.
 */

/* #define DEBUG 1  */

/*
 * Define this to be the type of variable you want to store the length
 * and index variables in. Read OPERATION for more details..
 */

#define LENTYPE short

#ifdef DEBUG

#define entry(s) _entry("s");
#define ret(s) { _ret(s); return(s); }
#define end { _end(); return; }

#else

#define entry(s) _Curr_rtn = "s";
#define ret(s) return(s);
#define end return;

#endif

/*
 * utility macros
 */

#define IMPORT extern
#define EXPORT
#define ROUTINE
#define forever for(;;)
#define IS_FRONT(v) (v->cur.here == v->head && v->cur.u_pos == -1)
#define IS_BACK(v) (v->cur.here == v->tail \
 && (v->cur.u_pos == v->cur.here->len))

typedef unsigned char boolean;

/*
 * messages/error codes/constants
 */

#define M_NOSPACE "out of memory/arena corrupted\n"
#define M_NILVSTR "null virtual string passed\n"

#define ATEND		-1
#define ATBEG		-2

#define NIL		0
#define BACK	1
#define FWD		2
#define HERE	3
#define BEG		4
#define END		5
#define INS		6

/*
 * declarations
 */

struct unit {
	struct unit *prev, *next;
	LENTYPE len;
	char *data;
};

typedef struct unit *p_unit;

struct cursor {
	struct unit *here;
	LENTYPE u_pos;
};

typedef struct cursor *p_cursor;

struct vstr {
	p_unit head, tail;
	LENTYPE unit_size;
	struct cursor cur;
};

typedef struct vstr *p_vstr;

#ifndef DEBUG
IMPORT char *_Curr_rtn;
#endif

IMPORT char *v_nvtos();
IMPORT char *v_vtos();
IMPORT int v_delete();
IMPORT int v_get();
IMPORT int v_length();
IMPORT int v_move();
IMPORT int v_span();
IMPORT p_cursor v_find();
IMPORT p_cursor v_savecursor();
IMPORT p_unit new_unit();
IMPORT p_vstr new_vstr();
IMPORT void freak_out();
IMPORT void v_append();
IMPORT void v_clear();
IMPORT void v_copycursor();
IMPORT void v_free();
IMPORT void v_put();
IMPORT void v_rewind();
IMPORT void v_setcursor();
IMPORT void v_show();
IMPORT void v_stov();
@//E*O*F vstr.h//
chmod u=rw,g=r,o=r vstr.h
 
echo x - vstr.man
sed 's/^@//' > "vstr.man" <<'@//E*O*F vstr.man//'
@.TH STRING 3-ucb
@.SH NAME

new_vstr, v_append, v_clear, v_copycursor, v_delete, v_find, v_free, v_get,
v_length, v_move, v_vtos, v_nvtos, v_put, v_rewind, v_savecursor, v_setcursor,
v_show, v_span, v_stov

\-virtual string operations

@.SH ORIGIN
4.2BSD
@.SH SYNOPSIS
@.nf
@.B p_vstr new_vstr(unitsize)
@.B int unitsize;
@.PP
@.B void v_append(v)
@.B p_vstr v;
@.PP
@.B void v_clear(v, how)
@.B p_vstr v;
@.B int how;
@.PP
@.B void v_copycursor(c1, c2)
@.B p_cursor c1, c2;
@.PP
@.B int v_delete(v, cnt)
@.B p_vstr v;
@.B int cnt;
@.PP
@.B int v_find(v, ch, dir)
@.B p_vstr v;
@.B char ch;
@.B int dir;
@.PP
@.B void v_free(v)
@.B p_vstr v;
@.PP
@.B int v_get(v, mode)
@.B p_vstr v;
@.B int mode;
@.PP
@.B int v_length(v, how)
@.B p_vstr v;
@.B int how;
@.PP
@.B char *vtos(v)
@.B p_vstr v;
@.PP
@.B char *nvtos(v, n)
@.B p_vstr v;
@.B int n;
@.PP
@.B void v_put(v, ch, mode)
@.B p_vstr v;
@.B char ch;
@.B int mode;
@.PP
@.B void v_rewind(v)
@.B p_vstr v;
@.PP
@.B void v_setcursor(v, c)
@.B p_vstr v;
@.B p_cursor c;
@.PP
@.B p_cursor v_savecursor(v, c)
@.B p_vstr v;
@.B p_cursor c;
@.PP
@.B void v_show(v)
@.B p_vstr v;
@.PP
@.B int v_span(c1, c2)
@.B p_cursor c1, c2;
@.PP
@.B void v_stov(s, v)
@.B char *s;
@.B p_vstr v;
@.PP
@.fi
@.SH DESCRIPTION
These functions allow the creation and manipulation of 'virtual' strings.
That is, strings that expand and contract as characters are added or deleted.
Unlike conventional strings, virtual strings have 'cursors' that can be
moved around in the string determining where operations will take place.
@.PP
@.I new_vstr
allocates space for a new virtual string header and returns a pointer
to it. Subsequent memory allocations will be in
@.I unitsize
byte chunks..
@.PP
@.I v_append
moves the cursor to the last (append) position in string
@.I v.
@.PP
@.I v_clear
deletes characters from string
@.I v
in one of two ways. If
@.I how
is set to the constant 'END' then text is deleted from the cursor
position to the end of the string. If set to anything else, the entire
string is deleted.
@.PP
@.I v_copycursor
copies the contents of cursor
@.I c2
to
@.I c1.
@.PP
@.I v_delete
deletes
@.I cnt
characters from
@.I v
starting at the
@.I current
cursor position until
@.I cnt
is exhausted or the end of the string is hit. Does not move the cursor
(Characters slide over from the right). Returns the number of characters
actually deleted.
@.PP
@.I v_find
searches through string
@.I v
for char
@.I ch
in direction
@.I dir
until found, or end is hit. If found, a pointer to a cursor is returned
containing the location of the found character. Since this cursor
is declared as a local static in
@.I v_find,
care must be taken to save the contents in an auxiliary cursor if the
contents are to be preserved across subsequent
@.I v_find
calls.
@.PP
@.I v_free
deallocates all space used by string
@.IR v.
@.PP
@.I v_get
gets a character from
@.I v
in the following ways depending on
@.I mode.
If
@.I mode
is the constant 'HERE' the character at the cursor is returned. Otherwise
a character is returned after moving cursor one position in direction
@.I mode
(BACK or FWD).
Returns the constant ATBEG or ATEND if the cursor cannot be moved
backwards or forward (the string is rewound or appended).
@.PP
@.I v_length
returns number of characters in string
@.I v
in one of three ways, depending on the value of
@.IR how.
If
@.I how
is the constant 'HERE', then the total number of characters is returned.
If
@.I how
is the constant 'BEG', then the number of characters from the current
cursor position to the beginning of the string is returned.
Finally, if
@.I how
is the constant 'END', then the number of characters from the current
cursor position to the end of the string is returned.
(easy enough..)
@.PP
@.I v_vtos
slurps up characters from the current cursor position in string
@.I v
to the end. Returns a malloc'd string containing them.
@.PP
@.IR v_nvtos
is like
@.I vtos
but only slurps (at most)
@.I n
characters.
@.PP
@.I v_put
places character
@.I ch
in string
@.I v
depending on the value of
@.I mode.
If
@.I mode
is 'HERE', the character is placed
at the cursor position without moving. If set to 'BACK' or 'FWD, it
moves backwards or forward and places the character. If mode is 'INS',
then the character is inserted after the cursor position and
the cursor is moved forward onto the inserted character.
@.PP
@.I v_rewind
moves the cursor in string
to the beginning (rewound) position in string
@.I v.
@.PP
@.I v_setcursor
sets the cursor in
@.I v
to the contents of the cursor
@.I c.
@.PP
@.I v_savecursor
copies the cursor in
@.I v
to the cursor
@.I c.
The address of cursor
@.I c
is returned.
@.PP
@.I v_show
is mostly for debugging. It displays the header contents of
@.I v
and shows the structure and contents of the linked list. Some types
of list corruption are also detected and flagged. Kind of cute, but of
limited use.
@.PP
@.I v_span
returns the number of characters between cursor
@.I c1
and
@.I c2.
The characters under the cursors are included. This function is
useful in computing counts for
@.I v_delete, v_move, v_nvtos
ect.  If one is clever, certain block operations between cursors can be
done with the aid of this call.
@.PP
@.I v_stov
puts string
@.I s
into
@.I v
at the current cursor position. String is added in a forward direction.
@.PP
@.SH BUGS
@.I v_find
returns the cursor position of the found character. If a subsequent
@.I v_get
or
@.I v_put
is done, you'll get the
@.I next
character since they both advance the cursor before returning a character.
This is not really a bug, but potentially confusing.
@.I v_delete
also acts on the current cursor position, rather that moving then
deleting. Another potential confuser. (Unless you're using it with
@.I v_find)
@.LP
@//E*O*F vstr.man//
chmod u=rw,g=r,o=r vstr.man
 
echo Inspecting for damage in transit...
temp=/tmp/shar$$; dtemp=/tmp/.shar$$
trap "rm -f $temp $dtemp; exit" 0 1 2 3 15
cat > $temp <<\!!!
      23      55     472 Makefile
      64     631    3461 OPERATION
      40     273    1612 README
     206     449    3670 examp.c
      42      83     656 v_alloc.c
      11      18     148 v_append.c
      29      65     419 v_clear.c
      56     125     884 v_cursor.c
     183     538    3551 v_debug.c
      55     129     877 v_delete.c
      25      74     495 v_find.c
      19      30     224 v_free.c
      44      93     708 v_get.c
      52     123     928 v_move.c
     121     335    2405 v_put.c
      11      18     134 v_rewind.c
      66     185    1061 v_string.c
     205     505    3360 v_utils.c
     121     376    2464 vstr.h
     263    1009    5594 vstr.man
    1636    5114   33123 total
!!!
wc  Makefile OPERATION README examp.c v_alloc.c v_append.c v_clear.c v_cursor.c v_debug.c v_delete.c v_find.c v_free.c v_get.c v_move.c v_put.c v_rewind.c v_string.c v_utils.c vstr.h vstr.man | sed 's=[^ ]*/==' | diff -b $temp - >$dtemp
if [ -s $dtemp ]
then echo "Ouch [diff of wc output]:" ; cat $dtemp
else echo "No problems found."
fi
exit 0