[comp.sources.misc] v03i053: hc -- side-by-side file catenator

srcs@frito.UUCP (06/05/88)

comp.sources.misc: Volume 3, Issue 53
Submitted-By: "A. Nonymous" <srcs@frito.UUCP>
Archive-Name: hc

["pr -m -t" with better control over the columns.  ++bsa]

Here's a little utility that I wrote, based on the Knowlogy documentation,
originally available only under  CP/M.   The code is original, the  design
is  acknowledged in the header comments.  I have packed the documentation,
test files  and source  into the source file.  No makefile is needed, just
	cc -o hc hc.c		and run the tests.

Questions/bugs/flames to Bill Cox, (714)631-4452 (voice)
			or uunet!ccicpg!frito!bill

Please let me know when you receive this, and if you have any comments or
suggestions before posting it.  THANKS for all the work that you do to
make stuff like this available to people.

thanks,
Bill

[Cheap shar from inside the editor, why can't/don't people read?!  ++bsa]
sed 's/^X//' << \EOF > hc.c
X/*
X * SYNOPSIS
X *	hc [-s] [+tab | -col | -l str | filename] ...
X *
X * DESCRIPTION
X *
X *	hc  concatenates files horizontally.   It concatenates correspon-
X *	ding lines from files.   The effects of hc are best visualized by 
X *	imagining  the files to be printed on long strips of  paper  with 
X *	Velcro  strips  on the left and right margins;   hc sticks  these 
X *	files together to make one wide file on the standard output.
X *
X *	Besides file names,  the command line can contain tab  positions, 
X *	column numbers, and literal strings.  A maximum of thirteen files 
X *	can be specified.
X *
X *	a  TAB  POSITION  is denoted by a positive  decimal  number.   It 
X *	specifies  that  the next item should be placed at  a  particular 
X *	column.   The  first column number is one.   For example,  a  tab 
X *	position of five would cause the next item to be placed at column 
X *	5 in the output.   If the last argument is a tab  position,  then 
X *	each  line of output will contain enough trailing spaces to bring 
X *	it to that column.
X *
X *	a START POSITION is denoted by a negative decimal.  It  specifies 
X *	the  left margin of the next file to be concatenated.   The first 
X *	column is number one, and normally the left margin number is one.  
X *	For  a start column number greater than one,  the  first  (col-1) 
X *	characters of each line of the next file are discarded.
X *
X *	a LITERAL STRING is preceded by a '-l' flag. It may be surrounded
X *	by double quotes (under DOS, at least).  It specifies  characters
X *	to  be  inserted in every output line at its relative position in
X *	the argument list.   In an extension to  the  Knowlogy  HC,  this
X *	version interprets the standard C backslashed escape codes.   The
X *	following  command  interleaves lines from the two files, separa-
X *	ting them by a '\n' or newline:
X *	        hc file1 -l "\n" file2
X *
X *	Of necessity, all tabs are expanded to spaces upon input, so that 
X *	columns can be determined.  On output, spaces are compressed back 
X *	to  tabs,  and,  unless a TAB POSITION argument follows the  last 
X *	file name, trailing spaces are removed from each line.
X *
X *	All  files but the longest are conceptually padded at the  bottom 
X *	with  zero-length  lines to bring all files to bring to the  same 
X *	length.   That is, concatenation continues until the end of every 
X *	file is encountered.   However,  if there are one or more literal 
X *	strings in the argument list, at least one line of output will be 
X *	written even if all files are empty or no files are specified.
X *
X *	The -s flag prevents the output from including any tabs; that is, 
X *	the normal tab compression of the output line is suppressed.
X *
X * EXAMPLES
X *
X *	Assume the files A, B, and C, whose contents are as follows:
X *
X *	00000000011111111112
X *	12345678901234567890
X *	-- this is file A --
X *	Little Miss Muffet
X *	Sat on her tuffet,
X *	Eating her curds
X *	  and whey.
X *	Along came a spider
X *	Who sat down
X *	  beside her,
X *	And frightened
X *	  Miss Muffet away.
X *
X *	000000000111111111122222222223
X *	123456789012345678901234567890
X *	     -- this is file B --
X *	Four score and seven years
X *	ago, our fathers brought forth
X *	upon this continent a new
X *	nation, conceived in liberty,
X *	and dedicated to the
X *	proposition that all men are
X *	created equal.
X *
X *	0000000001
X *	1234567890
X *	- file C -
X *	gimme a U
X *	gimme an N
X *	gimme an I
X *	gimme a C
X *	gimme an A
X *	whazzat
X *	  spell?
X *	whazzat
X *	  spell?
X *	arright!
X *	
X * Example: hc A B C		simply concatenate the files, making
X *				no attempt to justify any margins.
X *
X *	000000000111111111120000000001111111111222222222230000000001
X *	123456789012345678901234567890123456789012345678901234567890
X *	-- this is file A --     -- this is file B --- file C -
X *	Little Miss MuffetFour score and seven yearsgimme a U
X *	Sat on her tuffet,ago, our fathers brought forthgimme an N
X *	Eating her curdsupon this continent a newgimme an I
X *	  and whey.nation, conceived in liberty,gimme a C
X *	Along came a spiderand dedicated to thegimme an A
X *	Who sat downproposition that all men arewhazzat
X *	  beside her,created equal.  spell?
X *	And frightenedwhazzat
X *	  Miss Muffet away.  spell?
X *	arright!
X *	
X * Example: hc A +21 B +51 C	This concatenates the three files,
X *				with file A starting at column 1,
X *				file B starting at column 21, and
X *				file C starting at column 51.
X *
X *	000000000111111111120000000001111111111222222222230000000001
X *	123456789012345678901234567890123456789012345678901234567890
X *	-- this is file A --     -- this is file B --     - file C -
X *	Little Miss Muffet  Four score and seven years    gimme a U
X *	Sat on her tuffet,  ago, our fathers brought forthgimme an N
X *	Eating her curds    upon this continent a new     gimme an I
X *	  and whey.         nation, conceived in liberty, gimme a C
X *	Along came a spider and dedicated to the          gimme an A
X *	Who sat down        proposition that all men are  whazzat
X *	  beside her,       created equal.                  spell?
X *	And frightened                                    whazzat
X *	  Miss Muffet away.                                 spell?
X *	                                                  arright!
X *
X * Example: hc B +40 A		Concatenate files B, starting at column 1,
X *				and A, starting at column 40.  The two files
X *				are displayed side by side for comparison
X *				viewing.
X *
X *	000000000111111111122222222223         00000000011111111112
X *	123456789012345678901234567890         12345678901234567890
X *	     -- this is file B --              -- this is file A --
X *	Four score and seven years             Little Miss Muffet
X *	ago, our fathers brought forth         Sat on her tuffet,
X *	upon this continent a new              Eating her curds
X *	nation, conceived in liberty,            and whey.
X *	and dedicated to the                   Along came a spider
X *	proposition that all men are           Who sat down
X *	created equal.                           beside her,
X *	                                       And frightened
X *	                                         Miss Muffet away.
X *
X *
X * Example: hc -5 B +17		Chop the first 4 and last 5 columns off
X *				of file B.
X *	0000011111111112
X *	5678901234567890
X *	 -- this is file
X *	 score and seven
X *	 our fathers bro
X *	 this continent 
X *	on, conceived in
X *	dedicated to the
X *	osition that all
X *	ted equal.      
X *
X *
X * Example: hc -s -5 A +10 +15 -5 C +50	Concatenate file A (starting with
X *					its fifth column), and cut it off
X *	000001111     000001            when it reaches column 9 (just be-
X *	567890123     567890            fore column 10);  then add file B
X *	his is fi     le C -            (startint with its fifth column)
X *	le Miss M     e a U             beginning at column 15;  then pad
X *	on her tu     e an N            out with spaces through column 49
X *	ng her cu     e an I            (just before column 50).  Don't
X *	d whey.       e a C             compress spaces to tabs.
X *	g came a      e an A                             
X *	sat down      zat                                
X *	side her,     ell?                               
X *	frightene     zat                                
X *	ss Muffet     ell?                               
X *	              ght!                               
X *	                                                 
X *
X * Example: hc A "-----" B	Concatenate files A and B, with
X *				five dashes between them.
X *
X *	00000000011111111112-----000000000111111111122222222223
X *	12345678901234567890-----123456789012345678901234567890
X *	-- this is file A -------     -- this is file B --
X *	Little Miss Muffet-----Four score and seven years
X *	Sat on her tuffet,-----ago, our fathers brought forth
X *	Eating her curds-----upon this continent a new
X *	  and whey.-----nation, conceived in liberty,
X *	Along came a spider-----and dedicated to the
X *	Who sat down-----proposition that all men are
X *	  beside her,-----created equal.
X *	And frightened-----
X *	  Miss Muffet away.-----
X *
X *
X * Example: hc "| " A +24 "|| " C +38 "|"	
X *					Concatenate files A and C,
X *					with vertical walls sur-
X *					rounding them.
X *
X *	| 00000000011111111112 || 0000000001 |
X *	| 12345678901234567890 || 1234567890 |
X *	| -- this is file A -- || - file C - |
X *	| Little Miss Muffet   || gimme a U  |
X *	| Sat on her tuffet,   || gimme an N |
X *	| Eating her curds     || gimme an I |
X *	|   and whey.          || gimme a C  |
X *	| Along came a spider  || gimme an A |
X *	| Who sat down         || whazzat    |
X *	|   beside her,        ||   spell?   |
X *	| And frightened       || whazzat    |
X *	|   Miss Muffet away.  ||   spell?   |
X *	|                      || arright!   |
X *
X *
X * Example: ls ? | hc -s "sp " - " | pr" > tmp
X *
X *	sp A  | pr			This example builds a shell
X *	sp B  | pr			script to be run.  The script
X *	sp C  | pr			will run the spelling checker
X *	sp D  | pr			on each file and print its
X *					output.
X *
X * Example: hc -s input > tmp		Expand tabs to spaces while
X *					copying input to tmp.
X *
X * Example: ... | hc - > tmp		Compact spaces to tabs while
X *					copying stdin to tmp.
X *
X * From the Unica hc utility published by Knowlogy, Inc. in 1982, for
X * use under CP/M.  As far as I know, Knowlogy no longer exists.  The
X * above documentation is quoted nearly verbatim from the Unica manual.
X * My thanks to Knowlogy's people for a most useful design.
X *
X * Comments and bugs to:
X * Bill Cox	(714)631-4452 (voice)
X */
X
X#include <stdio.h>
X#include <ctype.h>
X#include <string.h>
X/* #include <stdlib.h>			/* for atoi */
X
X#define TRUE	1
X#define FALSE	0 
X
X#define LINELEN 256
X#define LITLEN  80
X#define NARGS	13
X#define TABINT	8			/* tab interval, in columns */
X
X#define FIL 0				/* remember what type arg */
X#define LIT 1				/* stored in parmtype */
X#define TAB 2
X#define STR 3
X
Xint ch,
X    sflag = 0,				/* TRUE when "-s" flag specified */
X    filesact = 0,			/* count of active files */
X    filecnt = 0,			/* count of specified file names */
X    actarg = 0,				/* actual argument-table index */
X    saweof[NARGS],			/* TRUE when this file reaches EOF */
X    parmtype[NARGS],			/* type of current argument */
X    tabposn[NARGS],			/* TAB POSITION arguments here */
X    startcol[NARGS];			/* START COLUMN arguments here */
Xchar literals[NARGS][LITLEN];		/* LITERAL arguments here */
XFILE *infiles[NARGS];			/* FILE argument handles here */
X
Xextern void exit();
Xvoid usage();				/* forward declaration */
X
Xmain(argc, argv)
Xint argc;
Xchar *argv[];
X    {
X    int argno, i, j, k;
X
X    if (argc == 1)				/* informational prompt */
X	usage();
X/*
X * collect input line arguments, initialize
X */
X    tabposn[0] = startcol[0] = 0;
X    for (argno = 1; argno < argc; argno++) {
X	saweof[argno] = TRUE;
X	startcol[argno] = 0;
X	tabposn[argno] = 0;
X	switch (argv[argno][0]) {
X	/*
X	 * tab position - the next literal or input line will 
X	 * be placed at this column in the output line.
X	 */
X        case '+':			
X	    tabposn[actarg] = atoi(&argv[argno][1])-1;
X	    parmtype[actarg++] = TAB;
X	    break;			
X
X	case '-':
X	    switch (argv[argno][1]) {
X	    case 's': case 'S':
X		sflag = TRUE;
X		break;
X
X	    case 'l': case 'L':
X		argno++;			/* NEXT arg is literal */
X		if (argno >= argc) {
X		    (void)fprintf(stderr, "hc: missing literal argument\n");
X		    exit(4);
X		    }
X		if (strlen(argv[argno]) > LITLEN-1) {
X		    (void)fprintf(stderr, "hc: literal %d too long\n", argno-1);
X		    exit(3);
X		    }
X		/*
X		 * copy string literal, interpreting backslash notation
X		 */
X		for (i = j = 0; (ch = argv[argno][j++]) != '\0';)
X		    if (ch != '\\')
X		        literals[actarg][i++] = ch;
X		    else
X			switch (argv[argno][j++]) {
X			case 'n':			/* newline */
X			    literals[actarg][i++] = '\n'; break;
X			case 't':			/* horizontal tab */
X			    literals[actarg][i++] = '\t'; break;
X			case 'v':			/* vertical tab */
X			    literals[actarg][i++] = '\v'; break;
X			case 'b':			/* backspace */
X			    literals[actarg][i++] = '\b'; break;
X			case 'r':			/* carriage return */
X			    literals[actarg][i++] = '\r'; break;
X			case 'f':			/* form feed */
X			    literals[actarg][i++] = '\f'; break;
X			case '\\':			/* actual backslash */
X			    literals[actarg][i++] = '\\'; break;
X			case '\'':			/* single quote */
X			    literals[actarg][i++] = '\''; break;
X			case '\"':			/* double quote */
X			    literals[actarg][i++] = '\"'; break;
X			case '\0':			/* '\' at end  */
X			    literals[actarg][i++] = '\\'; 
X			    --j;			/* let FOR see null */
X			    break;
X			default:			/* not recognized */
X			    literals[actarg][i++] = '\\';
X			    literals[actarg][i++] = ch;
X			    break;
X			} /* switch */
X		    literals[actarg][i] = '\0';
X		    parmtype[actarg++] = LIT;
X		    break;
X
X		case '0': case '1': case '2': case '3':
X		case '4': case '5': case '6': case '7':
X		case '8': case '9':		/* col number */
X		    startcol[actarg] = atoi(&argv[argno][1])-1;
X		    parmtype[actarg++] = STR;
X		    break;
X
X		case '\0':			/* lone '-' is ref to stdin */
X		    filesact++;
X		    filecnt++;
X		    infiles[actarg] = stdin;
X		    saweof[actarg] = FALSE;
X		    parmtype[actarg++] = FIL;
X		    break;
X		    
X		default:
X		    (void)fprintf(stderr, "hc: unrecognized option=%s\n",
X				    argv[argno]);
X		    usage();
X	        } /* inner switch */
X		break;
X
X	    default:			/* open the file */
X		if (argno > NARGS) {
X		    (void)fprintf(stderr, "hc: too many file names\n");
X		    continue;
X		    }
X		if ((infiles[actarg] = fopen(argv[argno], "r")) != NULL) {
X		    saweof[actarg] = FALSE;
X		    filesact++;
X		    filecnt++;
X		    }
X		else {
X		    (void)fprintf(stderr, "hc: can't access %s\n", argv[actarg]);
X		    exit(2);
X		    }
X		actarg++;
X		break;
X	    } /* outer switch */
X	} /* for */
X/* 
X * process input lines, generate output
X */
X
X   do {
X	char bufin[NARGS][LINELEN],	/* input file line buffers here */
X	     intmp[LINELEN],		/* buffer for tab expansion */
X	     bufout[LINELEN];		/* output buffer */
X
X 	bufout[0] = '\0';			/* ready for output */
X	for (argno = 0; argno < actarg; argno++) {
X	    switch (parmtype[argno]) {
X		case TAB:			/* pad/truncate bufout */
X		    if ((j = tabposn[argno]-strlen(bufout)) < 0)
X			bufout[tabposn[argno]] = '\0';
X		    else 
X			for (; j > 0; j--)
X			    strcat(bufout, " ");			
X		    break;
X
X		case LIT:			/* output the literal */
X		    for (j = strlen(bufout), k = startcol[argno-1];
X			(bufout[j++] = literals[argno][k++]) != 0;);
X		    bufout[j] = '\0';
X		    break;	 	
X
X		case FIL:			/* get next line into intmp */
X    		    if (saweof[argno] == TRUE)
X			intmp[0] = '\0';
X		    else if (fgets(intmp, LINELEN, infiles[argno]) != NULL)
X			if (intmp[strlen(intmp)-1] == '\n')
X			    intmp[strlen(intmp)-1] = '\0';
X			else {
X			    (void)fprintf(stderr, "hc: input line length > ");
X			    (void)fprintf(stderr, "%d chars, truncated.\n",
X				    LINELEN);
X			    exit(2);
X			    }
X		    else {
X			intmp[0] = '\0';
X			filesact--;
X			saweof[argno] = TRUE;
X			}
X		    /*
X		     * expand TABs from intmp into spaces in bufin
X		     */			
X		    for (i = 0, j = 0; (ch = intmp[j++]) != '\0';)
X			if (ch == '\t')
X			    for (k = TABINT - (i % TABINT); k-- > 0;)
X				bufin[argno][i++] = ' ';
X			else
X			    bufin[argno][i++] = ch;
X		    bufin[argno][i] = '\0';
X		/*
X		 * move bufin into bufout, if it's not null string
X		 */
X		    if (bufin[argno][0] != '\0') {
X			for (j = strlen(bufout), k = startcol[argno-1];
X			    (bufout[j++] = bufin[argno][k++]) != 0;);
X			bufout[j] = '\0';
X			}
X		    break;
X		} /* switch */
X	    } /* for */
X
X	/*
X	 * compress tabs in bufout
X	 */
X	if (!sflag) {
X	    int sp;
X
X	    strcpy(intmp, bufout);
X	    for (i = 0, j = 0, sp = 0; (ch = intmp[j++]) != '\0';) {
X		if (ch == ' ') {
X		    sp++;
X		    if ((sp > 1) && ((j % TABINT) == 0)) {
X			bufout[i++] = '\t';
X			sp = 0;
X			}
X		    }
X		else {
X		    while (sp > 0) {
X		        bufout[i++] = ' ';
X			sp--;
X		        }
X		    bufout[i++] = ch;
X		    }
X		}
X	    bufout[i] = '\0';
X	    }
X
X	/*
X	 * remove trailing blanks
X	 */
X	if (tabposn[actarg-1] == 0) {
X	    j = strlen(bufout);
X	    while (bufout[--j] == ' ')
X		;
X	    bufout[j+1] = '\0';
X	    }
X/*
X * if there were input files specified, and all files have reached EOF,
X * don't bother to output this line.
X */
X	if ((filecnt != 0) && (filesact != 0)) {
X	    fputs(bufout, stdout);
X	    fputc('\n', stdout);
X	    }
X	} while (filesact > 0);
X     exit(0);				/* successful return */
X     } /* hc routine */
X
X
Xvoid usage() {
X    (void)fprintf(stderr, "usage:");
X    (void)fprintf(stderr, "  hc [-s] [+tab | -col | -l \"str\" | filename] ...\n");
X    exit(2);
X    }
EOF
exit 0