[pe.cust.sources] nroff source for Axel's UNIX Clinic

carl@nrcaero.UUCP (Carl P. Swail) (02/19/85)

: '!/bin/sh'
: 'This is a shell archive, meaning:'
: '1. Remove everything above the #!/bin/sh line.'
: '2. Save the resulting text in a file.'
: '3. Execute the file with /bin/sh (not csh) to create the files: '
: '      clinic1'
: 'This archive created: Tue Feb 19 08:37:53 1985'
export PATH; PATH=/bin:$PATH
echo shar: extracting "'clinic1'" '(16364 characters)'
if test -f 'clinic1'
then
	echo shar: over-writing existing file "'clinic1'"
fi
sed 's/^X//' << \SHAR_EOF > 'clinic1'
X.so /usr/lib/tmac/tmac.s
X.pl 11i
X.ll 6.5i
X.po 1i
X.m1
X.m4
X.ds Sp \fIUNIX clinic\fR
X.TL
X\*(Sp
X.sp
X.fi
X.PP
XThis column first appears in the German quarterly
X.I unix/mail
X(Hanser Verlag, Munich, Germany).
XIt is copyrighted: copyright 1984 by Axel T. Schreiner, Ulm, West Germany.
XIt may be reproduced
Xas long as the copyright notice is included
Xand reference is made to the original publication.
X.PP
XThe column attempts to discuss typical approaches
Xto problem solving using the
X.UX
Xsystem.
XIt emphasizes what the author considers to be
Xgood programming pratices and appropriate choice of tools.
X.SH
X/lib/cpp
X.PP
XThis quarter's column deals with uses and abuses of the
XC preprocessor.
XWe demonstrate some techniques
Xwhich can save a lot of work (and even more errors).
XThe discussion applies to programming in C in general,
Xand it assumes only very elementary prerequisites:
X.QP
XC programs are run through a preprocessor
X.I before
Xthey are handed to the actual compiler.
XThe preprocessor performs (parametrized) text substitution
X(\fB#define\fR),
Xinserts
X.I "header files"
X(\fB#include\fR),
Xand can exclude parts of the source from compilation
X(\fB#if\fR).
X.QP
XSince the preprocessor is independent of the actual compiler
X\(em and does not know C at all \(em
Xone can use it in particular to extend the C language.
XOnly one's taste limits one's imagination here...
X.SH
XExcluding text
X.PP
XEvery programmer presumably writes occasional comments.
XSometimes we comment quite intentionally to exclude
Xprogram parts from a compilation.
XSince in Standard C comments may not be nested,
Xthere is considerable temptation not to comment
Xsuch excluded program parts any more.
X.PP
XThe following technique for text exclusion is much more appropriate:
X.DS
X#ifdef	not_defined
X	crash_the_system(NOW); /* this definitely goes wrong */
X#endif	not_defined
X.DE
X.LP
XOf course, the name
X.B not_defined
Xshould really not be defined...
X.SH
XVector dimensions
X.PP
XIn principle one can determine the size of a vector using the
X.B sizeof
Xoperator. However,
X.B sizeof
Xyields the size in bytes, not in elements.
XThe following macro determines the number of elements in an arbitrary vector:
X.DS
X#define	DIM(x)	(sizeof (x) / sizeof ((x)[0]))
X.DE
X.PP
X.B sizeof
Xdoes not really need parentheses,
Xif it is used to determine the size of an object
Xand not of a data type.
XOne should, however, enclose macro parameters in parentheses.
XThen things work out for a vector with more than one dimension, too:
X.DS
Xmain()
X{	struct { int a; char b } v[10][20][30];
X
X	printf("%d %d %d\en", DIM(v), DIM(v[1]), DIM(v[1][2]));
X}
X.DE
X.PP
XThe program produces the values
X.B 10 ,
X.B 20
Xand
X.B 30 .
X.PP
XParentheses should not be necessary in this use of
X.B sizeof
Xsince a vector subscript should have precedence over
X.B sizeof .
XAt least my copy of the Mark Williams CP/M-86 C compiler
Xdoes not seem to know this...
X.PP
XWe can carry these ideas somewhat further.
XThe last element of a vector is
X.DS
X#define	LAST(x)	((x)[DIM(x)-1])
X.DE
X.LP
Xand the customary
X.B for
Xloop is for example
X.DS
X#define	END(x)	((x) + DIM(x)-1)
X
Xint vector[10], * vp;
X	...
X	for (vp = vector; vp <= END(vector); ++ vp)
X		...
X.DE
X.PP
X.B sizeof
Xis evaluated by the compiler during constant expressions.
XThis can be used to determine the length of constant strings
Xin an efficient and flexible fashion:
X.DS
X#define	STRLEN(s) (sizeof s - 1)
X
Xchar buf[STRLEN("model") + 1];
X	...
X	strcpy(buf, "model");
X.DE
X.LP
XThere is the danger, however, that
X.B STRLEN
Xis used for other objects, i.e., non-strings,
Xby mistaking it for
X.B strlen ...
X.SH
XTrace
X.PP
XIt is well known that a
X.I "macro call"
Xis not recognized in a constant string.
XLess well known, but more useful, is perhaps
Xthat a
X.I "macro parameter"
Xis recognized and replaced within the replacement text
Xof a macro definition.
XRather than
X.DS
Xprintf("variable = %d\en", variable);
Xprintf("formula = %f\en", formula);
X.DE
X.LP
Xwe write
X.DS
X#define SHOW(val,fmt) fprintf(stderr,"SHOW: val = fmt\en",val)
X
X	SHOW(variable, %d);
X	SHOW(formula, %f);
X.DE
X.LP
XThe latter is easier to use
Xand conveys more information
Xsince
X.B val
Xis replaced in the format by the entire macro argument.
X.PP
XA bit of caution is required:
Xif the
X.B %
Xoperator is used within
X.B val
Xthere will be problems with the format.
XThis can be corrected as follows:
X.DS
X#define SHOW(val,fmt) fprintf(stderr,"%s = fmt\en", "val",val)
X.DE
X.PP
XA macro can be defined without a replacement text.
XUses of
X.B SHOW
Xcan thus easily be eliminated from the compiled program altogether.
XAlternatively we can specify a condition:
X.DS
X#ifdef	DEBUG
X	char debugflag;
X#	define	SHOW(val,fmt)	(debugflag && fprintf(...))
X#else	! DEBUG
X#	define	SHOW(val,fmt)	/* null */
X#endif	DEBUG
X.DE
X.PP
XIn this example
X.B SHOW
Xis always used as a statement and not as an expression.
XUsing
X.B &&
Xrather than
X.B if
Xhas two advantages:
Xthis way we do not
X\fIhave\fR
Xto use
X.B SHOW
Xas a statement, and a use of
X.B SHOW
Xdoes not invite an unintentional
X.B else ...
X.PP
X.B debugflag ,
Xby the way,
Xshould be used as a bit vector,
Xe.g.:
X.DS
X#define SHOW(level,val,fmt) (debugflag & 1<<level && \e
X			       fprintf(...))
X.DE
X.LP
XNow we can maintain different sets of trace information at
X.B level
X.B 0
Xthrough
X.B 7 .
X.SH
XGlobal variables
X.PP
XOf course,
Xyou like modular programs, too??
Xwith lots of sources,
X.I makefile ,
Xa central header file,
Xand the (feeble) hope
Xthat all global declarations really match??
XYou like to
X.I lint ,
Xtoo??
X.PP
XThe following technique simplifies maintaining global variables.
XThe central header file contains about the following:
X.DS
X#ifndef	GLOBAL
X#	define	GLOBAL	extern
X#endif	GLOBAL
X
XGLOBAL	int	global_variable;
X.DE
X.LP
XIf nothing else is arranged,
Xa variable declared
X.B GLOBAL
Xthus is declared
X.B extern .
X.PP
XWithin exactly
X.I one
Xof the source files which include the header file
Xwe have to take care that the variables which were declared
X.B extern
Xelsewhere are really defined.
XIn the main source file we therefore write:
X.DS
X#define	GLOBAL	/* to define global variables */
X#include "definitions.h"
X.DE
X.PP
XOne can even initialize global variables in this context
X.I without
Xresorting to the
X.B \(mim
Xflag instructing the loader
X.I ld
Xto accept multiple definitions:
X.DS
X#ifdef	GLOBAL
X#	define	INIT(x)	= x
X#else	! GLOBAL
X#	define	GLOBAL	extern
X#	define	INIT(x)
X#endif	GLOBAL
X
XGLOBAL	int variable INIT(10);
X.DE
X.PP
XThis technique is not very practical for aggregates.
XThe following variant is easier to use:
X.DS
X#ifdef	GLOBAL
X#	define	INIT(x)	= x
X#	define	GINIT
X#else	! GLOBAL
X#	define	GLOBAL	extern
X#	define	INIT(x)	;
X#	undef	GINIT
X#endif	GLOBAL
X
XGLOBAL	struct { int a; char b; } variable INIT()
X#ifdef	GINIT
X		{ 10, 'b' };
X#endif	GINIT
X.DE
X.PP
XThis method requires
Xthat the C preprocessor permits a macro call with an empty argument list
Xand that the C compiler does not complain about superfluous semicolons
Xbetween global declarations.
XThis method is admittedly no longer very elegant
Xbut it has the significant advantage
Xthat the text of central definitions exists only once in all cases.
X.SH
X/bin/lex
X.SH
XNow you see it...
X.PP
X.I lex
Xprograms have lots in common with fashions:
Xthe effect is not always what the pattern promises...
XIf a function generated by
X.I lex
Xis used as a front end for a parser generated by
X.I yacc
Xit is sometimes very hard to decide
Xwhere to place the blame for a bug:
Xis there a bug in the grammar presented to
X.I yacc
Xor are the patterns which were processed by
X.I lex
Xat fault?
X.PP
XThe following technique
X.FS
XThis technique was developed for the book
X.I "Introduction to Compiler Construction"
Xby A. T. Schreiner and H. G. Friedman Jr.,
Xto be published in January 1985 by Prentice-Hall.
X.FE
Xpermits the construction of a source file for
X.I lex
Xwhich is conditionalized so that a debugging version
Xcan be compiled at any time without any changes to the source.
XIn order to test the results of
X.I lex ,
Xall inputs which the parser is to receive later
Xare first presented to the debugging version.
XThis version of the front end then prints a mnemonic version
Xof the values which the parser would receive:
X.DS
X%{
X#ifdef	TRACE
X
X#	include	"assert.h"
X
X	main()
X	{	char * cp;
X
X		assert(sizeof(int) >= sizeof(char *));
X		while (cp = (char *) yylex())
X		    printf("%-.10s is \e"%s\e"\en",cp,yytext);
X	}
X
X#	define	token(x)	(int) "x"
X
X#else	! TRACE
X
X#	include	"y.tab.h"
X#	define	token(x)	x
X
X#endif	TRACE
X%}
X.DE
X.PP
XNormally
X.B TRACE
Xis undefined and the
X.I tokens ,
Xi.e., the values which are to be returned to the parser,
Xare defined in the file
X.I y.tab.h
Xgenerated by
X.I yacc
Xas:
X.DS
X#define	NAME	257
X	...
X.DE
X.LP
XThese defined names are used directly in the source presented to
X.I lex
Xand are returned as a result of the function
X.B yylex() .
X.PP
XIf
X.B TRACE
Xis defined,
X.I y.tab.h
Xneed not yet exist.
XIn this case, i.e., in the debugging version,
Xwe want to return a string as a result of
X.B yylex()
Xwhich is then printed by the
X.B main()
Xprogram included in this case.
X.FS
XThe technique requires that a pointer to a character string
Xcan be returned in place of an
X.B int
Xvalue. This is not possible across all implementations of C,
Xe.g., it is probably not allowed on the 7300 systems.
XWe guard against a portability problem using
X.B assert() .
X.FE
XAnalyzing the debugging output is most easily accomplished
Xif the output uses exactly those words which later will appear in
X.I y.tab.h ,
Xi.e., which are a result of
X.B %token
Xstatements in the source presented to
X.I yacc .
X.PP
XWe are using the fact that macro parameters
Xare replaced within strings in the replacement text of a macro.
X\fBtoken(\fIx\fB)\fR
Xeither returns
X.I x
Xitself (to be passed on to
X.I yacc ),
Xor a string
X\fB"\fIx\fB"\fR
Xfor the purposes of
X.B TRACE .
X.PP
XThe remainder of the
X.I lex
Xprogram is now quite obvious:
X.DS
X%%
X
X[0-9]+				return token(NUMBER);
X[a-z_A-Z][a-z_A-Z0-9]*		return word();
X[ \et\en]+			;
X\&.				return token(yytext[0]);
X
X%%
X
Xstruct reserved { char * text; int yylex; } reserved[] = {
X	{ "begin", token(BEGIN) },
X	{ "end", token(END) },
X	(char *) 0 };
X
Xint word()
X{	struct reserved * rp;
X
X	for (rp = reserved; rp->text; ++ rp)
X		if (strcmp(yytext, rp->text) == 0)
X			return rp->yylex;
X	return token(NAME);
X}
X.DE
X.PP
XYes \(em there should have been a binary chopped search,
Xbut we are dealing only with the principles...
X.SH
X/usr/src/main.c
X.SH
XArgument standards
X.PP
XCommand arguments are always good for surprises.
XSometimes several options may be combined into one argument;
Xsometimes each option must be a separate argument;
Xsometimes a parameter value follows as part of the argument;
Xsometimes it does not;
Xall of the above; some of the above... ?
X.PP
XIf one consults the sources of certain
X.UX
Xutilities, one learns to appreciate the flexibility of C
X(or the infinite patience of the C compiler?):
Xeverybody does his own thing,
Xand most do it differently in every program!
XHowever, it would be so simple to develop a standard:
X.DS
X#include <stdio.h>
X
X#define	show(x)	printf("x = %d\en", x)
X#define	USAGE	fputs("cmd [-f] [-v #]\en", stderr), exit(1)
X
Xmain(argc, argv)
X	int argc;
X	char ** argv;
X{	int f = 0, v = 0;
X
X	while (--argc > 0 && **++argv == '-')
X	{	switch (*++*argv) {
X		case 0:                         /* - */
X			--*argv;
X			break;
X		case '-':
X			if (! (*argv)[1])       /* -- */
X			{	++ argv, -- argc;
X				break;
X			}
X		default:
X			do
X			{	switch (**argv) {
X				case 'f':         /* -f */
X				    ++ f;
X				    continue;
X				case 'v':
X				    if (*++*argv)
X					;        /* -v# */
X				    else if (--argc > 0)
X					++argv; /* -v # */
X				    else
X					break;
X				    v = atoi(*argv);
X				    *argv += strlen(*argv)-1;
X				    continue;
X				}
X				USAGE;
X			} while (*++*argv);
X			continue;
X		}
X		break;
X	}
X	show(f), show(v), show(argc);
X	if (argc) puts(*argv);
X}
X.DE
X.PP
XAt
X.B show()
X.B argc
Xcontains the number of arguments which have not yet been processed
Xand
X.B *argv
Xis the first one of these.
XThis argument can be a single
X.B \(mi
Xcharacter \(em
Xin some ancient
X(\fIcat\fR)
Xand almost new
X(\fItar\fR)
Xutilities this indicates that standard input or output is to be used
Xin place of a file argument.
X.PP
XFlags can be combined at will.
XIf an option requires a value,
Xit can follow immediately (and then as rest of the argument)
Xor it can be an argument of its own.
X.PP
XFollowing a standard proposed in the
X"USENIX login"
Xan option
X.B \(mi\(mi
Xserves to terminate processing of the option list.
XApart from that options must start with
X.B \(mi
Xand they must precede other arguments.
XThese rules, however, still do not cover all possibilities of
X.I pr ...
X.PP
XThe skeleton above is useful
Xbut anatomically somewhat terrifying.
XThe following incarnation is perhaps more attractive:
X.DS
X#include <stdio.h>
X#include "main.h"
X
X#define	show(x)	printf("x = %d\en", x)
X#define	USAGE	fputs("cmd [-f] [-v #]\en", stderr), exit(1)
X
XMAIN
X{	int f = 0, v = 0;
X
X	OPT
X	ARG 'f':
X		++ f;
X	ARG 'v': PARM
X		v = atoi(*argv);
X		NEXTOPT
X	OTHER
X		USAGE;
X	ENDOPT
X	show(f), show(v), show(argc);
X	if (argc) puts(*argv);
X}
X.DE
X.PP
XThe trick of course is concealed in the header file
X.I main.h :
Xhere the macros
X.B OPT ,
X.B ARG ,
X.B PARM ,
X.B NEXTOPT ,
X.B OTHER ,
Xand
X.B ENDOPT
Xmust be defined using exactly those texts
Xwhich were given explicitly in the previous example:
X.DS
X#define MAIN    main(argc, argv)                          \e
X			int argc;                         \e
X			char ** argv;
X#define OPT     while (--argc > 0 && **++argv == '-')     \e
X		{       switch (*++*argv) {               \e
X			case 0:                           \e
X				--*argv;                  \e
X				break;                    \e
X			case '-':                         \e
X				if (! (*argv)[1])         \e
X				{       ++ argv, -- argc; \e
X					break;            \e
X				}                         \e
X			default:                          \e
X				do                        \e
X				{	switch (**argv) {
X#define ARG                                     continue; \e
X					case
X#define OTHER                                   continue; \e
X					}
X#define ENDOPT                  } while (*++*argv);       \e
X				continue;                 \e
X			}                                 \e
X			break;                            \e
X		}
X#define PARM if (*++*argv);                               \e
X	     else if (--argc > 0)++argv; else break;
X#define	NEXTOPT	*argv += strlen(*argv)-1;
X.DE
X.PP
XThe definitions are not exactly beautiful
X\(em especially if they need to be compacted so that the
XC preprocessor accepts the lengthy replacement texts \(em
Xbut they need to be developed only once
Xto make the argument standard available for all applications.
XAn application then is almost self-documenting:
X.RS
X.IP \fBMAIN\fR
Xis the function header of the main program.
X.IP \fBOPT\fR
Xstarts the loop during which the options are processed.
X.IP \fBENDOPT\fR
Xcompletes this loop.
X.IP \fBARG\fR
Xwithin the loop starts the processing of one option;
Xthe name of the option
X(a single character)
Xenclosed in single quotes and a colon must follow.
X.IP \fBPARM\fR
Xfollows the option specification if the option has a value parameter.
XThe parameter itself is then available as
X.B *argv .
X.IP \fBNEXTOPT\fR
Xis used in particular once such a parameter has been processed
Xto advance to the next command argument.
X.IP \fBOTHER\fR
Xmust follow all options;
Xfollowing this, one specifies what should be done
Xif an option could not be recognized.
X.B NEXTOPT
Xmay be specified in this case, too.
XThe unknown option itself is
X.B **argv .
X.RE
X.PP
XAfter the
X.B OPT
X.B ENDOPT
Xloop
X.B argc
Xcontains the number of command arguments which have not yet been processed
Xand
X.B *argv
Xis the first such argument.
XArbitrarily many (different) options
X.B ARG
Xcan be specified.
X.I pr
Xwould be implemented approximately as follows:
X.DS
XMAIN
X{
X	do
X	{	OPT
X		ARG 'h': PARM
X			header = *argv;
X			NEXTOPT
X		ARG 'w': PARM
X			width = atoi(*argv);
X			NEXTOPT
X		ARG 'l': PARM
X			length = atoi(*argv);
X			NEXTOPT
X		ARG 't':
X			tflag = 1;
X		ARG 's': PARM
X			delimeter = **argv;
X			++*argv;
X			NEXTOPT
X		ARG 'm':
X			mflag = 1;
X		OTHER
X			if (isdigit(**argv))
X				columns = atoi(*argv), NEXTOPT
X			else
X				USAGE, exit(1);
X		ENDOPT
X
X		if (argc)
X		{	if (**argv == '+')
X			{	PARM
X				first_page = atoi(*argv);
X				continue;
X			}
X	
X			dopr(*argv);
X		}
X		else
X			dopr("-");
X	} while (argc > 1);
X}
X.DE
X.LP
XThere is a blemish:
X\fB\(mi\fIcolumns\fR
Xmust be specified as a
X.I single
Xargument
X(since
X.B \(mi
Xalone refers to standard input).
X
X
X
X
SHAR_EOF
if test 16364 -ne "`wc -c 'clinic1'`"
then
	echo shar: error transmitting "'clinic1'" '(should have been 16364 characters)'
fi
: '      End of shell archive'
exit 0
-- 

Carl Swail      Mail: National Research Council of Canada
		      Building U-66, Montreal Road
		      Ottawa, Ontario, Canada K1A 0R6
		Phone: (613) 998-3408
USENET:
{pesnta,lsuc}!nrcaero!carl
{cornell,uw-beaver}!utcsrgv!dciem!nrcaero!carl
{allegra,decvax,duke,floyd,ihnp4,linus}!utzoo!dciem!nrcaero!carl