[comp.unix.questions] software tools question

jkrueger@dgis.dtic.dla.mil (Jon) (12/08/89)

I can't be the only person who needs to do this...
so maybe this is appropriate for the net.

I have a table that is common to many different documents.  I decided
to keep it in a separate file, so I can maintain a single copy, and
include it into my documents as needed.  I'm using text processing
tools commonly found on the UNIX (registered trademark of AT&T)
timesharing system:  troff and tbl.  So I placed my tbl definitions
into a file we'll call mytable.  I figured I'd just use the the .so
command in troff: .so mytable.  Right?

Wrong: the .so command includes mytable *after* tbl is done.  It needs
to be included before.  So I went looking for an interpolation tool.
The right tool on UNIX seemed to be simple macro preprocessor: m4.  I
changed the .so mytable to include(mytable), and generated my document
with m4 mydoc | tbl | troff.  This worked pretty well.

One day, the word "shift" appeared at the beginning of one of the lines
in my document.  This generated the message "m4: shift not yet
implemented", and that line was dropped from output.  More careful
examination of the m4 man page revealed that m4 keywords like
"include", "define", and "divert" would occur in my documents.  M4 will
drop them, and if they're at the beginning of the line, the rest of the
line with them.

Well, I gave up.  The following lex program mimics the m4 include
behavior, and passes through all else:

	%{
		FILE	*fp;
		char	*index();
		int	gotc;
	%}
	%%
	^include\([^)]*\)$	{
				*(index(yytext, ')')) = '\0';
				fp = fopen(index(yytext, '(') + 1, "r");
				if (fp != NULL) {
					while ((gotc = getc(fp)) != EOF)
						putchar((char) gotc);
					fclose(fp);
				}
			}
	%%

It solves my immediate problem, and certainly wasn't that terrible to
write.  But the software tools question keeps coming back to me:  is
there some tool already on UNIX that I'm missing here?  Sed can't do
this at all, as far as I can tell.  Awk can, but doesn't seem better
suited than lex; it would be even more, uh, awkward.  Cpp has fewer
keywords to bang into but isn't really a general purpose tool, will
want to introduce stray text like line numbers.  What is the right
tool for this?

-- Jon
-- 
Jonathan Krueger    jkrueger@dtic.dla.mil   uunet!dgis!jkrueger
The Philip Morris Companies, Inc: without question the strongest
and best argument for an anti-flag-waving amendment.

henry%angel@Sun.COM (Henry McGilton -- Software Products) (12/09/89)

In Article <19856@dgis.dtic.dla.mil> jkrueger (Jon Krueger) writes:

    *  I have a table that is common to many different
    *  documents.  I decided to keep it in a separate file, so
    *  I can maintain a single copy, and include it into my
    *  documents as needed.  I'm using text processing tools
    *  commonly found on the UNIX (registered trademark of
    *  AT&T) timesharing system:  troff and tbl.  So I placed
    *  my tbl definitions into a file we'll call mytable.  I
    *  figured I'd just use the the .so command in troff:
	    .so mytable.
    *  Right?

    *  Wrong: the .so command includes mytable *after* tbl is done.

This is indeed true.  .so  is a troff request, and as such is
processed by troff, which, regrettably, comes after its
preprocessors in the pipeline.

    *  It needs to be included before.  So I went
    *  looking for an interpolation tool.  The right tool on
    *  UNIX seemed to be simple macro preprocessor: m4.  I
    *  changed the .so mytable to include(mytable), and
    *  generated my document with m4 mydoc | tbl | troff.
    *  This worked pretty well.

    *  . . .  lines deleted detailing why m4 is not the whole answer.

This requirement occurs all the time.  If you have access
to any UNIX system running the 4.x BSD (Berkeley) flavor of
UNIX, you should find a program called `soelim'.  While
soelim was originally written for another purpose, it turns
out that it does just what you want, namely, to do the
source'ing of included files before the troff preprocessors
do their work.  Then you run the pipeline of commands in the form:

	soelim sourcefiles . . . | pic | tbl | eqn | troff . . .

I don't know if soelim is available on System V at this
point.  If you have access to a BSD system, grab the source
from such a system.  soelim was written by Bill Joy in
1977, and, as far as I know, is not copyrighted.

Anybody wish to elucidate on the issue of whether a
status of `not copyrighted' implies `publicly distributable'?

	................ Henry
+------------------+------------------------+---------------------------+
| Henry McGilton   | I saw the future,      | arpa: hmcgilton@sun.com   |
| Sun Microsystems | and it didn't work.    | uucp: ...!sun!angel!henry |
| Mt. View, CA     |                        |                           |
+------------------+------------------------+---------------------------+

skwu@boulder.Colorado.EDU (WU SHI-KUEI) (12/09/89)

With System V by far the easiest solution to including a table in multiple
documents is to process the table source with 'tbl', redirecting the output
to a file which can be included anywhere.  I.e.:

	tbl table1.src > table1.fmt
	tbl table2.src > table2.fmt

In the text source:

	blah blah blah
	.so table1.fmt
	more blah blah blah
	.so table2.fmt

There is no need for any other software tools (At least with 'mm' or any 
of the other System V text formatting tools).

davidsen@sixhub.UUCP (Wm E. Davidsen Jr) (12/10/89)

  Many versions of UNIX include 'soelim' to process the .so commands for
you. I think a version was posted to the net, but I don't have archives
handy. I posted 'sop' several years ago which does much the same thing.
If you want a copy let me know and I'll report to alt.source, it's quite
short and I don't remember where I posted it the first time so it may
not be archived.
-- 
	bill davidsen - sysop *IX BBS and Public Access UNIX
davidsen@sixhub.uucp		...!uunet!crdgw1!sixhub!davidsen

"Getting old is bad, but it beats the hell out of the alternative" -anon

henry%angel@Sun.COM (Henry McGilton -- Software Products) (12/12/89)

In article <14720@boulder.Colorado.EDU>, skwu@boulder.Colorado.EDU (WU SHI-KUEI) writes:
    *  With System V by far the easiest solution to including
    *  a table in multiple documents is to process the table
    *  source with 'tbl', redirecting the output to a file
    *  which can be included anywhere.  I.e.:

	    tbl table1.src > table1.fmt
	    tbl table2.src > table2.fmt

There are some minor (possibly major) problems with this approach:

    o   if your source has grap or pic programs, you must
	then preprocess the stuff using those programs first:

	    grap table1.src | pic | tbl > table1.fmt

    o   if your source has equations, you must
	then process the stuff using eqn as well:

	    grap table1.src | pic | tbl | eqn > table1.fmt

This leads to a couple of problems:

    1.  you use a lot more disk space keeping all
	the intermediate forms of the stuff around.
	The expansion factor through tbl and eqn can
	be around 10 to 1.

    2.  time after time after time I've seen neophyte
	users do exactly this process, followed by removing
	the original sources.  They don't understand that
	it's a one-way process, and that getting from
	expanded tbl output back to a tbl description is
	hard work.

    *  There is no need for any other software tools (At least
    *  with 'mm' or any of the other System V text formatting tools).

soelim or one of its clones is a safer answer.

	............. Henry
+------------------+------------------------+---------------------------+
| Henry McGilton   | I saw the future,      | arpa: hmcgilton@sun.com   |
| Sun Microsystems | and it didn't work.    | uucp: ...!sun!angel!henry |
| Mt. View, CA     |                        |                           |
+------------------+------------------------+---------------------------+

gwyn@smoke.BRL.MIL (Doug Gwyn) (12/12/89)

In article <129057@sun.Eng.Sun.COM> henry%angel@Sun.COM (Henry McGilton -- Software Products) writes:
>	    grap table1.src | pic | tbl | eqn > table1.fmt

This doesn't quite address the original problem, but since I've found
it handy I thought I'd post it:

#!/usr/5bin/sh
#	doctype -- synthesize proper command line for troff
#	adapted from Kernighan & Pike

#	last edit:	88/09/19	D A Gwyn
#	SCCS ID:	@(#)doctype.sh	1.9

PATH=/usr/5bin:/bin:/usr/bin
if pdp11
then	MACDIR=/usr/lib/tmac
else	MACDIR=/usr/5lib/tmac		# BRL System V emulation
fi

eopt=
macs=
opts=
topt=
for i
do	case "$i" in
	-e)	eopt="$i";		shift;;
	-m*)	macs="$macs $i";	shift;;
	-T*)	topt=" $i";		shift;;
	--)				shift;	break;;
	-)					break;;
	-*)	opts="$opts $i";	shift;;
	*)					break;;
	esac
done

if [ $# -gt 0 ]
then	s="cat $* | "
else	s=
fi

t=`cat $* |
egrep '^\.(EQ|TS|\[|P|G1|IS|SH)' |
sort -u |
awk '
/^\.SH/ { man++ }
/^\.P$/ { mm++ }
/^\.PP/ { ms++ }
/^\.EQ/ { eqn++ }
/^\.TS/ { tbl++ }
/^\.PS/ { pic++ }
/^\.PF/ { pic++ }
/^\.G1/ { grap++ }
/^\.IS/ { ideal++ }
/^\.\[/ { refer++ }
END {
	if (refer > 0)	printf "refer | "
	if (grap > 0)	printf "grap | "
	if (grap > 0 || pic > 0)	printf "_PIC_ | "
	if (ideal > 0)	printf "ideal | "
	if (tbl > 0)	printf "tbl | "
	if (eqn > 0)	printf "_EQN_ | "
	printf "_TROFF_"
	if (grap > 0 || pic > 0)	printf " -mpic"
	if (man > 0)	printf " -man"
	if (mm > 0 && man == 0)	printf " -mm"
	if (ms > 0 && mm == 0 && man == 0)	printf " -ms"
	printf " -\n"
} ' | sed -e s/_PIC_/"pic$topt"/ -e s/_EQN_/"eqn$topt"/ \
	-e s/_TROFF_/"troff$topt$opts$macs"/ -e s%' -m'%" $MACDIR/tmac."%g`

if [ -n "$eopt" ]
then	eval "$s$t"
else	echo $s$t
fi