[comp.text] Troff Macro Programming A: Part 2

lee@anduk.co.uk (Liam R. Quin) (05/18/89)

Intended Audience: anyone who writes troff macros, or who wants to
learn how to do so.  A very basic understanding of troff is assumed;
you could probably get that from reading part 1.


This article covers some troff programming techniques.  It's always
difficult to decide whether to do this before or after page layout, but
I prefer to do it this way if I'm talking to reasonably technical people,
as I can then use this stuff to do page layout!

Thus, the next article (if I get enough replies to merit continuing) will
show you how to use all this `macro programming' in practice, doing page
headers, margins, etc..., and the one after that will come back to some more
programming examples, including Callout Lists, and more complex pages.

Then we'll use all *that* to do balanced and parallel columns... whew!

More comments and feedback would be very helpful!  (although I have
already had some encouraging mail).

If you know other or better ways to do these things, say so!

Lee

=====================
2.0 Tricks With Names

Recall that troff reads its input a character at a time, so that some
\-escapes are expanded (`interpolated') everywhere.  In particular,
\n and \* are always expanded.	

Pedantic Aside:
	there are some obscure exceptions, but
	they will be obvious if you're in those situations!
	The most common is if you change the escape character
	with the .ec request.

Suppose I have four friends:			| old troff:
	.ds Friend1 "Simon			| .ds F1 "Simon
	.ds Friend2 "Lorella			| .ds F2 "Lorella
	.ds Friend3 "David			| .ds F3 "David
	.ds Friend4 "Aggie			| .ds F4 "Aggie
I can access David with \*[Friend3].		| \*(F3

But I can also do this:
	.nr Whom 4				| .nr W? 4
	\*[Friend\n[Whom]]			| \*(F\n(W?

When troff reads this, it gets as far as
	\*
and calls the code to read a string name a character at a time.
The name that troff actually sees is thus Friend4, because the \n is
interpolated as soon as it is seen.

So now we can do arrays in troff (well, sort of!).

Later we'll look at some more useful examples, but it's very important
to be clear on this topic, as it is used extensively by the majority
of troff macro packages.

Aside on Old Troff:
	If you have old troff, you have to be very careful not to have
	arrays with more than 10 elements... because
		\*(F11
	will interpolate the value of the string `F1' and follow it
	with a `1'!

	You can get around that by using a string instead of a number,
	and using a...z and A...Z as well as 0...9, but this is too
	messy to show here...  You would probably define a macro which
	would take the name of a string and a number, and would set the
	string to an appropriate character....

===================
2.1 Data Structures

Most programming texts describe Stacks and Queues.

Unless overwhelmed with demand, I am not going to go into great detail
about what these are.

But I will show how to implement them with some examples.

============
2.1.0 Stacks

The example output I posted last week (don't worry if you didn't save it
or don't have the right kind of printer) showed an example of a list.

Here's another short example.
	0.  Introduction

	1.  Overview

	2.  Tricks With Names
	    a.  Stacks
	    b.  Queues
	    c.  Callout Lists

	3.  Page Layout

Each level of list has an associated counter and format.

In other words, when we come back from the sublist (a,b,c), we have to
remember that the next element (Page Layout) is going to be number 3.

A very good way to do this is to use a stack.

Suppose the input looks like
	.L  \" start numeric list1
	.I \" item
	Introduction
	.I
	Overview
	.I
	Tricks With Names
	. L a \" start alphabetic list
	. I \" item:
	Stacks
	. I \" item..
	etc
	. /L \" end list
	.I
	Page Layout
	./L \" end outer list
(the space between the . and the macro name is ignored by troff, but
helps readability.  SGML uses a / for an end-element, and I'll do the
same here).

(We'll discuss input styles more later).

Let's assume that the L macro
*   sets up the indent so that the list-marker (1. or a.) is actually in
    the left margin -- i.e., indent to
    ^----------------------------------here, and calls this \n[ListIndent]

*   sets a tab stop at that point, so we can move from the end of the
    marker to the start of the text

*   sets a number register Item to contain one less than the number of
    the next list item.			| old troff:
					| I suggest that you figure out
					| the examples as they are, and then
					| convert them.  There are some
					| simple versions at the end of the
					| article.
	.nr ListIndent 4n \" only need to set this up once

	.de L \" wrong version
	. ta \\n[ListIndent]u
	. in +\\n[ListIndent]u
	. nr Item 0
	. af Item \\$1 \" set the format to 1. or a. or A. or whatever
	..

We could then make the Item macro like this:
	.de I
	. nr Item +1 \" increment the item number
	. sp 0.5 \" leave a gap
	. ti -\\n[ListIndent]u
	\\n[Item].\t\c
	..
The last line uses \t to represent a tab (type it in as \t too), and \c
to indicate that the line continues (like \c in the Unix echo command,
or echo -n on BSD Unix).

Aside:
	The sequence \t only works in `copy mode'.
	We will talk about `copy mode' later.  Beware that most of the 
	books on troff get \t seriously wrong.

The end-list macro simply has to undo the indent:
	.de /L
	. br
	. in -\\n[ListIndent]u
	..

Now, some questions:
*  Why have I written \\n everywhere and not \n as before?

   If I just wrote \n[Item], the register's value would be interpolated
   when the macro was defined, and not when it was called.

   The same applies for \\$ (accessing the arguments supplied to a macro
   when it is called), and for \\* for strings.

*  Does it work, and if so, why did I call it `wrong version'?

   Yes, it works.  Sort of.

Now, some questions:
*  Why have I written \\n everywhere and not \n as before?

   If I just wrote \n[Item], the register's value would be interpolated
   when the macro was defined, and not when it was called.

   The same applies for \\$ (accessing the arguments supplied to a macro
   when it is called), and for \\* for strings.

*  Does it work, and if so, why did I call it `wrong version'?

   Yes, it works perfectly.

   But it is `wrong' because it doesn't let you nest lists.  The example
   above won't work.

If we defined the /L macro so that it reset everything (including the
value and format of `Item', for example) and not just the indent, it'd
be fine, as then .I wouldn't need to know that there had been an intervening
nested list.
[This is about as close to data hiding that you can get with troff :-)]

First attempt:

	.de L \" slightly better version
	. nr OldItemFormat \\g[Item]
	. af Item 0\" so we can read it with \n and get the value
	. nr OldItem \\n[Item]
	. ta \\n[ListIndent]u
	. in +\\n[ListIndent]u
	. nr Item 0
	. af Item \\$1 \" set the format to 1. or a. or A. or whatever
	..


	.de /L
	. br
	. in -\\n[ListIndent]u
	. nr Item \\n[OldItem] \" Restore Item too
	. af Item \\n[OldItemFormat]
	..

This will probably work, but not if nests list more than two deep.
You may also get a troff warning, because the `Item' register is used
before it is set in the definition of `L'.

Clearly we need a more general mechanism to save a set of parameters,
and you've probably guessed by now that we could use a stack.

So let's go back to looking at stacks, and define some macros:
					| old troff aside:  because of the
					| shortage of names, you will
					| probably prefer to include the
					| stack stuff directly in the list
					| macros.
s1	.de nstack \" .nstack name -- declare a new number-stack
s2	. if !\\n[.$]=2 .error fatal "Usage: nstack name"
s3	. if r _nstack.i.\\$1 .error fatal "number-stack \\$1 already in use"
s4	. nr _nstack.i.\\$1 0 \" nothing in the stack at the moment
s5	..

u1	.de npush \" npush nstackname value -- push a number onto a stack
u2	. if !\\n[.$]=2 .error fatal "Usage: nstack nstack number"
u3	. if !r _nstack.i.\\$1 .error fatal "nstack \\$1 not declared"
u4	. nr _nstack.v.\\$1.\\n[_nstack.i.\\$1] 0\\$2
u5	. nr _nstack.i.\\$1 +1 \" increment stack pointer
u5	..

o1	.de npop \" npop stackname registername -- pop value
o2	. if !\\n[.$]=2 .error fatal "Usage: npop nstack number"
o3	. if !r _nstack.i.\\$1 .error fatal "nstack \\$1 not declared"
o4	. if \\n[_nstack.i.\\$1]<=0 .error fatal "npop: stack \\$1 is empty"
o5	. nr _nstack.i.\\$1 -1 \" decrement stack pointer
o6	. nr \\$2 \\n[_nstack.v.\\$1.\\n[_nstack.i.\\$1]]
o7	..

These may look a little fearsome... so we'll take them a bit at a time.
The idea is that you can say
	.nstack items
to declare a stack of numbers called "items".

Then if you say
	.npush items 12
	.npush items -5
	.npush items 1i

	.npop items width
	.npop items boy
	.npop items girl

we end up with \n[width] as 1i, \n[boy] as -5, and \n[girl] as 12.

Line by line, then...
s2	. if !\\n[.$]=2 .error fatal "Usage: nstack name"
Check that we've got 2 arguments.  I assume that you have defined a macro
called `error'.  The number register .$ contains the number of arguments
given to this macro, and the test !\\n[.$]=2 will be true if .$ is NOT 2.

A common error is to try to use \\n[.$]!=2, which would be more logical (for
C programmers), and which wouldn't work.

s3	. if r _nstack.i.\\$1 .error fatal "number-stack \\$1 already in use"
If there is already a register called _nstack.i.\\$1, complain.  We will use
_nstack.i.henry as an index into the stack "henry" (like a Stack Pointer).

If you're not using sqtroff, change this to
s3B	.if \\n[_nstack.i.\\$1] .error .....
and use 1 to represent an empty stack instead of 0.

s4	. nr _nstack.i.\\$1 0 \" nothing in the stack at the moment
Mark the stack as in use by defining the register, and empty by setting
the register (_nstack.i.henry perhaps) to 0.

u1	.de npush
u2 -- same as s2
u3	. if !r _nstack.i.\\$1 .error fatal "nstack \\$1 not declared"
Check that the stack is declared.  This really helps detect programming
errors.  If the troff input is coming from a database, you could delete
much of the error-checking and get more speed, but it really isn't worth
the hassle.

u4	. nr _nstack.v.\\$1.\\n[_nstack.i.\\$1] 0\\$2
We use _nstack.v.henry.0, _nstack.v.henry.1 etc... to store the values.
The leading 0 is necessary because otherwise
	.push henry -4
would expand to something like
	.nr _nstack.v.henry.34 -4
which would subtract 4 from the current value of that register.

u5	. nr _nstack.i.\\$1 +1 \" increment stack pointer
The stack pointer always refers to the next *unused* slot, because it starts
out that way.

o1	.de npop
o2   -- as above
o3   -- as above
o4	. if \\n[_nstack.i.\\$1]<=0 .error fatal "npop: stack \\$1 is empty"
Check the stack is empty

o5	. nr _nstack.i.\\$1 -1 \" decrement stack pointer
The pointer always refers to an empty slot, so we have to subtract one in
order to find the slot with the required value

o6	. nr \\$2 \\n[_nstack.v.\\$1.\\n[_nstack.i.\\$1]]
If the input was
	.npop henry james
then line o6 might have the effect of
	. nr james \\n[_nstack.v.henry.3]
to return the 3rd element of the stack in register "james".


So now we have stacks of numbers.

If you want to do stacks of strings and macros, the trick is to pass the
name of the string/macro instead of the value, and to change the .nr to
a .rn (rename).  Then spush/mpush will have the side-effect of deleting
the original macro or string, but this is usually OK.

If you are using old troff, you have to use names like S0, S1... for the
items in the stack, and then you can only have 1 stack (perhaps N
for numbers and S for strings).  You could use S0 and N0 for the stack
pointers, to save on the name-space.


Let's go back to lists, then, and end with a properly working example.

	.nstack ListItems \" only do this once
	.nr Item -1 \" facilitate detection of no-lists-left

	.de L\" working version
	. npush ListItems \\g[Item]
	. af Item 0
	. npush ListItems \\n[Item]
	. ta \\n[ListIndent]u
	. in +\\n[ListIndent]u
	. nr Item 0
	. af Item \\$1 \" set the format to 1. or a. or A. or whatever
	..

	.de /L
	. br
	. in -\\n[ListIndent]u
	. pop ListItems Item \" restore old value
	. pop ListItems ItemFormat
	. af Item \\n[ItemFormat]
	. rr ItemFormat
	..

=========
Exercises

1.  The list example doesn't let you use bullets or other strings.
    Modify it to keep a stack of strings, and make a .Ls macro to start a
    list that uses strings.  One might say
	.Ls +
    to get a list with a + instead of a number.

2.  In a bullet list, there's no full stop (.) after the item tag
    *  modify the example to remember if dots are needed.

3.  Change the design so that you can say
	.L
    or
	.Ls
    to get default formats.

4.  Allow an array -- perhaps called ListTagFormats -- that specifies the
    format of each level of list.  For example, you could do
	.ds SListTagFormats.3 ".
    to use a dot as the default string tag for level 3 slists (.Ls), and
	.ds ListTagFormats.3 "A
    to use A, B, C... for level 3 lists (.L).

===============
Note on Design:

Names:

I've used names like _stack.v.henry.12, where I could have used v12.
This is because troff only has global names;  when we add a list of
figures waiting to print at the top of the page, we don't want to have to
check which variables are in use.
So I give each separate `module' a prefex (like _stack).  These end in a
dot so there'd be no risk of conflict with a package using `_s' (say).

The leading _ is (as in the Unix C libraries) used to indicate names that
the user would not call or refer to directly.

The stack name supplied by the user is inserted into some of the generated
names (such as _stack.v.henry.12), so I have to ensure that a stack named
henry2 with 2 elements can't interfere with element 22 in stack `henry'!
One will have
	_stack.v.henry2.2
and the other
	_stack.v.henry.22
Clearly if the dots weren't there there would be a problem.

Stacks:

There are several other ways of doing this.

* You can include stacks in all the macros that use them.  This is the
  most machine-efficient, but is often not very human-efficient (i.e.
  hard to maintain).

* You can have ``polymorphic'' stacks.  These let you put anything on the
  stack -- for example
    push programmers string Name "Lee Barefoot
    push programmers number Age "26
    push programmers diversion description
  but the programming is more complex, and the benefit is very slight.


=======================================
Simple versions of Stacks for old troff

The following will work in most versions of troff.

At the very end is part of a sqtroff debugging trace, to help you follow
how it works.  This is *not* in the shar, so watch out.

If you use these macros for other things, change the #-signs in E! to
control-G, or use the version I might post later if I talk about error
handling.  You might find that \n(.F is always zero, if you have a very
old troff.  DWB nroff and troff use .F and c. as the current filename (!)
and line number, respectively.

Lee.

#! /bin/sh
# This is a shell archive.  Remove anything before this line, then unpack
# it by saving it into a file and typing "sh file".  If this archive is
# complete, you will see the following message at the end:
#		"End of shell archive."
# Contents:  old-stack
# Wrapped by lee@anduk.uucp on Wed May 17 19:27:26 1989
PATH=/bin:/usr/bin:/usr/lbin:/usr/ucb:$PATH; export PATH
if test -f old-stack -a "${1}" != "-c" ; then 
  echo shar: Will not over-write existing file \"old-stack\"
else
echo shar: Extracting \"old-stack\" \(3009 characters\)
sed "s/^X//" >old-stack <<'END_OF_old-stack'
X.\" I have tested this only with nroff, but it did work.  First time,
X.\" too, which maybe shows the value of starting with clear concepts
X.\" and a readable form (the sqtroff version) and converting...
X.\" Or maybe not :-)
X.\" Names used:
X.\" x0	(for any ASCII character d) holds the stack pointer for
X.\"	stack "x"
X.\" The values are in x1...x9, xa...xz, xA...xZ, so you have to
X.\" be very careful in your choice of X.
X.\" Also, string and register __ is temporary, and
X.\" macro _i is used.
X.
X.
X.de nS\" .nS X -- declare X as a numeric stack
X. if \\n(\\$10 .E! fatal "stack \\%$1 already in use!" nS
X. nr \\$10 1 \" stack defined but empty
X..
X.
X.\" n+ stackname value
X.\" put the given numeric value on the stack.  The value can be
X.\" any numeric expression, but it will have a 0 pre-pended so it
X.\" should start with a + or a - if it uses (parentheses).
X.\" Also, the expression must not contain \\ (or \\\\ if called from
X.\" a macro), as it has to be a complete single number when it's
X.\" put on the stack.
X.
X.de n+\" nPush value
X. if !\\n(\\$10 .E! fatal "stack \\$1 not declared with nS" n+
X. nr __ \\n(\\$10
X. _i __\" make an index
X. nr \\$1\\*(__ 0\\$2\" put the value on the stack
X. nr \\$10 +1
X. rr __
X. rm __
X..
X.
X.\" n- stackname register
X.\" take the top value from the named stack, and put into the
X.\" named register.  The stack must not be empty.
X.\" You can test \\n(X0 to see if stack X is empty (it will have
X.\" a value of 1 or less in this case).
X.\" retrieve a numeric value from the stack
X.de n- \" nPop value
X. if !\\n(\\$10 .E! fatal "nstack \\$1 not declared with nS" n-
X. if \\n(\\$10<2 .E! fatal "npop: nstack \\$1 is empty" n-
X. nr \\$10 -1
X. nr __ \\n(\\$10
X. _i __\" make __ be the appropriate index
X. nr \\$2 \\n(\\$1\\*(__ \" retrieve from the stack
X. rr __
X. rm __
X..
X.\" ._i xx
X.\" makes a string called "xx" be a single character, one of
X.\" 0-9, a-z, or A-Z, depending on the value of xx taken as a *number
X.\" register* (remember that number registers and strings inhabit
X.\" separate name-spaces)
X.\"
X.
X.de _i \" ._i 2-char-name -- make a 1-char index
X. af \\$1 0 \" default: use 0...9
X. ie \\n(\\$1>9 \{.\" use letters
X.  af \\$1 a \" a...z
X.  if \\n(\\$1>(9+26) .af \\$1 A\" use A...Z
X.  if \\n(\\$1>(9+26+26) .E! fatal "Stack overflow" _I
X. \}
X. ds \\$1 \\n(\\$1
X..
X.
X.\" error handling is covered properly later, here's a very simple one:
X.de E! \" E! severity message caller
X. ds __ tm
X. if #\\$1#fatal# .ds __ ab \" .ab prints a message and exits
X. \\*(__ \\n(.F: \\n(c.: \\$1: \\$3: \\$2
X..
X.
X.
X.\" A very simple example
X.nS @
X.n+ @ 12
X.n+ @ 7
X.nS +
X.n+ + 22
X.n+ + 23
X.n+ @ 32+12-5
X.n+ @ -99
X.n- @ a
X.tm pop 1 gives \na
X.n- @ a
X.tm pop 2 gives \na
X.n- @ a
X.tm pop 3 gives \na
X.n- @ a
X.tm pop 4 gives \na
X.n- + a
X.tm pop 5 gives \na
X.n- + a
X.tm pop 6 gives \na
X.\" some errors,  You'll have to try them one at a time!
X.\" .n- + a
X.\" .tm pop 7 gives \na (should be an error)
X.\".n- @ a \" another error
X.\".nS @ \" already declared
X.n+ A 45 \" no such stack
END_OF_old-stack
if test 3009 -ne `wc -c <old-stack`; then
    echo shar: \"old-stack\" unpacked with wrong size!
fi
# end of overwriting check
fi
echo shar: End of shell archive.
exit 0

Lee
Enjoy!

X ps 300 1 1				-- I'm using a PostScript device
% T *.STARTUP 21.0c 29.7c @ 21u 17u 56u 25u 21.0c 29.7c
Y P default A4 2480 3507 21 17 56 25 2480 3507	-- with A4 paper
% T *.ds .pageinfo "P default A4 2480 3507 21 17 56 25 2480 3507
% T *.ds .T "ps

% T *.de nS			-- define the macros...
% T *.de n+
% T *.de n- 
% T *.de _i 
% T *.de E! 

% T .nS (CALL) "@" 		-- now call them
% T .nS *.if 0  (FALSE)
% T .nS *.nr @0 1 
% T .nS (RETURN) 

[A couple of calls to .n+ left out here -- Liam]

% T .n+ (CALL) "@" "32+12-5" 	-- call npush
% T .n+ *.if !3  (FALSE)	-- check the stack is not undefined
% T .n+ *.nr __ 3		-- find the index character:
% T .n+ ._i (CALL) "__" 
% T .n+ ._i *.af __ 0			-- less than 10, so use a digit
% T .n+ ._i *.ie 3>9  (FALSE)
% T .n+ ._i *.ds __ 3			-- (3 in this case)
% T .n+ ._i (RETURN) 
% T .n+ *.nr @3 032+12-5	-- notice the leading zero
% T .n+ *.nr @0 +1		-- and increment the stack pointer
% T .n+ *.rr __			-- finally, cleanup.
% T .n+ *.rm __
% T .n+ (RETURN) 

% T .n+ (CALL) "@" "-99" 
% T .n+ *.if !4  (FALSE)
% T .n+ *.nr __ 4
% T .n+ ._i (CALL) "__" 
% T .n+ ._i *.af __ 0 
% T .n+ ._i *.ie 4>9  (FALSE)
% T .n+ ._i *.ds __ 4
% T .n+ ._i (RETURN) 
% T .n+ *.nr @4 0-99		-- this is why we need the leading 0
% T .n+ *.nr @0 +1
% T .n+ *.rr __
% T .n+ *.rm __
% T .n+ (RETURN) 

% T .n- (CALL) "@" "a" 
% T .n- *.if !5  (FALSE)
% T .n- *.if 5<2  (FALSE)
% T .n- *.nr @0 -1
% T .n- *.nr __ 4
% T .n- ._i (CALL) "__" 
% T .n- ._i *.af __ 0 
% T .n- ._i *.ie 4>9  (FALSE)
% T .n- ._i *.ds __ 4
% T .n- ._i (RETURN) 
% T .n- *.nr a -99 
% T .n- *.rr __
% T .n- *.rm __
% T .n- (RETURN) 
% T *.tm pop 1 gives -99
pop 1 gives -99

% T .n- (CALL) "@" "a" 
% T .n- *.if !4  (FALSE)
% T .n- *.if 4<2  (FALSE)
% T .n- *.nr @0 -1
% T .n- *.nr __ 3
% T .n- ._i (CALL) "__" 
% T .n- ._i *.af __ 0 
% T .n- ._i *.ie 3>9  (FALSE)
% T .n- ._i *.ds __ 3
% T .n- ._i (RETURN) 
% T .n- *.nr a 39 
% T .n- *.rr __
% T .n- *.rm __
% T .n- (RETURN) 

% T *.tm pop 2 gives 39

[I deleted a lot of trace output here -- Liam]
% T *.tm pop 6 gives 22
pop 6 gives 22

Finally, here's an error:
% T .n+ (CALL) "A" "45" 
% T .n+ *.if !0  (TRUE)
% T .n+ .E! (CALL) "fatal" "stack A not declared with nS" "n+" 
% T .n+ .E! *.ds __ tm
% T .n+ .E! *.if #fatal#fatal# (TRUE)
% T .n+ .E! *.ds __ ab 
% T .n+ .E! *.ab  old-stack: 104: fatal: n+: stack A not declared with nS
old-stack: 104: fatal: n+: stack A not declared with nS

End of Part 2.

key Concepts:
	Building up names with expressions
	Arrays
	Stacks
	Making bullet-lists
Secondary Concepts:
	tabs and indenting

Next Issue (if wanted):
	page layout
	    fonts & sizes
	    margins
	    lining things up
	page traps
	    page breaks
	    first margin-note example
and (if there's room)
	callout lists (which are a more general kind of trap)

-- 
Lee Russell Quin, Unixsys UK Ltd, The Genesis Centre, Birchwood,
Warrington, ENGLAND, WA3 7BH; Tel. +44 925 828181, Fax +44 925 827834
	lee%anduk.uucp@ai.toronto.edu;  {utzoo,uunet}!utai!anduk!lee
UK:	uu.warwick.ac.uk!anduk.co.uk!lee

scott@dtscp1.UUCP (Scott Barman) (05/25/89)

I apologize for posting this publically, but my mail reply got bounced
back.  I also apologize if this has been asked before (I've been away
for a while).

In article <19@nx32s.anduk.co.uk> lee@anduk.co.uk (Liam R. Quin) writes:
>Intended Audience: anyone who writes troff macros, or who wants to
>learn how to do so.  A very basic understanding of troff is assumed;
>you could probably get that from reading part 1.

Could you clarify one thing for me:
	You mention "old" troff and "new" troff (also mixing in sqtroff
as a synonym for new troff).  By old troff do you mean the AT&T distributed
version weather or not it is for the C/A/T-4 or device independent and new
troff being the commercially available product "sqtroff"?  I have worked
extensively with PWB 1.0 and DWB 2.0 but have never seen this "new" stuff
and I am wondering where it came from.

-- 
scott barman
{gatech, emory}!dtscp1!scott