[comp.text] troff and eof

msb@sq.com (Mark Brader) (03/09/89)

Detailed troff questions are both a UNIX topic and a text processing topic,
so I've added a crossposting to comp.text for this article; and I've directed
followups to comp.text also.

This was said in comp.unix.questions:

> > So, I use .em to set up an end-of-input trap ...
> > However, I want that text on a new page.
> > So, I add a .bp request before the text.  Unfortunately, this does not
> > work.  n/troff (and ditroff) exit *within* the .bp call

> It's worse than that.  Try closing and ejecting a diversion with .em.
> When *roff is finished reading its input files, it is determined that
> there shall be no new pages of output.  It would take major surgery
> on *roff to change this behavior.

As a person who has done major surgery on troff in general and on
this area in particular, I can say with some authority that this is wrong.
There are other problems with shutdown behavior, but this isn't one of them.

The way it actually works is that troff (including nroff, ditroff, and our
ditroff-based version sqtroff) is determined that there shall be no new
*unnecessary* pages of output.  The way that it determines necessity,
though, is (to put it politely) eccentric.

Troff thinks that new pages are necessary if and only if the buffers that
it uses for processing fill-mode paragraphs are NOT empty when it gets to
(ready now?) the end of the output page that it was processing when the
end of input occurred.  So, if the end-macro from .em causes output that
ought to be on the next page or may last for more than a page, you must
make sure that those buffers are not empty at that time.

The simplest case is where you have 'bp right inside the end-macro and
there's no bottom-of-page trap.  Then all you have to do is precede the
'bp request with the line:

	\c

(without the tab on the beginning, that was indentation for this message).

This much is described in the V7 UNIX documentation, though omitted from some
later versions.  It's in the short "Tutorial Examples" section that falls
between the Nroff/Troff manual proper and the separate "Troff Tutorial",
specifically in section T6.  As that writeup expresses it, the \c serves to
"deposit a null partial word"; i.e., in effect it sets the flag that says
the buffers are nonempty without actually putting anything in there.

(Digression.)

The usual use of \c, for those who don't know, is for constructions like

	This is
	.ul
	un\c
	usual input.

which is another way of saying

	This is \fIun\fPusual input.

The \c, which is called "interrupt text processing",serves to indicate
that the end of the input line does not mean the end of the word (here,
the word "unusual").  It works in nofill as well as fill mode.  It is not
the same as the backslash-newline escape sequence, which is simply swallowed
altogether; if the c were omitted in the last example, it would instead be
equivalent to

	This is \fIunusual input.\fP

(End of digression.)

A more difficult case is that where the end-macro does not itself contain
a 'bp, but causes one to occur indirectly -- e.g. by reading back a diversion,
leading to the bottom-of-page trap being sprung, and this trap then does
something like

	.ev 1
	.sp
	.tl ''\fBpage %\fP''
	.bp
	.ev

or perhaps some even more complicated stuff with footnotes.  Here, the \c
line must be put inside the bottom-of-page trap, in this case by changing
the .bp request to:

	\c
	'bp

If there was no bottom-of-page trap (you don't like bottom margins?!)
you'd have to create one.  Note that the \c line is generally ONLY wanted
on the last page; you might want to set a flag, say a number register
called LP, in the end-macro and do something like

	.if \n(LP \c
	.nr LP 0

in place of the simple \c above.  Yes, the \c works on the .if line like that.

I won't deal here with still more complicated situations such as those that
might arise if the end-macro itself directly or indirectly causes another
.em request!  Yes, there are legitimate uses for this.

The empty-buffer test is only ever done ONCE.  After troff finds the buffers
not to be empty, it continues processing input (from the end-macro and any
traps it causes to be invoked) until there is no more.  Then it finishes
the output page it's then producing (possible springing more traps along
the way), and shuts down no matter what.

(Final digression.)

The rationale behind this whole empty-buffer business, by the way, is that
it makes possible a simple technique of having one macro serve as both top
and bottom of page trap.  For instance, suppose you say:

	.wh -3 BT
	.de BT
	'sp
	.tl ''- % -''
	'bp
	'sp 3
	..

That's about as simple as you can get.  But look what would happen if
troff didn't terminate in the circumstances I've described.  Each time
the trap is sprung, it causes something (3 lines of space) to be put onto
a new page.  If the input is exhausted, troff will then finish the page.
But on the way to doing that, it springs the trap.  Which puts 3 lines of
space onto a new page.  Infinite loop.

(End digression.)

Mark Brader			 "... one of the main causes of the fall of the
SoftQuad Inc., Toronto		  Roman Empire was that, lacking zero, they had
utzoo!sq!msb, msb@sq.com	   no way to indicate successful termination of
416-963-8337, in US 800-387-2777    their C programs."		-- Robert Firth