[net.unix-wizards] Broff and a proposed net project

budd@arizona.UUCP (07/13/83)

        The design of broff, the successor to [nt]roff, is proceeding at
a fast clip.  I have received several comments on my earlier note on
the design of the new system, and these are summarized at the end of this
submission.  One idea that was proposed by several people, and which seems
intriguing, is to have the software produced as a net project.  The purpose of
the first part of this note is to describe the idea of the net project and
to request volunteers.

        The idea of a net project is as follows.  My job is to act as moderator
("chief programmer" for you IBM types).  I have already drawn up a high level
design on the system (and in fact have coded a good part of it), but various
important subtasks have been left incomplete.  These subtasks are to be
distributed to various interested parties to complete, perhaps even giving
some tasks to more than one party to work on.  My job as moderator is to
collect these parts, chosing or combining different solutions to the same
problem, producing the final product.

Some problems I can see.

1.      Communication.  Suppose way down in the lowest level of some
task (say font manipulation) someone discovers a serious flaw in the design
that has widespread ramifications.  How do you communicate this to all
other people working on various aspects of the project.  (I guess that is
part of my job as moderator).

2.      The myth of egoless programming.  I think redundancy is a good idea.
Since I have no actual control over the people producing the code (ie, their
jobs are not on the line) I cannot control deadlines or even insure that
a commitment to produce some part of the project will be satisfied.
Furthermore, since I'm not hiring these people (nor even meeting them face
to face) i have no way of evaluating programming skills or experience.
Thus having multiple parties working on the same project maximizes the
chance that someone will come through with a good version of the desired
code.  Nevertheless, suppose that three different groups are working on
some project.  They all spend a considerable amount of time coming up with
a solution.  They all submit solutions, and my job as moderator is to choose
the best.  It will be difficult not to have two groups quite angry with me.


        I'm sure there would be other problems with this idea, and welcome
any communication on this topic.  Nevertheless, I'm interested enough with
the idea to go forward.

        I have implemented a general framework for the new system, for now
called lroff (for "little roff", later will come broff).  The system is
crude in its capabilities but generally good, I believe, in its design.
At this point I can see the following areas requiring further work:

1.      Line and/or page layout.  This is a real biggie.  The troff
algorithm is well known to be bad.  Perhaps better algorithms, such as those
described by Knuth, can be incorporated into the system without changing the
user interface.

2.      Fonts, font description, virtual fonts, ligatures, special fonts.
This is closely related to 1, but in view of the size of 1 it would probably
be worthwhile to try to define a clean interface and divide the project up.
I have pondered the implementation of virtual fonts (see my earlier note
for a definition of this term) a little, and see some real difficulties in
realizing them.  I would like to see someone spend more time thinking about
this.

3.      Hyphenation algorithm.  Knuth gives a better (although more expensive)
algorithm than the one used by [nt]roff.  This is not as big a project as
1 or 2, but clearly separable.

4.      Device specifications, device independence, device drivers, etc.
I have pondered this a bit, and am not sure the direction taken by ditroff
was the right one, other than for expediencies sake.  There is a world of
difference between different devices you might like to drive, and I am
not sure any concise description would be adequate.  Perhaps a better
solution is to define a clean interface, and write short drivers for each
device you would like, and then have a troff shell script choose the program
to be run in any particular case.  In any event, this is an area deserving
some more thought.

5.      Macros.  I have written a short set of MS like macros.  One major
motivation for this project is to provide a set of tools sufficient to bring
the task of macro writing and reading within the grasp of the average
programmer.  (anybody that has ever peeked at the inside the ms macro
package will know what i am talking about).  The current package is very
simple, it could stand being expanded.

6.      Symbol table management.  This is a small task, but a necessary
one.  In lroff i just used linear table lookup for symbol table routines,
since I didn't want to spend any more time than necessary in writing
this section of code.  This should be changed.



        If anyone is interested in participating in this (historic) project,
drop me a note indicating your area of interest and I will send you a copy
of the system as it stands now, plus more detailed comments on my ideas
for extensions in the particular direction indicated.

--tim budd

        {utah-cs, cornell, taklabs, purdue, ucbvax, kpno} ! arizona ! budd


Now, to summarize the comments i have received:

        The notion of a net project was first suggested by
Ian Darwin (utcsstat!ian).

        Almost everyone commented on the most obvious limitation of [nt]roff,
which is the ridiculous 1 or 2 character name restriction.  Unfortunately,
few people seemed to realise the implication of this change (which was the
reason for my not mentioning it in my first design document).  Consider the
common string OQ which prints an open quote mark.  This is frequently placed
immediately next to text, as in \*(OQfoo\*(CQ.  Unfortunately, I do think
there should be some attempt to be "somewhat" upward compatible, and thus
this is somewhat of a problem.  The best solution came fro Mario Ruggiero,
also of Toronto (utcsstat!mario) who suggests

        \*A             for one character names
        \*(AB           for two character names
        \*{ABCDEFG}     for N character names.

        Bill Tuthill (ucbvax!G:tut) suggests \n be changed to \#, so that
\n could take on its more conventional "newline" meaning.  After just
woofing about compatibility, I agree.

        watmath!idallen suggests that predefined register names have more
meaningful names, for example \#(pl for page length instead of \#(.p.  I agree.

        Ian Utting (ukc!iau) suggests the line and page breaking algorithms
be rewritten a la Knuth.  I agree.  Can we make it a truly trans-atlantic
project, Ian?

        Rick Zaccone (psuvax!zaccone) suggests better debugging features.
I've added a "debug" command (.db) that can be followed by a single character
modifier to type out various information (registers, strings, diversion,
etc).  He also wanted to see a better method of controlling widows and orphans
(sounds rather Dickens-like, doesn't it).  This is probably tied up in a better
line and page breaking algorithm.

        Guy Harris (rlgvax!guy) suggests a driver be written to produce
the ANSI X3.64 terminal escape sequence.  Sounds good.  Whats that?

        Kenneth Almquist (spanky!ka) suggested that macros be radically
redesigned to be more C like, including local variables, strings, etc.
This has apparently been done in at least one site (see below) but it is
not public.  An interesting idea, but I'm dubious that it would make
reading or writing macros any easier.

        Finally, I got a note describing a C program that recoded some of the
MS macros directly into C, and thus produces documents much faster than nroff.
This note was written a long time ago on a machine in a galaxy far far away ....
Unfortunately, I am not at liberty to describe how the note came into
my hands.

jaap@mcvax.UUCP (07/19/83)

Of course it is nice to do a project and communicating about it on the net.
It is one of the reason why the unix network exist.

I don't consider a rewrite of *roff to be a serious project. (BROFF: a
Burlesque Rewrite Off a Famous Formatter?).

First, my boss doesn't pay me do have hobbies like this.

Second, it is a bad idea to make a look alike with more or less the same input
requirements. (By the way, it will take a 2 seconds edit job to change to
change \n in \# and a little sed script for tbl, eqn, ideal and pic to let
them know about it.)
What is really needed is a new approach with new concepts for text processing.
I don't think that a text processing in a programming language flavour is a
solution. This will be just a tool made by programmers for programmers.
It would certainly not help our typist pool.
To have things done more interactively would be a better idea.

By the way, if you are continueing this project, keep the hyphenation algorithm
as clean as it is now. It's now trivial to put a different algorithm in, f.i.
for Dutch.

Final remark:
	\*(OQ doen't address the character OQ but a by .ds OQ defined macro.

	Jaap "Not wanting to recode years 60's software in the 80's" Akkerhuis.
	{philabs,mcvax}!jaap.

guy@rlgvax.UUCP (07/19/83)

	What is really needed is a new approach with new concepts for
	text processing.  I don't think that a text processing in a
	programming language flavour is a solution.  This will be just
	a tool made by programmers for programmers.  It would certainly
	not help our typist pool.  To have things done more interactively
	would be a better idea.

Amen.  Admittedly, I'm prejudiced, having written most of an interactive
editor-formatter (i.e., a word processor) that runs under UNIX, but I find
that it's a lot nicer to work with that nroff/troff.  For one thing, there's
a lot less of the "edit-nroff-correct-nroff again-..." cycle.  For some
applications, like producing a book, a post-processor might be better (although
I don't know that it is); but even there, it's possible to do a lot better
than nroff/troff - for example, Knuth's TeX seems more powerful and seems to
have a less baroque and difficult language.  Our main use for nroff/troff here
is with documents that have already been written using it, and for UNIX manual
pages.  "broff" would be useful for programmers and other people working
at shops which make heavy use of "nroff" and don't want to convert to
something new, however.

	By the way, if you are continuing this project, keep the
	hyphenation algorithm as clean as it is now. It's now trivial
	to put a different algorithm in, f.i. for Dutch.

You might want to post something about how to do this, in case other people
have the same requirements.  If you know how the ".ht" (Hyphenation Threshhold)
request works (from what I can see, there's some value assigned to each digram,
and only digrams with values at or above the threshhold are hyphenated), you
might post that too...

	Guy Harris
	{seismo,mcnc,we13,brl-bmd,allegra}!rlgvax!guy

jaap@mcvax.UUCP (07/25/83)

	- for example, Knuth's TeX seems more powerful and seems to
	have a less baroque and difficult language.

Maybe TeX is more powerful, but it is still a difficult language.
One should read some TUG-boat newsletter (TeX user group bulletin  of
advanced typesetting ?)  and see all the articles about how to do things,
how the Branching Mechanics really works, Basic Kludges etc.
The newsletter looks like an anti-propaganda bulletin...
Also the error messages are quite cryptic. The first one I got was:
"Whoaa, you can't do this before that." Hm, talking about user friendly
systems...  So far the "Joy of TeX".

	... "broff" would be useful for programmers and other people working
	at shops which make heavy use of "nroff" and don't want to convert ...

That's another reason for not changing \n to \#.
In general, keep it full upward compatible with the existing implementations
of n/troff. Hardly anything is so frustrating than something that is
just about compatible.

	You might want to post something about hypnenation....

Just a short remark, from what I remember from two years ago, the last time
I looked to it.
The routine hyphen() in n8.c expect a pointer to a word. As far as I remember,
it returns a pointer to the same word, but with a bit set (0200) at each
character before which the hyphenation may take place. See also suftab.c.
It refuses to hyphenate words with numbers, local motions or other funny
non-ascii things inside. So it also refuses to hyphenate words with ligatures!
There is still an old plan remove this last limitation, but the road to hell
is paved with good intentions, so...

	If you know how the ".ht" (Hyphenation Threshhold)
	request works (from what I can see, there's some value assigned
	to each digram, and only digrams with values at or above the
	threshhold are hyphenated), you might post that too...

The .ht request is for debugging the hyphenation algorithm. Diagram values are
calculated by looking in the hytab.c values. There are tables that look like
char bxh[26][13]{
	0000, 0050, .... etc.
I guess these values give the chance that hyphenating between `b' and `h'
for certain characters might be correct.
So to hyphenate f.i. frysk, you have to adjust these values accordingly, and
maybe change the table(-names) to f.i. char fxr[26][8].
For Dutch we just put a complete different algorithm in, which is smaller
faster and need less tables. (It's a spin off from an old research project,
which dealed with word frequences in newspapers.)
So we have nroff and nlroff (for Dutch hyphenation). I planned (again) a new
reguest .la <one or two charname> which will allow dynamic change of the 
hyphenation algorithm.

Hope this information is useful for someone out there.

	Jaap Akkerhuis
	Mathematisch Centrum (soon to be Centrum voor Wiskunde en Informatica)
	Amsterdam

PS
An apology.
I know this article should have gone in net.text, but we don't get it (yet).

barmar@mit-eddie.UUCP (07/27/83)

Since you have indicated that you agree that there are problems
with *roff-style formatters, why are you designing another one?
There is currently an ISO Draft-Proposed Standard for a Text
Processing Programming Language and for a Text Processing
Markup Language.  Perhaps you should look into this, rather
than redoing a twenty-year-old formatter.  The TPPL is a highly
structured language, and the markup language takes all the good
ideas from Scribe and then does it right.
-- 
			Barry Margolin
			ARPA: barmar@MIT-Multics
			UUCP: ..!genrad!mit-eddie!barmar

mel@houxm.UUCP (07/28/83)

Barry,
  About that ISO Draft-Proposed Standard for a Text Processing Programming
Language and Markup Language -- how do we find out about it ???  I did a bit
of research in our library and couldn't find any information about it.  They
had ISO Standards, but not Draft Standards listed.
  What better place than Usenet to air a new proposed standard ?
So please tell us about it.  Where do we get a copy of the
proposal ?  Can you post it ? or an abstract, at least ?  Are there any 
articles published on it ?  Who are the authors ?  Is an implementation in
the works ?  who ? when ? where ? on what ?
  A smug "I know more than you do" note may be great for the ego, but a note
with information will win friends and further the cause.  Please post more
information.  If this becomes a popular issue, perhaps we should form a new
newsgroup (net.docp ?) for document preparation discussions.   Thanks.
  Mel Haas  ,  houxm!mel

zrm@mit-eddie.UUCP (Zigurd R. Mednieks) (07/30/83)

Layout (formatting), and text entry have traditionally been separated.
While not all traditions are worth carrying forward, there usually is
some reason for their existence in the first place: Text entry must be
highly interactive and responsive, while text formatting must be very
powerful and flexible. The computer power it takes to provide the
flexibility and power for formatting is much greater than that required
by, say, emacs. Due to this imbalence, putting formatting functions in
an editor frequently causes the editor to become unacceptably slow. 
Also, formatting and text entry have two separate goals. Text entry
provides content, formatting provides form.

Most of the people involved in this discussion are familiar with
powerful text editors, but many of us are quite unaware of how much
effort goes into laying out a book or newspaper with a computerized
layout system. Further, computer layout systems are undergoing a
revolution in functionality and expressive power. And we are just
beginning to understand how to represent the graphic information that is
typically found on the printed page.

For these reasons I don't expect to see an editor that combines layout
functions with text entry. An interactive layout program, even a slow
one, that allows graphics to be mreged with text, pictures to be
halftone-screened, and page layout to be easily manipulated would be
well worth hacking.

Cheers,
Zig

barmar@mit-eddie.UUCP (Barry Margolin) (07/30/83)

I have been bombarded with so many requests for information on the ISO
Text Processing Programming Language & Text Processing Markup Language,
that I am going to just post what I know to the net.

The order number is: ISO TC97/SC5/EG CLPT-X3J6 N177

I don't know if this document is actually publicly available yet; I got
it from a colleague, who may even be on the subcommittee.  It is too
large for me to make copies of (at least, not enough to make copies for
everyone who asked).  If it is available, then I believe that one
should order it from the National Technical Information Service (NTIS)
in Wash., D.C.
-- 
			Barry Margolin
			ARPA: barmar@MIT-Multics
			UUCP: ..!genrad!mit-eddie!barmar

guy@rlgvax.UUCP (Guy Harris) (08/01/83)

	Layout (formatting), and text entry have traditionally been separated.
	While not all traditions are worth carrying forward, there usually is
	some reason for their existence in the first place: Text entry must be
	highly interactive and responsive, while text formatting must be very
	powerful and flexible. The computer power it takes to provide the
	flexibility and power for formatting is much greater than that required
	by, say, emacs. Due to this imbalence, putting formatting functions in
	an editor frequently causes the editor to become unacceptably slow.
	Also, formatting and text entry have two separate goals. Text entry
	provides content, formatting provides form.

Well, for fancy layout problems, like books, you might be right.  For less
demanding tasks, like letters, memos, technical documentation, etc. - well,
let's just say Wang probably sells more of their word processing equipment
(with an editor/formatter) in a month than text-editor-plus-nroff systems
are sold as word processing systems in a year.

Also, the claim that two functions are "logically separate" may not mean that
they should be done by two separate programs.  How many times have we all
gone through a cycle of edit-nroff-proofread-edit-nroff-..., not to correct
typos, spelling errors, and other semantic errors but just to correct the FORM
of the underlying document?  In addition, dealing with a separate editor and
formatter involves a process of abstraction that can make the work more dif-
ficult and which doesn't necessarily serve any purpose.

But I think the prime argument is still the "fifty million Frenchmen can't
be wrong" one.  Yes, there are people who argue that tube amplifiers are better
than transistor ones, but given the sales figures I think the battle has been
won.  The benefits conveyed by transistor amplifiers outweigh those conveyed
by tube amplifiers for the vast majority of buyers.  The same is true for
editor/formatters and separate editor/formatter packages; all the commercial
text editing systems sold for office use, except for a specialized few, are
editor/formatters.  The Seybold Report on Office Systems has repeatedly
criticized systems with separate editors and formatters, and said that those
systems would never catch on; they were and are right.  I can do just about
anything with our editor/formatter that you can with "nroff", and more easily
and better to boot.

As for the claim about CPU resources used by an editor/formatter and "emacs";
I'd have to see "emacs" in action before I believed that.  I've been told that
"emacs" requires more CPU resources than "vi", and "vi" has on occasion looked
competitive with our editor/formatter.  It's all in how you implement it.
There are several commercial word processing systems that use Z80s as their
main CPU, so obviously it's not an impossible task to make it efficient.

	Most of the people involved in this discussion are familiar with
	powerful text editors, but many of us are quite unaware of how much
	effort goes into laying out a book or newspaper with a computerized
	layout system. Further, computer layout systems are undergoing a
	revolution in functionality and expressive power. And we are just
	beginning to understand how to represent the graphic information that is
	typically found on the printed page.

	For these reasons I don't expect to see an editor that combines layout
	functions with text entry. An interactive layout program, even a slow
	one, that allows graphics to be mreged with text, pictures to be
	halftone-screened, and page layout to be easily manipulated would be
	well worth hacking.

Well, go down to your nearest Xerox Office Systems Division sales office and
ask to see a demo of the "8010 Information System".  You may know of it as
the "Star".  Yes, it's slow, BUT - it includes an interactive editor/formatter
which understands:

	multiple fonts
	multiple type sizes
	graphics within text
	multi-column text
	page layouts
	mathematical formulae
	tables

and does much, although not all, of the formatting as you type.  I agree that
dynamic as-you-type PAGE composition is probably very difficult and may not
be desirable to implement.  And if you want a paragraph formatting algorithm
like Knuth's, where a small change to one line can affect lines above it as
well as below it, you may not want to implement it within an editor/formatter
(or not be able to).  And it may not be possible to do book, newspaper, or
magazine layout with such a system.  HOWEVER, for most relatively simple layout
problems, such as one-column text, two-column text of the type done by the
"-ms" macro package, and the other sorts of things asked for by letters,
technical documentation, technical papers, reports, memoranda, and the like,
the battle has already been fought and won - by the editor/formatters.  In
those cases, there is no good case for separate editors and formatters.  We
know enough now about how to do editor/formatters that the benefits of
separating the editor and formatter have disappeared.

	Guy Harris
	{seismo,mcnc,we13,brl-bmd,allegra}!rlgvax!guy

zrm@mit-eddie.UUCP (Zigurd R. Mednieks) (08/02/83)

It's quite true that most business word processing consists of letter
writing, where what-you-see-is-what-you-get is exactly the right thing.
Since only one or two pages are being dealt with, you don't run into
re-pagination lossage. Also, the way a business letter looks is
important, and if the letter's content is, say, just a "thank-you", the
appearance might be more important than the content.

Wang has, therefore, done well with an editor that shows you pretty much
what you would see on paper. But you pay a price: The text files contain
formatting information, so they can't be easily viewed without the text
processor; On most Unix systems, mimicing the Wang display style would
mean transmitting a dozen or more characters for each character typed!
Yow, Vaxen eaten alive by ravenous DZs! Of course this matters less, if
at all, on personal workstation machines.

But, if you strip off some of the features, you can build a good
mixed-mode text processing system that does most things as you type, and
time consuming things later. A recent Seybold Report had good things to
say about one Wangish text processor by NBI, but it requires hardware
support (for the IBM PC), and one mixed-mode system, by Mark of the
Unicorn. (And if you think that's an odd name, NBI stands for Nothing
But Initials.) And Wang has their own re-implementation of their text
processor running on their PC.

If you say emacs is expensive, you better say which one. MINCE (stands
fo MINCE Is Not Complete Emacs) is probabaly the fastest visual editor
available for Unix. (Venturcom sells it.) The author did not call it
emacs because emacs ("emacs" comes from Editor MACroS) means extensible,
and while you can compile new functions into MINCE quite cleanly, you
don't have an implementation language like Mock-Lisp or TECO(!!!) that
lets you add things on the fly. This means that a "real" emacs has to
carry around the overhead of any interpretive system.

Parenthetically, here is an incomplete list of emacses and near-emacses:
The original emacs, written in TECO by Richard Stallman at MIT is
available free (if you've got a PDP10 to run it). Multics Emacs, written
in Multics MacLisp, by Bernie Greenberg, is available from Honeywell.
Gosling's Emacs, written by James Gosling at CMU, runs on Unixes,
available from Unipress. MINCE, written by Craig Finseth, in C, running
on everthing from CP/M systems on up, available for Unix through
Venturcom, and for all else through Mark of the Unicorn. Perfect Writer
is an early version of MINCE, available through Perfect Software.
Another emacs has just recently been written in Scheme for the HP200 at
MIT by persons unknown to me. NILE (NILE Is Like Emacs) comes with the
NIL language for VMS, written by Richard Soley and others. TV is an
emacs for the Perkin-Elmer 32-bit machines running under a Multics-like
operating system, written by Ted Anderson. Lastly, and mostly, is
ZEMACS, written is Zetalisp and running on the Lisp Machine, available
from Symbolics and Lisp Machine Inc, and including more features than I
have ever seen in any five systems of any sort, but NO WHAT-YOU-SEE... 
Whew! But I'm sure there are more. 

Now which emacs do you mean?

But back to "Unix meets Wangwriter." Another reason Wang-like text
processing does not fit in well with Unix is that almost every file that
contains data on Unix is human readable. So unless you opt for complete
integratation -- like in the Lisa or Star -- awk , sed, diction, spell
and etc. are going to have as hard a time dealing with your bussiness
letters as you would without the editor.

What to do? One possiblilty would be to keep formatting information in a
separate companion file. If the free text gets bashed, then the editor
tries its best to figure out how it was bashed (the companion file could
be a superset of the free text, including its contents plus formatting
info) and rebuilds the companion file. But one could speculate forever.
I, for one, have so far resisted the urge to write The Great American
Text Editor.

Cheers,
Zig

guy@rlgvax.UUCP (08/02/83)

	Wang has, therefore, done well with an editor that shows you pretty
	much what you would see on paper. But you pay a price: The text files
	contain formatting information, so they can't be easily viewed without
	the text processor; On most Unix systems, mimicing the Wang display
	style would mean transmitting a dozen or more characters for each
	character typed! Yow, Vaxen eaten alive by ravenous DZs! Of course
	this matters less, if at all, on personal workstation machines.

Yes, our system does require you to use either the word processing editor or
print program to view a document.  This isn't too much of a hardship.  (Also,
any ordinary UNIX text file can be used with the word processor, but the NL
at the end of every line is treated as a hard return.)  We used to display
a column counter as you typed, but don't anymore, to cut down on that trans-
mission overhead (also, we use DHs, not DZs, for our users).

	But, if you strip off some of the features, you can build a good
	mixed-mode text processing system that does most things as you type,
	and time consuming things later. A recent Seybold Report had good
	things to say about one Wangish text processor by NBI, but it requires
	hardware support (for the IBM PC), and one mixed-mode system, by Mark
	of the Unicorn. (And if you think that's an odd name, NBI stands for
	Nothing But Initials.) And Wang has their own re-implementation of
	their text processor running on their PC.

The reason the NBI (which originally stood for something *really wierd* until
the inventor of the name left NBI and took the rights to the name with him;
they then retroactively changed it to Nothing But Initials) word processor
requires hardware support is that the NBI 3000 uses a Motorola 6800, the
software is written in assembly language, and they figured it was easier and
quicker to put in a 6800 card rather than rewriting it in 808* assembler or
a high-level language.  Microsoft has a editor/formatter for the PC (Multi-Tool
Word) which uses the bitmap to display characters; it may be slow, but if you
wanted to use the character generator you could do a full editor/formatter
without any special hardware.

	If you say emacs is expensive, you better say which one.

The EMACS I was told was a hog was one of the common UNIX EMACSs, probably
not MINCE (and the reason it's not as fast as MINCE is probably that it
does support MockLisp.)

	But back to "Unix meets Wangwriter." Another reason Wang-like text
	processing does not fit in well with Unix is that almost every file that
	contains data on Unix is human readable. So unless you opt for complete
	integratation -- like in the Lisa or Star -- awk , sed, diction, spell
	and etc. are going to have as hard a time dealing with your bussiness
	letters as you would without the editor.

A simple solution, which we have adopted (although we haven't made the program
an "official" part of the system yet), is to have the moral equivalent of
"deroff" (yes, "nroff" sometimes suffers from the same problem) which works
on our word processor's document.  When you say UNIX, by the way, perhaps you
should say "the UNIX text processing utilities".  A UNIX system being used
for laboratory instrument control may not spend very much of its time piping
text processing utilities together....  And we *did* opt for such integration
("awk" isn't too happy working on free text anyway, as it's really oriented
towards records with fields); we have a spelling checker based on Proximity
Devices' lexicon and code that *INTERACTIVELY* scans through the document,
finds words not in the lexicon (it's not a hashed spelling checker, so the only
way you get a false hit or miss is if the lexicon itself is bad), and when it
encounters one it stops, highlights the word FROM WITHIN THE EDITOR, and
permits you to correct it interactively.  You can even see a selection from
the lexicon if you aren't sure of the correct spelling.  MUCH nicer than running
a batch program, getting a list of words not in the dictionary, and having to
go through the document and find and correct them yourself.  I have run our
documents through "diction" (and found it to be a bigger pain in the *ss than
"lint", in terms of complaining about things that you really can't fix) and
"style", using our "deroff"-style program.  Next to NO work needed to be done
to do that, once the program was written; just tweak the "style" shell file
slightly.

The statement that "Wang-style text processing doesn't fit in well with UNIX"
is simply false.  I don't like to see certain applications deemed "unfit for
UNIX" because they don't fit the common model of text processing filters
piped together.  UNIX is fit to run any application that isn't grossly
inconvenient or inefficient to implement under it.  (Note that both Bell and
Berkeley have started working on putting IPC facilities - shared memory,
semaphores, message mechanisms - into UNIX, precisely because you can open up
UNIX's applicability greatly without doing violence to the system by adding
such facilities), regardless of whether it accesses text files sequentially
or not.)

I remember somebody claiming that Fortran didn't belong under UNIX
because programs written in Fortran didn't fit the "UNIX style".  What this
would mean in practice is "Well, you unwashed Fortran programmers don't
deserve to have a hierarchical directory structure to organize your files,
or UNIX's facilities to edit, compile, and debug your code, or... - sorry,
you'll have to use RSX-11M."  We used UNIX at Harvard College Observatory mostly
as a base for running scientific applications written in Fortran.  It is a LOT
more "user-friendly" than RSX-11M.  "What do you mean I don't just type ^C to
abort my program, I then have to type 'ABO' if I ran it with 'RUN' and
'ABO ...PIP' if I ran it with 'PIP'?  For that matter, what do you mean I
have to say 'PIP FILE2.FOR=FILE1.FOR' instead of 'cp file1.for file2.for'
(anybody who thinks "cp" is cryptic and doesn't suggest copying, while "PIP"
is clear and obviously suggests copying, raise your hands).

I just don't buy the notion that UNIX should be a narrowly targeted system.  If
that approach is taken, a small group of people will love it and the vast bulk
of the world will never bother with it.  And I especially don't think that the
model of building an application by joining filters together should be taken as
the way to do *everything*.  This model fits several applications very poorly,
and text editing/formatting (as opposed to the sort of text processing that
some of the Writer's Workbench software does) is one of them.  Given the choice
between the "conceptual cleanliness" of an editor and a formatter, and the
"user-friendliness" of getting immediate, rather than delayed, feedback, I'll
take the latter every time.  The technique of organizing an application as
filters is a tool like any other tool, to be used only when it is better than
other tools.  Don't use a saber saw to slice bread.

The Seybold Report, by the way, has a strong bias in favor of "what-you-see-
is-what-you-get" systems.  They even prefer to show margin justification on
the screen, which we and Syntrex do (Syntrex's Aquarius is another system done
by a bunch of UNIX hackers, by the way).  For that matter, another UNIX system
with a full editor/formatter is Fortune, with their Wang-clone, and another is
the new NBI 68010 workstation.  As I've said, the battle between editor/
formatters and separate editor/formatter combinations has already been fought
and won by the former; you're fighting a rear-guard action.  For complicated
layouts, the separate editor and formatter may still win, although it is not
inconceivable that faster hardware or cleverer designs may make an interactive
editor/formatter possible even for them.

	Guy Harris
	{seismo,mcnc,we13,brl-bmd,allegra}!rlgvax!guy

cak@purdue.ARPA (08/02/83)

From:  Christopher A Kent <cak@purdue.ARPA>

Left unsaid in all this discussion (so far) is the fact that macros
are wonderful (all those who disagree may stop reading here). The
standard example is "I have this paper and I want to submit it
to CACM and IEEE Spectrum but they have different formats and I
don't want to type it twice". (Or maybe that should be CACM and Byte.)
With troff, you write the thing using macros, and run it through troff
twice, once with one set of definitions, once with the other.

This is extremely powerful! I first learned about text processing
with nroff/troff, and when I discovered Bravo (a what-you-see
editor/formatter for Altos) I couldn't understand how they had
missed this. Turns out that a follow-on, BravoX, attacked just
this dilemma. They have what-you-see, but there are magic marks
in the text that mark the beginning of a paragraph or what have you,
and you can modify a description file and reformat.

Commercial word processors don't deal with this because most business
applications don't need it. It's just another case of analyzing
your target market's needs.

Cheers,
chris

JPAYNE@BBNG.ARPA@sri-unix.UUCP (08/02/83)

Hey!  What about ...

JOVE.  It stands for Jonathan's Own Version of Emacs and is so full of
features, that it is simply amazing that it fits on an 11/70 with 10k
I space to spare.  Jove is FAST too, and probably better for UNIX than
MINCE because it takes advantage of pipes for things like shell commands
to buffers, which means "make" into a buffer with parse commands ...
And it's free!!

PHEW!  Now I feel better.

mullen@nrl-css@sri-unix.UUCP (08/02/83)

From:  Preston Mullen <mullen@nrl-css>

    Date: 1 Aug 83 21:30:34-PDT (Mon)
    From: decvax!genrad!mit-eddie!zrm@ucb-vax
    Subject: Re: Broff and a proposed net project

    What to do? One possiblilty would be to keep formatting information in
    a separate companion file. If the free text gets bashed, then the
    editor tries its best to figure out how it was bashed (the companion
    file could be a superset of the free text, including its contents plus
    formatting info) and rebuilds the companion file....

That's sort of how FORTUNE (tm?) systems do it.  (They run Unix (tm) with
a menu-based shell -- you can also escape into a Bourne shell.  One
application is a Wang (tm?) word processor look-alike.  The keyboard even
has all the special keys in the same configuration as Wang's.)

They keep three (two?) files for each word processor document.  I don't
know exactly what's in each file, but I think one contains only "format
lines" (line spacing and tab settings).  I do not believe that they
duplicate the "free text" in two files, one with imbedded formatting
controls and one without.  (One might assume that the word processor
would typically be used as a self-contained subsystem; it then wouldn't
pay to always keep two versions of the text.)  I do believe they have
programs that can map the text with imbedded formatting information
into an easily-readable plain text file, but I did not actually see such
programs.  (I imagine you could figure out the representation and write
such a program yourself in a couple of hours.)

This is what I recall from a demo of the system and a few minutes of
browsing in its files.  This is not an endorsement of any product, and
I am not connected with any of the companies that make or sell the
aforementioned products.

bob@ucla-locus@sri-unix.UUCP (08/03/83)

From:            Bob English <bob@ucla-locus>

I'm a little bit amazed at how discussions of favorite emacs
versions wind up under "broff".  Aren't reply programs wonderful?

--bob--

guy@rlgvax.UUCP (Guy Harris) (08/05/83)

Actually, some commercial systems do have the ability to define generic
formats that can be changed in one place and have the entire document change;
NBI and Convergent Technologies systems come to mind.

It is a good idea, but it's not a point in favor of separate editors and
formatters; as you point out, systems of both kinds can have this feature.

	Guy Harris
	{seismo,mcnc,we13,brl-bmd,allegra}!rlgvax!guy

guy@rlgvax.UUCP (Guy Harris) (08/06/83)

I don't know about tube vs. transistor amplifiers, but editor/formatters vs.
separate editors and formatters for long documents depends on the person
composing the document.  We do a number of long documents here with an
editor/formatter (which can handle footnotes, and numbering of headers and
sections, automatically) and find it quite convenient.

This is an interesting discussion, but it should probably move to net.text at
this point.  I'm posting this to net.text and net.unix-wizards.

	Guy Harris
	{seismo,mcnc,we13,brl-bmd,allegra}!rlgvax!guy

ron@brl-bmd@sri-unix.UUCP (08/10/83)

From:      Ron Natalie <ron@brl-bmd>

I always preferred the "You asked for it, you got it" type of text editing
system.

-Ron

jack@vu44.UUCP (Jack Jansen) (08/11/83)

If I play guitar, I want a tube amplifier.
Also, If I write >= 5 pages, I want ex/nroff.
	Jack Jansen. ({philabs|decvax}!mcvax!vu44!jack)