[comp.society.futures] Is DTP Dead?

bzs@std.com (Barry Shein) (10/02/89)

(DTP == DeskTop Publishing)

Well, I don't believe that DTP is dead. But it's clear we may be
currently on a bad evolutionary path.

If you follow the TCP-IP list (discussion of the ARPAnet protocols)
you've been watching a fascinating discussion about how to store the
ARPAnet RFC's (Requests For Comment, that's what they call the
documents which define the networking protocols, it doesn't matter if
you understand this, just that it's several hundred on-line documents
each from one to about twenty-five or so pages on technical subjects
and very important to some people AND freely redistributable.)

A proposal had been made and accepted I guess to allow new RFC's to be
submitted in Postscript (if you don't know what postscript is you
probably should find out, it's a fancy language for creating fancy
documents on fancy printers mostly, and fancy CRT's.) The desire was
to make them prettier to print out and allow the inclusion of fancy
diagrams and/or graphics, the sort of thing Postscript is very good
at.

The problem is that a postscript document is usually generated by some
program and is mostly unreadable to a human and looks like:

	2 p
	%%Page: 2 2
	12 s 0 xH 0 xS 1 f
	2203 384(-)N
	2259(2)X
	2331(-)X
	555 672(Are)N
	733(You)X
	932(There)X
	1191(\(AYT\):)X
	1513(A)X
	1616(way)X
	1810(for)X
	1955(the)X
	2106(user)X

Not very readable although you can find the text if you look hard in
this one, most are much harder. Not obvious if there are any paragraph
breaks etc.

One big problem is that unless you have some very fancy software it's
pretty hard to do something which is easy to do on plain text files
(like this mail message) -- search for certain word patterns,
particularly if you want to search through hundreds of documents
automatically with a program.

Now, it seems like on-line, computerized document repositories are at
least as important as being able to use old english fonts in your
submission to a journal. And if we have on-line libraries than it
would be nice to be able to search them efficiently. Ideally
everything would be indexed but indexes have to be built in advance
and it's not possible to know what anyone might want to ask in
advance. So, sometimes we just have to search the full text body
itself. And it works. But it's much harder if it's in a format
like the above.

Before the clever hackers out there say "gee, I could just throw
something together which turns that into plain text" remember that
you'll also have to figure out things like tables which instead of
looking like:


			 Madison Jefferson  Adams
	Total Votes	| 11,240|   18,220| 9,270

Look something like, well, the stuff I showed you earlier. In text
format a lot of tables are easy to search even if error-prone.

Anyhow, perhaps after all these years of trying to come up with
formats (Postscript isn't the only culprit, in fact its
standardization might help encourage solutions!) which are good on
both printers and screens we missed the point. We actually wanted the
stuff to also be good on computers!

	-Barry Shein

Software Tool & Die, Purveyors to the Trade
1330 Beacon Street, Brookline, MA 02146, (617) 739-0202
Internet: bzs@skuld.std.com  UUCP: uunet!skuld!bzs

amanda@intercon.com (Amanda Walker) (10/03/89)

Well, there are a couple of issues here.  The biggest one, I think, is that
for documents such as the RFCs, universal accessibility is very important,
and as popular as PostScript is, there is one and only one universal document
format: line printer text.  There are minor variations, such as using EBCDIC
vs. ASCII, but straight monospaced line-by-line text is the only document
representation that everybody can read, and it is likely to be this way for
a long time.  There are still plenty of 70's vintage CRTs in use out there,
for example.

I have been struggling with this issue myself.  I have written (and my company
is about to start shipping) a Macintosh interface to news (with a mail
interface done by a coworker).  Part of this is a text editing module.  The
only reasons it doesn't handle high-quality text, basic graphics, and so on
are that (a) there's no way for me to send such a message so that anyone
else can read it, and (b) everybody else is sending messages that are
line printer text.

There are two approaches I can think of that can overcome this barrier, and
I don't like either of them :-).  They are based the idea that the document
should be readable and comprehensible if treated as line printer text, but
have more structure if interpreted by a smarter piece of software.  UNIX
does something like this with nroff output, which underlines by using
"underscore-backspace-character" sequences, and boldface by using "character-
backspace-character" sequences.  Both of them look fine on a printer or a
CRT, but a screen viewer that knows how can do appropriate things and show
real underlining (or italics) and boldfacing.

Another example is that if a viewer knows that a document consists of a
stream of paragraphs separated by blank lines (most news articles, for
example), it can reformat the paragraphs themselves, ignoring the line
breaks in the document.

In my opinion, what we need is a simple text-like format that can be printed
off or viewed on a dumb CRT, but that can also be postprocessed into
PostScript or whatever else (this adds extra flexibility, as well--I could,
for example, print RFCs in Garamond Light instead of Times Roman or Courier).

I've thought of a couple of things, such as using "space-backspace" (which
would print or view as a blank line) to toggle between proportional or
monospaced text, and so on.  It's kind of icky, but it would work :-).

The bigggest problem is graphics.  You just can't do graphics on a line
printer (aside from Snoopy calendars :-)).  You might be able to do something
with approximating line drawing with +, -, and | (the way the RFC's do now)
and some rules for turning them back into lines and boxes, but anything
more complex is going to be a bear.

--
Amanda Walker
amanda@intercon.com

elm@chilli.Berkeley.EDU (ethan miller) (10/03/89)

In article <1476@intercon.com> amanda@intercon.com (Amanda Walker) writes:
%There are two approaches I can think of that can overcome this barrier, and
%I don't like either of them :-).  They are based the idea that the document
%should be readable and comprehensible if treated as line printer text, but
%have more structure if interpreted by a smarter piece of software.  UNIX
%does something like this with nroff output, which underlines by using
%"underscore-backspace-character" sequences, and boldface by using "character-
%backspace-character" sequences.  Both of them look fine on a printer or a
%CRT, but a screen viewer that knows how can do appropriate things and show
%real underlining (or italics) and boldfacing.

So what's wrong with writing a PostScript interpreter that produces
line-printer text?  It's always much easier to reduce the complexity
of a document than increase it.  If you don't have proportional spacing,
you get regular spacing.  If you can't switch fonts, everything is in
the same font.  Drawings get simplified or just not printed (it can't
be worse than before, with those horrid ASCII drawings).  Tables can
be simulated pretty easily.  This is no different from what nroff
does; it's just that the input language is much less human-readable.

%In my opinion, what we need is a simple text-like format that can be printed
%off or viewed on a dumb CRT, but that can also be postprocessed into
%PostScript or whatever else (this adds extra flexibility, as well--I could,
%for example, print RFCs in Garamond Light instead of Times Roman or Courier).

As I said above, I think the reverse is true.  Let the document
creator define a "preferred" style for printing out, and if people
can't do that, then convert into line-printer style.

%The bigggest problem is graphics.  You just can't do graphics on a line
%printer (aside from Snoopy calendars :-)).  You might be able to do something
%with approximating line drawing with +, -, and | (the way the RFC's do now)
%and some rules for turning them back into lines and boxes, but anything
%more complex is going to be a bear.

Indeed.  Converting from PostScript into line-printer text is tough, but
much easier than getting a good laser-quality drawing from + and |.

ethan
=================================
ethan miller--cs grad student   elm@ginger.berkeley.edu
#include <std/disclaimer.h>     {...}!ucbvax!ginger!elm
"I like the Austrian way better." -- Dr. Henry Jones, Jr.

peter@ficc.uu.net (Peter da Silva) (10/03/89)

.\" #!nroff -ms
.DS L
In article <31661@ucbvax.BERKELEY.EDU>, elm@chilli.Berkeley.EDU (ethan miller) writes:
> So what's wrong with writing a PostScript interpreter that produces
> line-printer text?  It's always much easier to reduce the complexity
> of a document than increase it.  If you don't have proportional spacing,
.DE
.IP
"It's always much easier to reduce the complexity of a document than
increase it."
.PP
Basically you're saying that Postscript is a higher level language than
ASCII text.
.PP
Indeed. So rather than ship Postscript, ship some markup language (TeX,
nroff, or whatever silly acronyms the standards community is using. SGML?).
This is even a higher level than PS, and a lot better designed for use
by software: whether to generate line-printer text or to include as
references in further documents.
-- 
Peter da Silva, *NIX support guy @ Ferranti International Controls Corporation.
Biz: peter@ficc.uu.net, +1 713 274 5180. Fun: peter@sugar.hackercorp.com. `-_-'
"That is not the Usenet tradition, but it's a solidly-entrenched            U
 delusion now." -- brian@ucsd.Edu (Brian Kantor)

kent@WSL.DEC.COM (10/04/89)

But then you need to standardize on a markup language, and they're all
bad in some dimension (in particular, most don't handle figures
portably). And chances are that I don't have the right formatter or
macro package for your message. Which is why people degenerate to PostScript.

peter@ficc.uu.net (Peter da Silva) (10/04/89)

In article <8910040136.AA11446@gnomee.pa.dec.com>, kent@WSL.DEC.COM writes:
> But then you need to standardize on a markup language, and they're all
> bad in some dimension

... as opposed to postscript, which is bad in two dimensions...

(or is that...

1913(...)P
45(as)P
561(opposed)P

...?)
-- 
Peter da Silva, *NIX support guy @ Ferranti International Controls Corporation.
Biz: peter@ficc.uu.net, +1 713 274 5180. Fun: peter@sugar.hackercorp.com. `-_-'
``I feel that any [environment] with users in it is "adverse".''           'U`
	-- Eric Peterson <lcc.eric@seas.ucla.edu>

GLOBALCP@UVVM.BITNET (Melcir Erksine-Richmond) (10/05/89)

You might like to ponder this one.  I know little about the jargon you
are all au fait with.  No matter.  I would like to transfer many of my
MacWrite files (which include different sized fonts) on-line.  However,
since the material was all previously prepared for Desk-Top Publishing
- or any other publishing process for that matter, I seem to now be
stuck.  But I am stunned that no one using on-line communications seems
to have come up with this technology for different font sizes, or for
online illustration transfer.  Is it that difficult?  Or am I just not
yet aware of some magical tools already available?

Best wishes
Melcir


RETURN ADDRESS:
Melcir Erskine-Richmond

BITNET: GLOBALCP@UVVM            * UNIX: globalcp@uvcw.UVic.ca
POSTAL: GlobalCP
        C% U. Vic. Chapter - World Future Society
        S. U. Bldg.
        University of Victoria
        P.O. BOX 1700
        VICTORIA, B.C., V8W 2Y2
        CANADA
   FAX: 604-721-8653               |  TEL: 604-721-4763

=========================================================================
If we plan collectively *NOW* for a healthy and sustainable global bio-
SYSTEM IN THE 21ST CENTURY, WE CAN STILL ACHIEVE THIS GOAL.

Acknowledge-To: <GLOBALCP@UVVM>

daven@ibmpcug.co.uk (D R Newman) (10/06/89)

So what's wrong with DCA (Document Content Architecture) format for articles,
which is read by a lot of word processors - at least until we have proper
hypertext and hyperdata publishing systems?

D.R.Newman@kingston.ac.uk
-- 
Automatic Disclaimer:
The views expressed above are those of the author alone and may not
represent the views of the IBM PC User Group.

janssen@holmes (Bill Janssen) (10/06/89)

This interesting discussion is similar to one in comp.mail.multi-media
a bit earlier, but a bit different with the reference to RFC's and a base
of stored documents.

It is important to realize that the format for the documents should be
a mark-up language, not just raw text.  The mark-up language should be
chosen so that it marks ideas, not formatting.  The text should not
have marks that indicate that a certain word is "italic" or "bold",
but rather that that word is an "important-concept" or
"reference-to-system".  This allows semantic content to be preserved,
and, with the addition of a file defining appearances, can be
processed into a nice presentation format as well (which might in fact
be PostScript, or TeX dvi, or InterPress, or raw line-printer output).

Some have pointed out that with appropriate defs, PostScript can be
used as a semantic mark-up language.  The mechanism used in PostScript
for string constants seems to be a little clumsy for that, but no
doubt it is possible to get around that.  This use seems inappropriate
for PostScript, though, considering that it was designed for page
description.

Another criterion for the mark-up language would be to be reasonably
readable, even in raw form, so that it could be editted with a dumb editor,
and the documents printed without the formatting programs.  Some system
such as LaTeX seems to provide a better model for this type of language
than does PostScript.

Of course, two programs, for turning the marked-up documents into
PostScript and line-printer, should be written and placed into the
document repository for readers to use.

Bill
--
 Bill Janssen        janssen.pa@xerox.com      (415) 494-4763
 Xerox Palo Alto Research Center
 3333 Coyote Hill Road, Palo Alto, California   94304

ALLEN@BROWNVM.BITNET ("Allen Renear, CIS, Brown Univ. 401-863-7312") (10/06/89)

You are in a huge underground office.  A fierce snake bars the exit.
There is some computer equipment there. There is a network connection.
Over the next few days you will be receiving thousands of pages of
technical documents.  Your survival will depend on these documents.
Would you like these documents (1) in postscript, (2) in plain tty
ascii text, or (3) in an (unspecified) high-level markup language?

>What kind of computer equipment is it?

Sorry, I can't tell you that.

>(3)

(Really, is there any doubt about this?)