bzs@std.com (Barry Shein) (10/02/89)
(DTP == DeskTop Publishing) Well, I don't believe that DTP is dead. But it's clear we may be currently on a bad evolutionary path. If you follow the TCP-IP list (discussion of the ARPAnet protocols) you've been watching a fascinating discussion about how to store the ARPAnet RFC's (Requests For Comment, that's what they call the documents which define the networking protocols, it doesn't matter if you understand this, just that it's several hundred on-line documents each from one to about twenty-five or so pages on technical subjects and very important to some people AND freely redistributable.) A proposal had been made and accepted I guess to allow new RFC's to be submitted in Postscript (if you don't know what postscript is you probably should find out, it's a fancy language for creating fancy documents on fancy printers mostly, and fancy CRT's.) The desire was to make them prettier to print out and allow the inclusion of fancy diagrams and/or graphics, the sort of thing Postscript is very good at. The problem is that a postscript document is usually generated by some program and is mostly unreadable to a human and looks like: 2 p %%Page: 2 2 12 s 0 xH 0 xS 1 f 2203 384(-)N 2259(2)X 2331(-)X 555 672(Are)N 733(You)X 932(There)X 1191(\(AYT\):)X 1513(A)X 1616(way)X 1810(for)X 1955(the)X 2106(user)X Not very readable although you can find the text if you look hard in this one, most are much harder. Not obvious if there are any paragraph breaks etc. One big problem is that unless you have some very fancy software it's pretty hard to do something which is easy to do on plain text files (like this mail message) -- search for certain word patterns, particularly if you want to search through hundreds of documents automatically with a program. Now, it seems like on-line, computerized document repositories are at least as important as being able to use old english fonts in your submission to a journal. And if we have on-line libraries than it would be nice to be able to search them efficiently. Ideally everything would be indexed but indexes have to be built in advance and it's not possible to know what anyone might want to ask in advance. So, sometimes we just have to search the full text body itself. And it works. But it's much harder if it's in a format like the above. Before the clever hackers out there say "gee, I could just throw something together which turns that into plain text" remember that you'll also have to figure out things like tables which instead of looking like: Madison Jefferson Adams Total Votes | 11,240| 18,220| 9,270 Look something like, well, the stuff I showed you earlier. In text format a lot of tables are easy to search even if error-prone. Anyhow, perhaps after all these years of trying to come up with formats (Postscript isn't the only culprit, in fact its standardization might help encourage solutions!) which are good on both printers and screens we missed the point. We actually wanted the stuff to also be good on computers! -Barry Shein Software Tool & Die, Purveyors to the Trade 1330 Beacon Street, Brookline, MA 02146, (617) 739-0202 Internet: bzs@skuld.std.com UUCP: uunet!skuld!bzs
amanda@intercon.com (Amanda Walker) (10/03/89)
Well, there are a couple of issues here. The biggest one, I think, is that for documents such as the RFCs, universal accessibility is very important, and as popular as PostScript is, there is one and only one universal document format: line printer text. There are minor variations, such as using EBCDIC vs. ASCII, but straight monospaced line-by-line text is the only document representation that everybody can read, and it is likely to be this way for a long time. There are still plenty of 70's vintage CRTs in use out there, for example. I have been struggling with this issue myself. I have written (and my company is about to start shipping) a Macintosh interface to news (with a mail interface done by a coworker). Part of this is a text editing module. The only reasons it doesn't handle high-quality text, basic graphics, and so on are that (a) there's no way for me to send such a message so that anyone else can read it, and (b) everybody else is sending messages that are line printer text. There are two approaches I can think of that can overcome this barrier, and I don't like either of them :-). They are based the idea that the document should be readable and comprehensible if treated as line printer text, but have more structure if interpreted by a smarter piece of software. UNIX does something like this with nroff output, which underlines by using "underscore-backspace-character" sequences, and boldface by using "character- backspace-character" sequences. Both of them look fine on a printer or a CRT, but a screen viewer that knows how can do appropriate things and show real underlining (or italics) and boldfacing. Another example is that if a viewer knows that a document consists of a stream of paragraphs separated by blank lines (most news articles, for example), it can reformat the paragraphs themselves, ignoring the line breaks in the document. In my opinion, what we need is a simple text-like format that can be printed off or viewed on a dumb CRT, but that can also be postprocessed into PostScript or whatever else (this adds extra flexibility, as well--I could, for example, print RFCs in Garamond Light instead of Times Roman or Courier). I've thought of a couple of things, such as using "space-backspace" (which would print or view as a blank line) to toggle between proportional or monospaced text, and so on. It's kind of icky, but it would work :-). The bigggest problem is graphics. You just can't do graphics on a line printer (aside from Snoopy calendars :-)). You might be able to do something with approximating line drawing with +, -, and | (the way the RFC's do now) and some rules for turning them back into lines and boxes, but anything more complex is going to be a bear. -- Amanda Walker amanda@intercon.com
elm@chilli.Berkeley.EDU (ethan miller) (10/03/89)
In article <1476@intercon.com> amanda@intercon.com (Amanda Walker) writes:
%There are two approaches I can think of that can overcome this barrier, and
%I don't like either of them :-). They are based the idea that the document
%should be readable and comprehensible if treated as line printer text, but
%have more structure if interpreted by a smarter piece of software. UNIX
%does something like this with nroff output, which underlines by using
%"underscore-backspace-character" sequences, and boldface by using "character-
%backspace-character" sequences. Both of them look fine on a printer or a
%CRT, but a screen viewer that knows how can do appropriate things and show
%real underlining (or italics) and boldfacing.
So what's wrong with writing a PostScript interpreter that produces
line-printer text? It's always much easier to reduce the complexity
of a document than increase it. If you don't have proportional spacing,
you get regular spacing. If you can't switch fonts, everything is in
the same font. Drawings get simplified or just not printed (it can't
be worse than before, with those horrid ASCII drawings). Tables can
be simulated pretty easily. This is no different from what nroff
does; it's just that the input language is much less human-readable.
%In my opinion, what we need is a simple text-like format that can be printed
%off or viewed on a dumb CRT, but that can also be postprocessed into
%PostScript or whatever else (this adds extra flexibility, as well--I could,
%for example, print RFCs in Garamond Light instead of Times Roman or Courier).
As I said above, I think the reverse is true. Let the document
creator define a "preferred" style for printing out, and if people
can't do that, then convert into line-printer style.
%The bigggest problem is graphics. You just can't do graphics on a line
%printer (aside from Snoopy calendars :-)). You might be able to do something
%with approximating line drawing with +, -, and | (the way the RFC's do now)
%and some rules for turning them back into lines and boxes, but anything
%more complex is going to be a bear.
Indeed. Converting from PostScript into line-printer text is tough, but
much easier than getting a good laser-quality drawing from + and |.
ethan
=================================
ethan miller--cs grad student elm@ginger.berkeley.edu
#include <std/disclaimer.h> {...}!ucbvax!ginger!elm
"I like the Austrian way better." -- Dr. Henry Jones, Jr.
peter@ficc.uu.net (Peter da Silva) (10/03/89)
.\" #!nroff -ms .DS L In article <31661@ucbvax.BERKELEY.EDU>, elm@chilli.Berkeley.EDU (ethan miller) writes: > So what's wrong with writing a PostScript interpreter that produces > line-printer text? It's always much easier to reduce the complexity > of a document than increase it. If you don't have proportional spacing, .DE .IP "It's always much easier to reduce the complexity of a document than increase it." .PP Basically you're saying that Postscript is a higher level language than ASCII text. .PP Indeed. So rather than ship Postscript, ship some markup language (TeX, nroff, or whatever silly acronyms the standards community is using. SGML?). This is even a higher level than PS, and a lot better designed for use by software: whether to generate line-printer text or to include as references in further documents. -- Peter da Silva, *NIX support guy @ Ferranti International Controls Corporation. Biz: peter@ficc.uu.net, +1 713 274 5180. Fun: peter@sugar.hackercorp.com. `-_-' "That is not the Usenet tradition, but it's a solidly-entrenched U delusion now." -- brian@ucsd.Edu (Brian Kantor)
kent@WSL.DEC.COM (10/04/89)
But then you need to standardize on a markup language, and they're all bad in some dimension (in particular, most don't handle figures portably). And chances are that I don't have the right formatter or macro package for your message. Which is why people degenerate to PostScript.
peter@ficc.uu.net (Peter da Silva) (10/04/89)
In article <8910040136.AA11446@gnomee.pa.dec.com>, kent@WSL.DEC.COM writes: > But then you need to standardize on a markup language, and they're all > bad in some dimension ... as opposed to postscript, which is bad in two dimensions... (or is that... 1913(...)P 45(as)P 561(opposed)P ...?) -- Peter da Silva, *NIX support guy @ Ferranti International Controls Corporation. Biz: peter@ficc.uu.net, +1 713 274 5180. Fun: peter@sugar.hackercorp.com. `-_-' ``I feel that any [environment] with users in it is "adverse".'' 'U` -- Eric Peterson <lcc.eric@seas.ucla.edu>
GLOBALCP@UVVM.BITNET (Melcir Erksine-Richmond) (10/05/89)
You might like to ponder this one. I know little about the jargon you are all au fait with. No matter. I would like to transfer many of my MacWrite files (which include different sized fonts) on-line. However, since the material was all previously prepared for Desk-Top Publishing - or any other publishing process for that matter, I seem to now be stuck. But I am stunned that no one using on-line communications seems to have come up with this technology for different font sizes, or for online illustration transfer. Is it that difficult? Or am I just not yet aware of some magical tools already available? Best wishes Melcir RETURN ADDRESS: Melcir Erskine-Richmond BITNET: GLOBALCP@UVVM * UNIX: globalcp@uvcw.UVic.ca POSTAL: GlobalCP C% U. Vic. Chapter - World Future Society S. U. Bldg. University of Victoria P.O. BOX 1700 VICTORIA, B.C., V8W 2Y2 CANADA FAX: 604-721-8653 | TEL: 604-721-4763 ========================================================================= If we plan collectively *NOW* for a healthy and sustainable global bio- SYSTEM IN THE 21ST CENTURY, WE CAN STILL ACHIEVE THIS GOAL. Acknowledge-To: <GLOBALCP@UVVM>
daven@ibmpcug.co.uk (D R Newman) (10/06/89)
So what's wrong with DCA (Document Content Architecture) format for articles, which is read by a lot of word processors - at least until we have proper hypertext and hyperdata publishing systems? D.R.Newman@kingston.ac.uk -- Automatic Disclaimer: The views expressed above are those of the author alone and may not represent the views of the IBM PC User Group.
janssen@holmes (Bill Janssen) (10/06/89)
This interesting discussion is similar to one in comp.mail.multi-media a bit earlier, but a bit different with the reference to RFC's and a base of stored documents. It is important to realize that the format for the documents should be a mark-up language, not just raw text. The mark-up language should be chosen so that it marks ideas, not formatting. The text should not have marks that indicate that a certain word is "italic" or "bold", but rather that that word is an "important-concept" or "reference-to-system". This allows semantic content to be preserved, and, with the addition of a file defining appearances, can be processed into a nice presentation format as well (which might in fact be PostScript, or TeX dvi, or InterPress, or raw line-printer output). Some have pointed out that with appropriate defs, PostScript can be used as a semantic mark-up language. The mechanism used in PostScript for string constants seems to be a little clumsy for that, but no doubt it is possible to get around that. This use seems inappropriate for PostScript, though, considering that it was designed for page description. Another criterion for the mark-up language would be to be reasonably readable, even in raw form, so that it could be editted with a dumb editor, and the documents printed without the formatting programs. Some system such as LaTeX seems to provide a better model for this type of language than does PostScript. Of course, two programs, for turning the marked-up documents into PostScript and line-printer, should be written and placed into the document repository for readers to use. Bill -- Bill Janssen janssen.pa@xerox.com (415) 494-4763 Xerox Palo Alto Research Center 3333 Coyote Hill Road, Palo Alto, California 94304
ALLEN@BROWNVM.BITNET ("Allen Renear, CIS, Brown Univ. 401-863-7312") (10/06/89)
You are in a huge underground office. A fierce snake bars the exit. There is some computer equipment there. There is a network connection. Over the next few days you will be receiving thousands of pages of technical documents. Your survival will depend on these documents. Would you like these documents (1) in postscript, (2) in plain tty ascii text, or (3) in an (unspecified) high-level markup language? >What kind of computer equipment is it? Sorry, I can't tell you that. >(3) (Really, is there any doubt about this?)