jef@unisoft.uucp (Jef Poskanzer) (10/15/87)
In the referenced message, dce@mips.UUCP (David Elliott) wrote: }A year or so ago, I began noticing news postings with lines longer }than 80 characters. These can be a real pain to read, and at one }time I actually had my global rn kill file set up to junk all }articles from Apollo (where most of these were coming from at the }time). } }Anyway, with the proliferation of window systems on the net, I }believe that we may be seeing more and more of this type of thing. } }First of all, is this a problem? If so, what can we do about it? }If not, convince me that I shouldn't care (remember that it may }be a while before I can get a wide terminal for home, where I read }news). It is a problem, and what we should do about it is fix the news-reading and news-transferring programs to handle such messages in a reasonable manner. Real soon now, >80 character lines will become the norm, and we had better be ready for them. Many people have now discovered that the easiest and most natural way to make text be screen-width-independent is to use <newline> as a paragraph separator, not a line separator. The program that displays the text to the user then becomes responsible for breaking the paragraphs up into screen lines. You would not believe how much nicer this makes things. Not only does it solve the problem of different people using different size windows and different width fonts, it also makes composing text much more of a pleasure - no more reformatting. Unfortunately, many programs have built-in limits on line length. For example, pretty much every mailer on the DOD Internet does disgusting things to lines >80 characters. The SMTP protocol specifies a maximum line length of 1000 characters. And of course, vi simply loses. You can be sure that any programs I write can handle arbitrary-length lines. The rest of you had better start hacking... --- Jef Jef Poskanzer unisoft!jef@ucbvax.Berkeley.Edu ...ucbvax!unisoft!jef Fools rush in and get the best seats. ...and now, a word from our sponsor: "The opinions expressed are those of the author and do not necessarily represent those of UniSoft Corp, its staff, or its management."
rees@apollo.uucp (Jim Rees) (10/16/87)
    Many people have now discovered that the easiest and most natural way
    to make text be screen-width-independent is to use <newline> as a
    paragraph separator, not a line separator.  The program that displays
    the text to the user then becomes responsible for breaking the paragraphs
    up into screen lines.  You would not believe how much nicer this makes
    things.  Not only does it solve the problem of different people using
    different size windows and different width fonts, it also makes composing
    text much more of a pleasure - no more reformatting.
I don't see why we should have to change the format of the text as sent.
It's easy to tell where lines and paragraphs end with the existing
format.  Lines end in a single NL, paras end in a double NL.  You can
still write a filter that reformats paras to your favorite line length.
This is in fact what the news reading interface (emacs based) that I
used to use did.david@ms.uky.edu (David Herron -- Resident E-mail Hack) (10/18/87)
>> Many people have now discovered that the easiest and most natural way >> to make text be screen-width-independent is to use <newline> as a >> paragraph separator, not a line separator. PLEASE We (ukma) exchange a lot of news with BITNET sites. In particular, an IBM machine at the U of Pennsylvania, and a VMS Vax cluster at the U of Louisville. In both cases their operating systems limit test files to some maximum number of characters per line. (The IBM machine limits it to 132 columns and I don't know what the VMS machine limits itself to). In addition ... the file transfers are going over BITNET. In this case, BITNET means CARD PUNCHES virtual style. The news is transferred using a PUNCH deck (Maybe a print deck ... same problems) in fixed length records. We're talking truncation city folks! The point is that this network is rapidly growing away from it's roots as a UUCP-only network. We've got greater use of the Internet going on as well as (potentially) BITNET. To an extent we can't violate the standards of other networks and expect to get away with it. Instead, we need to be able to live with them. -- <---- David Herron, Local E-Mail Hack, david@ms.uky.edu, david@ms.uky.csnet <---- {rutgers,uunet,cbosgd}!ukma!david, david@UKMA.BITNET <---- I thought that time was this neat invention that kept everything <---- from happening at once. Why doesn't this work in practice?
fair@ucbarpa.Berkeley.EDU (Erik E. Fair) (10/18/87)
David, are you telling me that we are bound by the most restrictive set of standards network-wide that any one transport forces on us? That's not reasonable. The reasonable approach is to do a trivial encapsulation or encoding that makes it possible to move USENET articles (no matter what their characteristics are) through BITNET, or any other strange network. Erik E. Fair ucbvax!fair fair@ucbarpa.berkeley.edu
david@ms.uky.edu (David Herron -- Resident E-mail Hack) (10/18/87)
In article <21314@ucbvax.BERKELEY.EDU> fair@ucbarpa.Berkeley.EDU (Erik E. Fair) writes: >David, are you telling me that we are bound by the most restrictive >set of standards network-wide that any one transport forces on us? hmmmm .... weeeelll... >That's not reasonable. The reasonable approach is to do a trivial >encapsulation or encoding that makes it possible to move USENET >articles (no matter what their characteristics are) through BITNET, >or any other strange network. yes, I did exactly that for a long time with a news feed we had coming from GaTech's sole Unix machine on BITNET (gtfelix). We used a little pipeline of "compress -d file | btoa" on the sending side and "atob | uncompress" on the receiving side. I still use that same set of stuff with the feed to the VMS machine. BUT ... compress and atob/btoa don't run on the IBM 308x that's out other neighbor on BITNET. ALSO, in both cases their underlying operating systems has that record-oriented mentality. I agree that it's ridiculous that silly details of the transport system, or other operating systems' storage methods, should cause us to stunt the development of the software. BUT Some of us (you included) are trying to free this network from its' reliance on Unix. Building the WorldNet and such like. But what will the IBM people on BITNET think if they start seeing every article come in with 2000 character long lines because someone on a Unix machine wanted "automatic formatting" of his paragraphs? They'll only be able to read the first 80 (132?) characters of each paragraph. YES ... that 3081 at Penn State and the VMS machine at U of L could patch up their news to use some other storage method. But they will gripe every inch of the way and will end up with a slower system to boot. (likely). In essence you're looking down your noses at these people, and just continuing the old tradition of saying "My <x> is better than yours". Of course, they do it just as much as we do. WHICH DOESN'T MAKE IT ANY MORE CORRECT A THING TO DO. Each <x> has it's good points and bad points. BASIC is still around because it's an easy to use language and is very good at certain tasks that just need to be solved quickly. IBM's are still around because some people just prefer that mind-set. (I personally don't understand why, they just do). All I wanted to say in my original posting was that we should always keep in mind the least-common-demoninator. At the moment it's 80x24 screens. But I really like the 66line by 96 column display on my Blit... :-) > Erik E. Fair ucbvax!fair fair@ucbarpa.berkeley.edu -- <---- David Herron, Local E-Mail Hack, david@ms.uky.edu, david@ms.uky.csnet <---- {rutgers,uunet,cbosgd}!ukma!david, david@UKMA.BITNET <---- I thought that time was this neat invention that kept everything <---- from happening at once. Why doesn't this work in practice?
blarson@skat.usc.edu (Bob Larson) (10/19/87)
In article <7526@g.ms.uky.edu> david@ms.uky.edu (David Herron -- Resident E-mail Hack) writes: >In article <21314@ucbvax.BERKELEY.EDU> fair@ucbarpa.Berkeley.EDU (Erik E. Fair) writes: >>That's not reasonable. The reasonable approach is to do a trivial >>encapsulation or encoding that makes it possible to move USENET >>articles (no matter what their characteristics are) through BITNET, >>or any other strange network. >yes, I did exactly that for a long time with a news feed we had >BUT ... compress and atob/btoa don't run on the IBM 308x that's >out other neighbor on BITNET. Who said it had to be compress and btoa? >I agree that it's ridiculous that silly details of the transport >system, or other operating systems' storage methods, should >cause us to stunt the development of the software. >Some of us (you included) are trying to free this network from its' >reliance on Unix. Building the WorldNet and such like. But what will >the IBM people on BITNET think if they start seeing every article come >in with 2000 character long lines because someone on a Unix machine >wanted "automatic formatting" of his paragraphs? They'll only be able >to read the first 80 (132?) characters of each paragraph. So who's forcing them to truncate???? Why can't they set up some continuation line convention? A possible example would be to put a \ in column 80 to indicate that the next line is really part of the current line. The only programs that would have to know about such a convention already have to do ascii <-> ebcdic conversion, etc. (So it looks ugly to the news readers on the IBM system. If they care, they can fix their software.) While we're talking about fixing the news problems caused by bitnet, could they standardize an ascii <-> ebcdic conversion table for this use and make sure that tabs don't get converted to spaces? (The conversion breaks patch files, sendmail.cf files, etc.) >In essence you're looking down your noses at these people, and just >continuing the old tradition of saying "My <x> is better than yours". No, we're saying if you are a single person who wants to talk to several thousand that already speak the same language, trying to insist that those thousands always use a subset of their language that you happen to speak so you don't have to bother to learn the rest of the language probably won't get you far. -- Bob Larson Arpa: Blarson@Ecla.Usc.Edu Uucp: {sdcrdcf,cit-vax}!oberon!skat!blarson blarson@skat.usc.edu Prime mailing list (requests): info-prime-request%fns1@ecla.usc.edu
henry@utzoo.UUCP (Henry Spencer) (10/19/87)
> It is a problem, and what we should do about it is fix the news-reading > and news-transferring programs to handle such messages in a reasonable > manner. Real soon now, >80 character lines will become the norm, and > we had better be ready for them. Better yet, we should stay compatible with existing practice -- a matter of considerable importance in a network like this, where coordinated software updates are utterly impossible -- and let the long-linists fix *their* software to present text the way they like it while adhering to existing standards for inter-system transmission. > Many people have now discovered that the easiest and most natural way > to make text be screen-width-independent is to use <newline> as a > paragraph separator, not a line separator... Actually, text formatters discovered that it was quite possible to have text be output-device-width-independent without this silly incompatibility some twenty or more years ago. Just notice the empty line that separates the paragraphs. (Oh yes, and read the ASCII standard about the meaning of newline, so you know what you're trying to be compatible with.) -- "Mir" means "peace", as in | Henry Spencer @ U of Toronto Zoology "the war is over; we've won". | {allegra,ihnp4,decvax,utai}!utzoo!henry
fair@ucbarpa.Berkeley.EDU (Erik E. Fair) (10/20/87)
There are two issues here:
	1. Netnews transport
	2. Netnews presentation
The first issue is perfectly clear to me: all systems should be
able to transmit netnews articles through their gizzards without
change (excepting those changes in the headers that are mandated
by normal netnews operation, like updating "path:"). If there is
some type of netnews article that some minority of the network
can't swallow, then they're broken, and should be fixed, or left
on the periphery of netnews distribution so that their brokeness
won't affect the rest of the network. I do not intend to preclude
IBM systems from storing things in some internal format that is
more efficient for them; I just want them to understand that when
they transmit such an article to the outside world that the article
should be converted back to what the rest of the network views as
"normal": ASCII, with no transliterations, substitutions, or other
information loss.
The second issue is a bit more thorny. Taken to logical extreme,
we need to write articles in some formatting or page description
language, which the user interfaces interpret for whatever display
the user is using. SGML, anyone? Or perhaps {n,t,dit}roff? Maybe
PostScript?
Whatever we finally choose should be relatively easy to interpret,
easy to learn and write things in (nroff with -ms isn't so bad, if
you don't do anything too fancy), and yet powerful enough to do the
sort of fancy things you might see on a Sun or Macintosh. Not a
weekend hack project, it seems to me.
Given that none of the existing user interfaces is prepared to deal
with this sort of thing automatically (sure, you can pipe articles
to external interpreters, but that's not the point), we have to
make some assumptions, and the prevailing assumptions are 24 lines
of 80 ASCII characters, with various format effectors like tabs,
blank lines and form feeds. People who violate these assumptions
should bear in mind that in making their articles harder to read
on what is certainly the standard display size on the USENET today,
are decreasing the probability that their message will be read and
understood.
	Erik E. Fair	ucbvax!fair	fair@ucbarpa.berkeley.edudavid@ms.uky.edu (David Herron -- Resident E-mail Hack) (10/20/87)
In article <4756@oberon.USC.EDU> blarson@skat.usc.edu (Bob Larson) writes: >In article <7526@g.ms.uky.edu> david@ms.uky.edu (David Herron -- Resident E-mail Hack) writes: >>yes, I did exactly that for a long time with a news feed we had >>BUT ... compress and atob/btoa don't run on the IBM 308x that's >>out other neighbor on BITNET. >Who said it had to be compress and btoa? compress/btoa happen to be what I used since it worked with a couple of the sites I wanted to exchange news with. It could be something else. >>Some of us (you included) are trying to free this network from its' [ I am speaking to Erik here ... ] >>reliance on Unix. Building the WorldNet and such like. But what will >>the IBM people on BITNET think if they start seeing every article come >>in with 2000 character long lines because someone on a Unix machine >>wanted "automatic formatting" of his paragraphs? They'll only be able >>to read the first 80 (132?) characters of each paragraph. >So who's forcing them to truncate???? BITNET itself is doing the truncation. Think that IBM only makes high speed CARD PUNCHs and BITNET begins to make sense. > Why can't they set up some >continuation line convention? To be honest, I don't think they really see the problem. Also there's a bit of a chicken-and-egg problem. There are news readers and transport agents for IBM mainframes... But the use isn't very widespread. And the people doing it don't really understand that tab preservation or { and } preservation are needed things. Also ... most of the equivalent sort of traffic gets handled by their LISTSERVers. It's a distributed mailing list handler which allows people to subscribe/unsubscribe by themselves, and automagically subscribes them to the nearest LISTSERV. I don't really see a good solution. Potentially they could be a valuable addition to this WorldNet thingie we're trying to build. Doing things which are against their operating systems' assumptions is as irritating to them as their doing things against our os's assumptions. > A possible example would be to put a \ >in column 80 to indicate that the next line is really part of the >current line. Bad example. Suppose a Makefile is posted which has a line exactly 80 characters long with a \ as the last character ... I just remembered there's already one format in use that could be used ... The Listserv-punch format ... it can at least handle long records virtually within an 80 column punch file. And there's software around to encode/decode it already ... I'll have to look into this ... I think their current software works sort-of like ours. A batch arrives, gets put into a /usr/spool/news equivalent, and is transmitted from there. In order to not transmit the munged copy of the article across the net they'd have to do a non-trivial re-working of their systems to achieve an effect which won't even be visible to themselves. >While we're talking about fixing the news problems caused by bitnet, >could they standardize an ascii <-> ebcdic conversion table for this >use and make sure that tabs don't get converted to spaces? (The >conversion breaks patch files, sendmail.cf files, etc.) For my part ... I've determined that the only munging (at least for the news software at psuvm.bitnet) happens for articles which go: psuvax1 -> psuvm.bitnet -> ukma (i.e. were transmitted by UREP) For our feeds to the outside world I block any articles which arrived here via psuvax1 ... The only disaster I know of from a munged article which went through these links was one of the patches-for-patch. Its' tabs were changed to spaces causing it to be useless unless you used the -l switch, but many people didn't know about -l. (It so happened that sdcsvax -> burdvax -> psuvax1 -> ukma -> cbosgd was VERY VERY fast that day). I think we all understand the problems. Personally I don't like the idea of using VERY VERY long lines anyway. It looks ugly and like the person doesn't know how to use their editor very well. Many people have pointed out that you look for paragraph breaks by blank lines. Yah, I use blank lines for paragraph breaks, but not all do. For another case, what will happen to the above quotations if they get automatically-formatted on display? Won't they stop looking like quotations? -- <---- David Herron, Local E-Mail Hack, david@ms.uky.edu, david@ms.uky.csnet <---- {rutgers,uunet,cbosgd}!ukma!david, david@UKMA.BITNET <---- I thought that time was this neat invention that kept everything <---- from happening at once. Why doesn't this work in practice?
owens@psuvax1.psu.edu (Robert Michael Owens) (10/21/87)
first, ascii machine -> ascii machine over bitnet using urep can be done so that an image of the file is transfered using existing code. as alf would say -- no problem. the problem occures then an ascii machine -> unknown machine transfer occurs. in this case the ascii machine must assume the lowest common denominator (a brain damaged ibm machine). hence, the unix ascii byte stream must be converted into either ibm punch or print records. most ibm systems require punch records to be exactly 80 characters (fixed length records) and print records to be less than or equal to 132 characters (variable length records). also a ccc (would you believe channel command code) or asa character should be prepended to a print record. furthermore several records (which may have a nop ccc's so the user can't really see them) may have to be prepended to the file (how many can say :read card). to make thing worse there is no one ebcdic standard (hence the left and right bracket problem). the ebcdic tab is not correctly interpreted by some (most) ibm packages (editors, etc), etc. urep converts a file so that if the file is printed on either the ascii or ebcdic host, the listings are the same. (how many can say skip to prime page) second, some (most) ibm hosts can handle very long records. the problem is not all hosts can. big blue solves this problem by encapsulating the file (how many can say diskdump or netdata) when it is xfered. also, some jes'es and rscs2 (as does urep 3.0) can also handle spanned records. rscs1 also had spanned records but in a way which was pretty much incomptable with every thing else (special ccc's). owens hay. i don't know what i'm talking about either. i just wrote the code.
clewis@stm386.UUCP (Chris Lewis) (10/21/87)
Regarding the discussions about 80 character truncations etc...
I haven't actually seen BITNET, but isn't it primarily VM/CMS machines
communicating by RSCS?  (Good 'ol "CP SET PUN ROUTE...", "PUNCH FILE..." 
etc.)  (I used to be a bit of a VM/CMS hacker till I saw the light :-)
That's a little archaic - if you changed the "PUNCH FILE" to "DISK DUMP"
it would be able to transmit any kind of file (read: RECFM V, 
LRECL=anything).  "DISK LOAD" on the other side.  And, it's relatively
easy to have the software figure out itself which to do (to maintain
compatibility with both)  Depending on the software
there might be some things you'd have to do with the article reading
code ("rnews" equivalent).  Mind you, this is hacking - and if you
think that getting people to upgrade their USENET software on UUCP is
hard...
IBM had (3-4 years ago) an internal news system that works sort of similarly -
I believe that it was an ultra-trivial set of EXECs that merely read
incoming "punch" files, figures out which "newsgroup" it was in and
then appended the article to a "newsgroup file".  There was a central
repository where you sent articles, and it broadcast them to all sites
that "subscribed to the net"  The user interface
was merely "link to news disk" and then the user xedit'd the newsgroups
they wanted to see.  Xedit can handle files of any width.  As you can 
well imagine, the traffic wasn't particularly high.
The point I'm trying to make is that, yes, the virtual card punch is
RECFM=F, LRECL=80, but with the standard software available for
handling the PUNCH this is no longer a limitation on what you can 
send.  Must we remain compatible with a limit on a "peripheral network"
that hasn't been a limit on those systems for ages?   (predates VM/SP!)
Yes, lines over 80 columns are a bit of a pain on my VT100, but c'est la 
vie.
Of course, ASCII<->EBCDIC translation is a real b***h with those thingies.
Braces (there are two different pairs of codes for these - depending
on the peripheral), square brackets (is anybody's 3274 GENed to display
these?), tildes (Yes Virginia, there is a tilde in EBCDIC... somewhere),
carets (Weeelll, you can download a font for one...), tabs? (what's a tab?)
Actually, as far as articles originating in ASCII-land is concerned,
I would prefer that BITNET store them totally unchanged (In ASCII, RECFM=U).
Then, when a user wants to see something, BITNET figures out whether
to use its normal article "presenter", or runs it thru a ASCII-EBCDIC
translation when the user wants to see it.  Then, ASCII-land articles
that go to another ASCII-land site via some BITNET site have no changes
whatsoever.  Of course though, I'm dreaming...
-- 
Chris Lewis, International Semi-Tech Microelectronics Inc.
{uunet|utzoo}!mnetor!stm386!clewiskarl@haddock.ISC.COM (Karl Heuer) (10/22/87)
In article <37e7ff5a.b8ab@apollo.uucp> rees@apollo.uucp (Jim Rees) writes: >I don't see why we should have to change the format of the text as sent. >It's easy to tell where lines and paragraphs end with the existing >format. Lines end in a single NL, paras end in a double NL. I wish this were true. Unfortunately, there are some folks out there who use "\n[ \t][ \t]*" rather than "\n\n" as their paragraph separator. Write a filter that recognizes both formats, you say? Good idea, but now I have to worry about the people who think that indentation is a good way to highlight quoted text. And their counterparts who believe that the quoted text should be left as is, and the reply indented. Intelligent intervention is required at this point, and since AI doesn't exist, that means a human. Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
karl@haddock.ISC.COM (Karl Heuer) (10/22/87)
In article <7541@e.ms.uky.edu> david@ms.uky.edu (David Herron -- Resident E-mail Hack) writes: >In article <4756@oberon.USC.EDU> blarson@skat.usc.edu (Bob Larson) writes: >>A possible example would be to put a \ in column 80 to indicate that the >>next line is really part of the current line. > >Bad example. Suppose a Makefile is posted which has a line exactly 80 >characters long with a \ as the last character ... Then, by this convention, the first 79 characters would be displayed on the first line, followed by a \ for continuation, and the 80th character (\) would appear alone on the second line. Completely unambiguous, although ugly. >Personally I don't like the idea of using VERY VERY long lines anyway. >It looks ugly and like the person doesn't know how to use their editor >very well. Well, the suggestion was that the newsreading program should know how to display it properly. >... For another case, what will happen to the above quotations if they get >automatically-formatted on display? Won't they stop looking like quotations? Again, not if the newsreader formatter is smart. The quoted text would look like ">very long line of text\n" internally, but would display as if it were ">very\n>long\n>line\n>of text\n\n". More generally, the internal format should be something like ">\{margin}very long line of text\{para}" so that strings other than ">" can be properly replicated. (Btw, I use a similar convention when writing C programs. I try to avoid breaking a line just because it's getting close to the margin -- the person reading the code may have a different screen width or tab stops. Now someone just needs to write an editor that will display such long lines in a more conventional format.) Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
jerry@oliveb.UUCP (Jerry Aguirre) (10/22/87)
In article <7526@g.ms.uky.edu> david@ms.uky.edu (David Herron -- Resident E-mail Hack) writes: >All I wanted to say in my original posting was that we should always >keep in mind the least-common-demoninator. At the moment it's 80x24 >screens. But I really like the 66line by 96 column display on >my Blit... :-) ACTUALLY I HAVE SEEN MORE THAN A FEW ARTICLES THAT WERE POSTED ON UPPERCASE ONLY TERMINALS. SOME WERE EVEN RESTRICTED TO 40 COLUMNS. IF WE ARE GOING TO RESTRICT OURSELFS TO THE LEAST-COMMON-DENOMINATOR THEN LET US USE 40 COLUMN UPPERCASE ONLY. OH, NO BRACES, TILDE, OR PIPE SYMBOLS BECAUSE THEY DON'T PRINT ON SOME TERMINALS. All :-) if you couldn't tell. Jerry Aguirre Systems Administration Olivetti ATC (Actually when I read an all UPPER CASE article I am left with the impression that the writer has been SHOUTING at me.)
nick@nswitgould.OZ (Nick Andrew) (10/22/87)
in article <7526@g.ms.uky.edu>, david@ms.uky.edu (David Herron -- Resident E-mail Hack) says: | | Some of us (you included) are trying to free this network from its' | reliance on Unix. Building the WorldNet and such like. But what will | the IBM people on BITNET think if they start seeing every article come | in with 2000 character long lines because someone on a Unix machine | wanted "automatic formatting" of his paragraphs? They'll only be able | to read the first 80 (132?) characters of each paragraph. | Gee whiz, if the IBM OS can't handle it then it must be accomplished by the gateway machine(s). How many gateways are there between BITNET and UUCP? It should be a (relatively) simple matter for each gateway processor to fold long lines before the IBMs get hold of it. Slower? Nah ... a couple of instructions! ACSnet: nick@nswitgould.oz zeta@runx.ips.oz UUCP: ...!uunet!munnari!nswitgould.oz!nick Fidonet: 3:713/602 ACSgate: 3:713/603 (nick@zeta.fido@nswitgould.oz in development) "Anything that is moral for a group to do is moral for one person to do" - Clark Fries in Heinlein's "Podkayne of Mars".
IRWIN@pucc.Princeton.EDU (Irwin Tillman) (10/22/87)
I distribute a VM/CMS implementation of netnews, and it deals properly
with lines > 80 characters.  Since it has hooks for communicating with
other sites, it is up to the local news admin to specify a transport
mechanism that will preserve long lines (and do ASCII/EBCDIC character
translation "properly" if it is necessary).  Two methods that may be
available (depending on software and hardware available at each site)
are SENDFILE and ftp.
 
Irwin Tillman           BITNET: IRWIN@PUCC
Princeton University    UUCP: {allegra,ihnp4,cbosgd}!psuvax1!PUCC.BITNET!IRWINallbery@ncoast.UUCP (Brandon Allbery) (10/23/87)
As quoted from <7526@g.ms.uky.edu> by david@ms.uky.edu (David Herron -- Resident E-mail Hack): +--------------- | In article <21314@ucbvax.BERKELEY.EDU> fair@ucbarpa.Berkeley.EDU (Erik E. Fair) writes: | >That's not reasonable. The reasonable approach is to do a trivial | >encapsulation or encoding that makes it possible to move USENET | >articles (no matter what their characteristics are) through BITNET, | >or any other strange network. | | BUT ... compress and atob/btoa don't run on the IBM 308x that's | out other neighbor on BITNET. ALSO, in both cases their underlying | operating systems has that record-oriented mentality. +--------------- At least one Fido-compatible system uses an encoding such that lines end in ^M and paragraphs end in ^M^J; and the Fido standard is paragraph-oriented, NOT line-oriented, so as to encourage word wrapping. I suggest that a system like this, with ^M inserted between words to force lines to < 80 characters, would work fine without breaking filesystems based on fixed-length records rather than variable- length ones (i.e. lines). (For UNIX, ^M and ^J would seem to be natural choices. These can easily be changed to ^M and ^M^J for non-UNIX sites, a' la "text mode" umodem and kermit.) -- Brandon S. Allbery necntc!ncoast!allbery@harvard.harvard.edu {{harvard,mit-eddie}!necntc,well!hoptoad,sun!mandrill!hal}!ncoast!allbery
rhorn@infinet.UUCP (Rob Horn) (10/24/87)
I don't think that these simple solutions will work well. There are many aspects of an article that have to be handled properly: indentation of quotations, marking inclusions from previous postings, poetry, pictures, etc. A new format that can convey all this information and still meet the needs of the most restrictive transport mechanisms would be to define a minimal set of TeX macros that encompass the kinds of text structures that news articles need. Then the news reading software can tailor the formatted display to the capabilities of the display hardware. The super fancy bitmap displays get spiffy formats and the CRT users get the same old stuff. This would have one drawback(?). Since information like ``prior article inclusion'' is now formatted locally a poster could not control whether the display uses >>'s, font, or indentation to signify inclusion. Similarly other text structures might display differently on different devices. I don't think this overall approach is yet practical. A recognizing filter seems plausible enough, although more complex than these early posters seem to realize. But on the display side I think that the computational load would be excessive. I can just imagine our poor little 11/750 attempting to run multiple TeX's for all our news readers. Maybe some fast CRT oriented substitute could be dreamed up. Another set of problems is implementing this in a manner that allows for a rational transition period. It must be able to coexist with prior versions of software for several years --- this being the approximate lifespan of obsolete versions of news software. It must be easy to add as an upgrade. Both pose real difficulties. TeX is not the only suitable system, but it is well suited to conveying structure independently from text. If the other problems can be overcome than the selection of what formatting system to use becomes interesting. I have my doubts about the suitability of either troff or Postscript (and thus NeWS) because both of these are too close to the display device and have already mapped some of the textual structures into specific formatting concepts. -- Rob Horn UUCP: ...harvard!adelie!infinet!rhorn Snail: Infinet, 40 High St., North Andover, MA (Note: harvard!infinet path is in maps but not working yet)
dce@mips.UUCP (David Elliott) (10/25/87)
I'm really glad to see that my question has sparked so much thought
and discussion.
With C news in it's Alpha stage, it might be nice to have some kind
of interim solution. For example, if the new news posting mechanism
(still inews?) could look at the message and if it finds any long
lines (where "long" can be arbitrarily set to 80 characters for now),
it prints the message:
	Warning: This article contains lines longer than 80 characters,
	         making it difficult for some people to read. The
		 articles has been sent, but you should refrain from
		 doing this in the future.
This may be better to do in the news posting front ends (postnews,
Pnews, etc.), which could allow users to edit the article again to
remedy the situation, but this is more work to implement.
Another idea might be to have some sites (backbones) scan articles
for long lines, and send mail to the poster with a message similar 
to the warning above.
Yet another idea might be to add a Max-Line-Length: header field,
generated by inews for articles with lines longer than 80 (again,
chosen arbitrarily, and I would even suggest 40 in this case).
This field could be used by news software to reformat articles if
the user wishes, or in the rn KILL file to junk such messages (as
I said, if it looks too hard to read, I tend to toss it).
-- 
David Elliott		dce@mips.com  or  {ames,decwrl,prls}!mips!dcehenry@utzoo.UUCP (Henry Spencer) (10/27/87)
> ... Unfortunately, there are some folks out there who use > "\n[ \t][ \t]*" rather than "\n\n" as their paragraph separator... > Write a filter that recognizes both formats, you say? Good idea, but now I > have to worry about the people who think that indentation is a good way to > highlight quoted text... [and so on] However, it is probably easier (if that is the word) and less painful to convince people to adhere to standards in such things than to convince them to shift to a new and *incompatible* standard. The former can be at least partly automated, by the way. -- PS/2: Yesterday's hardware today. | Henry Spencer @ U of Toronto Zoology OS/2: Yesterday's software tomorrow. | {allegra,ihnp4,decvax,utai}!utzoo!henry
kimcm@ambush.UUCP (Kim Chr. Madsen) (10/30/87)
In article <8831@utzoo.UUCP> henry@utzoo.UUCP (Henry Spencer) writes: >However, it is probably easier (if that is the word) and less painful to >convince people to adhere to standards in such things than to convince them >to shift to a new and *incompatible* standard. The former can be at least >partly automated, by the way. Which standard????? The standards for writing style are dependent upon several things: 1) Whom are you writing to (Newspaper, Tech. Journal, Letter to Mom, etc.) 2) Where do you come from (different countries have different style standards). etc. etc. Kim Chr. Madsen.
henry@utzoo.UUCP (Henry Spencer) (11/07/87)
> Which standard????? > The standards for writing style are dependent upon several things: > 1) Whom are you writing to (Newspaper, Tech. Journal, Letter > to Mom, etc.) So we set up a specific standard for Usenet. No big deal, except for the highly non-trivial problem of getting people to adhere to it. My point remains: getting people to use a new standard will be easier if it doesn't require scrapping everything that exists and starting over. -- Those who do not understand Unix are | Henry Spencer @ U of Toronto Zoology condemned to reinvent it, poorly. | {allegra,ihnp4,decvax,utai}!utzoo!henry