greyham@hades.ausonics.oz.au (Greyham Stoney) (07/19/90)
Why don't all you people divert your energies into making your news system handle 8 bit news rather than developing new and incompatible ways of bitbashing your files into a format that both news and your unpacking program (be it /bin/sh, sed, awk or whatever) can cope with?. Considering the advantages, (especially to binaries groups), It's gotta be the most worthwhile step to take. It may mean changes at lots of sites, but we gotta start somewhere. [ Insert prediction of immenent death-of-net here should net decide that status-quo is more important than advancing with the times :-) ] Greyham. -- /* Greyham Stoney: Australia: (02) 428 6476 * greyham@hades.ausonics.oz.au - Ausonics Pty Ltd, Lane Cove, Sydney, Oz. * Neurone Server: Brain Cell not Responding. */
tneff@bfmny0.BFM.COM (Tom Neff) (07/20/90)
In article <777@hades.ausonics.oz.au> greyham@hades.ausonics.oz.au (Greyham Stoney) writes: >Why don't all you people divert your energies into making your news system >handle 8 bit news rather than developing new and incompatible ways of >bitbashing your files into a format that both news and your unpacking program >(be it /bin/sh, sed, awk or whatever) can cope with?. 8 bit news would help only slightly with things OTHER than the transmission of binary files via news. Seven bit is basically doing the job now; the remaining issues (envelope consistency, line lengths, character sets, paragraph wrapping etc) aren't going to be solved by going to 8 bits. As for tranmitting binary files, 8 bit alone is insufficient. No binary ought to be transmitted without self contained integrity checking as well as the means to split it up into pieces of acceptable size. Hence some kind of packaging is unavoidable. Given that fact, why not go the small additional distance and have the packaging map into 7 bits. I have never thought, and do not think now, that transmitting binaries is an appropriate activity for Usenet... but a significant minority disagrees, and since they can control who does and doesn't carry the binary bandwidth, it's fine with me. Either way, 8 bit articles don't fix anything fundamentally broken, so I'd concentrate energies elsewhere. >[ Insert prediction of immenent death-of-net here should net decide that > status-quo is more important than advancing with the times :-) ] [ Insert ritual threat to take dollys and go home here. :-) ] -- 1955-1975: 36 Elvis movies. | Tom Neff 1975-1989: nothing. | tneff@bfmny0.BFM.COM
kibo@pawl.rpi.edu (James 'Kibo' Parry) (07/21/90)
In article <15688@bfmny0.BFM.COM> tneff@bfmny0.BFM.COM (Tom Neff) writes: >In article <777@hades.ausonics.oz.au> greyham@hades.ausonics.oz.au (Greyham Stoney) writes: >>Why don't all you people divert your energies into making your news system >>handle 8 bit news rather than developing new and incompatible ways of >>bitbashing your files into a format that both news and your unpacking program >>(be it /bin/sh, sed, awk or whatever) can cope with?. > >8 bit news would help only slightly with things OTHER than the transmission >of binary files via news. Seven bit is basically doing the job now; >the remaining issues (envelope consistency, line lengths, character sets, >paragraph wrapping etc) aren't going to be solved by going to 8 bits. Going to eight bits WOULD be nice for people using languages other than English; as it is now, if you're in, say, Finland, and you have a terminal that does the Finnish variant of ASCII, people outside Finland are going to see braces, brackets, backslashes, etc., wherever you say something with an accented character. Hoever, there are some good 8-bit character sets (IBM PC's, HP's, ECMA-94 Latin, etc.) which could be used so that when someone in Germany types an "o" with an umlaut, in France it'll appear as an "o" with an umlaut and not as a question mark or a bracket or something. Each foreign character could have exactly one respresentation, as opposed to the current scheme where some systems use ASCII, some use the Swedish variant, etc... Doing this would probably be best handled by (a) picking a standard (let's say we decide that non-English characters will be located in the ECMA character set.) and then (b) we either give everyone in the world a terminal that can display them (which seems very unfeasible) or else we just build into the next versions of the news-reading software a little option that maps the plus-128 characters onto the PC, HP, ECMA, ASCII, whatever character set as it displays articles. I'm sure someone will be able to poke holes in this idea, but it seems like something we should at least consider, given that the United States no longer accounts for as much of the Usenet readership as it used to. Comments? -- james "kibo" parry, 138 birch lane, scotia, ny 12302 <-- close to schenectady. kibo@pawl.rpi.edu _________________________________________________ kibo%pawl.rpi.edu@rpi.edu / Kibology / Anything I say is my opinion, userfe0n@rpitsmts.bitnet / is better! / and is the opposite of Xibo's.
brad@looking.on.ca (Brad Templeton) (07/21/90)
Why should binaries be split up into smaller bits, particularly bits as small As 50K? If I'm going to lose parts of a multi-part binary, I may as well lose the whole thing. I wrote the ABE/DABE system to make that easier to deal with, but even so, having to deal with missing parts and assembly, etc., is a pain. At the very least, raise the limit to something manageable like 500K or 1 meg, set it as an explicit limit in the next RFC, and leave splitting to only the very largest binaries. But there is a problem. Say we sit down and make a news system that can handle 500K 8 bit binary files. Great. Slowly, people start to run that system. But what moderator is going to post these in his/her binary group? Knowing that they will break at many sites, for a LONG time to come. So there will have to be two groups, one for pure binaries and one for split ones. And thus we double the load, and we have nothing to encourage the sites running old software to upgrade. So in the end we gain nothing. This is nothing new. The last major changes in the format of news articles were Supersedes: and References: These were added around 1985 -- that's centuries ago in the computer/networking world. AND WE STILL CAN'T USE THEM TODAY!!!! Not a good sign. Drastic measures are needed. I would support a move to design a binary transmission format, have the new releases of B and C news support them, and have all the moderators switch, thus forcing anybody who wants binaries to get off their duffs and upgrade. This would work as binaries are one of the biggest draws of usenet for many sites. But before doing this, I would say we should sit down at a usenix and list out all the other new features we want, then implement them, because we won't get another chance for 5 years to upgrade the format. -- Brad Templeton, ClariNet Communications Corp. -- Waterloo, Ontario 519/884-7473
jfriedl@frf.omron.co.jp (NFF) (07/21/90)
In article <15688@bfmny0.BFM.COM>, tneff@bfmny0.BFM.COM (Tom Neff) writes: > Either way, 8 bit articles don't > fix anything fundamentally broken, so I'd concentrate energies elsewhere. Well, it's certainly not the most important thing [at least to many people], but not having 8 bits is a pain (for example) for me when I send mail with Japanese text in it. I've been able to send mail here around Japan and not have it stripped, but try anything outside the country and you end up with a bunch of gook ("gook" -- that's a technical term, in case you aren't up on your science, for 8-bit text stripped to 7 bits). Again, maybe for 99.44% of the traffic within The States, it doesn't matter in this respect, but as so many students^H^H^H^H^H^H^H^HAmericans don't seem to know, the world is not only a VAX^H^H^H^H^HAmerica. minor digression: My brother told me about some (I think) circa 1981 Unipress Emacs code he was working on long ago which used the high bit as a marker for something in the text. In the code where they were dealing with this was the comment: /* sorry japan */ He always thought that was funny. Me too, now. *jeff* ----------------------------------------------------------------------------- Jeffrey Eric Francis Friedl jfriedl@nff.ncl.omron.jp Direct path from uunet: ...!uunet!othello!jfriedl Omron Electronics, Central R&D Lab, RNA Nagaokakyo, Kyoto 617, Japan Fax: 011-81-75-955-2442 Phone: 011-81-75-951-5111 x154 "sorry, but I can't spell" -me "current memory prices are 4600$ a megabyte on VAX (4/22/81)" -- my '/usr/include/vmparam.h'
jv@mh.nl (Johan Vromans) (07/21/90)
In article <777@hades.ausonics.oz.au> greyham@hades.ausonics.oz.au (Greyham Stoney) writes: > Why don't all you people divert your energies into making your news system > handle 8 bit news ... > Considering the advantages, (especially to binaries groups), ... Very short-sighted. Major advantage is to allow information exchange in local languages that require special character sets. And don't tell me that the "language of the news" is English, since that is only true for agreed-upon international newsgroups. Johan -- Johan Vromans jv@mh.nl via internet backbones Multihouse Automatisering bv uucp: ..!{uunet,hp4nl}!mh.nl!jv Doesburgweg 7, 2803 PL Gouda, The Netherlands phone/fax: +31 1820 62911/62500 ------------------------ "Arms are made for hugging" -------------------------
eps@toaster.SFSU.EDU (Eric P. Scott) (07/21/90)
We have an 8-bit standard: ISO 8859. Great for us Western European-American types. It doesn't help the fj newsgroups much. (or kremvax!gorby) -=EPS=-
Dan@dna.lth.se (Dan Oscarsson) (07/21/90)
In article <+7Y$AV&@rpi.edu> kibo@pawl.rpi.edu (James 'Kibo' Parry) writes: >In article <15688@bfmny0.BFM.COM> tneff@bfmny0.BFM.COM (Tom Neff) writes: >>In article <777@hades.ausonics.oz.au> greyham@hades.ausonics.oz.au (Greyham Stoney) writes: >>>Why don't all you people divert your energies into making your news system >>>handle 8 bit news rather than developing new and incompatible ways of >>>bitbashing your files into a format that both news and your unpacking program >>>(be it /bin/sh, sed, awk or whatever) can cope with?. >> >>8 bit news would help only slightly with things OTHER than the transmission >>of binary files via news. Seven bit is basically doing the job now; >>the remaining issues (envelope consistency, line lengths, character sets, >>paragraph wrapping etc) aren't going to be solved by going to 8 bits. > >Going to eight bits WOULD be nice for people using languages other than >English; as it is now, if you're in, say, Finland, and you have a terminal >that does the Finnish variant of ASCII, people outside Finland are going >to see braces, brackets, backslashes, etc., wherever you say something >with an accented character. > Yes it is time to start thinking about using an international character set in netnews. This means that 8bit bytes are used but not that binary files can be transmitted without encapsulation. Binary files must still be converted into a encoded format that can be check and unpacked in a controlled manner. Only one character set should be used for transmitting articles as it is impossible for everyone to handle all in the world. In european talks about a character set to use for mail ISO 10646 is the best candidate and it should be fine for netnews also. ISO 10646 has both ASCII and ISO 8859-1 as true subsets and that will easy compatability problems. Local netnews readers will have to convert from ISO 10646 into the character set used locally. Using ISO 10646 allows nearly every letter in the world to be written. Dan -- Dan Oscarsson Department of Computer Science Lund Institute of Technology e-mail: Dan@dna.lth.se Box 118 S-221 00 Lund, Sweden
scs@lokkur.dexter.mi.us (Steve Simmons) (07/22/90)
jv@mh.nl (Johan Vromans) writes:
:> Why don't all you people divert your energies into making your news system
:> handle 8 bit news ...
:> Considering the advantages, (especially to binaries groups), ...
:Very short-sighted. Major advantage is to allow information exchange
:in local languages that require special character sets.
:And don't tell me that the "language of the news" is English, since
:that is only true for agreed-upon international newsgroups.
The sarcasm lamp is now lit. :-)
So what's your point?
Leave us not forget that 8-bits isn't the answer for all languages,
either. Of course, there aren't many of us reading the current news
who want to read those kanji, katakana, and ghu-only-knows what other
variants. Still, we should all be forced to make software that is
capable of handling it so that the English readers can look at the
Hindi, Korean and Russian postings that flow by.
And I'm sure those guys back at Duke said, "Hey, let's have some
agreed-upon international newsgroups that'll be only in English
and we'll implement to enforce it."
The sarcasm lamp is now off. :-)
Hey, news is ASCII-based, written in english-speaking countries for
english-speaking readers. That fact that it works *at all* for
international and non-English stuff is a wonderful plus. If regional
newgroups have regional needs, they should go ahead and fill them.
But neither side should expect interoperatbility.
My understanding is that a number of nordic installations now have
appropriate hacks to encode/decode/display their national character
sets. That's super; I hope that software propogates its way across
the water. But proper gateways and translations (into 7-bit) will
be needed or the postings are gonna break a lot of systems.
henry@zoo.toronto.edu (Henry Spencer) (07/22/90)
In article <1990Jul21.054016.10409@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes: >I would support a move to design a binary transmission format, have the >new releases of B and C news support them... Unless there is something I have overlooked, C News neither knows nor cares whether the body of a message is text or binary. C News is, by intent and I think in practice, 8-bit clean, and it does not care whether the body is split into lines or not (although the headers must follow the standards). Furthermore, it doesn't care whether the article is 5KB or 5MB. (One caution: I speak here of relaynews, expire, etc. The inews shell script uses many existing Unix tools that aren't so tolerant. This is an issue only on sites that post odd messages, though, not on ones that receive them.) There are occasional problems with transport subsystems -- in particular, links that transmit news by mail without encoding it are a major problem -- and the readers are a can of worms, but I don't think C News needs any modifications for this. B News might or might not; I'm not sure. -- NFS: all the nice semantics of MSDOS, | Henry Spencer at U of Toronto Zoology and its performance and security too. | henry@zoo.toronto.edu utzoo!henry
guy@auspex.auspex.com (Guy Harris) (07/22/90)
>We have an 8-bit standard: ISO 8859. Great for us Western >European-American types. It doesn't help the fj newsgroups much. >(or kremvax!gorby) ISO 8859 doesn't help the "fj" newsgroups much, but "kremvax!gorby" could use ISO 8859/5, Latin-Cyrillic alphabet. Did you perhaps mean "ISO 8859/1" rather than "ISO 8859"?
guy@auspex.auspex.com (Guy Harris) (07/22/90)
>This is nothing new. The last major changes in the format of news articles >were Supersedes: and References: These were added around 1985 -- that's >centuries ago in the computer/networking world. > >AND WE STILL CAN'T USE THEM TODAY!!!! I use "References:" all the time; you're just not using the right newsreader. (Hey Wayne! Isn't it soup yet? :-)) No, the References: lines aren't always correct, or complete; "trn"'s thread-constructor also works from Subject: lines (philosophical objections to the null device, please, linking by subject *gets the job done better than only using the reference lines* in today's imperfect world). However, they *are* used, and *do* come in handy....
amanda@mermaid.intercon.com (Amanda Walker) (07/22/90)
In article <1990Jul21.054016.10409@looking.on.ca>, brad@looking.on.ca (Brad Templeton) writes: > I would say we should sit down at a usenix and > list out all the other new features we want, then implement them, because > we won't get another chance for 5 years to upgrade the format. I think it's probably too late for that. My view at this point is that Usenet has become large enough (and established enough as an operational network) that it is effectively frozen. A more fruitful approach is probably to start building "Usenet II". It's easier to migrate people from one system to another than it is to upgrade them in place. Just to take an example, I didn't switch intercon.com over to C news until I had to bring up news from scratch on a new piece of hardware, simply because it was not worth risking a breakdown of the existing service while I got the new stuff running. This may sound shortsighted, but it is an example of what you have to deal with when a system is being used for everyday operation. Sometimes a disruption is worse than putting up with current limitations. Myself, I lean towards the techno-nerd side, and always want the latest and greatest toys. However, any plans for Usenet have to take the majority of the sites and users into account. Rather than telling people they should upgrade, I'd rather build something better and have them decide for themselves that it would be a good idea :-). It minimizes aggravation for everyone, and keeps people from bugging you while you're building the better mousetrap... -- Amanda Walker <amanda@intercon.com> InterCon Systems Corporation
brad@looking.on.ca (Brad Templeton) (07/22/90)
C news may indeed *handle* arbitrary byte stream messages, but does it *support* them? In particular, input programs like inews must deal with them, and a new header line must be added to classify article body types. Many body types are possible, including: ascii + underlining (current default) extended ascii (international char sets, etc.) rich text (of some format) andrew message non-interpreted binary (programs, etc.) binary with text header XXX format bitmap (gif, tif) In particular, I would devise code words for every format, such that those code words could conveniently be the names of output programs in some directory, with the default built into the reader, of course. -- Brad Templeton, ClariNet Communications Corp. -- Waterloo, Ontario 519/884-7473
brad@looking.on.ca (Brad Templeton) (07/22/90)
I did not mean that there was no software to use References lines, I have some myself. I meant that the chains are broken far too often. About 6% of followups have no references line, and that results in broken chains on about 40% of usenet messages (every followup to such a followup is broken as well.) So you can't use it like you should. As for supersedes, it works in limited cases, but try *really* using it like I did and you will find a lot of sites don't run it at all, and that there's a bug in the *design* of supersedes, such that it breaks with batching. (Try superseding twice, close enough that they are both in the same news batch) Like I said, these are the last 2 additions, from 5 years ago, and they still do not work today. -- Brad Templeton, ClariNet Communications Corp. -- Waterloo, Ontario 519/884-7473
frisk@rhi.hi.is (Fridrik Skulason) (07/22/90)
Well - some of us have 8-bits news already - I am for example using an 8-bit 'rn' right now. The program only required a few minor modifications to work properly. The reason we went to 8-bit news and E-mail is quite simple - our alphabet contains 10 charactes not found in standard ASCII. Of course I can only post 8-bit articles to our local newsgroups - the rest of the world is still only 7-bit :-( I fully agree that we need an 8-bit news system (as well as 8-bit E-mail), as this would make life a lot easier for those of us not using English. Modifying the news software to permit the transmission of 8-bit data is trivial - the real problem is the charcter set issue. I don't know if the readers of this group are familiar with a similar discussion regarding automatic translation between character sets in the Kermit program. The conclusions reached there seem to apply to the 8-bit News/E-mail discussion as well, though. Some possible solutions: (1) Each machine posts articles using the user's character set of choice. To indicate which character set is used, a new field is added to the header. examples: Character-set: CP 870 Character-set: ISO 8859/4 This is easy to implement, but has one serious drawback - all machines are required to be able to handle all possible character sets. (2) On every machine the article is translated into one of the ISO 8859/x series of character sets. 8859/1 would probably be most used, as it covers most of the languages of Western Europe. 8859/2, 8859/3, 8859/4 etc. would solve the needs of those using Greek, various Eastern European languages and (I think) Hebrew and Arabic. This would not solve the problem of those using a 16-bit character set. Also, I am not sure if Esperanto is included in any of the ISO 8859/x standards. (3) All text is transmitted according to the ISO 10646 standard. This has one advantage compared to (2) - it allows the transmission of documents containing 16-bit characters, as well as documents containing characters from more than one of the 8859/x standards. For example, one could send a message with the first part in Russian and the second part in Greek. My opinion is that (3) is more of a long-term goal - for 95 % of users of Usenet, (2) is all that is needed. But what changes would (2) require ? Change #1: Any ASCII computer on Usenet must accept 8-bit news and E-mail, and be able to forward articles without changes (in other words - don't strip the eight bit !!!) This is the only change required from the "English-only" ASCII-sites, where no 8-bit articles would originate or be read. Change #2: Any computer on Usenet using an extended version of ASCII (CP 437, ISO 8859/x etc) must translate all postings to one of the 8859/x charcter sets and indicate (in the header) which one is used. This change would be required from European/Non-English using users. Change #3: Any computer not using ASCII, but rather EBDIC (or something else), must translate all postings to one of the 8859/x character sets, instead of just translating to ASCII. Change #4: Any computer must accept postings in one of the 8859/x character sets and be able to translate them to the character set used by each user. Problem #1: If the local character set is not able to represent all the charactes in the original posting, they must be represented as well as possible. For example - a 7-bit computer receiving a text containing accented wovels might be expected just to drop the accent marks. Problem #2: Different users - even on the same machine - have different capabilities to display 8-bit text. For example, in Scandinavia it is common for terminals to use a 7-bit character set, where some of the characters (for example { [ ] } |) have been replaced by non-ASCII characters. Other users in the same countries have fully 8-bit terminals (for example PCs running an terminal emulator). The computer must store incoming articles as they arrive and the news/E-mail software must be updated to display them according to the capabilities of each terminal, as indicated by an environment variable. So - what now ? Is there any interest in creating a "working group" to attack the problem ? Any of the authors of rn, nn, elm or other news/e-mail software out there ? We are of course willing to share our modifications to the programs, and with a bit of work we should be able to have 8-bit news/email running in a few months. So - any volunteers ? -- Fridrik Skulason University of Iceland | Technical Editor of the Virus Bulletin (UK) | Reserved for future expansion E-Mail: frisk@rhi.hi.is Fax: 354-1-28801 |
zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) (07/22/90)
>>Why don't all you people divert your energies into making your news system >>handle 8 bit news rather than developing new and incompatible ways of > >I have never thought, and do not think now, that transmitting binaries >is an appropriate activity for Usenet... but a significant minority Files that use all 8 bits are not necessarily binaries. Other languages, bitmaps, etc. Even control characters get munged. There definitely needs to be a more tranparent standard for the transmission of news articles. Sites capable of transparently transmitting all 8 bit characters shouldn't have to pay the penalty of 7 bit encoding. In the meantime, a standard encoding method (and a header to indicate its use) would be useful. Maybe the encoding method could address the "too big for one article" problem too. Keep the transport problems away from the users. Maybe we do need checksums. At least we could throw away munged articles. Start doing that and I suspect that people would fix their software. -- Jon Zeeff (NIC handle JZ) zeeff@b-tech.ann-arbor.mi.us
henry@zoo.toronto.edu (Henry Spencer) (07/23/90)
In article <1990Jul22.062034.20896@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes: >C news may indeed *handle* arbitrary byte stream messages, but does it >*support* them? >In particular, input programs like inews must deal with them, and a new >header line must be added to classify article body types... C news "supports" arbitrary byte-stream messages to exactly the same extent as it "supports", say, poetry: it doesn't give a damn what's inside the message or what's in any headers other than the ones it needs to know about. With the exception of the inews problem, you're purely and simply talking about reader issues, not transport issues, and no changes to C News are necessary. The inews business is a nuisance, but the blame rests primarily with the Unix utilities rather than with inews proper. There is an inews rewrite already in the works, which might improve things somewhat. It might also be worthwhile to define a new input interface without all the goo and dribble that have crept into inews over the years. Being backward compatible was a real pain there. A clean interface could eliminate a lot of messy handling of stuff that inews should not have to care about. (For example, the specs say that inews must try to guess whether the beginning of the input text looks like headers, in which case it *is* headers. This could be eliminated by demanding that the -h flag be used in such cases.) As for content classification, I believe there has already been some work done on this for RFC822-X.400 interfacing, although I don't know the details. -- NFS: all the nice semantics of MSDOS, | Henry Spencer at U of Toronto Zoology and its performance and security too. | henry@zoo.toronto.edu utzoo!henry
henry@zoo.toronto.edu (Henry Spencer) (07/23/90)
In article <==H&NB&@b-tech.uucp> zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) writes: >Maybe we do need checksums. At least we could throw away munged articles. >Start doing that and I suspect that people would fix their software. Geoff and I thought hard about this during C News development. The trouble with checksums is that most people would prefer a slightly mangled copy of an article to no copy of the article. There are all too many transmission channels that do in fact slightly mangle articles (expanding tabs, fiddling with the definition of newline, etc.). Some early test versions of C News did generate a checksum header. We scrapped it because we could not think of anything to do with it that people would want. -- NFS: all the nice semantics of MSDOS, | Henry Spencer at U of Toronto Zoology and its performance and security too. | henry@zoo.toronto.edu utzoo!henry
guy@auspex.auspex.com (Guy Harris) (07/23/90)
>I meant that the chains are broken far too often. About 6% of followups >have no references line, and that results in broken chains on about >40% of usenet messages (every followup to such a followup is broken as >well.) > >So you can't use it like you should. It works well enough for me. It's not perfect, but what is? 1) Merely grouping all the articles in a thread together is a big win (no, this is not a hypothetical assertion, it is an observation based on using "trn" for quite a while); even with no references line on the followup, "trn" (or, more correctly, its thread builder program) does that by subject matching. 2) *Enough* of the articles have proper reference lines that I can go up and down the article tree often enough to make it worthwhile.
david@twg.com (David S. Herron) (07/23/90)
In article <15688@bfmny0.BFM.COM> tneff@bfmny0.BFM.COM (Tom Neff) writes: >In article <777@hades.ausonics.oz.au> greyham@hades.ausonics.oz.au (Greyham Stoney) writes: >I have never thought, and do not think now, that transmitting binaries >is an appropriate activity for Usenet... but a significant minority >disagrees, and since they can control who does and doesn't carry the >binary bandwidth, it's fine with me. Either way, 8 bit articles don't >fix anything fundamentally broken, so I'd concentrate energies elsewhere. Transmitting 8-bit files can be useful beyond the fairly narrowly defined thought of software packages. Think about all the multi-media gadgetry floating around in X.400. Voice, animation, still pictures in various formats, etc. This is something you'd need an Amiga to do justice to :-) But, no, it doesn't require 8-bit article formats to support all that. It can be encoded in 7-bit files, without too much problem and so forth. BTW.. I hafta make this warning.. BITNET won't be able to handle any 8-bit file format very easily. More than just bitnet, but also things like VMS will have problems. One of the better things about Usenet is that, since it's text files, it's immediately portable across all sorts of OS's. Non-text files tend to be non-portable. -- <- David Herron, an MMDF weenie, <david@twg.com> <- Formerly: David Herron -- NonResident E-Mail Hack <david@ms.uky.edu> <- <- Sign me up for one "I survived Jaka's Story" T-shirt!
eps@toaster.SFSU.EDU (Eric P. Scott) (07/23/90)
In article <3721@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes: >Did you perhaps mean "ISO 8859/1" rather than "ISO 8859"? Yes, I did. Mea culpa. -=EPS=-
diamond@tkou02.enet.dec.com (diamond@tkovoa) (07/23/90)
In article <1990Jul21.174535.8281@lokkur.dexter.mi.us> scs@lokkur.dexter.mi.us (Steve Simmons) writes: >The sarcasm lamp is now lit. :-) Not mine though. (I have a large sarcasm lamp but it's not lit this time.) >Of course, there aren't many of us reading the current news >who want to read those kanji, katakana, and ghu-only-knows what other >variants. There sure are. I can't read them very well, but there are many who do. >Still, we should all be forced to make software that is >capable of handling it so that the English readers can look at the >Hindi, Korean and Russian postings that flow by. You're right; you aren't forced. If you don't, then the rest of the world will leave you behind. But you aren't forced. >The sarcasm lamp is now off. :-) >Hey, news is ASCII-based, written in english-speaking countries for >english-speaking readers. This was true 10 years ago. >That fact that it works *at all* for >international and non-English stuff is a wonderful plus. If regional >newgroups have regional needs, they should go ahead and fill them. >But neither side should expect interoperatbility. They'll go ahead and fill it, believe me. And they will market systems in the U.S. that have interoperability too. Businesses that agree with your opinion will go bankrupt. (And there are a lot of them. The U.S. is moving towards losing the software market, just as it did for cars and home electronics.) -- Norman Diamond, Nihon DEC diamond@tkou02.enet.dec.com This is me speaking. If you want to hear the company speak, you need DECtalk.
brian@ucsd.Edu (Brian Kantor) (07/23/90)
The NNTP extensions (that I'll get into an RFC soon, I promise!) support a CHARSET extension and 8-bit transmission, so it can allow the transfer of (i hope) any character set. Perhaps news will follow suit. A standard for transmitting multi-byte characters would presumably specify what order the bytes are to be sent; that is NOT properly part of the news system nor of NNTP. - Brian
tneff@bfmny0.BFM.COM (Tom Neff) (07/23/90)
In article <1990Jul21.054016.10409@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes: >Why should binaries be split up into smaller bits, particularly bits as small >As 50K? > >If I'm going to lose parts of a multi-part binary, I may as well lose the >whole thing. I agree completely. Binaries shouldn't be split. If a binary is bigger than 50K, it has NO BUSINESS BEING BROADCAST AS NEWS! Post a pointer for FTP and anonymous UUCP, and let those who want it pay for it. THAT'S how Usenet ought to work. (This is not a flame at Brad of course -- it's a flame at BBS refugees who discover Usenet and expect it to work just like a bigger BBS.) >I wrote the ABE/DABE system to make that easier to deal with, but even so, >having to deal with missing parts and assembly, etc., is a pain. In all fairness, when a properly packaged source archive (i.e., what Usenet SHOULD be broadcasting) arrives in a dozen pieces, installation is VERY easy on most platforms. When a part is lost or corrupted, the penalty for rebroadcast is a lot smaller than it would be if the whole kit had to be resent. -- "Take off your engineering hat = "The filter has | Tom Neff and put on your management hat." = discreting sources." | tneff@bfmny0.BFM.COM
Dan@dna.lth.se (Dan Oscarsson) (07/23/90)
In article <1857@krafla.rhi.hi.is> frisk@rhi.hi.is (Fridrik Skulason) writes: > >Some possible solutions: > >(2) On every machine the article is translated into one of the ISO 8859/x >series of character sets. 8859/1 would probably be most used, as it covers >most of the languages of Western Europe. 8859/2, 8859/3, 8859/4 etc. would >solve the needs of those using Greek, various Eastern European languages and >(I think) Hebrew and Arabic. This would not solve the problem of those using >a 16-bit character set. Also, I am not sure if Esperanto is included in any >of the ISO 8859/x standards. > >(3) All text is transmitted according to the ISO 10646 standard. This has >one advantage compared to (2) - it allows the transmission of documents >containing 16-bit characters, as well as documents containing characters from >more than one of the 8859/x standards. For example, one could send a message >with the first part in Russian and the second part in Greek. > >My opinion is that (3) is more of a long-term goal - for 95 % of users of >Usenet, (2) is all that is needed. > I think (3) is better. (2) is more or less a subset of (3) and it would not be much more work to implement (3) than (2). Using (3) we have one character set only and an article can contain any character. ISO 10646 can be sent in way so that ascii and iso 8859-1 articles can be sent without any change. Also if ISO 10646 is choosen if will fit well with the international sendmail patches that is under development. If we choose (2) we will have to change to (3) in a few years. -- Changes to netnews is somewhat different from mail. In netnews the articles are stored in a central database used both for reading and for sending the articles onward to the next site. This means that we cannot convert incoming articles into the local character set used at a site, instead each newsreader must do the convertion from ISO 10646 into local character set. To handle "old" sites that cannot handle 8-bit articles the articles will have to be converted into ascii. So backbone sites must upgrade their software to allow 8-bits through to allow tgis to work. -- When the patches for international sendmail that I and one in Denmark is developing is ready they will include convertion routines that could probably be used in a netnews reader. Dan -- Dan Oscarsson Department of Computer Science Lund Institute of Technology e-mail: Dan@dna.lth.se Box 118 S-221 00 Lund, Sweden
peter@ficc.ferranti.com (Peter da Silva) (07/23/90)
In article <1990Jul21.054016.10409@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes: > I would support a move to design a binary transmission format, have the > new releases of B and C news support them, and have all the moderators switch, > thus forcing anybody who wants binaries to get off their duffs and upgrade. This is a great idea. It will break binaries at enough sites that maybe the bloody things will finally dry up and blow away and maybe even some of the stuff currently posted in binary form will start showing up as source. Binaries and News go together about as well as rock videos and Masterpiece Theatre. -- Peter da Silva. `-_-' +1 713 274 5180. 'U` <peter@ficc.ferranti.com>
scs@iti.org (Steve Simmons) (07/23/90)
diamond@tkou02.enet.dec.com (diamond@tkovoa) writes: >In article <1990Jul21.174535.8281@lokkur.dexter.mi.us> scs@lokkur.dexter.mi.us (Steve Simmons) writes: >>The sarcasm lamp is now lit. :-) >Not mine though. (I have a large sarcasm lamp but it's not lit this time.) *chuckle* I was kind of looking forward to it... :-) >>Of course, there aren't many of us reading the current news >>who want to read those kanji, katakana, and ghu-only-knows what other >>variants. >There sure are. I can't read them very well, but there are many who do. 'Many' here is a relative term. And my sarcastic point (which you quote in the next note) is that use of national language sets for the purpose of using foreign languages is largely irrelevant to those who do not speak that language. I freely admit english postings are of little interest to non-english speakers. :-) >You're right; you aren't forced. If you don't, then the rest of the >world will leave you behind. But you aren't forced. I agree with you -- things will change here (USA, Canada, UK, Australia, NZ) only when there is sufficient software available and things worth reading which require 8bit (or more). >>Hey, news is ASCII-based, written in english-speaking countries for >>english-speaking readers. >This was true 10 years ago. And it's 90 or 95% true today. 100% of all news transport interfaces were done originally by English speakers using ASCII *or* deliberately designed to be compatible with same. 80 or 90% of newsreaders are the same, the only exception I know of is nn (which, by the by, is the best damn fine newsreader around) (and for all I know was written by a native English speaker).
jbuck@galileo.berkeley.edu (Joe Buck) (07/24/90)
In article <1857@krafla.rhi.hi.is>, frisk@rhi.hi.is (Fridrik Skulason) writes: |> I fully agree that we need an 8-bit news system (as well as 8-bit E-mail), |> as this would make life a lot easier for those of us not using English. I agree. But any solution must take into account Problem #0: Many sites will continue to run the software they are using now, and no amount of cajoling will cause them to install new, 8-bit compatible software. In some cases, this is because the organization gives news and mail a low priority (doesn't bring in money, etc). Many sites are still running obsolete software and will continue to do so. To install new software people need an incentive. There is a very small incentive, unfortunately, for sites in English-speaking countries to install software to support 8-bit character sets. This means that any new software must co-exist with the current environment. One way to do this is to have gateway sites do conversion. There are relatively few connections between the US and Europe -- most traffic across the Atlantic goes through uunet. Character translation could be done on the uunet-mcsun link -- stripping accents on articles arriving from Europe, remapping characters so when an American types braces in articles in comp.lang.c, readers in Europe see braces, instead of language-specific characters. -- Joe Buck jbuck@galileo.berkeley.edu {uunet,ucbvax}!galileo.berkeley.edu!jbuck
davison%drivax@uunet.uu.net (Wayne Davison) (07/24/90)
guy@auspex.auspex.com (Guy Harris) wrote: > I use "References:" all the time; you're just not using the right > newsreader. (Hey Wayne! Isn't it soup yet? :-)) The soup's been simmering for a long time time now, but I think its finally time to serve. And the NNTP sites out there might even appreciate the last ingredient that was added (and thus, took extra time to slow-cook) -- NNTP support has been added to the thread creator, and the whole stew has been tested as trrn. I'm planning to ship the package off to the comp.sources.unix group sometime this week, so keep your eyes peeled (pun intended :-). -- Wayne Davison \ /| / /|\/ /| /(_) davison%drivax@uunet.uu.net davison@drivax.UUCP (_)/ |/ /\|/ / |/ \ ...!uunet!drivax!davison (W A Y N e)
ed@braaten.doit.sub.org (Ed Braaten) (07/24/90)
scs@lokkur.dexter.mi.us (Steve Simmons) writes: >Hey, news is ASCII-based, written in english-speaking countries for >english-speaking readers. That fact that it works *at all* for >international and non-English stuff is a wonderful plus. If regional >newgroups have regional needs, they should go ahead and fill them. Thats funny Steve - 50% of the news I read is in German. I'm living in Germany right now. Although many of the 60+ Million Germans here can speak English (often better than we americans ;-), the language in this country is German. And the german language news here is not limited to "regional" consumption. I'm aware of several sites there in the good ole USA that are carrying the german groups also. I'm willing to bet there is a LOT of non-English stuff floating around out there. So why don't we drop the provincial attitudes - lets hear it for 8-bit news! It won't make English any harder to read, but it would certainly make life easier for the rest of the USENET. >But neither side should expect interoperatbility. Say what? Interoperability and the free exchange of information is in my opinion exactly what makes USENET so successful... --------------------------------------------------------------------------- Ed Braaten | Jesus answered, "I am the way and the Work: ed@imuse.de.intel.com | truth and the life. No one comes to the Home: ed@braaten.doit.sub.org | Father except through me." John 14:6 ---------------------------------------------------------------------------
storm@texas.dk (Kim F. Storm) (07/24/90)
(I've cross-posted this to news.software.nn since it contains some information about future directions of nn. Followups are directed to .b only. ++Kim). In news.software.b, frisk@rhi.hi.is (Fridrik Skulason) writes: >Well - some of us have 8-bits news already - I am for example using an 8-bit >'rn' right now. nn release 6.4 supports presentation of 8-bit data (and with pl 6 it also accepts 8-bit command input). >I fully agree that we need an 8-bit news system (as well as 8-bit E-mail), >as this would make life a lot easier for those of us not using English. Certainly depends on who "we" are. I think that if we are going to attack this problem, we better do it properly from the start, and not *just* solve the 8-bit news problem (which you cannot solve anyway). >Modifying the news software to permit the transmission of 8-bit data is >trivial - the real problem is the charcter set issue. Changing any software is just *so easy*. But it is *impossible* to get people to install the changes unless they have a personal interest in doing so. I speak from experience: More than one year has gone since the initial release on nn worldwide (rel. 6.3.0). Since then, about 20 patches including a new release has been posted, but there are still some sites out there running 6.3.0 (or .1 or .2) which you can recognize from the RFC violating Re^2: prefixes in the Subject: lines on some postings. I still get complaints about how stupid nn is, although this problem is fixed (oh yes, it was *trivial*). But getting people to update.... So if you want this to work on a world-wide scale within a timeframe of less than 3-6 years I believe it must be done in the news reader software at the end-points which needs this, and design a transport protocol for 8-bit, 16-bit and even 32-bit data which can: a) be transparently sent through the current 7-bit restrictive channels, and b) still be interpreted sensible on systems and by news readers which are not adapted to this new scheme. Keld Simonsen from the Danish UNIX-system Users Group (DKUUG) has defined a new naming scheme for international characters based on 10646 using primarily two-character names which attempts to be as close to the real character as possible. For example, e' is an e with ' above it o: is o with two dots above, etc. Now this sounds rather trivial, but it has specifically been designed for the above purposes: (a) it uses only a subset of the ASCII character set (e.g. { and } are not used since they are used for national characters in many older 7-bit characters based on ISO 646), and (b) as the example shows, the character name is a close approximation to the actual character. So you can actually read a letter written using the character names! (Of course, "a" is named "a", "b" is "b", "A" is "A", etc.). A letter can then be written in any of the 8859/x variants, in various EBCDIC and other IBM codepages, etc. using N-bit codes supported on the local system. However, when such a letter is transmitted to a remote system, all the international characters are *encoded* by replacing all the international characters by an "escape character" followed by the (two character) character name. The result is a pure 7-bit letter. At the receiving end, the encoded letter can be converted back to the originating character set, or the character set used on the local character set as far as that is possible. But that is the choice of the recipient end! Or, it can just be read without conversion since the encoding is "readable" (the only problem being the escape character). In the sendmail used on the Danish DKnet backbone, Keld has implemented this and it is running very well, supporting about 50 different character sets. By default it uses ^] as the escape character which has the benefit of being invisible on most terminals, but it can use any escape character you like. Both the escape character and the originating character set is specified in the articles header. (more on this below) >Some possible solutions: >(1) Each machine posts articles using the user's character set of choice. >To indicate which character set is used, a new field is added to the header. > examples: Character-set: CP 870 > Character-set: ISO 8859/4 This is what Keld's sendmail extensions support today. >This is easy to implement, but has one serious drawback - all machines are >required to be able to handle all possible character sets. Not with Keld's solution: - If you know the character set, you can convert to it. - If you use another character set, you can convert to that instead since all international characters have been give *unique* names. - If your software doesn't understand any of it, you can still read the message with little or no problems. And in the sendmail case, the Danish backbone is actually doing the encoding *and* decoding for the Danish sites for which it has been told which character set they prefer. So if one site runs 8859/1, they send 8-bit 8859/1 data directly to the backbone, and if the recipient is a known EBCDIC site, the backbone converts the letter to EBCDIC before delivery! If it is to an unknown site, it will be converted to the "encoded" character set, and it is thus the task of the recipient to handle it. So in Denmark we not only run 8-bit mail, but *multi character set* mail. And it is transparent for all practical uses. >(2) On every machine the article is translated into one of the ISO 8859/x >character sets.... Too limited, and which one should you choose? >(3) All text is transmitted according to the ISO 10646 standard. This has >one advantage compared to (2) - it allows the transmission of documents >containing 16-bit characters, as well as documents containing characters from >more than one of the 8859/x standards. For example, one could send a message >with the first part in Russian and the second part in Greek. Currently, I think Keld has defined about 1000 characters *including* Greek, Russian (Cyrillic), Hebrew, Arabian, all 8859 sets, EBCDIC, PC character sets and more. And there are "hooks" reserved to include longer names for kanji characters and the like. So you can say that Keld has defined a 10646 character set representation using only a limited 7-bit character set. >My opinion is that (3) is more of a long-term goal - for 95 % of users of >Usenet, (2) is all that is needed. And if you want to keep it that way, sure limit yourself to (2). >But what changes would (2) require ? >Change #1: Any ASCII computer on Usenet must accept 8-bit news and E-mail, > and be able to forward articles without changes (in other words - > don't strip the eight bit !!!) This is the only change required > from the "English-only" ASCII-sites, where no 8-bit articles > would originate or be read. The "only" change, yes, but a change which you simply cannot expect to be done. No hope at all! >Change #2: Any computer on Usenet using an extended version of ASCII (CP 437, > ISO 8859/x etc) must translate all postings to one of the 8859/x > charcter sets and indicate (in the header) which one is used. This wouldn't do it if the recipient end cannot handle that character set. Or said in another way: which one of the 8859/x character sets should you use? 8859/x is probably *the* answer for use within a certain country on *most* UNIX boxes, but what about all the PC character sets, EBCDIC hosts etc. Don't you think a little more than 5% of the users are in that category? >Change #3: Any computer not using ASCII, but rather EBDIC (or something else), > must translate all postings to one of the 8859/x character sets, > instead of just translating to ASCII. If they have to translate, they can just as well translate into something which has a good chance of getting through the network - and 8859 doesn't have a chance there. >Change #4: Any computer must accept postings in one of the 8859/x character > sets and be able to translate them to the character set used > by each user. But what if I support 8859/1 and get an article written in 8859/7 (greek?) If we use your scheme, *all* the 8859/x sets must be accepted! >Problem #1: If the local character set is not able to represent all the > charactes in the original posting, they must be represented as > well as possible. For example - a 7-bit computer receiving a text > containing accented wovels might be expected just to drop the > accent marks. Which may in some cases completely change the meaning! >Problem #2: Different users - even on the same machine - have different > capabilities to display 8-bit text. For example, in Scandinavia > it is common for terminals to use a 7-bit character set, where > some of the characters (for example { [ ] } |) have been replaced > by non-ASCII characters. Other users in the same countries have > fully 8-bit terminals (for example PCs running an terminal > emulator). The computer must store incoming articles as they > arrive and the news/E-mail software must be updated to display > them according to the capabilities of each terminal, as indicated > by an environment variable. Exactly, and that is definitely easiest if everybody agrees on *one* common "carrier character set" (my suggested term for such a character set). >So - what now ? >Is there any interest in creating a "working group" to attack the problem ? >Any of the authors of rn, nn, elm or other news/e-mail software out there ? Yes, support for Keld's multi character set handling is planned for an upgrade to nn 6.4 later this year. We have been looking at what can be used as the escape character in news, and this is definitely a problem, since inews traditionally is very restrictive with respect to what it will pass through (^] is filtered out as most other control characters). But we believe we have found the right solution, which will pass through at least Bnews' inews, and is supposed to be *transparent* to most news interfaces: We use a double escape character consisting of a "space" followed by a "backspace". When output to a screen this will be invisible and most pagers will handle backspace properly (i.e. move the cursor back over the space). And we think it is very unlikely that this sequence will occur in normal postings (we see no purpose for it). And since only articles which have the proper header specifying that this is really an encoded article will be "decoded", the filters which encode the articles at the originating end can check that no such sequences exist in the original text. >We are of course willing to share our modifications to the programs, and with >a bit of work we should be able to have 8-bit news/email running in a few >months. nn users world-wide can soon exchange multi character news - other users can read it (without problems), and we will publish our code and specifications so that other interfaces can support it as well. >So - any volunteers ? Yes, but is there any interest in what we plan to do??? And will our "space-backspace" escape pass through Cnews, NNTP and other inews/relaynews/whatever implementations (without modification)? -- Kim F. Storm <storm@texas.dk> No news is good news, Texas Instruments A/S, Denmark but nn is better!
henry@zoo.toronto.edu (Henry Spencer) (07/24/90)
In article <7647@gollum.twg.com> david@twg.com (David S. Herron) writes: >BITNET won't be able to handle any 8-bit file format very easily. It doesn't handle 7-bit file formats very reliably, actually; Bitnet has all kinds of ugly properties as a transmission subsystem. However, with a suitable encoding it can still be used to get clean 8-bit data from point A to point B. The bencode/bdecode stuff shipped with C News, originally written at Waterloo, is Bitnet-proof by design. (Uuencode is not, by the way.) -- NFS: all the nice semantics of MSDOS, | Henry Spencer at U of Toronto Zoology and its performance and security too. | henry@zoo.toronto.edu utzoo!henry
eps@toaster.SFSU.EDU (Eric P. Scott) (07/24/90)
In article <7647@gollum.twg.com> david@twg.com (David S. Herron) writes: >BITNET won't be able to handle any 8-bit file format very easily. >More than just bitnet, but also things like VMS will have problems. Say what? VAX/VMS got 8-bit-ized way back when the VT2xx terminals came out--I think that was around V4.0. A better question is whether ANU NEWS does The Right Thing. Geoff? -=EPS=-
" Maynard) (07/24/90)
In article <Z4V4=DF@xds13.ferranti.com> peter@ficc.ferranti.com (Peter da Silva) writes: >In article <1990Jul21.054016.10409@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes: >> I would support a move to design a binary transmission format, have the >> new releases of B and C news support them, and have all the moderators switch, >> thus forcing anybody who wants binaries to get off their duffs and upgrade. >This is a great idea. It will break binaries at enough sites that maybe the >bloody things will finally dry up and blow away and maybe even some of the >stuff currently posted in binary form will start showing up as source. This is a rotten idea. That source is the prevalent form of software distribution in the Unix environment is more an artifact of the diversity of Unix systems than anything else. The IBM-PC world (the only one I'm intimately familiar with; I don't own an Amiga/Atari/...) won't switch away from the various compressed archivers it's using to a pure source distribution, for several reasons: 1) You can't stuff an executable in a shar, and comparatively few people own each individual language environment, so they can't recompile the programs. 2) .ARC/.ZIP files are easy to transport, and explode into the component parts with just a single tool, instead of requiring a shell and several utilities. 3) The cultural history doesn't include people improving on the source and sharing the improvements; if anything, it's more along the lines of stealing the code and giving no credit. Breaking binaries on Usenet will get rid of them, all right, but at the cost of cutting Usenet users off completely from nearly all residtributed programs. You won't get the authors to do things your way. -- Jay Maynard, EMT-P, K5ZC, PP-ASEL | Never ascribe to malice that which can jay@splut.conmicro.com (eieio)| adequately be explained by stupidity. "It's a hardware bug!" "It's a +---------------------------------------- software bug!" "It's two...two...two bugs in one!" - _Engineer's Rap_
bob@MorningStar.Com (Bob Sutterfield) (07/24/90)
In article <15692@bfmny0.BFM.COM> tneff@bfmny0.BFM.COM (Tom Neff) writes: In article <1990Jul21.054016.10409@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes: If I'm going to lose parts of a multi-part binary, I may as well lose the whole thing. I agree completely. Binaries shouldn't be split. If a binary is bigger than 50K, it has NO BUSINESS BEING BROADCAST AS NEWS! Post a pointer for FTP and anonymous UUCP, and let those who want it pay for it. THAT'S how Usenet ought to work. If the binary thing is a computer program, it has no business being transmitted as news. Binaries are inherently nonportable, and of little value to a majority of sites passing them through. If the binary thing is a multimedia document, then (at the user interface level) it should certainly appear in one chunk. ...when a properly packaged source archive (i.e., what Usenet SHOULD be broadcasting)... Yes, if the thing you're talking about is a program for a computer, it certainly should be distributed as source. Does anyone actually trust an encoded binary that they found in some newsgroup? How quaint, how naive! However, there are (as has been abundantly pointed out) plenty of examples of things that aren't program binaries but that still should be transmissible via news-like mechanisms and that break the current news implementations. ...arrives in a dozen pieces, installation is VERY easy on most platforms. When a part is lost or corrupted, the penalty for rebroadcast is a lot smaller than it would be if the whole kit had to be resent. Sequencing and reassembly are problems for the session layer, not the user interface layer. Users should never see that a document was split into transport layer-sized chunks, which is what the 50K article-size limit really is.
zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) (07/24/90)
>header line must be added to classify article body types. Many body types >are possible, including: > ascii + underlining (current default) > extended ascii (international char sets, etc.) > rich text (of some format) > andrew message > non-interpreted binary (programs, etc.) > binary with text header > XXX format bitmap (gif, tif) > Maybe a digital audio type also. I like the idea of reading the next news article and having the news reader decide that is a gif file and automatically displaying the graphics. It might be useful to be able to mix types within an article (for example, graphics with text telling you what it is along with a binary to produce it). This would mean some imbedded esc sequences instead of a header line. Re character sets, ISO10646 sounds good but I'd hate to see news volume double (16 bit chars vs 8). Options for it and ISO8859/x sound more efficient. So we need three things - a standard, newsreaders to handle it, and transfer mechanisms that don't munge things (like most of C News). At some point, I expect sites would start refusing feeds from munging sites and dumping munged articles. -- Jon Zeeff (NIC handle JZ) zeeff@b-tech.ann-arbor.mi.us Dolphins! What about the tuna?
davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (07/25/90)
In article <1990Jul22.195243.28379@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes: | Geoff and I thought hard about this during C News development. The trouble | with checksums is that most people would prefer a slightly mangled copy of | an article to no copy of the article. There are all too many transmission | channels that do in fact slightly mangle articles (expanding tabs, fiddling | with the definition of newline, etc.). This is a good point, but there are some groung rules which could help eliminate this. When I do a CRC on text, I ignore all whitespace, and put a delimiter (@@start and @@stop) around the text. This makes it work even when fairly heavily munged in the usual ways. The problems of line ending, conversion of blanks to tabs and back, adding or deleting blanks at end of line, can all be ignored this way. Even if lines are folded you can get a CRC if you ignore whitespace. This is not to say you're wrong, just that a partial solution is available. I have used this for some time, and it seems to be critical enough to be useful, and forgiving enough to avoid dropping things which are still readable. I use brik for error checking on c.b.i.p postings, for historical reasons, and I haven't had a complaint in six months. A header field for CRC would be great, even if all the reader did was output a message indicating that the data was damaged. -- bill davidsen (davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen) "Stupidity, like virtue, is its own reward" -me
davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (07/25/90)
In article <1864@tkou02.enet.dec.com> diamond@tkou02.enet.dec.com (diamond@tkovoa) writes: | They'll go ahead and fill it, believe me. And they will market systems | in the U.S. that have interoperability too. Businesses that agree with | your opinion will go bankrupt. Is this bash the USA week? Net software is given away. There are no businesses selling news software, and if someone gives away better software it will be used, if not the useful features will be added. -- bill davidsen (davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen) "Stupidity, like virtue, is its own reward" -me
guy@auspex.auspex.com (Guy Harris) (07/25/90)
>80 or 90% of newsreaders are the same, the only exception I know of is >nn (which, by the by, is the best damn fine newsreader around) Well, Wayne says it's going to be soup soon; we'll see whether things change (although I'm told one of the GNU EMACS newsreaders also deals with threads reasonably, and there were some mutterings about "nn" picking up some of the threads stuff from "trn"). >(and for all I know was written by a native English speaker). The author posted it from a TI site in Denmark, where, as I remember, he works; I don't know if Kim is a Dane, an expatriate from an English-speaking country, or an expatriate from a non-English-speaking country.
peter@ficc.ferranti.com (Peter da Silva) (07/25/90)
In article <JCP&4:&@splut.conmicro.com> jay@splut.conmicro.com (Jay "you ignorant splut!" Maynard) writes: > Breaking binaries on Usenet will get rid of them, all right, but at the > cost of cutting Usenet users off completely from nearly all > residtributed programs. You won't get the authors to do things your way. Breaking binaries on UNIX didn't have that result. You're contradicting yourself. Remember, Usenet is not a BBS. -- Peter da Silva. `-_-' +1 713 274 5180. 'U` <peter@ficc.ferranti.com>
dansmith@well.sf.ca.us (Daniel Smith) (07/25/90)
[ongoing discussion about 8 bit news...] I agree that 8 bit news would be a Good Thing. But some of the same people that would be solving that problem are attempting (or at least, should be attempting :-) to speed along these two: 1) 8 bit email 2) worldwide agreement on email addressing... I don't think 8 bit email is that hard, but then again, I haven't really looked at the code...I just think that since uucp and other low-level transport mechanisms can handle it, that it shouldn't be too hard to adapt MTAs and readers for it. I could be way off base on this :-) Problems for both 8 bit email and news is: how to protect your screen (use a pager like "less", which automatically displays control-whatever as 2 chars)? How do you decide which escape sequences you want to affect your screen and which you want to filter out? How do you know which language (character set) to display in? Which convention do you use to handle EOL? Will people agree on generic ways of including bold text, font changes, etc? Sure, there are some standards in these areas...but... I guess the problem really breaks down into three: how to transport (MTAs and underlying news software), how to display (allowing for different terminals, local conventions, etc) and how to save the message (save everything in binary mode?, save it the way you saw it?, etc.) A big problem I see (2) is all the *%&^ ways of getting a message from point A to point B. Not in the routing sense, but in type of addressing used sense. Sure it's getting better, but not quickly enough. One has to remember foo@bar.com vs this!that!other vs something::other For instance, in my mail today I have: From: Someone in England <uunet!stl.stc.co.uk!That.Person> X-Vms-Mail-To: INET%"daniel%bermuda%island@mcsun.uucp" Now, I'm glad this made it to me (I've changed the name, but see what I mean? lots of mailers can choke on "That.Person"...it's not a domain!) As for the INET line...yea, I think I understand it, but god what a kludge just to get a letter from one place to another! My wish for today is to have the world convert to user@site.domain once and for all...problem is (getting back to the real world!) all the different twists on this that you see (don't they do some of the domain part in reverse in England? sigh :-) The real test of this will be "can I explain in 1 minute to someone brand-new to email how to address to any site?". When we can do that, without all sorts of exceptions based on machine type, OS, country, local net, etc., we'll all benefit (less bounced mail, more understandable). I realize many people are currently working on this, and I thank them! [followup to some more appropriate group, since this is starting to get off the Subject] Daniel -- Dan "Bucko" Smith dansmith@well.sf.ca.us daniel@island.uu.net unicom!daniel@pacbell.com ph: (415) 332 3278 (h), 258 2136 (w) disclaimer: Island's coffee was laced :-) My mind likes Cyberstuff, my eyes films, my hands guitar, my feet skiing...
chris@vision.UUCP (Chris Davies) (07/25/90)
In article <37713@ucbvax.BERKELEY.EDU> jbuck@galileo.berkeley.edu (Joe Buck) writes: [...] >This means that any new software must co-exist with the current environment. >One way to do this is to have gateway sites do conversion. There are >relatively >few connections between the US and Europe -- most traffic across the Atlantic >goes through uunet. Character translation could be done on the uunet-mcsun >link -- stripping accents on articles arriving from Europe, remapping >characters >so when an American types braces in articles in comp.lang.c, readers in >Europe see braces, instead of language-specific characters. I'd guess that there are far more unpublished links between the US and Europe than might be expected from the maps... Particularly companies with both US and European offices (no, I won't name names :-) Would these sites have to provide gateway-based character translation too? I can't really see that happening, so we'd get a mix of translated and untranslated articles both sides of the Atlantic. Also, stripping accents off characters Could completely change the meaning of the word(s). Other characters (UK sterling symbol, for instance) would have no easy translation (unless you map it to hash, #) in 7-bit US ASCII. Just my pennyworth, Chris -- VISIONWARE LTD | UK: chris@vision.uucp JANET: chris%vision.uucp@ukc 57 Cardigan Lane | US: chris@vware.mn.org OTHER: chris@vision.co.uk LEEDS LS4 2LE | BANGNET: ...{backbone}!ukc!vision!chris England | VOICE: +44 532 788858 FAX: +44 532 304676 -------------- "VisionWare: The home of DOS/UNIX/X integration" --------------
henry@zoo.toronto.edu (Henry Spencer) (07/25/90)
In article <632@texas.dk> storm@texas.dk (Kim F. Storm) writes: >And will our "space-backspace" escape pass through Cnews, NNTP and >other inews/relaynews/whatever implementations (without modification)? I can't answer for NNTP et al, but it should be fine with C News, even inews. Our inews strips out a lot of control characters, but it leaves backspace alone. -- NFS: all the nice semantics of MSDOS, | Henry Spencer at U of Toronto Zoology and its performance and security too. | henry@zoo.toronto.edu utzoo!henry
Dan@dna.lth.se (Dan Oscarsson) (07/25/90)
In article <!YJ&ZKC@b-tech.uucp> zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) writes: > >Re character sets, ISO10646 sounds good but I'd hate to see news volume >double (16 bit chars vs 8). Options for it and ISO8859/x sound more >efficient. > NO ISO 10646 will not double the volume even though ISO 10646 is a 32-bit character set. It will only be necessary to use 8 bits when ascii or ISO 8859-1 is used. A few additional bytes are used when changing to an other part of ISO 10646 than base part. Dan -- Dan Oscarsson Department of Computer Science Lund Institute of Technology e-mail: Dan@dna.lth.se Box 118 S-221 00 Lund, Sweden
zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) (07/25/90)
>>Maybe we do need checksums. At least we could throw away munged articles. >>Start doing that and I suspect that people would fix their software. > >Geoff and I thought hard about this during C News development. The trouble >with checksums is that most people would prefer a slightly mangled copy of >an article to no copy of the article. There are all too many transmission >channels that do in fact slightly mangle articles (expanding tabs, fiddling >with the definition of newline, etc.). Some early test versions of C News There is a certain class of sites that munge all articles in this way. If someone found that all articles from their feed site were being dropped, they would tend to find a new feed (or if they still excepted it, it would be less likely that they would pass it on). So the choice wouldn't be between slightly mangled news and no news, but between mangled news and a new feed. As it is, I can't identify munged articles even if I want to. If I could identify them, I could at least hold out to get better copies from another site. Without a crc, we just don't have the tools we need to fix the problem. -- Jon Zeeff (NIC handle JZ) zeeff@b-tech.ann-arbor.mi.us Dolphins! What about the tuna?
" Maynard) (07/26/90)
In article <14W43HC@xds13.ferranti.com> peter@ficc.ferranti.com (Peter da Silva) writes: >In article <JCP&4:&@splut.conmicro.com> jay@splut.conmicro.com (Jay "you ignorant splut!" Maynard) writes: >> Breaking binaries on Usenet will get rid of them, all right, but at the >> cost of cutting Usenet users off completely from nearly all >> residtributed programs. You won't get the authors to do things your way. >Breaking binaries on UNIX didn't have that result. You're contradicting >yourself. ...huh? Your statement doesn't make much sense. Binaries haven't been useful for Unix sites since it was first ported off a PDP-11. >Remember, Usenet is not a BBS. Agreed...but how is that relevant to the current discussion about binaries versus source? -- Jay Maynard, EMT-P, K5ZC, PP-ASEL | Never ascribe to malice that which can jay@splut.conmicro.com (eieio)| adequately be explained by stupidity. "It's a hardware bug!" "It's a +---------------------------------------- software bug!" "It's two...two...two bugs in one!" - _Engineer's Rap_
peter@ficc.ferranti.com (Peter da Silva) (07/26/90)
In article <9SQ&.FC@splut.conmicro.com> jay@splut.conmicro.com (Jay "you ignorant splut!" Maynard) writes: > ...huh? Your statement doesn't make much sense. Binaries haven't been > useful for Unix sites since it was first ported off a PDP-11. Well, that's the point. -- Peter da Silva. `-_-' +1 713 274 5180. 'U` <peter@ficc.ferranti.com>
storm@texas.dk (Kim F. Storm) (07/27/90)
guy@auspex.auspex.com (Guy Harris) writes: >>80 or 90% of newsreaders are the same, the only exception I know of is >>nn (which, by the by, is the best damn fine newsreader around) >Well, Wayne says it's going to be soup soon; we'll see whether things >change (although I'm told one of the GNU EMACS newsreaders also deals >with threads reasonably, and there were some mutterings about "nn" >picking up some of the threads stuff from "trn"). I've had some talks with Wayne some time back about adding his thread handling to nn, but I have been busy getting the 6.4 release into a good shape (which it is now!) So next on the agenda for nn is: Internationalisation (multicharacter support) Thread handling (maybe based on trn - I still have to see it) >>(and for all I know was written by a native English speaker). You are as wrong as you can be - ever heard me *speak* :-) :-) :-) >The author posted it from a TI site in Denmark, where, as I remember, he >works; I don't know if Kim is a Dane, an expatriate from an >English-speaking country, or an expatriate from a non-English-speaking >country. In case anybody is interested I'm Danish. -- Kim F. Storm <storm@texas.dk> No news is good news, Texas Instruments A/S, Denmark but nn is better!
michael@fts1.uucp (Michael Richardson) (07/29/90)
In article <37713@ucbvax.BERKELEY.EDU> jbuck@galileo.berkeley.edu (Joe Buck) writes: >Many sites will continue to run the software they are using now, and no amount >of cajoling will cause them to install new, 8-bit compatible software. In >some cases, this is because the organization gives news and mail a low priority >(doesn't bring in money, etc). Many sites are still running obsolete software Agreed. My experience says that these people tend to be either terminal sites, or extremly heavily loaded sites, whose future existance is not necessarily assured. (fts1's feed, nrcaer is such a site) >There is a very small incentive, unfortunately, for sites in English-speaking >countries to install software to support 8-bit character sets. Sites in Canada, particularly the Canadian government would probably be able to justify the time to support the use of french. >few connections between the US and Europe -- most traffic across the Atlantic >goes through uunet. Character translation could be done on the uunet-mcsun >link -- stripping accents on articles arriving from Europe, remapping >characters So long as I can get a feed from uunet, untranslated. Obviously the translation programs would be easiest to put into the batchers. I'm not exactly sure how ISO 10646 (is that the right number?) relates to T61 and NAPLPS (which I'm quite familliar with), but I think that a method of getting 8 bit data through a 7 bit site could be devised with the proper set of filters. I just wish that uuencoded and tarmail could be better dealt with and transferred in binary form. tar | compress | uuecode | *news* | uux <-Telebit-> uuxqt | *news* | uudecode | uncompress | tar x Is a necessary evil, but perhaps the blow could be reduced somewhat. Particularly if "*news*" involves batching and doing MORE compression.. >so when an American types braces in articles in comp.lang.c, readers in >Europe see braces, instead of language-specific characters. And the rest of the world that receives stuff from uunet? -- :!mcr!: | < political commentary currently undergoing Senate > Michael Richardson | < committee review. Returning next house session. > Play: mcr@julie.UUCP Work: michael@fts1.UUCP Fido: 1:163/109.10 1:163/138 Amiga----^ - Pay attention only to _MY_ opinions. - ^--Amiga--^
tmatimar@watmath.waterloo.edu (Ted M A Timar) (07/30/90)
In article <37713@ucbvax.BERKELEY.EDU> jbuck@galileo.berkeley.edu (Joe Buck) writes: >Many sites will continue to run the software they are using now, and no amount >of cajoling will cause them to install new, 8-bit compatible software. In >some cases, this is because the organization gives news and mail a low priority >(doesn't bring in money, etc). Many sites are still running obsolete software >and will continue to do so. To install new software people need an incentive. >There is a very small incentive, unfortunately, for sites in English-speaking >countries to install software to support 8-bit character sets. I don't see why we spend so much time catering to people who are unwilling to upgrade their software to more recent versions. I think, to be reasonable, we must maintain backward compatibility to the version immediately before the current one, and possibly a bit more, so that people with different transport agents (not bnews/cnews, or PC's running news ...) to catch up. To be fair, the problem is, in part, that news installation is not trivial, and making it trivial isn't trivial. This discourages sites with many other problems from upgrading when they have other problems to deal with. But they do slow down progress much more than we might desire. >This means that any new software must co-exist with the current environment. >One way to do this is to have gateway sites do conversion. There are >relatively >few connections between the US and Europe -- most traffic across the Atlantic >goes through uunet. Character translation could be done on the uunet-mcsun >link -- stripping accents on articles arriving from Europe, remapping >characters >so when an American types braces in articles in comp.lang.c, readers in >Europe see braces, instead of language-specific characters. Unfortunately, there are many gateway sites to Europe. There are also many gateway sites to Quebec. (Both Europe and Quebec can use NNTP to get news from almost anywhere now.) I would like to see a Usenet II / Usenet with very few gateways. No new software would be written for Usenet, and all news would be gatewayed both ways. Those gatewayed in would, in many cases be entirely gibberish. This would encourage sites to upgrade as soon as possible. Furthermore, I would recommend that a "Language:" header be added to all articles. And a reverse KILL format to the newsreaders, so that people could kill all but the languages they could read. -- Just my .0002 cents worth Ted Timar tmatimar@watmath.waterloo.edu
zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) (07/30/90)
It looks like the latest trn (threaded rn) will suport the display of eighth bit set characters. With it and cnews*, one can already run a local or controlled distribution group with a 8 bit character set (eg, iso8859/1). * - If the utilities inews uses aren't 8 bit clean, you may be out of luck until Henry does a new version. I wonder if anyone out there reads this like I do - "risumi". -- Jon Zeeff (NIC handle JZ) zeeff@b-tech.ann-arbor.mi.us
henry@zoo.toronto.edu (Henry Spencer) (07/30/90)
In article <N8N&T+D@b-tech.uucp> zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) writes: >* - If the utilities inews uses aren't 8 bit clean, you may be out >of luck until Henry does a new version. Actually, Geoff is the one who would do it (and is working on it, as time permits). I would hope that a lot of the sites that need to post 8-bit articles already have 8-bit-clean utilities, since I would expect they'd need them for other purposes. -- The 486 is to a modern CPU as a Jules | Henry Spencer at U of Toronto Zoology Verne reprint is to a modern SF novel. | henry@zoo.toronto.edu utzoo!henry
" Maynard) (07/30/90)
In article <5IX4X2D@ggpc2.ferranti.com> peter@ficc.ferranti.com (Peter da Silva) writes: >In article <9SQ&.FC@splut.conmicro.com> jay@splut.conmicro.com (Jay "you ignorant splut!" Maynard) writes: >> ...huh? Your statement doesn't make much sense. Binaries haven't been >> useful for Unix sites since it was first ported off a PDP-11. >Well, that's the point. ...except that your point doesn't port to the IBM-PC environment, where binaries are useful; the mountain of PC packages delivered in object form, often in some compressed archive format with documentation in a neat little bundle, is an existence proof. Usenet is more than Unix. -- Jay Maynard, EMT-P, K5ZC, PP-ASEL | Never ascribe to malice that which can jay@splut.conmicro.com (eieio)| adequately be explained by stupidity. "It's a hardware bug!" "It's a +---------------------------------------- software bug!" "It's two...two...two bugs in one!" - _Engineer's Rap_
tneff@bfmny0.BFM.COM (Tom Neff) (07/30/90)
In article <NLV&_.D@splut.conmicro.com> jay@splut.conmicro.com (Jay "you ignorant splut!" Maynard) writes: >...except that your point doesn't port to the IBM-PC environment, where >binaries are useful; the mountain of PC packages delivered in object >form, often in some compressed archive format with documentation in a >neat little bundle, is an existence proof. > >Usenet is more than Unix. Oh yeah, but much less (and more) than a BBS. Binaries *WILL BREAK OUR BACKS* if we open the doors wide! And they are intrinsically parochial, where text and source are ecumenical. It doesn't matter how many new kinds of computers join: the net's ESSENTIAL CHARACTER needs preserving. This is a lesson it would be unfair (unrealistic anyway) to expect PC BBS refugees to understand right away. A lot of them show up and want to chat the sysop, as it were. Post three revisions a week of those DR. FILEGOOD binaries, the Bart Simpson pictures, and Zeke's Montly List of 3,000 BBS's in ZIP form to preserve every superfluous ^M. The "existence proof" of all the PC binaries carried on BBS's and services is also a sufficiency proof. These folks HAVE their BBS's already, and CompuServe, and GEnie, and BIX, etc, etc. They don't need to turn Usenet into another one. -- If the human mind were simple enough to understand, =)) Tom Neff we'd be too simple to understand it. -- Emerson Pugh ((= tneff@bfmny0.BFM.COM
peter@ficc.ferranti.com (Peter da Silva) (07/30/90)
In article <1990Jul29.171112.4093@watmath.waterloo.edu> tmatimar@watmath.waterloo.edu (Ted M A Timar) writes: > I don't see why we spend so much time catering to people who are unwilling > to upgrade their software to more recent versions. Unwilling? Or unable? I have not been able to get B news at a higher patch level than 14 to run on Xenix System III/286. We don't have the option of upgrading to a newer O/S: we depend on certain programs that won't run on anything else. I have sidestepped the problem by switching to C news, at the cost of a 13 hour unpaid day during the christmas shutdown, plus days of after hours hacking before and after. And recent versions of C news have broken again... I'm now several patches behind. News 3.0 was never an option... it's just too big. -- Peter da Silva. `-_-' +1 713 274 5180. 'U` <peter@ficc.ferranti.com>
cudep@warwick.ac.uk (Ian Dickinson) (08/01/90)
In article <15710@bfmny0.BFM.COM> tneff@bfmny0.BFM.COM (Tom Neff) writes: >[Binaries] are intrinsically parochial, where text and source are ecumenical. Good stuff - I like the way that was put. >The "existence proof" of all the PC binaries carried on BBS's and >services is also a sufficiency proof. These folks HAVE their BBS's >already, and CompuServe, and GEnie, and BIX, etc, etc. They don't need >to turn Usenet into another one. Most people from the BBS crowd who I know like Usenet *BECAUSE* it is ^M different. They wouldn't want it to change into a monolithic BBS. ^M And they're prepared to put in the effort to make sure it stays that way. ^M ^M :-) -- \/ato. Ian Dickinson. GNU's not got BSE. Food: Hungarian Acacia Honey. vato@cu.warwick.ac.uk Plinth. Machine: Sun SPARCserver 330. vato@tardis.cs.ed.ac.uk Sabeq. Footwear: Airwalk Vic Blotch. gdd046@cck.cov.ac.uk Oxymoron: Intelligent Life.
VERKADE@CTSS.CO.UK (Herman Verkade) (08/02/90)
A couple of comments on 8 bit news. It seems to me that it is not necesary to convert the whole net to 8 bit. The 7 bit restriction is only a problem for specific newsgroups: newsgroups in languages other than english and newsgroups containing binary data, such as bitmaps, .gif files, etc. So, I don't think **everybody** needs to upgrade to some implementation that supports 8 bits. Only those that wish to carry newsgroups, that need it. All we would need is a standard, not necesarily a world-wide upgrade of software. For example, if the Germans and the Fins decide that their local language newsgroups will be 8 bit, then that is of no business of the Americans, British, Spanish or Russians. As long as there is some software that support 8 bits (And C News seems to do that or alternatively a few changes in B News) and maybe some way of indicating that a particular group expects 8-bit, so that when posting to a 7 bit group another signature can be used; one that doesn't have 8 bit characters in it. And if people want to start a new newsgroup for .gif files in 8 bit mode then, again, the sites that want to carry it must install an 8 bit version. If your feed doesn't support it, get another feed for that newsgroup (A similar situation exists in the UK, where UKC doesn't carry nor forward `alt.sex.pictures'. Sites that want it get another feed for that group). The next problem would be how to read an 8 bit group, both for groups that use different character sets and for groups that carry other 8 bit data. As someone earlier suggested, maybe an extra header should be added to the standard that indicates what type of data it is. For mail there are RFC-1049 (Content-Type) and RFC-1154 (Encoding), which are extensions to RFC-822. The extra header fields would only need to be interpreted by the news reader. So, only if you want to read an 8 bit group, get an 8 bit reader. As long as we can agree on some standard. My proposal would be RFC-1154-style, because it also allows one message to contain encodings in different parts and could therefore also be used to automaticaly convert different parts of a message in 7 bit groups. For example, a message containing a uuencoded file preceded by some explanation in ASCII and a signature at the bottom, could have a header such as: Encoding: 10 text, 1045 uuencode, 5 text A smart news reader could display the two text parts and ask whether you want the uuencode bit to be uudecoded. For an article containing a header like: Encoding: 15 text, 637 uugif, 5 text the reader could then automatically extract the uuencoded .gif file and display an image instead. Etc, etc, etc. And only users that want such functionality switch to a news reader that supports it. I realise that I am discussing two seperate topics here: 1) Provide 8 bit transport mechanisms so that international character sets can be used, but enable 8 bits only on a newsgroup by newsgroup basis with either a designated character set for such a group, or an Encoding header to indicate the character set. 2) An Encoding: header for carrying data other that text (in either 7 or 8 bit groups). For both I suggest to provide a standard, but not to force anybody to upgrade to new software. I think this proposal provides for backward compatibility and allows the requirements of a fair number of net.people. Herman Verkade
amanda@mermaid.intercon.com (Amanda Walker) (08/03/90)
In article <900802011259.00001B1F@MARVIN.CTSS.CO.UK>, VERKADE@CTSS.CO.UK (Herman Verkade) writes: > The 7 bit restriction is only a problem for > specific newsgroups: newsgroups in languages other than english and newsgroups > containing binary data, such as bitmaps, .gif files, etc. It's also a problem for any newsgroup that has non-english-speakers posting to it. For example, on alt.sca, the name of one common poster comes out as "]ke"-- he's Swedish, and the "]" should really be an A with a ring over it... Even in groups whose traffic is conducted only in English, there is a growing proportion of people whose names cannot be spelled with 7-bit ASCII. -- Amanda Walker <amanda@intercon.com> InterCon Systems Corporation
ed@braaten.doit.sub.org (Ed Braaten) (08/05/90)
VERKADE@CTSS.CO.UK (Herman Verkade) writes: >A couple of comments on 8 bit news. It seems to me that it is not necesary to >convert the whole net to 8 bit. The 7 bit restriction is only a problem for >specific newsgroups: newsgroups in languages other than english and newsgroups >containing binary data, such as bitmaps, .gif files, etc. So, I don't think >**everybody** needs to upgrade to some implementation that supports 8 bits. >Only those that wish to carry newsgroups, that need it. All we would need >is a standard, not necesarily a world-wide upgrade of software. I think this is the right approach to the problem. If it works, don't fix it! ;-) But give the non-English and binary people a chance. A standard, however is an absolute must. >My proposal would be RFC-1154-style, because it also allows one message to >contain encodings in different parts and could therefore also be used to >automaticaly convert different parts of a message in 7 bit groups. For >example, a message containing a uuencoded file preceded by some explanation >in ASCII and a signature at the bottom, could have a header such as: > Encoding: 10 text, 1045 uuencode, 5 text >A smart news reader could display the two text parts and ask whether you want >the uuencode bit to be uudecoded. For an article containing a header like: > Encoding: 15 text, 637 uugif, 5 text >the reader could then automatically extract the uuencoded .gif file and >display an image instead. Etc, etc, etc. And only users that want such >functionality switch to a news reader that supports it. How about it? Could we get the author of nn sold on this? (I'm crossposting this article to n.s.nn to find out...) >I realise that I am discussing two seperate topics here: >1) Provide 8 bit transport mechanisms so that international character sets > can be used, but enable 8 bits only on a newsgroup by newsgroup basis > with either a designated character set for such a group, or an Encoding > header to indicate the character set. >2) An Encoding: header for carrying data other that text (in either 7 or > 8 bit groups). I like your suggestions Herman. What about the rest of the net? Opinions? Comments? Greetings from Munich, Ed --------------------------------------------------------------------------- Ed Braaten | Jesus answered, "I am the way and the Work: ed@imuse.de.intel.com | truth and the life. No one comes to the Home: ed@braaten.doit.sub.org | Father except through me." John 14:6 ---------------------------------------------------------------------------
richard@pegasus.com (Richard Foulk) (08/06/90)
In article <N8N&T+D@b-tech.uucp> zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) writes: >It looks like the latest trn (threaded rn) will suport the display of >eighth bit set characters. With it and cnews*, one can already run a >local or controlled distribution group with a 8 bit character set (eg, >iso8859/1). > Yes, but trn is total vaporware. -- Richard Foulk richard@pegasus.com
mcmahon@tgv.com (John McMahon) (08/06/90)
In article <3863a@braaten.doit.sub.org>, ed@braaten.doit.sub.org (Ed Braaten) writes... >>My proposal would be RFC-1154-style, because it also allows one message to >>contain encodings in different parts and could therefore also be used to >>automaticaly convert different parts of a message in 7 bit groups. For >>example, a message containing a uuencoded file preceded by some explanation >>in ASCII and a signature at the bottom, could have a header such as: > >> Encoding: 10 text, 1045 uuencode, 5 text My understanding is that an RFC is in the works for "non-textual tranmission of data via E-mail". I suspect this could be easily expanded to include USENET NEWS. Watch the NIC for announcements of new RFCs... John 'Fast-Eddie' McMahon : MCMAHON@TGV.COM : TTTTTTTTTTTTTTTTTTTTTTTT TGV, Incorporated : : T GGGGGGG V V 603 Mission Street : HAVK (abha) Gur bayl : T G V V Santa Cruz, California 95060 : bcrengvat flfgrz gb : T G GGGG V V 408-427-4366 or 800-TGV-3440 : or qrfgeblrq ol znvy : T GGGGGGG V