[news.software.b] Time for 8 bit news, isn't it?????.

greyham@hades.ausonics.oz.au (Greyham Stoney) (07/19/90)

Why don't all you people divert your energies into making your news system
handle 8 bit news rather than developing new and incompatible ways of
bitbashing your files into a format that both news and your unpacking program
(be it /bin/sh, sed, awk or whatever) can cope with?.

Considering the advantages, (especially to binaries groups), It's gotta be the
most worthwhile step to take. It may mean changes at lots of sites, but we
gotta start somewhere.

[ Insert prediction of immenent death-of-net here should net decide that
  status-quo is more important than advancing with the times :-) ]

								Greyham.
-- 
/*  Greyham Stoney:                            Australia: (02) 428 6476
 *  greyham@hades.ausonics.oz.au - Ausonics Pty Ltd, Lane Cove, Sydney, Oz.
 *		Neurone Server: Brain Cell not Responding.
 */

tneff@bfmny0.BFM.COM (Tom Neff) (07/20/90)

In article <777@hades.ausonics.oz.au> greyham@hades.ausonics.oz.au (Greyham Stoney) writes:
>Why don't all you people divert your energies into making your news system
>handle 8 bit news rather than developing new and incompatible ways of
>bitbashing your files into a format that both news and your unpacking program
>(be it /bin/sh, sed, awk or whatever) can cope with?.

8 bit news would help only slightly with things OTHER than the transmission
of binary files via news.   Seven bit is basically doing the job now;
the remaining issues (envelope consistency, line lengths, character sets,
paragraph wrapping etc) aren't going to be solved by going to 8 bits.

As for tranmitting binary files, 8 bit alone is insufficient.  No binary
ought to be transmitted without self contained integrity checking as
well as the means to split it up into pieces of acceptable size.  Hence
some kind of packaging is unavoidable.  Given that fact, why not go the
small additional distance and have the packaging map into 7 bits.

I have never thought, and do not think now, that transmitting binaries
is an appropriate activity for Usenet... but a significant minority
disagrees, and since they can control who does and doesn't carry the
binary bandwidth, it's fine with me.  Either way, 8 bit articles don't
fix anything fundamentally broken, so I'd concentrate energies elsewhere.

>[ Insert prediction of immenent death-of-net here should net decide that
>  status-quo is more important than advancing with the times :-) ]

[ Insert ritual threat to take dollys and go home here. :-) ]
-- 
 1955-1975: 36 Elvis movies.  |  Tom Neff
 1975-1989: nothing.          |  tneff@bfmny0.BFM.COM

kibo@pawl.rpi.edu (James 'Kibo' Parry) (07/21/90)

In article <15688@bfmny0.BFM.COM> tneff@bfmny0.BFM.COM (Tom Neff) writes:
>In article <777@hades.ausonics.oz.au> greyham@hades.ausonics.oz.au (Greyham Stoney) writes:
>>Why don't all you people divert your energies into making your news system
>>handle 8 bit news rather than developing new and incompatible ways of
>>bitbashing your files into a format that both news and your unpacking program
>>(be it /bin/sh, sed, awk or whatever) can cope with?.
>
>8 bit news would help only slightly with things OTHER than the transmission
>of binary files via news.   Seven bit is basically doing the job now;
>the remaining issues (envelope consistency, line lengths, character sets,
>paragraph wrapping etc) aren't going to be solved by going to 8 bits.

Going to eight bits WOULD be nice for people using languages other than
English;  as it is now, if you're in, say, Finland, and you have a terminal
that does the Finnish variant of ASCII, people outside Finland are going
to see braces, brackets, backslashes, etc., wherever you say something
with an accented character.

Hoever, there are some good 8-bit character sets (IBM PC's, HP's,
ECMA-94 Latin, etc.) which could be used so that when someone in Germany
types an "o" with an umlaut, in France it'll appear as an "o" with an
umlaut and not as a question mark or a bracket or something.  Each foreign
character could have exactly one respresentation, as opposed to the
current scheme where some systems use ASCII, some use the Swedish variant,
etc...

Doing this would probably be best handled by (a) picking a standard (let's
say we decide that non-English characters will be located in the ECMA character
set.) and then (b) we either give everyone in the world a terminal that
can display them (which seems very unfeasible) or else we just build into
the next versions of the news-reading software a little option that maps
the plus-128 characters onto the PC, HP, ECMA, ASCII, whatever character
set as it displays articles.

I'm sure someone will be able to poke holes in this idea, but it seems like
something we should at least consider, given that the United States no
longer accounts for as much of the Usenet readership as it used to.

Comments?

-- 
james "kibo" parry, 138 birch lane, scotia, ny 12302 <-- close to schenectady.
kibo@pawl.rpi.edu            _________________________________________________
kibo%pawl.rpi.edu@rpi.edu   / Kibology    /  Anything I say is my opinion,
userfe0n@rpitsmts.bitnet   /  is better! /   and is the opposite of Xibo's.

brad@looking.on.ca (Brad Templeton) (07/21/90)

Why should binaries be split up into smaller bits, particularly bits as small
As 50K?

If I'm going to lose parts of a multi-part binary, I may as well lose the
whole thing.

I wrote the ABE/DABE system to make that easier to deal with, but even so,
having to deal with missing parts and assembly, etc., is a pain.

At the very least, raise the limit to something manageable like 500K or
1 meg, set it as an explicit limit in the next RFC, and leave splitting
to only the very largest binaries.

But there is a problem.  Say we sit down and make a news system that
can handle 500K 8 bit binary files.  Great.  Slowly, people start to run
that system.

But what moderator is going to post these in his/her binary group?  Knowing
that they will break at many sites, for a LONG time to come.

So there will have to be two groups, one for pure binaries and one for split
ones.  And thus we double the load, and we have nothing to encourage the
sites running old software to upgrade.

So in the end we gain nothing.


This is nothing new.  The last major changes in the format of news articles
were Supersedes: and References:   These were added around 1985 -- that's
centuries ago in the computer/networking world.

AND WE STILL CAN'T USE THEM TODAY!!!!

Not a good sign.  Drastic measures are needed.

I would support a move to design a binary transmission format, have the
new releases of B and C news support them, and have all the moderators switch,
thus forcing anybody who wants binaries to get off their duffs and upgrade.

This would work as binaries are one of the biggest draws of usenet for many
sites.

But before doing this, I would say we should sit down at a usenix and
list out all the other new features we want, then implement them, because
we won't get another chance for 5 years to upgrade the format.
-- 
Brad Templeton, ClariNet Communications Corp. -- Waterloo, Ontario 519/884-7473

jfriedl@frf.omron.co.jp (NFF) (07/21/90)

In article <15688@bfmny0.BFM.COM>, tneff@bfmny0.BFM.COM (Tom Neff) writes:
> Either way, 8 bit articles don't
> fix anything fundamentally broken, so I'd concentrate energies elsewhere.

Well, it's certainly not the most important thing [at least to many people],
but not having 8 bits is a pain (for example) for me when I send mail with
Japanese text in it.  I've been able to send mail here around Japan and
not have it stripped, but try anything outside the country and you end
up with a bunch of gook ("gook" -- that's a technical term, in case you
aren't up on your science, for 8-bit text stripped to 7 bits).

Again, maybe for 99.44% of the traffic within The States, it doesn't
matter in this respect, but as so many students^H^H^H^H^H^H^H^HAmericans
don't seem to know, the world is not only a VAX^H^H^H^H^HAmerica.

minor digression:
  My brother told me about some (I think) circa 1981 Unipress Emacs
  code he was working on long ago which used the high bit as a marker
  for something in the text.  In the code where they were dealing with
  this was the comment: /* sorry japan */
  He always thought that was funny. Me too, now.

	*jeff*
-----------------------------------------------------------------------------
Jeffrey Eric Francis Friedl                          jfriedl@nff.ncl.omron.jp
Direct path from uunet:                             ...!uunet!othello!jfriedl
Omron Electronics, Central R&D Lab, RNA          Nagaokakyo, Kyoto 617, Japan
Fax: 011-81-75-955-2442                        Phone: 011-81-75-951-5111 x154

      "sorry, but I can't spell"
                        -me
      "current memory prices are 4600$ a megabyte on VAX (4/22/81)"
                                              -- my '/usr/include/vmparam.h'   

jv@mh.nl (Johan Vromans) (07/21/90)

In article <777@hades.ausonics.oz.au> greyham@hades.ausonics.oz.au (Greyham Stoney) writes:

> Why don't all you people divert your energies into making your news system
> handle 8 bit news ...
> Considering the advantages, (especially to binaries groups), ...

Very short-sighted. Major advantage is to allow information exchange
in local languages that require special character sets.
And don't tell me that the "language of the news" is English, since
that is only true for agreed-upon international newsgroups.

	Johan
-- 
Johan Vromans				       jv@mh.nl via internet backbones
Multihouse Automatisering bv		       uucp: ..!{uunet,hp4nl}!mh.nl!jv
Doesburgweg 7, 2803 PL Gouda, The Netherlands  phone/fax: +31 1820 62911/62500
------------------------ "Arms are made for hugging" -------------------------

eps@toaster.SFSU.EDU (Eric P. Scott) (07/21/90)

We have an 8-bit standard: ISO 8859.  Great for us Western
European-American types.  It doesn't help the fj newsgroups much.
(or kremvax!gorby)

					-=EPS=-

Dan@dna.lth.se (Dan Oscarsson) (07/21/90)

In article <+7Y$AV&@rpi.edu> kibo@pawl.rpi.edu (James 'Kibo' Parry) writes:
>In article <15688@bfmny0.BFM.COM> tneff@bfmny0.BFM.COM (Tom Neff) writes:
>>In article <777@hades.ausonics.oz.au> greyham@hades.ausonics.oz.au (Greyham Stoney) writes:
>>>Why don't all you people divert your energies into making your news system
>>>handle 8 bit news rather than developing new and incompatible ways of
>>>bitbashing your files into a format that both news and your unpacking program
>>>(be it /bin/sh, sed, awk or whatever) can cope with?.
>>
>>8 bit news would help only slightly with things OTHER than the transmission
>>of binary files via news.   Seven bit is basically doing the job now;
>>the remaining issues (envelope consistency, line lengths, character sets,
>>paragraph wrapping etc) aren't going to be solved by going to 8 bits.
>
>Going to eight bits WOULD be nice for people using languages other than
>English;  as it is now, if you're in, say, Finland, and you have a terminal
>that does the Finnish variant of ASCII, people outside Finland are going
>to see braces, brackets, backslashes, etc., wherever you say something
>with an accented character.
>

Yes it is time to start thinking about using an international character set
in netnews. This means that 8bit bytes are used but not that binary files
can be transmitted without encapsulation. Binary files must still be
converted into a encoded format that can be check and unpacked in a controlled
manner.

Only one character set should be used for transmitting articles as it is 
impossible for everyone to handle all in the world. In european talks about
a character set to use for mail ISO 10646 is the best candidate and it
should be fine for netnews also. ISO 10646 has both ASCII and ISO 8859-1 as
true subsets and that will easy compatability problems.

Local netnews readers will have to convert from ISO 10646 into the character
set used locally.

Using ISO 10646 allows nearly every letter in the world to be written.

    Dan

-- 
Dan Oscarsson                              Department of Computer Science
                                           Lund Institute of Technology
e-mail:  Dan@dna.lth.se                    Box 118
                                           S-221 00 Lund, Sweden

scs@lokkur.dexter.mi.us (Steve Simmons) (07/22/90)

jv@mh.nl (Johan Vromans) writes:

:> Why don't all you people divert your energies into making your news system
:> handle 8 bit news ...
:> Considering the advantages, (especially to binaries groups), ...

:Very short-sighted. Major advantage is to allow information exchange
:in local languages that require special character sets.
:And don't tell me that the "language of the news" is English, since
:that is only true for agreed-upon international newsgroups.

The sarcasm lamp is now lit.  :-)

So what's your point?

Leave us not forget that 8-bits isn't the answer for all languages,
either.  Of course, there aren't many of us reading the current news
who want to read those kanji, katakana, and ghu-only-knows what other
variants.  Still, we should all be forced to make software that is
capable of handling it so that the English readers can look at the
Hindi, Korean and Russian postings that flow by.

And I'm sure those guys back at Duke said, "Hey, let's have some
agreed-upon international newsgroups that'll be only in English
and we'll implement to enforce it."

The sarcasm lamp is now off.  :-)

Hey, news is ASCII-based, written in english-speaking countries for
english-speaking readers.  That fact that it works *at all* for
international and non-English stuff is a wonderful plus.  If regional
newgroups have regional needs, they should go ahead and fill them.
But neither side should expect interoperatbility.

My understanding is that a number of nordic installations now have
appropriate hacks to encode/decode/display their national character
sets.  That's super; I hope that software propogates its way across
the water.  But proper gateways and translations (into 7-bit)  will
be needed or the postings are gonna break a lot of systems.

henry@zoo.toronto.edu (Henry Spencer) (07/22/90)

In article <1990Jul21.054016.10409@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes:
>I would support a move to design a binary transmission format, have the
>new releases of B and C news support them...

Unless there is something I have overlooked, C News neither knows nor cares
whether the body of a message is text or binary.  C News is, by intent and
I think in practice, 8-bit clean, and it does not care whether the body is
split into lines or not (although the headers must follow the standards).
Furthermore, it doesn't care whether the article is 5KB or 5MB.

(One caution:  I speak here of relaynews, expire, etc.  The inews shell
script uses many existing Unix tools that aren't so tolerant.  This is
an issue only on sites that post odd messages, though, not on ones that
receive them.)

There are occasional problems with transport subsystems -- in particular,
links that transmit news by mail without encoding it are a major problem --
and the readers are a can of worms, but I don't think C News needs any
modifications for this.  B News might or might not; I'm not sure.
-- 
NFS:  all the nice semantics of MSDOS, | Henry Spencer at U of Toronto Zoology
and its performance and security too.  |  henry@zoo.toronto.edu   utzoo!henry

guy@auspex.auspex.com (Guy Harris) (07/22/90)

>We have an 8-bit standard: ISO 8859.  Great for us Western
>European-American types.  It doesn't help the fj newsgroups much.
>(or kremvax!gorby)

ISO 8859 doesn't help the "fj" newsgroups much, but "kremvax!gorby"
could use ISO 8859/5, Latin-Cyrillic alphabet.

Did you perhaps mean "ISO 8859/1" rather than "ISO 8859"?

guy@auspex.auspex.com (Guy Harris) (07/22/90)

>This is nothing new.  The last major changes in the format of news articles
>were Supersedes: and References:   These were added around 1985 -- that's
>centuries ago in the computer/networking world.
>
>AND WE STILL CAN'T USE THEM TODAY!!!!

I use "References:" all the time; you're just not using the right
newsreader.  (Hey Wayne!  Isn't it soup yet? :-))

No, the References: lines aren't always correct, or complete; "trn"'s
thread-constructor also works from Subject: lines (philosophical
objections to the null device, please, linking by subject *gets the job
done better than only using the reference lines* in today's imperfect
world).  However, they *are* used, and *do* come in handy....

amanda@mermaid.intercon.com (Amanda Walker) (07/22/90)

In article <1990Jul21.054016.10409@looking.on.ca>, brad@looking.on.ca (Brad
Templeton) writes:
> I would say we should sit down at a usenix and
> list out all the other new features we want, then implement them, because
> we won't get another chance for 5 years to upgrade the format.

I think it's probably too late for that.  My view at this point is that
Usenet has become large enough (and established enough as an operational
network) that it is effectively frozen.  A more fruitful approach is probably
to start building "Usenet II".

It's easier to migrate people from one system to another than it is to
upgrade them in place.  Just to take an example, I didn't switch
intercon.com over to C news until I had to bring up news from scratch on a
new piece of hardware, simply because it was not worth risking a breakdown
of the existing service while I got the new stuff running.  This may sound
shortsighted, but it is an example of what you have to deal with when a
system is being used for everyday operation.  Sometimes a disruption is
worse than putting up with current limitations.

Myself, I lean towards the techno-nerd side, and always want the latest
and greatest toys.  However, any plans for Usenet have to take the majority
of the sites and users into account.

Rather than telling people they should upgrade, I'd rather build something
better and have them decide for themselves that it would be a good idea :-).
It minimizes aggravation for everyone, and keeps people from bugging you
while you're building the better mousetrap...

--
Amanda Walker <amanda@intercon.com>
InterCon Systems Corporation

brad@looking.on.ca (Brad Templeton) (07/22/90)

C news may indeed *handle* arbitrary byte stream messages, but does it
*support* them?

In particular, input programs like inews must deal with them, and a new
header line must be added to classify article body types.  Many body types
are possible, including:
	ascii + underlining	(current default)
	extended ascii		(international char sets, etc.)
	rich text		(of some format)
	andrew message
	non-interpreted binary	(programs, etc.)
	binary with text header
	XXX format bitmap	(gif, tif)

In particular, I would devise code words for every format, such that those
code words could conveniently be the names of output programs in some
directory, with the default built into the reader, of course.

-- 
Brad Templeton, ClariNet Communications Corp. -- Waterloo, Ontario 519/884-7473

brad@looking.on.ca (Brad Templeton) (07/22/90)

I did not mean that there was no software to use References lines, I have
some myself.

I meant that the chains are broken far too often.  About 6% of followups
have no references line, and that results in broken chains on about
40% of usenet messages (every followup to such a followup is broken as
well.)

So you can't use it like you should.

As for supersedes, it works in limited cases, but try *really* using it
like I did and you will find a lot of sites don't run it at all, and that
there's a bug in the *design* of supersedes, such that it breaks with
batching.   (Try superseding twice, close enough that they are both in
the same news batch)

Like I said, these are the last 2 additions, from 5 years ago, and they
still do not work today.
-- 
Brad Templeton, ClariNet Communications Corp. -- Waterloo, Ontario 519/884-7473

frisk@rhi.hi.is (Fridrik Skulason) (07/22/90)

Well - some of us have 8-bits news already - I am for example using an 8-bit
'rn' right now.  The program only required a few minor modifications to work
properly.  The reason we went to 8-bit news and E-mail is quite simple - our
alphabet contains 10 charactes not found in standard ASCII.  Of course I can
only post 8-bit articles to our local newsgroups - the rest of the world is
still only 7-bit   :-(

I fully agree that we need an 8-bit news system (as well as 8-bit E-mail),
as this would make life a lot easier for those of us not using English.

Modifying the news software to permit the transmission of 8-bit data is
trivial - the real problem is the charcter set issue.

I don't know if the readers of this group are familiar with a similar
discussion regarding automatic translation between character sets in the
Kermit program.  The conclusions reached there seem to apply to the 8-bit
News/E-mail discussion as well, though.

Some possible solutions:

(1)  Each machine posts articles using the user's character set of choice.
To indicate which character set is used, a new field is added to the header.

                 examples:     Character-set: CP 870
	                       Character-set: ISO 8859/4

This is easy to implement, but has one serious drawback - all machines are
required to be able to handle all possible character sets.

(2)  On every machine the article is translated into one of the ISO 8859/x
series of character sets.  8859/1 would probably be most used, as it covers
most of the languages of Western Europe.  8859/2, 8859/3, 8859/4 etc. would
solve the needs of those using Greek, various Eastern European languages and
(I think) Hebrew and Arabic.  This would not solve the problem of those using
a 16-bit character set.  Also, I am not sure if Esperanto is included in any
of the ISO 8859/x standards.

(3)  All text is transmitted according to the ISO 10646 standard.  This has
one advantage compared to (2) - it allows the transmission of documents
containing 16-bit characters, as well as documents containing characters from
more than one of the 8859/x standards.  For example, one could send a message
with the first part in Russian and the second part in Greek.

My opinion is that (3) is more of a long-term goal - for 95 % of users of
Usenet, (2) is all that is needed.

But what changes would (2) require ?

Change #1:  Any ASCII computer on Usenet must accept 8-bit news and E-mail,
            and be able to forward articles without changes (in other words - 
            don't strip the eight bit !!!)  This is the only change required
            from the "English-only" ASCII-sites, where no 8-bit articles
            would originate or be read.

Change #2:  Any computer on Usenet using an extended version of ASCII (CP 437,
            ISO 8859/x etc) must translate all postings to one of the 8859/x
            charcter sets and indicate (in the header) which one is used.
            This change would be required from European/Non-English using users.

Change #3:  Any computer not using ASCII, but rather EBDIC (or something else),
            must translate all postings to one of the 8859/x character sets,
            instead of just translating to ASCII.  

Change #4:  Any computer must accept postings in one of the 8859/x character
            sets and be able to translate them to the character set used
	    by each user.

Problem #1: If the local character set is not able to represent all the
            charactes in the original posting, they must be represented as
            well as possible.  For example - a 7-bit computer receiving a text
            containing accented wovels might be expected just to drop the
            accent marks.

Problem #2: Different users - even on the same machine - have different
            capabilities to display 8-bit text.  For example, in Scandinavia
            it is common for terminals to use a 7-bit character set, where
            some of the characters (for example { [ ] } |) have been replaced
            by non-ASCII characters.  Other users in the same countries have
            fully 8-bit terminals (for example PCs running an terminal
            emulator).  The computer must store incoming articles as they
            arrive and the news/E-mail software must be updated to display
            them according to the capabilities of each terminal, as indicated
            by an environment variable.

So - what now ?

Is there any interest in creating a "working group" to attack the problem ?
Any of the authors of rn, nn, elm or other news/e-mail software out there ?

We are of course willing to share our modifications to the programs, and with
a bit of work we should be able to have 8-bit news/email running in a few
months.

So - any volunteers ?


-- 
Fridrik Skulason      University of Iceland  |       
Technical Editor of the Virus Bulletin (UK)  |  Reserved for future expansion
E-Mail: frisk@rhi.hi.is    Fax: 354-1-28801  |   

zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) (07/22/90)

>>Why don't all you people divert your energies into making your news system
>>handle 8 bit news rather than developing new and incompatible ways of
>
>I have never thought, and do not think now, that transmitting binaries
>is an appropriate activity for Usenet... but a significant minority

Files that use all 8 bits are not necessarily binaries.  Other languages,
bitmaps, etc.  Even control characters get munged.

There definitely needs to be a more tranparent standard for the 
transmission of news articles.  Sites capable of transparently 
transmitting all 8 bit characters shouldn't have to pay the penalty of 
7 bit encoding.  

In the meantime, a standard encoding method (and a header to indicate its
use) would be useful.  Maybe the encoding method could address the "too big
for one article" problem too.

Keep the transport problems away from the users.

Maybe we do need checksums.  At least we could throw away munged articles.
Start doing that and I suspect that people would fix their software.
-- 
Jon Zeeff (NIC handle JZ)	 zeeff@b-tech.ann-arbor.mi.us

henry@zoo.toronto.edu (Henry Spencer) (07/23/90)

In article <1990Jul22.062034.20896@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes:
>C news may indeed *handle* arbitrary byte stream messages, but does it
>*support* them?
>In particular, input programs like inews must deal with them, and a new
>header line must be added to classify article body types...

C news "supports" arbitrary byte-stream messages to exactly the same
extent as it "supports", say, poetry:  it doesn't give a damn what's
inside the message or what's in any headers other than the ones it
needs to know about.  With the exception of the inews problem, you're
purely and simply talking about reader issues, not transport issues,
and no changes to C News are necessary.

The inews business is a nuisance, but the blame rests primarily with
the Unix utilities rather than with inews proper.  There is an inews
rewrite already in the works, which might improve things somewhat.

It might also be worthwhile to define a new input interface without
all the goo and dribble that have crept into inews over the years.
Being backward compatible was a real pain there.  A clean interface
could eliminate a lot of messy handling of stuff that inews should not
have to care about.  (For example, the specs say that inews must try
to guess whether the beginning of the input text looks like headers,
in which case it *is* headers.  This could be eliminated by demanding
that the -h flag be used in such cases.)

As for content classification, I believe there has already been some
work done on this for RFC822-X.400 interfacing, although I don't know the
details.
-- 
NFS:  all the nice semantics of MSDOS, | Henry Spencer at U of Toronto Zoology
and its performance and security too.  |  henry@zoo.toronto.edu   utzoo!henry

henry@zoo.toronto.edu (Henry Spencer) (07/23/90)

In article <==H&NB&@b-tech.uucp> zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) writes:
>Maybe we do need checksums.  At least we could throw away munged articles.
>Start doing that and I suspect that people would fix their software.

Geoff and I thought hard about this during C News development.  The trouble
with checksums is that most people would prefer a slightly mangled copy of
an article to no copy of the article.  There are all too many transmission
channels that do in fact slightly mangle articles (expanding tabs, fiddling
with the definition of newline, etc.).	Some early test versions of C News
did generate a checksum header.  We scrapped it because we could not think
of anything to do with it that people would want.
-- 
NFS:  all the nice semantics of MSDOS, | Henry Spencer at U of Toronto Zoology
and its performance and security too.  |  henry@zoo.toronto.edu   utzoo!henry

guy@auspex.auspex.com (Guy Harris) (07/23/90)

>I meant that the chains are broken far too often.  About 6% of followups
>have no references line, and that results in broken chains on about
>40% of usenet messages (every followup to such a followup is broken as
>well.)
>
>So you can't use it like you should.

It works well enough for me.  It's not perfect, but what is?

1) Merely grouping all the articles in a thread together is a big win
   (no, this is not a hypothetical assertion, it is an observation based
   on using "trn" for quite a while); even with no references line on
   the followup, "trn" (or, more correctly, its thread builder program)
   does that by subject matching.

2) *Enough* of the articles have proper reference lines that I can go up
   and down the article tree often enough to make it worthwhile.

david@twg.com (David S. Herron) (07/23/90)

In article <15688@bfmny0.BFM.COM> tneff@bfmny0.BFM.COM (Tom Neff) writes:
>In article <777@hades.ausonics.oz.au> greyham@hades.ausonics.oz.au (Greyham Stoney) writes:
>I have never thought, and do not think now, that transmitting binaries
>is an appropriate activity for Usenet... but a significant minority
>disagrees, and since they can control who does and doesn't carry the
>binary bandwidth, it's fine with me.  Either way, 8 bit articles don't
>fix anything fundamentally broken, so I'd concentrate energies elsewhere.

Transmitting 8-bit files can be useful beyond the fairly narrowly
defined thought of software packages.

Think about all the multi-media gadgetry floating around in X.400.
Voice, animation, still pictures in various formats, etc.  This is
something you'd need an Amiga to do justice to :-)

But, no, it doesn't require 8-bit article formats to support all that.
It can be encoded in 7-bit files, without too much problem and so forth.

BTW..  I hafta make this warning..

BITNET won't be able to handle any 8-bit file format very easily.
More than just bitnet, but also things like VMS will have problems.

One of the better things about Usenet is that, since it's text
files, it's immediately portable across all sorts of OS's.  Non-text
files tend to be non-portable.


-- 
<- David Herron, an MMDF weenie, <david@twg.com>
<- Formerly: David Herron -- NonResident E-Mail Hack <david@ms.uky.edu>
<-
<- Sign me up for one "I survived Jaka's Story" T-shirt!

eps@toaster.SFSU.EDU (Eric P. Scott) (07/23/90)

In article <3721@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes:
>Did you perhaps mean "ISO 8859/1" rather than "ISO 8859"?

Yes, I did.  Mea culpa.

					-=EPS=-

diamond@tkou02.enet.dec.com (diamond@tkovoa) (07/23/90)

In article <1990Jul21.174535.8281@lokkur.dexter.mi.us> scs@lokkur.dexter.mi.us (Steve Simmons) writes:

>The sarcasm lamp is now lit.  :-)

Not mine though.  (I have a large sarcasm lamp but it's not lit this time.)

>Of course, there aren't many of us reading the current news
>who want to read those kanji, katakana, and ghu-only-knows what other
>variants.

There sure are.  I can't read them very well, but there are many who do.

>Still, we should all be forced to make software that is
>capable of handling it so that the English readers can look at the
>Hindi, Korean and Russian postings that flow by.

You're right; you aren't forced.  If you don't, then the rest of the
world will leave you behind.  But you aren't forced.

>The sarcasm lamp is now off.  :-)

>Hey, news is ASCII-based, written in english-speaking countries for
>english-speaking readers.

This was true 10 years ago.

>That fact that it works *at all* for
>international and non-English stuff is a wonderful plus.  If regional
>newgroups have regional needs, they should go ahead and fill them.
>But neither side should expect interoperatbility.

They'll go ahead and fill it, believe me.  And they will market systems
in the U.S. that have interoperability too.  Businesses that agree with
your opinion will go bankrupt.  (And there are a lot of them.  The U.S.
is moving towards losing the software market, just as it did for cars and
home electronics.)

-- 
Norman Diamond, Nihon DEC     diamond@tkou02.enet.dec.com
This is me speaking.  If you want to hear the company speak, you need DECtalk.

brian@ucsd.Edu (Brian Kantor) (07/23/90)

The NNTP extensions (that I'll get into an RFC soon, I promise!) support
a CHARSET extension and 8-bit transmission, so it can allow the transfer
of (i hope) any character set.  Perhaps news will follow suit.

A standard for transmitting multi-byte characters would presumably
specify what order the bytes are to be sent; that is NOT properly part
of the news system nor of NNTP.
	- Brian

tneff@bfmny0.BFM.COM (Tom Neff) (07/23/90)

In article <1990Jul21.054016.10409@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes:
>Why should binaries be split up into smaller bits, particularly bits as small
>As 50K?
>
>If I'm going to lose parts of a multi-part binary, I may as well lose the
>whole thing.

I agree completely.  Binaries shouldn't be split.  If a binary is bigger
than 50K, it has NO BUSINESS BEING BROADCAST AS NEWS!  Post a pointer
for FTP and anonymous UUCP, and let those who want it pay for it.
THAT'S how Usenet ought to work.  (This is not a flame at Brad of course
-- it's a flame at BBS refugees who discover Usenet and expect it to
work just like a bigger BBS.)

>I wrote the ABE/DABE system to make that easier to deal with, but even so,
>having to deal with missing parts and assembly, etc., is a pain.

In all fairness, when a properly packaged source archive (i.e., what
Usenet SHOULD be broadcasting) arrives in a dozen pieces, installation
is VERY easy on most platforms.  When a part is lost or corrupted, the
penalty for rebroadcast is a lot smaller than it would be if the whole
kit had to be resent.

-- 
"Take off your engineering hat   = "The filter has      | Tom Neff
and put on your management hat." = discreting sources." | tneff@bfmny0.BFM.COM

Dan@dna.lth.se (Dan Oscarsson) (07/23/90)

In article <1857@krafla.rhi.hi.is> frisk@rhi.hi.is (Fridrik Skulason) writes:
>
>Some possible solutions:
>
>(2)  On every machine the article is translated into one of the ISO 8859/x
>series of character sets.  8859/1 would probably be most used, as it covers
>most of the languages of Western Europe.  8859/2, 8859/3, 8859/4 etc. would
>solve the needs of those using Greek, various Eastern European languages and
>(I think) Hebrew and Arabic.  This would not solve the problem of those using
>a 16-bit character set.  Also, I am not sure if Esperanto is included in any
>of the ISO 8859/x standards.
>
>(3)  All text is transmitted according to the ISO 10646 standard.  This has
>one advantage compared to (2) - it allows the transmission of documents
>containing 16-bit characters, as well as documents containing characters from
>more than one of the 8859/x standards.  For example, one could send a message
>with the first part in Russian and the second part in Greek.
>
>My opinion is that (3) is more of a long-term goal - for 95 % of users of
>Usenet, (2) is all that is needed.
>
I think (3) is better.
(2) is more or less a subset of (3) and it would not be much more work
to implement (3) than (2).
Using (3) we have one character set only and an article can contain any
character. ISO 10646 can be sent in way so that ascii and iso 8859-1
articles can be sent without any change.
Also if ISO 10646 is choosen if will fit well with the international
sendmail patches that is under development.
If we choose (2) we will have to change to (3) in a few years.

--
Changes to netnews is somewhat different from mail. In netnews the
articles are stored in a central database used both for reading and for
sending the articles onward to the next site. This means that we cannot
convert incoming articles into the local character set used at a site,
instead each newsreader must do the convertion from ISO 10646 into
local character set.

To handle "old" sites that cannot handle 8-bit articles the articles will
have to be converted into ascii. So backbone sites must upgrade their
software to allow 8-bits through to allow tgis to work.

--
When the patches for international sendmail that I and one in Denmark is
developing is ready they will include convertion routines that could
probably be used in a netnews reader.

   Dan


-- 
Dan Oscarsson                              Department of Computer Science
                                           Lund Institute of Technology
e-mail:  Dan@dna.lth.se                    Box 118
                                           S-221 00 Lund, Sweden

peter@ficc.ferranti.com (Peter da Silva) (07/23/90)

In article <1990Jul21.054016.10409@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes:
> I would support a move to design a binary transmission format, have the
> new releases of B and C news support them, and have all the moderators switch,
> thus forcing anybody who wants binaries to get off their duffs and upgrade.

This is a great idea. It will break binaries at enough sites that maybe the
bloody things will finally dry up and blow away and maybe even some of the
stuff currently posted in binary form will start showing up as source.

Binaries and News go together about as well as rock videos and Masterpiece
Theatre.
-- 
Peter da Silva.   `-_-'
+1 713 274 5180.   'U`
<peter@ficc.ferranti.com>

scs@iti.org (Steve Simmons) (07/23/90)

diamond@tkou02.enet.dec.com (diamond@tkovoa) writes:

>In article <1990Jul21.174535.8281@lokkur.dexter.mi.us> scs@lokkur.dexter.mi.us (Steve Simmons) writes:

>>The sarcasm lamp is now lit.  :-)

>Not mine though.  (I have a large sarcasm lamp but it's not lit this time.)

*chuckle*  I was kind of looking forward to it...  :-)

>>Of course, there aren't many of us reading the current news
>>who want to read those kanji, katakana, and ghu-only-knows what other
>>variants.

>There sure are.  I can't read them very well, but there are many who do.

'Many' here is a relative term.  And my sarcastic point (which you quote
in the next note) is that use of national language sets for the purpose
of using foreign languages is largely irrelevant to those who do not
speak that language.  I freely admit english postings are of little
interest to non-english speakers.  :-)

>You're right; you aren't forced.  If you don't, then the rest of the
>world will leave you behind.  But you aren't forced.

I agree with you -- things will change here (USA, Canada, UK, Australia,
NZ) only when there is sufficient software available and things worth
reading which require 8bit (or more).

>>Hey, news is ASCII-based, written in english-speaking countries for
>>english-speaking readers.

>This was true 10 years ago.

And it's 90 or 95% true today.  100% of all news transport interfaces
were done originally by English speakers using ASCII *or* deliberately
designed to be compatible with same.  80 or 90% of newsreaders are
the same, the only exception I know of is nn (which, by the by, is
the best damn fine newsreader around) (and for all I know was written
by a native English speaker).

jbuck@galileo.berkeley.edu (Joe Buck) (07/24/90)

In article <1857@krafla.rhi.hi.is>, frisk@rhi.hi.is (Fridrik Skulason) writes:
|> I fully agree that we need an 8-bit news system (as well as 8-bit E-mail),
|> as this would make life a lot easier for those of us not using English.

I agree.  But any solution must take into account

Problem #0:

Many sites will continue to run the software they are using now, and no amount
of cajoling will cause them to install new, 8-bit compatible software.  In
some cases, this is because the organization gives news and mail a low priority
(doesn't bring in money, etc).  Many sites are still running obsolete software
and will continue to do so.  To install new software people need an incentive.
There is a very small incentive, unfortunately, for sites in English-speaking
countries to install software to support 8-bit character sets.

This means that any new software must co-exist with the current environment.
One way to do this is to have gateway sites do conversion.  There are
relatively
few connections between the US and Europe -- most traffic across the Atlantic
goes through uunet.  Character translation could be done on the uunet-mcsun
link -- stripping accents on articles arriving from Europe, remapping
characters
so when an American types braces in articles in comp.lang.c, readers in
Europe see braces, instead of language-specific characters.

--
Joe Buck
jbuck@galileo.berkeley.edu	 {uunet,ucbvax}!galileo.berkeley.edu!jbuck	

davison%drivax@uunet.uu.net (Wayne Davison) (07/24/90)

guy@auspex.auspex.com (Guy Harris) wrote:
> I use "References:" all the time; you're just not using the right
> newsreader.  (Hey Wayne!  Isn't it soup yet? :-))

The soup's been simmering for a long time time now, but I think its finally
time to serve.  And the NNTP sites out there might even appreciate the last
ingredient that was added (and thus, took extra time to slow-cook) -- NNTP
support has been added to the thread creator, and the whole stew has been
tested as trrn.  I'm planning to ship the package off to the comp.sources.unix
group sometime this week, so keep your eyes peeled (pun intended :-).
-- 
Wayne Davison            \  /| / /|\/ /| /(_)     davison%drivax@uunet.uu.net
davison@drivax.UUCP     (_)/ |/ /\|/ / |/  \         ...!uunet!drivax!davison
                           (W   A  Y   N   e)

ed@braaten.doit.sub.org (Ed Braaten) (07/24/90)

scs@lokkur.dexter.mi.us (Steve Simmons) writes:

>Hey, news is ASCII-based, written in english-speaking countries for
>english-speaking readers.  That fact that it works *at all* for
>international and non-English stuff is a wonderful plus.  If regional
>newgroups have regional needs, they should go ahead and fill them.

Thats funny Steve - 50% of the news I read is in German.  I'm living
in Germany right now.  Although many of the 60+ Million Germans here can
speak English (often better than we americans ;-), the language in this 
country is German.  And the german language news here is not limited to
"regional" consumption.  I'm aware of several sites there in the good
ole USA that are carrying the german groups also.  I'm willing to bet 
there is a LOT of non-English stuff floating around out there.  So why
don't we drop the provincial attitudes - lets hear it for 8-bit news!
It won't make English any harder to read, but it would certainly make 
life easier for the rest of the USENET.

>But neither side should expect interoperatbility.

Say what?  Interoperability and the free exchange of information is 
in my opinion exactly what makes USENET so successful...


---------------------------------------------------------------------------
        Ed Braaten             |  Jesus answered,  "I am the way and the
Work: ed@imuse.de.intel.com    |  truth and the life.  No one comes to the
Home: ed@braaten.doit.sub.org  |  Father except through me."   John 14:6 
---------------------------------------------------------------------------

storm@texas.dk (Kim F. Storm) (07/24/90)

(I've cross-posted this to news.software.nn since it contains some information
about future directions of nn.  Followups are directed to .b only.  ++Kim).

In news.software.b, frisk@rhi.hi.is (Fridrik Skulason) writes:

>Well - some of us have 8-bits news already - I am for example using an 8-bit
>'rn' right now.

nn release 6.4 supports presentation of 8-bit data (and with pl 6 it
also accepts 8-bit command input).

>I fully agree that we need an 8-bit news system (as well as 8-bit E-mail),
>as this would make life a lot easier for those of us not using English.

Certainly depends on who "we" are.  I think that if we are going to attack
this problem, we better do it properly from the start, and not *just* solve
the 8-bit news problem (which you cannot solve anyway).

>Modifying the news software to permit the transmission of 8-bit data is
>trivial - the real problem is the charcter set issue.

Changing any software is just *so easy*.  But it is *impossible* to get
people to install the changes unless they have a personal interest in
doing so.

I speak from experience:  More than one year has gone since the initial
release on nn worldwide (rel. 6.3.0).  Since then, about 20 patches
including a new release has been posted, but there are still some sites
out there running 6.3.0 (or .1 or .2) which you can recognize from the
RFC violating Re^2: prefixes in the Subject: lines on some postings.

I still get complaints about how stupid nn is, although this problem is
fixed (oh yes, it was *trivial*).  But getting people to update....

So if you want this to work on a world-wide scale within a timeframe of
less than 3-6 years I believe it must be done in the news reader software
at the end-points which needs this, and design a transport protocol for
8-bit, 16-bit and even 32-bit data which can:

a) be transparently sent through the current 7-bit restrictive channels,
   and

b) still be interpreted sensible on systems and by news readers which
   are not adapted to this new scheme.


Keld Simonsen from the Danish UNIX-system Users Group (DKUUG) has
defined a new naming scheme for international characters based on 10646 
using primarily two-character names which attempts to be as close to
the real character as possible.  For example, e' is an e with ' above it
o: is o with two dots above, etc.

Now this sounds rather trivial, but it has specifically been designed
for the above purposes:  (a) it uses only a subset of the ASCII character
set (e.g. { and } are not used since they are used for national characters
in many older 7-bit characters based on ISO 646), and (b) as the example
shows, the character name is a close approximation to the actual character.
So you can actually read a letter written using the character names!
(Of course, "a" is named "a", "b" is "b", "A" is "A", etc.).

A letter can then be written in any of the 8859/x variants, in various
EBCDIC and other IBM codepages, etc. using N-bit codes supported on the
local system.  However, when such a letter is transmitted to a remote
system, all the international characters are *encoded* by replacing all
the international characters by an "escape character" followed by the
(two character) character name.  The result is a pure 7-bit letter.

At the receiving end, the encoded letter can be converted back
to the originating character set, or the character set used on the local
character set as far as that is possible.  But that is the choice of
the recipient end!  Or, it can just be read without conversion since
the encoding is "readable" (the only problem being the escape character).

In the sendmail used on the Danish DKnet backbone, Keld has implemented
this and it is running very well, supporting about 50 different
character sets.  By default it uses ^] as the escape character which has the
benefit of being invisible on most terminals, but it can use any escape
character you like.  Both the escape character and the originating
character set is specified in the articles header.  (more on this below)


>Some possible solutions:

>(1)  Each machine posts articles using the user's character set of choice.
>To indicate which character set is used, a new field is added to the header.

>                 examples:     Character-set: CP 870
>	                       Character-set: ISO 8859/4

This is what Keld's sendmail extensions support today.

>This is easy to implement, but has one serious drawback - all machines are
>required to be able to handle all possible character sets.

Not with Keld's solution:
- If you know the character set, you can convert to it.

- If you use another character set, you can convert to that instead since
all international characters have been give *unique* names.

- If your software doesn't understand any of it, you can still read the
message with little or no problems.

And in the sendmail case, the Danish backbone is actually doing the
encoding *and* decoding for the Danish sites for which it has been told
which character set they prefer.  So if one site runs 8859/1, they send
8-bit 8859/1 data directly to the backbone, and if the recipient is a known
EBCDIC site, the backbone converts the letter to EBCDIC before delivery!
If it is to an unknown site, it will be converted to the "encoded" character
set, and it is thus the task of the recipient to handle it.

So in Denmark we not only run 8-bit mail, but *multi character set* mail.
And it is transparent for all practical uses.

>(2)  On every machine the article is translated into one of the ISO 8859/x
>character sets....

Too limited, and which one should you choose?

>(3)  All text is transmitted according to the ISO 10646 standard.  This has
>one advantage compared to (2) - it allows the transmission of documents
>containing 16-bit characters, as well as documents containing characters from
>more than one of the 8859/x standards.  For example, one could send a message
>with the first part in Russian and the second part in Greek.

Currently, I think Keld has defined about 1000 characters *including* Greek,
Russian (Cyrillic), Hebrew, Arabian, all 8859 sets, EBCDIC, PC character sets
and more.  And there are "hooks" reserved to include longer names for
kanji characters and the like.  So you can say that Keld has defined a
10646 character set representation using only a limited 7-bit character set.


>My opinion is that (3) is more of a long-term goal - for 95 % of users of
>Usenet, (2) is all that is needed.

And if you want to keep it that way, sure limit yourself to (2).

>But what changes would (2) require ?

>Change #1:  Any ASCII computer on Usenet must accept 8-bit news and E-mail,
>            and be able to forward articles without changes (in other words - 
>            don't strip the eight bit !!!)  This is the only change required
>            from the "English-only" ASCII-sites, where no 8-bit articles
>            would originate or be read.

The "only" change, yes, but a change which you simply cannot expect to be done.
No hope at all!

>Change #2:  Any computer on Usenet using an extended version of ASCII (CP 437,
>            ISO 8859/x etc) must translate all postings to one of the 8859/x
>            charcter sets and indicate (in the header) which one is used.

This wouldn't do it if the recipient end cannot handle that character set.
Or said in another way: which one of the 8859/x character sets should you
use?  8859/x is probably *the* answer for use within a certain country on
*most* UNIX boxes, but what about all the PC character sets, EBCDIC hosts
etc.  Don't you think a little more than 5% of the users are in that
category?

>Change #3:  Any computer not using ASCII, but rather EBDIC (or something else),
>            must translate all postings to one of the 8859/x character sets,
>            instead of just translating to ASCII.  

If they have to translate, they can just as well translate into something
which has a good chance of getting through the network - and 8859 doesn't
have a chance there.

>Change #4:  Any computer must accept postings in one of the 8859/x character
>            sets and be able to translate them to the character set used
>	    by each user.

But what if I support 8859/1 and get an article written in 8859/7 (greek?)
If we use your scheme, *all* the 8859/x sets must be accepted!

>Problem #1: If the local character set is not able to represent all the
>            charactes in the original posting, they must be represented as
>            well as possible.  For example - a 7-bit computer receiving a text
>            containing accented wovels might be expected just to drop the
>            accent marks.

Which may in some cases completely change the meaning!

>Problem #2: Different users - even on the same machine - have different
>            capabilities to display 8-bit text.  For example, in Scandinavia
>            it is common for terminals to use a 7-bit character set, where
>            some of the characters (for example { [ ] } |) have been replaced
>            by non-ASCII characters.  Other users in the same countries have
>            fully 8-bit terminals (for example PCs running an terminal
>            emulator).  The computer must store incoming articles as they
>            arrive and the news/E-mail software must be updated to display
>            them according to the capabilities of each terminal, as indicated
>            by an environment variable.

Exactly, and that is definitely easiest if everybody agrees on *one*
common "carrier character set" (my suggested term for such a character set).

>So - what now ?

>Is there any interest in creating a "working group" to attack the problem ?
>Any of the authors of rn, nn, elm or other news/e-mail software out there ?

Yes, support for Keld's multi character set handling is planned for an
upgrade to nn 6.4 later this year.  We have been looking at what can be
used as the escape character in news, and this is definitely a problem,
since inews traditionally is very restrictive with respect to what it
will pass through (^] is filtered out as most other control characters).

But we believe we have found the right solution, which will pass
through at least Bnews' inews, and is supposed to be *transparent* to
most news interfaces:  We use a double escape character consisting of
a "space" followed by a "backspace".  When output to a screen this will be
invisible and most pagers will handle backspace properly (i.e. move the
cursor back over the space).  And we think it is very unlikely that this
sequence will occur in normal postings (we see no purpose for it).

And since only articles which have the proper header specifying that this
is really an encoded article will be "decoded", the filters which encode
the articles at the originating end can check that no such sequences exist
in the original text.

>We are of course willing to share our modifications to the programs, and with
>a bit of work we should be able to have 8-bit news/email running in a few
>months.

nn users world-wide can soon exchange multi character news - other users can
read it (without problems), and we will publish our code and specifications
so that other interfaces can support it as well.

>So - any volunteers ?

Yes, but is there any interest in what we plan to do???

And will our "space-backspace" escape pass through Cnews, NNTP and
other inews/relaynews/whatever implementations (without modification)?

-- 
Kim F. Storm  <storm@texas.dk>		No news is good news,
Texas Instruments A/S, Denmark		  but nn is better!

henry@zoo.toronto.edu (Henry Spencer) (07/24/90)

In article <7647@gollum.twg.com> david@twg.com (David S. Herron) writes:
>BITNET won't be able to handle any 8-bit file format very easily.

It doesn't handle 7-bit file formats very reliably, actually; Bitnet has
all kinds of ugly properties as a transmission subsystem.  However, with
a suitable encoding it can still be used to get clean 8-bit data from
point A to point B.  The bencode/bdecode stuff shipped with C News,
originally written at Waterloo, is Bitnet-proof by design.  (Uuencode
is not, by the way.)
-- 
NFS:  all the nice semantics of MSDOS, | Henry Spencer at U of Toronto Zoology
and its performance and security too.  |  henry@zoo.toronto.edu   utzoo!henry

eps@toaster.SFSU.EDU (Eric P. Scott) (07/24/90)

In article <7647@gollum.twg.com> david@twg.com (David S. Herron) writes:
>BITNET won't be able to handle any 8-bit file format very easily.
>More than just bitnet, but also things like VMS will have problems.

Say what?  VAX/VMS got 8-bit-ized way back when the VT2xx
terminals came out--I think that was around V4.0.

A better question is whether ANU NEWS does The Right Thing.
Geoff?

					-=EPS=-

" Maynard) (07/24/90)

In article <Z4V4=DF@xds13.ferranti.com> peter@ficc.ferranti.com (Peter da Silva) writes:
>In article <1990Jul21.054016.10409@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes:
>> I would support a move to design a binary transmission format, have the
>> new releases of B and C news support them, and have all the moderators switch,
>> thus forcing anybody who wants binaries to get off their duffs and upgrade.
>This is a great idea. It will break binaries at enough sites that maybe the
>bloody things will finally dry up and blow away and maybe even some of the
>stuff currently posted in binary form will start showing up as source.

This is a rotten idea. That source is the prevalent form of software
distribution in the Unix environment is more an artifact of the
diversity of Unix systems than anything else. The IBM-PC world (the only
one I'm intimately familiar with; I don't own an Amiga/Atari/...) won't
switch away from the various compressed archivers it's using to a pure
source distribution, for several reasons:
1) You can't stuff an executable in a shar, and comparatively few people
own each individual language environment, so they can't recompile the
programs.
2) .ARC/.ZIP files are easy to transport, and explode into the component
parts with just a single tool, instead of requiring a shell and several
utilities.
3) The cultural history doesn't include people improving on the source
and sharing the improvements; if anything, it's more along the lines of
stealing the code and giving no credit.

Breaking binaries on Usenet will get rid of them, all right, but at the
cost of cutting Usenet users off completely from nearly all
residtributed programs. You won't get the authors to do things your way.

-- 
Jay Maynard, EMT-P, K5ZC, PP-ASEL   | Never ascribe to malice that which can
jay@splut.conmicro.com       (eieio)| adequately be explained by stupidity.
"It's a hardware bug!" "It's a      +----------------------------------------
software bug!" "It's two...two...two bugs in one!" - _Engineer's Rap_

bob@MorningStar.Com (Bob Sutterfield) (07/24/90)

In article <15692@bfmny0.BFM.COM> tneff@bfmny0.BFM.COM (Tom Neff) writes:
   In article <1990Jul21.054016.10409@looking.on.ca> brad@looking.on.ca (Brad Templeton) writes:
      If I'm going to lose parts of a multi-part binary, I may as well
      lose the whole thing.

   I agree completely.  Binaries shouldn't be split.  If a binary is
   bigger than 50K, it has NO BUSINESS BEING BROADCAST AS NEWS!  Post
   a pointer for FTP and anonymous UUCP, and let those who want it pay
   for it.  THAT'S how Usenet ought to work.

If the binary thing is a computer program, it has no business being
transmitted as news.  Binaries are inherently nonportable, and of
little value to a majority of sites passing them through.

If the binary thing is a multimedia document, then (at the user
interface level) it should certainly appear in one chunk.

   ...when a properly packaged source archive (i.e., what Usenet
   SHOULD be broadcasting)...

Yes, if the thing you're talking about is a program for a computer, it
certainly should be distributed as source.  Does anyone actually trust
an encoded binary that they found in some newsgroup?  How quaint, how
naive!

However, there are (as has been abundantly pointed out) plenty of
examples of things that aren't program binaries but that still should
be transmissible via news-like mechanisms and that break the current
news implementations.

   ...arrives in a dozen pieces, installation is VERY easy on most
   platforms.  When a part is lost or corrupted, the penalty for
   rebroadcast is a lot smaller than it would be if the whole kit had
   to be resent.

Sequencing and reassembly are problems for the session layer, not the
user interface layer.  Users should never see that a document was
split into transport layer-sized chunks, which is what the 50K
article-size limit really is.

zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) (07/24/90)

>header line must be added to classify article body types.  Many body types
>are possible, including:
>	ascii + underlining	(current default)
>	extended ascii		(international char sets, etc.)
>	rich text		(of some format)
>	andrew message
>	non-interpreted binary	(programs, etc.)
>	binary with text header
>	XXX format bitmap	(gif, tif)
>

Maybe a digital audio type also.  I like the idea of reading the next 
news article and having the news reader decide that is a gif file and 
automatically displaying the graphics.  It might be useful to be able 
to mix types within an article (for example, graphics with text 
telling you what it is along with a binary to produce it).  This would 
mean some imbedded esc sequences instead of a header line.  

Re character sets, ISO10646 sounds good but I'd hate to see news volume
double (16 bit chars vs 8).  Options for it and ISO8859/x sound more
efficient.

So we need three things - a standard, newsreaders to handle it, and 
transfer mechanisms that don't munge things (like most of C News).  At 
some point, I expect sites would start refusing feeds from munging 
sites and dumping munged articles.

-- 
Jon Zeeff (NIC handle JZ)	 zeeff@b-tech.ann-arbor.mi.us
Dolphins!  What about the tuna?

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (07/25/90)

In article <1990Jul22.195243.28379@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:

| Geoff and I thought hard about this during C News development.  The trouble
| with checksums is that most people would prefer a slightly mangled copy of
| an article to no copy of the article.  There are all too many transmission
| channels that do in fact slightly mangle articles (expanding tabs, fiddling
| with the definition of newline, etc.).

  This is a good point, but there are some groung rules which could help
eliminate this. When I do a CRC on text, I ignore all whitespace, and
put a delimiter (@@start and @@stop) around the text. This makes it work
even when fairly heavily munged in the usual ways.

  The problems of line ending, conversion of blanks to tabs and back,
adding or deleting blanks at end of line, can all be ignored this way.
Even if lines are folded you can get a CRC if you ignore whitespace.

  This is not to say you're wrong, just that a partial solution is
available. I have used this for some time, and it seems to be critical
enough to be useful, and forgiving enough to avoid dropping things which
are still readable.

  I use brik for error checking on c.b.i.p postings, for historical
reasons, and I haven't had a complaint in six months. A header field for
CRC would be great, even if all the reader did was output a message
indicating that the data was damaged.
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
            "Stupidity, like virtue, is its own reward" -me

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (07/25/90)

In article <1864@tkou02.enet.dec.com> diamond@tkou02.enet.dec.com (diamond@tkovoa) writes:

| They'll go ahead and fill it, believe me.  And they will market systems
| in the U.S. that have interoperability too.  Businesses that agree with
| your opinion will go bankrupt.

  Is this bash the USA week? Net software is given away. There are no
businesses selling news software, and if someone gives away better
software it will be used, if not the useful features will be added.
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
            "Stupidity, like virtue, is its own reward" -me

guy@auspex.auspex.com (Guy Harris) (07/25/90)

>80 or 90% of newsreaders are the same, the only exception I know of is
>nn (which, by the by, is the best damn fine newsreader around)

Well, Wayne says it's going to be soup soon; we'll see whether things
change (although I'm told one of the GNU EMACS newsreaders also deals
with threads reasonably, and there were some mutterings about "nn"
picking up some of the threads stuff from "trn").

>(and for all I know was written by a native English speaker).

The author posted it from a TI site in Denmark, where, as I remember, he
works; I don't know if Kim is a Dane, an expatriate from an
English-speaking country, or an expatriate from a non-English-speaking
country.

peter@ficc.ferranti.com (Peter da Silva) (07/25/90)

In article <JCP&4:&@splut.conmicro.com> jay@splut.conmicro.com (Jay "you ignorant splut!" Maynard) writes:
> Breaking binaries on Usenet will get rid of them, all right, but at the
> cost of cutting Usenet users off completely from nearly all
> residtributed programs. You won't get the authors to do things your way.

Breaking binaries on UNIX didn't have that result. You're contradicting
yourself.

Remember, Usenet is not a BBS.
-- 
Peter da Silva.   `-_-'
+1 713 274 5180.   'U`
<peter@ficc.ferranti.com>

dansmith@well.sf.ca.us (Daniel Smith) (07/25/90)

[ongoing discussion about 8 bit news...]

	I agree that 8 bit news would be a Good Thing.  But some of the
same people that would be solving that problem are attempting (or at
least, should be attempting :-) to speed along these two:

	1) 8 bit email

	2) worldwide agreement on email addressing... 

	I don't think 8 bit email is that hard, but then again, I haven't
really looked at the code...I just think that since uucp and other
low-level transport mechanisms can handle it, that it shouldn't be
too hard to adapt MTAs and readers for it.  I could be way off base on this :-)

	Problems for both 8 bit email and news is:  how to protect
your screen (use a pager like "less", which automatically displays
control-whatever as 2 chars)?  How do you decide which escape sequences
you want to affect your screen and which you want to filter out?
How do you know which language (character set) to display in?  Which
convention do you use to handle EOL?  Will people agree on generic ways
of including bold text, font changes, etc?  Sure, there are some standards
in these areas...but...  I guess the problem really breaks down into
three: how to transport (MTAs and underlying news software), how to
display (allowing for different terminals, local conventions, etc) and
how to save the message (save everything in binary mode?, save it
the way you saw it?, etc.)

	A big problem I see (2) is all the *%&^ ways of getting a message
from point A to point B.  Not in the routing sense, but in type of
addressing used sense.  Sure it's getting better, but not quickly enough.
One has to remember foo@bar.com vs this!that!other vs something::other
For instance, in my mail today I have:

From: Someone in England <uunet!stl.stc.co.uk!That.Person>
X-Vms-Mail-To: INET%"daniel%bermuda%island@mcsun.uucp"

	Now, I'm glad this made it to me (I've changed the name,
but see what I mean?  lots of mailers can choke on "That.Person"...it's
not a domain!)  As for the INET line...yea, I think I understand it,
but god what a kludge just to get a letter from one place to another!

	My wish for today is to have the world convert to user@site.domain
once and for all...problem is (getting back to the real world!) all the
different twists on this that you see (don't they do some of the domain
part in reverse in England?  sigh :-)  The real test of this will
be "can I explain in 1 minute to someone brand-new to email how to
address to any site?".  When we can do that, without all sorts of
exceptions based on machine type, OS, country, local net, etc., we'll
all benefit (less bounced mail, more understandable).  I realize many
people are currently working on this, and I thank them!

[followup to some more appropriate group, since this is starting to get
off the Subject]

				Daniel
-- 
                         Dan "Bucko" Smith
   dansmith@well.sf.ca.us   daniel@island.uu.net   unicom!daniel@pacbell.com
ph: (415) 332 3278 (h), 258 2136 (w) disclaimer: Island's coffee was laced :-)
My mind likes Cyberstuff, my eyes films, my hands guitar, my feet skiing...

chris@vision.UUCP (Chris Davies) (07/25/90)

In article <37713@ucbvax.BERKELEY.EDU> jbuck@galileo.berkeley.edu (Joe Buck) writes:
[...]
>This means that any new software must co-exist with the current environment.
>One way to do this is to have gateway sites do conversion.  There are
>relatively
>few connections between the US and Europe -- most traffic across the Atlantic
>goes through uunet.  Character translation could be done on the uunet-mcsun
>link -- stripping accents on articles arriving from Europe, remapping
>characters
>so when an American types braces in articles in comp.lang.c, readers in
>Europe see braces, instead of language-specific characters.

I'd guess that there are far more unpublished links between the US and Europe
than might be expected from the maps...  Particularly companies with both
US and European offices (no, I won't name names  :-)  Would these sites have
to provide gateway-based character translation too?  I can't really see that
happening, so we'd get a mix of translated and untranslated articles both
sides of the Atlantic.

Also, stripping accents off characters Could completely change the meaning
of the word(s).  Other characters (UK sterling symbol, for instance) would
have no easy translation (unless you map it to hash, #) in 7-bit US ASCII.

Just my pennyworth,
		Chris
-- 
VISIONWARE LTD         | UK: chris@vision.uucp     JANET: chris%vision.uucp@ukc
57 Cardigan Lane       | US: chris@vware.mn.org    OTHER: chris@vision.co.uk
LEEDS LS4 2LE          | BANGNET:  ...{backbone}!ukc!vision!chris
England                | VOICE:   +44 532 788858   FAX:   +44 532 304676
-------------- "VisionWare:   The home of DOS/UNIX/X integration" --------------

henry@zoo.toronto.edu (Henry Spencer) (07/25/90)

In article <632@texas.dk> storm@texas.dk (Kim F. Storm) writes:
>And will our "space-backspace" escape pass through Cnews, NNTP and
>other inews/relaynews/whatever implementations (without modification)?

I can't answer for NNTP et al, but it should be fine with C News, even inews.
Our inews strips out a lot of control characters, but it leaves backspace
alone.
-- 
NFS:  all the nice semantics of MSDOS, | Henry Spencer at U of Toronto Zoology
and its performance and security too.  |  henry@zoo.toronto.edu   utzoo!henry

Dan@dna.lth.se (Dan Oscarsson) (07/25/90)

In article <!YJ&ZKC@b-tech.uucp> zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) writes:
>
>Re character sets, ISO10646 sounds good but I'd hate to see news volume
>double (16 bit chars vs 8).  Options for it and ISO8859/x sound more
>efficient.
>
NO ISO 10646 will not double the volume even though ISO 10646 is a
32-bit character set. It will only be necessary to use 8 bits when ascii or
ISO 8859-1 is used. A few additional bytes are used when changing to an other
part of ISO 10646 than base part.

   Dan

-- 
Dan Oscarsson                              Department of Computer Science
                                           Lund Institute of Technology
e-mail:  Dan@dna.lth.se                    Box 118
                                           S-221 00 Lund, Sweden

zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) (07/25/90)

>>Maybe we do need checksums.  At least we could throw away munged articles.
>>Start doing that and I suspect that people would fix their software.
>
>Geoff and I thought hard about this during C News development.  The trouble
>with checksums is that most people would prefer a slightly mangled copy of
>an article to no copy of the article.  There are all too many transmission
>channels that do in fact slightly mangle articles (expanding tabs, fiddling
>with the definition of newline, etc.).	Some early test versions of C News

There is a certain class of sites that munge all articles in this 
way.  If someone found that all articles from their feed site were 
being dropped, they would tend to find a new feed (or if they still 
excepted it, it would be less likely that they would pass it on).  So 
the choice wouldn't be between slightly mangled news and no news, but 
between mangled news and a new feed.  

As it is, I can't identify munged articles even if I want to.  If I 
could identify them, I could at least hold out to get better copies 
from another site.  

Without a crc, we just don't have the tools we need to fix the problem.

-- 
Jon Zeeff (NIC handle JZ)	 zeeff@b-tech.ann-arbor.mi.us
Dolphins!  What about the tuna?

" Maynard) (07/26/90)

In article <14W43HC@xds13.ferranti.com> peter@ficc.ferranti.com (Peter da Silva) writes:
>In article <JCP&4:&@splut.conmicro.com> jay@splut.conmicro.com (Jay "you ignorant splut!" Maynard) writes:
>> Breaking binaries on Usenet will get rid of them, all right, but at the
>> cost of cutting Usenet users off completely from nearly all
>> residtributed programs. You won't get the authors to do things your way.
>Breaking binaries on UNIX didn't have that result. You're contradicting
>yourself.

...huh? Your statement doesn't make much sense. Binaries haven't been
useful for Unix sites since it was first ported off a PDP-11.

>Remember, Usenet is not a BBS.

Agreed...but how is that relevant to the current discussion about
binaries versus source?

-- 
Jay Maynard, EMT-P, K5ZC, PP-ASEL   | Never ascribe to malice that which can
jay@splut.conmicro.com       (eieio)| adequately be explained by stupidity.
"It's a hardware bug!" "It's a      +----------------------------------------
software bug!" "It's two...two...two bugs in one!" - _Engineer's Rap_

peter@ficc.ferranti.com (Peter da Silva) (07/26/90)

In article <9SQ&.FC@splut.conmicro.com> jay@splut.conmicro.com (Jay "you ignorant splut!" Maynard) writes:
> ...huh? Your statement doesn't make much sense. Binaries haven't been
> useful for Unix sites since it was first ported off a PDP-11.

Well, that's the point.
-- 
Peter da Silva.   `-_-'
+1 713 274 5180.   'U`
<peter@ficc.ferranti.com>

storm@texas.dk (Kim F. Storm) (07/27/90)

guy@auspex.auspex.com (Guy Harris) writes:

>>80 or 90% of newsreaders are the same, the only exception I know of is
>>nn (which, by the by, is the best damn fine newsreader around)

>Well, Wayne says it's going to be soup soon; we'll see whether things
>change (although I'm told one of the GNU EMACS newsreaders also deals
>with threads reasonably, and there were some mutterings about "nn"
>picking up some of the threads stuff from "trn").

I've had some talks with Wayne some time back about adding his thread
handling to nn, but I have been busy getting the 6.4 release into a
good shape (which it is now!)  So next on the agenda for nn is:
	Internationalisation (multicharacter support)
	Thread handling (maybe based on trn - I still have to see it)
	

>>(and for all I know was written by a native English speaker).

You are as wrong as you can be - ever heard me *speak* :-) :-) :-)

>The author posted it from a TI site in Denmark, where, as I remember, he
>works; I don't know if Kim is a Dane, an expatriate from an
>English-speaking country, or an expatriate from a non-English-speaking
>country.

In case anybody is interested I'm Danish.

-- 
Kim F. Storm  <storm@texas.dk>		No news is good news,
Texas Instruments A/S, Denmark		  but nn is better!

michael@fts1.uucp (Michael Richardson) (07/29/90)

In article <37713@ucbvax.BERKELEY.EDU> jbuck@galileo.berkeley.edu (Joe Buck) writes:
>Many sites will continue to run the software they are using now, and no amount
>of cajoling will cause them to install new, 8-bit compatible software.  In
>some cases, this is because the organization gives news and mail a low priority
>(doesn't bring in money, etc).  Many sites are still running obsolete software

  Agreed. My experience says that these people tend to be either terminal sites, or extremly heavily
loaded sites, whose future existance is not necessarily assured. (fts1's feed, nrcaer
is such a site)
  
>There is a very small incentive, unfortunately, for sites in English-speaking
>countries to install software to support 8-bit character sets.

  Sites in Canada, particularly the Canadian government would probably be
able to justify the time to support the use of french.

>few connections between the US and Europe -- most traffic across the Atlantic
>goes through uunet.  Character translation could be done on the uunet-mcsun
>link -- stripping accents on articles arriving from Europe, remapping
>characters

  So long as I can get a feed from uunet, untranslated. Obviously
the translation programs would be easiest to put into the batchers.

  I'm not exactly sure how ISO 10646 (is that the right number?) relates
to T61 and NAPLPS (which I'm quite familliar with), but I think that a method
of getting 8 bit data through a 7 bit site could be devised with the proper
set of filters.
  I just wish that uuencoded and tarmail could be better dealt with and 
transferred in binary form. 
  tar | compress | uuecode | *news* | uux <-Telebit-> uuxqt | *news* |
uudecode | uncompress | tar x

  Is a necessary evil, but perhaps the blow could be reduced somewhat. Particularly
if "*news*" involves batching and doing MORE compression..

>so when an American types braces in articles in comp.lang.c, readers in
>Europe see braces, instead of language-specific characters.

  And the rest of the world that receives stuff from uunet?

-- 
   :!mcr!:            | < political commentary currently undergoing Senate >
   Michael Richardson | < committee review. Returning next house session.  >
 Play: mcr@julie.UUCP Work: michael@fts1.UUCP Fido: 1:163/109.10 1:163/138
    Amiga----^     - Pay attention only to _MY_ opinions. -   ^--Amiga--^

tmatimar@watmath.waterloo.edu (Ted M A Timar) (07/30/90)

In article <37713@ucbvax.BERKELEY.EDU> jbuck@galileo.berkeley.edu (Joe Buck) writes:
>Many sites will continue to run the software they are using now, and no amount
>of cajoling will cause them to install new, 8-bit compatible software.  In
>some cases, this is because the organization gives news and mail a low priority
>(doesn't bring in money, etc).  Many sites are still running obsolete software
>and will continue to do so.  To install new software people need an incentive.
>There is a very small incentive, unfortunately, for sites in English-speaking
>countries to install software to support 8-bit character sets.

I don't see why we spend so much time catering to people who are unwilling
to upgrade their software to more recent versions.  I think, to be reasonable,
we must maintain backward compatibility to the version immediately before the
current one, and possibly a bit more, so that people with different transport
agents (not bnews/cnews, or PC's running news ...) to catch up.

To be fair, the problem is, in part, that news installation is not trivial,
and making it trivial isn't trivial.  This discourages sites with many other
problems from upgrading when they have other problems to deal with.  But
they do slow down progress much more than we might desire.

>This means that any new software must co-exist with the current environment.
>One way to do this is to have gateway sites do conversion.  There are
>relatively
>few connections between the US and Europe -- most traffic across the Atlantic
>goes through uunet.  Character translation could be done on the uunet-mcsun
>link -- stripping accents on articles arriving from Europe, remapping
>characters
>so when an American types braces in articles in comp.lang.c, readers in
>Europe see braces, instead of language-specific characters.

Unfortunately, there are many gateway sites to Europe.  There are also
many gateway sites to Quebec.  (Both Europe and Quebec can use NNTP to
get news from almost anywhere now.)

I would like to see a Usenet II / Usenet with very few gateways.  No new
software would be written for Usenet, and all news would be gatewayed
both ways.  Those gatewayed in would, in many cases be entirely
gibberish.  This would encourage sites to upgrade as soon as possible.

Furthermore, I would recommend that a "Language:" header be added to
all articles.  And a reverse KILL format to the newsreaders, so that
people could kill all but the languages they could read.

-- 
Just my .0002 cents worth
Ted Timar
tmatimar@watmath.waterloo.edu

zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) (07/30/90)

It looks like the latest trn (threaded rn) will suport the display of 
eighth bit set characters.  With it and cnews*, one can already run a 
local or controlled distribution group with a 8 bit character set (eg,
iso8859/1).  

* - If the utilities inews uses aren't 8 bit clean, you may be out
of luck until Henry does a new version.


I wonder if anyone out there reads this like I do  - "risumi".
-- 
Jon Zeeff (NIC handle JZ)	 zeeff@b-tech.ann-arbor.mi.us

henry@zoo.toronto.edu (Henry Spencer) (07/30/90)

In article <N8N&T+D@b-tech.uucp> zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) writes:
>* - If the utilities inews uses aren't 8 bit clean, you may be out
>of luck until Henry does a new version.

Actually, Geoff is the one who would do it (and is working on it, as
time permits).

I would hope that a lot of the sites that need to post 8-bit articles
already have 8-bit-clean utilities, since I would expect they'd need
them for other purposes.
-- 
The 486 is to a modern CPU as a Jules  | Henry Spencer at U of Toronto Zoology
Verne reprint is to a modern SF novel. |  henry@zoo.toronto.edu   utzoo!henry

" Maynard) (07/30/90)

In article <5IX4X2D@ggpc2.ferranti.com> peter@ficc.ferranti.com (Peter da Silva) writes:
>In article <9SQ&.FC@splut.conmicro.com> jay@splut.conmicro.com (Jay "you ignorant splut!" Maynard) writes:
>> ...huh? Your statement doesn't make much sense. Binaries haven't been
>> useful for Unix sites since it was first ported off a PDP-11.
>Well, that's the point.

...except that your point doesn't port to the IBM-PC environment, where
binaries are useful; the mountain of PC packages delivered in object
form, often in some compressed archive format with documentation in a
neat little bundle, is an existence proof.

Usenet is more than Unix.

-- 
Jay Maynard, EMT-P, K5ZC, PP-ASEL   | Never ascribe to malice that which can
jay@splut.conmicro.com       (eieio)| adequately be explained by stupidity.
"It's a hardware bug!" "It's a      +----------------------------------------
software bug!" "It's two...two...two bugs in one!" - _Engineer's Rap_

tneff@bfmny0.BFM.COM (Tom Neff) (07/30/90)

In article <NLV&_.D@splut.conmicro.com> jay@splut.conmicro.com (Jay "you ignorant splut!" Maynard) writes:
>...except that your point doesn't port to the IBM-PC environment, where
>binaries are useful; the mountain of PC packages delivered in object
>form, often in some compressed archive format with documentation in a
>neat little bundle, is an existence proof.
>
>Usenet is more than Unix.

Oh yeah, but much less (and more) than a BBS.  Binaries *WILL BREAK OUR
BACKS* if we open the doors wide!  And they are intrinsically parochial,
where text and source are ecumenical.

It doesn't matter how many new kinds of computers join: the net's
ESSENTIAL CHARACTER needs preserving.  This is a lesson it would be
unfair (unrealistic anyway) to expect PC BBS refugees to understand
right away.  A lot of them show up and want to chat the sysop, as it
were.  Post three revisions a week of those DR. FILEGOOD binaries, the
Bart Simpson pictures, and Zeke's Montly List of 3,000 BBS's in ZIP form
to preserve every superfluous ^M.

The "existence proof" of all the PC binaries carried on BBS's and
services is also a sufficiency proof.  These folks HAVE their BBS's
already, and CompuServe, and GEnie, and BIX, etc, etc.  They don't need
to turn Usenet into another one.

-- 
If the human mind were simple enough to understand,   =))  Tom Neff
we'd be too simple to understand it. -- Emerson Pugh  ((=  tneff@bfmny0.BFM.COM

peter@ficc.ferranti.com (Peter da Silva) (07/30/90)

In article <1990Jul29.171112.4093@watmath.waterloo.edu> tmatimar@watmath.waterloo.edu (Ted M A Timar) writes:
> I don't see why we spend so much time catering to people who are unwilling
> to upgrade their software to more recent versions.

Unwilling? Or unable?

I have not been able to get B news at a higher patch level than 14 to run
on Xenix System III/286. We don't have the option of upgrading to a newer
O/S: we depend on certain programs that won't run on anything else. I have
sidestepped the problem by switching to C news, at the cost of a 13 hour
unpaid day during the christmas shutdown, plus days of after hours hacking
before and after. And recent versions of C news have broken again... I'm
now several patches behind. News 3.0 was never an option... it's just too
big.
-- 
Peter da Silva.   `-_-'
+1 713 274 5180.   'U`
<peter@ficc.ferranti.com>

cudep@warwick.ac.uk (Ian Dickinson) (08/01/90)

In article <15710@bfmny0.BFM.COM> tneff@bfmny0.BFM.COM (Tom Neff) writes:
>[Binaries] are intrinsically parochial, where text and source are ecumenical.

Good stuff - I like the way that was put.

>The "existence proof" of all the PC binaries carried on BBS's and
>services is also a sufficiency proof.  These folks HAVE their BBS's
>already, and CompuServe, and GEnie, and BIX, etc, etc.  They don't need
>to turn Usenet into another one.

Most people from the BBS crowd who I know like Usenet *BECAUSE* it is ^M
different.  They wouldn't want it to change into a monolithic BBS. ^M
And they're prepared to put in the effort to make sure it stays that way. ^M
^M :-)
--
\/ato.  Ian Dickinson.    GNU's not got BSE.  Food: Hungarian Acacia Honey.
vato@cu.warwick.ac.uk          Plinth.        Machine: Sun SPARCserver 330.
vato@tardis.cs.ed.ac.uk        Sabeq.         Footwear: Airwalk Vic Blotch.
gdd046@cck.cov.ac.uk                          Oxymoron: Intelligent Life.

VERKADE@CTSS.CO.UK (Herman Verkade) (08/02/90)

A couple of comments on 8 bit news. It seems to me that it is not necesary to
convert the whole net to 8 bit. The 7 bit restriction is only a problem for
specific newsgroups: newsgroups in languages other than english and newsgroups
containing binary data, such as bitmaps, .gif files, etc. So, I don't think
**everybody** needs to upgrade to some implementation that supports 8 bits.
Only those that wish to carry newsgroups, that need it. All we would need
is a standard, not necesarily a world-wide upgrade of software.

For example, if the Germans and the Fins decide that their local language
newsgroups will be 8 bit, then that is of no business of the Americans,
British, Spanish or Russians. As long as there is some software that support
8 bits (And C News seems to do that or alternatively a few changes in B News)
and maybe some way of indicating that a particular group expects 8-bit, so
that when posting to a 7 bit group another signature can be used; one that
doesn't have 8 bit characters in it. And if people want to start a new
newsgroup for .gif files in 8 bit mode then, again, the sites that want
to carry it must install an 8 bit version. If your feed doesn't support it,
get another feed for that newsgroup (A similar situation exists in the UK,
where UKC doesn't carry nor forward `alt.sex.pictures'. Sites that want it
get another feed for that group).

The next problem would be how to read an 8 bit group, both for groups that
use different character sets and for groups that carry other 8 bit data. As
someone earlier suggested, maybe an extra header should be added to the
standard that indicates what type of data it is. For mail there are RFC-1049
(Content-Type) and RFC-1154 (Encoding), which are extensions to RFC-822. The
extra header fields would only need to be interpreted by the news reader. So,
only if you want to read an 8 bit group, get an 8 bit reader. As long as we
can agree on some standard.

My proposal would be RFC-1154-style, because it also allows one message to
contain encodings in different parts and could therefore also be used to
automaticaly convert different parts of a message in 7 bit groups. For
example, a message containing a uuencoded file preceded by some explanation
in ASCII and a signature at the bottom, could have a header such as:

    Encoding: 10 text, 1045 uuencode, 5 text

A smart news reader could display the two text parts and ask whether you want
the uuencode bit to be uudecoded. For an article containing a header like:

    Encoding: 15 text, 637 uugif, 5 text

the reader could then automatically extract the uuencoded .gif file and
display an image instead. Etc, etc, etc. And only users that want such
functionality switch to a news reader that supports it.

I realise that I am discussing two seperate topics here:
1) Provide 8 bit transport mechanisms so that international character sets
   can be used, but enable 8 bits only on a newsgroup by newsgroup basis
   with either a designated character set for such a group, or an Encoding
   header to indicate the character set.
2) An Encoding: header for carrying data other that text (in either 7 or
   8 bit groups).

For both I suggest to provide a standard, but not to force anybody to upgrade
to new software. I think this proposal provides for backward compatibility
and allows the requirements of a fair number of net.people.

Herman Verkade

amanda@mermaid.intercon.com (Amanda Walker) (08/03/90)

In article <900802011259.00001B1F@MARVIN.CTSS.CO.UK>, VERKADE@CTSS.CO.UK
(Herman Verkade) writes:
> The 7 bit restriction is only a problem for
> specific newsgroups: newsgroups in languages other than english and newsgroups
> containing binary data, such as bitmaps, .gif files, etc.

It's also a problem for any newsgroup that has non-english-speakers posting
to it.  For example, on alt.sca, the name of one common poster comes out
as "]ke"-- he's Swedish, and the "]" should really be an A with a ring over
it...

Even in groups whose traffic is conducted only in English, there is a growing
proportion of people whose names cannot be spelled with 7-bit ASCII.

--
Amanda Walker <amanda@intercon.com>
InterCon Systems Corporation

ed@braaten.doit.sub.org (Ed Braaten) (08/05/90)

VERKADE@CTSS.CO.UK (Herman Verkade) writes:

>A couple of comments on 8 bit news. It seems to me that it is not necesary to
>convert the whole net to 8 bit. The 7 bit restriction is only a problem for
>specific newsgroups: newsgroups in languages other than english and newsgroups
>containing binary data, such as bitmaps, .gif files, etc. So, I don't think
>**everybody** needs to upgrade to some implementation that supports 8 bits.
>Only those that wish to carry newsgroups, that need it. All we would need
>is a standard, not necesarily a world-wide upgrade of software.

I think this is the right approach to the problem.  If it works, don't
fix it! ;-)  But give the non-English and binary people a chance.  A 
standard, however is an absolute must.

>My proposal would be RFC-1154-style, because it also allows one message to
>contain encodings in different parts and could therefore also be used to
>automaticaly convert different parts of a message in 7 bit groups. For
>example, a message containing a uuencoded file preceded by some explanation
>in ASCII and a signature at the bottom, could have a header such as:

>    Encoding: 10 text, 1045 uuencode, 5 text

>A smart news reader could display the two text parts and ask whether you want
>the uuencode bit to be uudecoded. For an article containing a header like:

>    Encoding: 15 text, 637 uugif, 5 text

>the reader could then automatically extract the uuencoded .gif file and
>display an image instead. Etc, etc, etc. And only users that want such
>functionality switch to a news reader that supports it.

How about it?  Could we get the author of nn sold on this?  (I'm
crossposting this article to n.s.nn to find out...)

>I realise that I am discussing two seperate topics here:
>1) Provide 8 bit transport mechanisms so that international character sets
>   can be used, but enable 8 bits only on a newsgroup by newsgroup basis
>   with either a designated character set for such a group, or an Encoding
>   header to indicate the character set.
>2) An Encoding: header for carrying data other that text (in either 7 or
>   8 bit groups).

I like your suggestions Herman.  What about the rest of the net?
Opinions?  Comments?

Greetings from Munich,

Ed

---------------------------------------------------------------------------
        Ed Braaten             |  Jesus answered,  "I am the way and the
Work: ed@imuse.de.intel.com    |  truth and the life.  No one comes to the
Home: ed@braaten.doit.sub.org  |  Father except through me."   John 14:6 
---------------------------------------------------------------------------

richard@pegasus.com (Richard Foulk) (08/06/90)

In article <N8N&T+D@b-tech.uucp> zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) writes:
>It looks like the latest trn (threaded rn) will suport the display of 
>eighth bit set characters.  With it and cnews*, one can already run a 
>local or controlled distribution group with a 8 bit character set (eg,
>iso8859/1).  
>

Yes, but trn is total vaporware.


-- 
Richard Foulk		richard@pegasus.com

mcmahon@tgv.com (John McMahon) (08/06/90)

In article <3863a@braaten.doit.sub.org>, ed@braaten.doit.sub.org (Ed Braaten) writes...
>>My proposal would be RFC-1154-style, because it also allows one message to
>>contain encodings in different parts and could therefore also be used to
>>automaticaly convert different parts of a message in 7 bit groups. For
>>example, a message containing a uuencoded file preceded by some explanation
>>in ASCII and a signature at the bottom, could have a header such as:
> 
>>    Encoding: 10 text, 1045 uuencode, 5 text

My understanding is that an RFC is in the works for "non-textual tranmission of
data via E-mail".  I suspect this could be easily expanded to include USENET
NEWS.

Watch the NIC for announcements of new RFCs...

John 'Fast-Eddie' McMahon    :    MCMAHON@TGV.COM    : TTTTTTTTTTTTTTTTTTTTTTTT
TGV, Incorporated            :                       :    T   GGGGGGG  V     V
603 Mission Street           : HAVK (abha) Gur bayl  :    T  G          V   V
Santa Cruz, California 95060 : bcrengvat flfgrz gb   :    T  G    GGGG   V V
408-427-4366 or 800-TGV-3440 : or qrfgeblrq ol znvy  :    T   GGGGGGG     V