[news.misc] Random order news

sommar@enea.se (Erland Sommarskog) (06/20/88)

I am not any Unix hacker, nor am I any news wizard, just a ordinary
Usenetter. I realize that what I'm asking for is not easy to achieve,
yet I think the problem should be addressed. It is a nuisance, and
I am not the only one to be annoyed.

The problem: News articles seem to come in an almost random order.
Seeing the follow-up before the original is just as common as the 
opposite. Yes, there are easy explanations to this, I know, but
sometimes you could wonder. An example: There was a lot of articles
in rec.music.misc/gdead with the subject "First and last visit to
a Greatful Dead concert". I saw something 10-15 follow-ups before 
I got the original article. Some of the prematures were posted in 
June, while the original stemmed from May 24th. A friend of mine at 
Yale, told me he received them in the right order.
  Since the end of May, mcvax have had problems/overload, so the 
news feed to Europe has been (and still is) behind. You are on   
June 8th, when suddenly articles from the 14th turns up. And
back to 8th and so on. (All of them from America.)

Where is the problem?  Mcvax takes all American news from uunet, 
so the local problems shouldn't infer? Or? Is the problem within 
the American part of Usenet? But why did Yale get it in right order? 
Or is the problem at uunet or in the uunet-husc6 connection?
  Questions, question. If anyone could answer I'd be happy. If 
anyone could say "we're planning improvement in the news software",
I'd be even more grateful. (E.g. Mcvax could when taking news
from uunet in date order, instead the order of arrival to uunet.
The at least every batch could be correct.)

If you follow-up, I am glad if you mail me a copy. As I said,
the news flow here is slow. And I don't read this group normally.
-- 
Erland Sommarskog           
ENEA Data, Stockholm        
sommar@enea.UUCP            
Mail your NO votes for rec.music.rock to: jfc%Athena.mit.edu@mit-eddie.UUCP

rmtodd@uokmax.UUCP (Richard Michael Todd) (06/21/88)

Well, I don't claim to know all the details of why Usenet propagation behaves
the way it does, but I'll try.  No doubt if I really screw it up someone
will point it out....
In article <3543@enea.se> sommar@enea.se (Erland Sommarskog) writes:
>I am not any Unix hacker, nor am I any news wizard, just a ordinary
>Usenetter. I realize that what I'm asking for is not easy to achieve,
>yet I think the problem should be addressed. It is a nuisance, and
>I am not the only one to be annoyed.
>The problem: News articles seem to come in an almost random order.
...
>I got the original article. Some of the prematures were posted in 
>June, while the original stemmed from May 24th. A friend of mine at 
>Yale, told me he received them in the right order.

Happens here too, though usually not as drastically--propagation times
are shorter here.  The fact that one site saw them in the right order
is no guarantee whatsoever that others will.  

>Where is the problem?  

  First of all, a brief lecture on how netnews works.  When a site 
receives a news batch (or a single article) it takes each article, copies
it into the appropriate /usr/spool/news directory where you end up reading
it, and stores the articles in batches it prepares to feed for neighbor
sites.  (Actually, at least in C News, the article id are merely listed
to a file and the batches are made up later.  I think B News works similarly,
but I've only got ready access to a copy of C News.)  So if two articles come
in in the same batch and can go out in the same batch (batches are usually
made up to be about the same size, and there is no requirement that 
the outgoing batch size be the same as the incoming), their order will
be preserved.  
  However, if the articles arrive in different batches, there is no guarantee
that they will arrive in order.  The batches are stored up in the outgoing
uucp directory waiting transfer to the other system, but there is no
guarantee whatsoever that the current uucp software will transmit the
batches in the same order they were queued.  Although I don't know for
sure just how uucico scans for files awaiting transfer, I strongly suspect
it just opens the directory and reads each directory entry looking for 
files destined for the system (that's the simplest way to do it). Especially
on very active sites where files are being added to and deleted from the uucp
spool directory all the time, the order the files appear in in the directory
will bear almost no resemblance to the order in which they were put into the
directory.
  I believe that overseas sites like mcvax only contact uunet on weekends
so that their phone bills are lower; at any rate the two sites apparently
make contact relatively infrequently (compared to other Usenet links).  
If mcvax only gets news batches every week from uunet, by the time it
dials in there are a whole week's worth of news batches waiting for it.
And as I mentioned above, they're not going to be in any particular order.
Seeing a June 14 posting in the middle of a whole bunch from June 8 is
hardly surprising.  

>the American part of Usenet? But why did Yale get it in right order? 
Presumably they get news batches in often enough so that only, say, 1
day's worth of news is sitting in the queue getting randomly shuffled.
It all depends on how often one's site contacts its neighbors for news
(and how often the upstream sites contact *their* neighbors).

>anyone could say "we're planning improvement in the news software",
>I'd be even more grateful. (E.g. Mcvax could when taking news
>from uunet in date order, instead the order of arrival to uunet.
>The at least every batch could be correct.)

It would probably help some.  The question is, does Rick Adams really
want to hack on his uucp software?  Given that his site supports a whole
horde of uucp sites, I'm not sure I'd want to risk messing with the 
software, as long as it still worked.  For that matter, the extra overhead
involved in sorting queued uucp jobs by time may itself be too long to
do it in uucico; there's little merit in making sure the jobs are arranged
in order if the connection has timed out by the time you've finished
sorting.  
  I should mention in closing that this problem of Usenet not preserving
the order of articles is hardly new.  It's been around as long as I can
remember reading netnews (ca. 4 yrs now).  I even remember all the chaos
it caused with notes, which implicitly assumes that base notes always
arrive before followups.  (Remember "Orphaned Response"?)  Thankfully this
site no longer has to put up with notes...
  Final note: this problem is hardly unique to Usenet, as one can readily
discover by checking out one's local Fidonet Echos (an Echo is the
Fido analogue of a newsgroup).  Scrambled message ordering occurs almost
all the time on Fidonet, probably for the same reasons.
-- 
Richard Todd		Dubious Domain: rmtodd@uokmax.ecn.uoknor.edu
USSnail:820 Annie Court,Norman OK 73069 	Fido:1:147/1
UUCP: {many AT&T sites}!occrsh!uokmax!rmtodd
"MSDOS is a Neanderthal operating system" - Henry Spencer

bill@carpet.WLK.COM (Bill Kennedy) (06/21/88)

In article <3543@enea.se> sommar@enea.se (Erland Sommarskog) writes:

[ Complaints about the sequence with which news arrives deleted ]

For what ever reason, news seems to operate on GMT (UTC?).  The time
stamp in the post is that way, and it appears to be presented that
way when it's set up for reading.  How about if some wizard figures
out how to sort these before rnews squirrels them away?

>Erland Sommarskog           
>ENEA Data, Stockholm        

I think that the rest of us should pay special attention to our European
neighbors.  Actually our intercontinental neighbors.  Erland explains a
problem that plagues most any site, how 'bout someone tackling it?
-- 
Bill Kennedy  Internet:  bill@ssbn.WLK.COM
                Usenet:  { killer | att-cb | ihnp4!tness7 }!ssbn!bill

fair@ucbarpa.Berkeley.EDU (Erik E. Fair) (06/22/88)

This is a user interface problem - when the user interfaces learn
to sort by date, you win. Until then, you are at the mercy of the
transport system (which is not going to change *that* fundamentally)
which delivers things out of order.

	it's only a small matter of software,

	Erik E. Fair	ucbvax!fair	fair@ucbarpa.berkeley.edu

P.S.	Of course, if you *really* want to see who responded to what,
	you should be looking at Message-IDs and References.

jerry@oliveb.olivetti.com (Jerry Aguirre) (06/23/88)

In article <4025@pasteur.Berkeley.Edu> fair@ucbarpa.Berkeley.EDU (Erik E. Fair) writes:
>This is a user interface problem - when the user interfaces learn
>to sort by date, you win.

I have thought about this and I have a suggestion.  Having the user
interface open all the articles in a group so that it can parse the date
is too much overhead.  A simple solution is to have rnews set the
modification time of each article to the date in the header.  It is then
possible to get the article date by just stat'ing the file.

Once this is done many things become practical.  Asside from the news
readers a utility can easily be written to sort a batch/nntp input file
by article date instead of arrival time.  This wouldn't be reliable for
UUCP transmission but would still help even out the propagation delay.

Of course some of the strange dates in messages are going to create some
strange file times.  But hey, that kind of thing makes usenet
interesting.  ("Rn won't let me read this article yet because it was
written tomorrow.")

sommar@enea.se (Erland Sommarskog) (07/04/88)

Erik E. Fair (fair@ucbarpa.Berkeley.EDU) writes:
>This is a user interface problem - when the user interfaces learn
>to sort by date, you win. Until then, you are at the mercy of the
>transport system (which is not going to change *that* fundamentally)
>which delivers things out of order.

I don't think the user interface is good place to handle this. 
Every time I enter the group there would be start-up time equal 
to the time for my first "k" in a group today. A little annoying. 
  Also, this would have to done for every user, instead of once 
if it was the response of the transport system. 
  Another loss is that it only helps me within the same session. 
The case is not unsual when the predecessor comes in another batch 
than the follow-up, and that could be days later. And it is these 
big delays that are really annoying.

To me it doesn't unreasonable to implement more sophisticated
selectiob algorithms could be implemented at transport level
at at least the more central sites, like uunet or mcvax.      
-- 
Erland Sommarskog           
ENEA Data, Stockholm        
sommar@enea.UUCP            
"It all looks quite impressive really, but is it really necessary?" Radio Stars

shields@ists.yorku.ca (Paul Shields) (07/07/88)

In article <3652@enea.se>, sommar@enea.se (Erland Sommarskog) writes:
> Erik E. Fair (fair@ucbarpa.Berkeley.EDU) writes:
> >This is a user interface problem - when the user interfaces learn
> >to sort by date, you win. Until then, you are at the mercy of the
> >transport system (which is not going to change *that* fundamentally)
> >which delivers things out of order.
> 
> I don't think the user interface is good place to handle this. 
> [...]

Why not implement it in both the user interface AND transport layers?
Before you call me excessive, look at this: 

    If you implement it in the transport layers alone, you have to
re-order every time a new batch comes in.  This is ludicrous based on the
current file structures.

Implementing part of the re-ordering in, say, sendbatch, will help 
you to keep the out-of-order articles from getting too far apart.
Then your user interface comes along, and doesn't have to do very much
work to complete the reordering.  

Besides, I'd like to be able to sort by posting-time.  Now all we have to
do is get everyone's clocks synchronised.  Probably a harder problem! :-)

Paul Shields, shields@ists.yorku.CA, shields@yunccn.UUCP
(...utzoo!yunexus!ists, ...mnetor!ontmoh!yunccn)!shields
It's amazing just how long it takes to get nothing done.

weemba@garnet.berkeley.edu (Obnoxious Math Grad Student) (07/08/88)

In article <170@ists>, shields@ists (Paul Shields) writes:
>Besides, I'd like to be able to sort by posting-time.

Gnews could do that.  I don't know if anyone has written the necessary
code, but it would be fairly straightforward.

ucbvax!garnet!weemba	Matthew P Wiener/Brahms Gang/Berkeley CA 94720