[news.admin] List of sites with broken Followup

brad@looking.UUCP (Brad Templeton) (05/10/89)

Recently I wrote some software to make use of the References line in news
articles.  Much to my dismay, I found that most articles don't have a valid
References: line!   The reason is a small number of sites with broken posting
programs that don't include a References: line on followups.  I scanned
65,000 news articles, and 3700 of them matched this expression:

	if( !followup && subject has "^re:" )
		-- bad article --
(The subject starts with re: but there is no References: line)

Now only 5%, that's not so bad, right?  Wrong.  Every followup to one of
these bad articles ALSO has a broken reference chain, and is not linked to
the parent.   All it takes is just a few sites to make the References: line
useless.

We have been trying to get that line to be usable for years, but it will never
be unless we insist that people follow the RFC1036 News Article format.

I propose we give these sites a while to clean up their act.  Then we should
install an automatic program at some site.  This program should, when it
detects an article matching the above pattern, send a mail message to the
postmaster or usenet user at the poster's domain and the sender's domain
(if different) informing them of the bug in their program and asking them
to fix it.

Annoying?  Perhaps.   What you do on your own site, and what readers you
run are your own business.  But what you POST has to follow the standards,
and that's everybody's business.

If we can't fix this problem, we will just have to abandon the References:
line and all the useful things that can be done with it.  (Like USENET
hypertext threads etc.)

I have the full details on sender/from pairs for these bad articles.  That's
too much to post, so I will just list the offending Sender: and From: line
sites below, with the counts of bad articles.   One clear problem is
messages from Fidonet and some BITNET relays.   For Unix sites, working
postnews software is available now, for free.

(If your count is very small compared to your news output, it may be
a local bug for some types of posting, or it may be users who simply
typed in 're:' in a manual subject line.)

65450 valid articles, 3696 invalid articles

Here are the offending sites found in 'Sender:' lines:

 811 rice.edu
 513 ucbvax.berkeley.edu
 178 andrew.cmu.edu
  70 orbit.uucp
  55 bunker.uucp
  53 gryphon.com
  53 dogie.macc.wisc.edu
  46 watmath.waterloo.edu
  46 cbnews.att.com
  44 vector.dallas.tx.us
  44 lindy.stanford.edu
  43 tank.uchicago.edu
  43 canremote.uucp
  36 tut.cis.ohio-state.edu
  34 ames.arc.nasa.gov
  30 udel.edu
  30 crash.cts.com
  29 rpi.edu
  29 hubcap.clemson.edu
  27 eddie.mit.edu
  26 geneva.rutgers.edu
  26 bloom-beacon.mit.edu
  26 bbn.com
  25 sei.cmu.edu
  23 sun.eng.sun.com
  23 adm.brl.mil
  21 ucdavis.ucdavis.edu
  20 yale.uucp
  20 stjhmc.fidonet.org
  20 agate.berkeley.edu
  19 watdragon.waterloo.edu
  18 van-bc.uucp
  18 pyramid.pyramid.com
  18 iuvax.cs.indiana.edu
  17 mamab.fidonet.org
  15 aber-cs.uucp
  14 jumbo.dec.com
  14 eos.uucp
  13 stiatl.uucp
  13 cs.ucla.edu
  13 arizona.edu
  12 unix.sri.com
  11 ruby.dec.com
  11 orion.cf.uci.edu
  10 sdcsvax.ucsd.edu
  10 psuvm.bitnet
  10 pacbell.com
  10 isishq.fidonet.org
  10 heimat.uucp
  10 b.gp.cs.cmu.edu
   9 yvax.byu.edu
   9 saturn.ucsc.edu
   9 rutgers.rutgers.edu
   9 oxy.edu
   9 egvideo.uucp
   9 alice.uucp
   8 watcgl.waterloo.edu
   8 uswat.uucp
   8 tekgvs.labs.tek.com
   8 psuarlc.bitnet
   8 polya.stanford.edu
   8 mudos.ann-arbor.mi.us
   8 garth.uucp
   8 fidogate.fidonet.org
   8 ai.toronto.edu
   7 tekred.cna.tek.com
   7 sequent.uucp
   7 rust.dec.com
   7 ncoast.uucp
   7 mindlink.uucp
   7 cs.utexas.edu
   7 brunix.uucp
   6 spice.cs.cmu.edu
   6 spdcc.com
   6 portia.stanford.edu
   6 paris.ics.uci.edu
   6 nmtsun.nmt.edu
   6 net.bio.net
   6 mountn.dec.com
   6 mcastl.fidonet.org
   6 ihlpa.att.com
   6 hcx1.ssd.harris.com
   6 apollo.com
   5 vax1.tcd.ie
   5 usc.edu
   5 uhura.cc.rochester.edu
   5 twwells.uucp
   5 ssc-vax.uucp
   5 solbourne.com
   5 pasteur.berkeley.edu
   5 orchid.waterloo.edu
   5 nunki.usc.edu
   5 netnews.upenn.edu
   5 mitisft.convergent.com
   5 mathcs.emory.edu
   5 littlei.uucp
   5 ihlpf.att.com
   5 ibmpcug.uucp
   5 gpu.utcs.utoronto.ca
   5 gollum.uucp
   5 ecsvax.uucp
   5 dartvax.dartmouth.edu
   5 cunyvm.cuny.edu
   5 cs.cmu.edu
   5 cmhgate.fidonet.org
   5 barilvm.bitnet
   4 zodiac.uucp
   4 wpi.wpi.edu
   4 vicorp.uucp
   4 uxe.cso.uiuc.edu
   4 uxa.cso.uiuc.edu
   4 ux1.cso.uiuc.edu
   4 sun.soe.clarkson.edu
   4 stag.uucp
   4 serene.uucp
   4 sbsvax.uucp
   4 rb-dc1.uucp
   4 raybed2.uucp
   4 psuecl.bitnet
   4 omepd.uucp
   4 ncsuvx.ncsu.edu
   4 mailcom.fidonet.org
   4 m.cs.uiuc.edu
   4 linus.uucp
   4 leah.albany.edu
   4 lafcol.uucp
   4 jsheese.fidonet.org
   4 jarthur.claremont.edu
   4 hogbbs.fidonet.org
   4 hannah.dec.com
   4 csli.stanford.edu
   4 cs.swarthmore.edu
   4 computing-maths.cardiff.ac.uk
   4 cisunx.uucp
   4 chris
   4 cbnewsl.att.com
   4 brains.uucp
   4 bmug.fidonet.org
   4 apple.com
   4 amelia.nas.nasa.gov
   3 vaxa.uwa.oz
   3 uxf.cso.uiuc.edu
   3 uwovax.uwo.ca
   3 uvm-gen.uucp
   3 usl-pc.usl.edu
   3 sgi.sgi.com
   3 quanta.eng.ohio-state.edu
   3 psych.stanford.edu
   3 pbseps.uucp
   3 otter.hpl.hp.com
   3 nmsu.edu
   3 munsell.uucp
   3 muadib.fidonet.org
   3 ms.uky.edu
   3 mips.com
   3 mas.uucp
   3 lucid.com
   3 leadsv.uucp
   3 kuhub.cc.ukans.edu
   3 isl.stanford.edu
   3 ihuxz.att.com
   3 hwlab.columbia.ncr.com
   3 hub.ucsb.edu
   3 hal.uucp
   3 gssc.uucp
   3 fas.ri.cmu.edu
   3 discg1.uucp
   3 cs.rochester.edu
   3 cs.rit.edu
   3 cps3xx.uucp
   3 caen.engin.umich.edu
   3 busker.fidonet.org
   3 bigq.dec.com
   3 batcomputer.tn.cornell.edu
   3 aplcomm.jhuapl.edu
   3 ?
   2 yunccn.uucp
   2 white.toronto.edu
   2 wheaton.uucp
   2 vi.ri.cmu.edu
   2 versatc.uucp
   2 vax1.acs.udel.edu
   2 uts.amdahl.com
   2 ut-emx.uucp
   2 urartu.uucp
   2 uklirb.uucp
   2 tmsoft.uucp
   2 think.uucp
   2 telesoft.uucp
   2 tekcrl.labs.tek.com
   2 taux01.uucp
   2 swrinde.nde.swri.edu
   2 swan.ulowell.edu
   2 stsci.edu
   2 srhqla.uucp
   2 spool.cs.wisc.edu
   2 skivs.uucp
   2 servax0.essex.ac.uk
   2 sdsu.uucp
   2 sdcc15.ucsd.edu
   2 sam.cs.cmu.edu
   2 quintus.uucp
   2 pucc.princeton.edu
   2 phoenix.princeton.edu
   2 ntvax.uucp
   2 ns.network.com
   2 ncsuvm.bitnet
   2 nadc.arpa
   2 mva.cs.liv.ac.uk
   2 munnari.oz
   2 mit-amt
   2 mimsy.uucp
   2 metnet.fidonet.org
   2 mcnc.org
   2 marob.masa.com
   2 maine.bitnet
   2 madnix.uucp
   2 lll-lcc.uucp
   2 lerami.uucp
   2 laic.uucp
   2 insight
   2 imspw6.uucp
   2 imagen.uucp
   2 ihlpb.att.com
   2 iesd.dk
   2 hpcvlx.hp.com
   2 hpctdke.hp.com
   2 hound.uucp
   2 holin.att.com
   2 gwusun.gwu.edu
   2 godot.radonc.unc.edu
   2 ginosko.samsung.com
   2 garcon.cso.uiuc.edu
   2 eva.slu.se
   2 enea.se
   2 druco.att.com
   2 dowjone.uucp
   2 dayton.uucp
   2 cybaswan.uucp
   2 cunyvm.bitnet
   2 csuf3b.uucp
   2 cit-vax.caltech.edu
   2 cg-atla.uucp
   2 carroll1.uucp
   2 buengc.bu.edu
   2 boulder.colorado.edu
   2 blake.acs.washington.edu
   2 bigtime.fidonet.org
   2 berlioz
   2 axion.bt.co.uk
   2 ardec.arpa
   2 aipna.ed.ac.uk
   2 ai.utoronto.ca
   2 adobe.com
   1 zeus.unl.edu
   1 zehntel.uucp
   1 zaphod.ncsa.uiuc.edu
   1 yamnet.uucp
   1 xyzzy.uucp
   1 xn.ll.mit.edu
   1 wooglin.scc.com
   1 wiley.uucp
   1 watvlsi.waterloo.edu
   1 watrose.uucp
   1 water.waterloo.edu
   1 watarts.waterloo.edu
   1 wasatch.utah.edu
   1 warwick.uucp
   1 warwick.ac.uk
   1 wacsvax.oz
   1 venera.isi.edu
   1 vax5.cit.cornell.edu
   1 vax1.cc.lehigh.edu
   1 uxh.cso.uiuc.edu
   1 uxg.cso.uiuc.edu
   1 utstat.uucp
   1 utastro.uucp
   1 urchin.fidonet.org
   1 unmvax.unm.edu
   1 unisoft.uucp
   1 unify.uucp
   1 umn-d-ub.d.umn.edu
   1 umn-cs.cs.umn.edu
   1 umbc3.umbc.edu
   1 ultb.uucp
   1 uhnix1.uh.edu
   1 uhccux.uhcc.hawaii.edu
   1 ubvax.uucp
   1 ubbs-nh.mv.com
   1 tut.fi
   1 trsvax.uucp
   1 thor.acc.stolaf.edu
   1 tellab5.tellabs.chi.il.us
   1 teksce.sce.tek.com
   1 tekno.chalmers.se
   1 teklds.cae.tek.com
   1 td2cad.intel.com
   1 tc.fluke.com
   1 tallis.dec.com
   1 sybase.sybase.com
   1 sunkisd.cs.concordia.ca
   1 sun.uucp
   1 stellar.uucp
   1 star.dec.com
   1 sq.com
   1 soleil.uucp
   1 smoke.brl.mil
   1 sjs.sj.ate.slb.com
   1 silver.bacs.indiana.edu
   1 siemens.siemens.com
   1 sas.uucp
   1 sarek.uucp
   1 sagpd1.uucp
   1 s.cs.uiuc.edu
   1 rtmvax.uucp
   1 robotics.jpl.nasa.gov
   1 reed.uucp
   1 redsox.bsw.com
   1 recondo.uucp
   1 randvax.uucp
   1 quintro.uucp
   1 qucis.queensu.ca
   1 pyr.gatech.edu
   1 pur-phy
   1 polyslo.calpoly.edu
   1 polyof.uucp
   1 pie1.mach.cs.cmu.edu
   1 philabs.philips.com
   1 perle.uucp
   1 pcsbst.uucp
   1 panda.uucp
   1 p.cs.uiuc.edu
   1 oucsace.cs.ohiou.edu
   1 osiris.cso.uiuc.edu
   1 orioneb.fidonet.org
   1 oresoft.uu.net
   1 oregon.uoregon.edu
   1 ohstpy.mps.ohio-state.edu
   1 nugipsy.uucp
   1 nlm-mcs.arpa
   1 netmbx.uucp
   1 neoucom.uucp
   1 neabbs.uucp
   1 ndsuvax.uucp
   1 ncr-sd.sandiego.ncr.com
   1 ncis.tis.llnl.gov
   1 ncelvax.uucp
   1 naucse.uucp
   1 nancy.uucp
   1 mtunf.att.com
   1 mtunb.att.com
   1 mtgzz.att.com
   1 mitvma.mit.edu
   1 mit-caf.mit.edu
   1 mipos3.intel.com
   1 metapsy.uucp
   1 mentor.cc.purdue.edu
   1 medsoft.uucp
   1 mcdurb.urbana.gould.com
   1 masscomp.uucp
   1 m2xenix.uucp
   1 lzss.att.com
   1 logico.uucp
   1 lindenthal.cae.ri.cmu.edu
   1 lgnp1.ls.com
   1 lakeoz.uucp
   1 kcdev.uucp
   1 julian.uwo.ca
   1 jpl-devvax.jpl.nasa.gov
   1 jjmhome.uucp
   1 jhunix.hcf.jhu.edu
   1 jato.jpl.nasa.gov
   1 itivax.iti.org
   1 ism780c.isc.com
   1 inria.inria.fr
   1 inmet
   1 infmx.uucp
   1 imagine.pawl.rpi.edu
   1 ima.ima.isc.com
   1 iisat.uucp
   1 ihlpm.att.com
   1 ihlpl.att.com
   1 ihlpe.att.com
   1 idca.tds.philips.nl
   1 i.cc.purdue.edu
   1 husc6.harvard.edu
   1 hrc63.co.uk
   1 hplsla.hp.com
   1 hpctdls.hp.com
   1 hpcilzb.hp.com
   1 hounx.att.com
   1 hou2d.att.com
   1 hitchrack.stanford.edu
   1 helios.ee.lbl.gov
   1 hcx1.uucp
   1 haddock.ima.isc.com
   1 gt-eedsp.uucp
   1 granite.dec.com
   1 gmu90x.uucp
   1 glacier.stanford.edu
   1 geovision.uucp
   1 genrad.uucp
   1 geac.uucp
   1 galaxy.rutgers.edu
   1 fungus.dec.com
   1 fredonia.uucp
   1 frame.uucp
   1 foil.uucp
   1 flatline.uucp
   1 felix.uucp
   1 f.gp.cs.cmu.edu
   1 etive.ed.ac.uk
   1 ethz.uucp
   1 esquire.uucp
   1 elroy.jpl.nasa.gov
   1 edsel.uucp
   1 ecf.toronto.edu
   1 drutx.att.com
   1 drd.uucp
   1 dialog.uucp
   1 dgbt.uucp
   1 dftsrv.gsfc.nasa.gov
   1 devildog.uucp
   1 deming.dec.com
   1 deimos.cis.ksu.edu
   1 decwrl.dec.com
   1 dcdwest.uucp
   1 data-io.com
   1 daimi.dk
   1 cwjcc.cwru.edu
   1 cvman.uucp
   1 cup.portal.com
   1 ctycal.com
   1 csusac.uucp
   1 cs.odu.edu
   1 cs.helsinki.fi
   1 cs.glasgow.ac.uk
   1 cosmo.uucp
   1 cornell.uucp
   1 copper.mdp.tek.com
   1 convex.uucp
   1 columbia.edu
   1 cmx.npac.syr.edu
   1 charon.unm.edu
   1 cfa.harvard.edu
   1 ccnysci.uucp
   1 cci632.uucp
   1 cbnewsh.att.com
   1 cbmvax.uucp
   1 cb.ecn.purdue.edu
   1 cattell.psych.upenn.edu
   1 calgary.uucp
   1 cacilj.uucp
   1 bucket.uucp
   1 bu-cs.bu.edu
   1 bsu-cs.bsu.edu
   1 brl.mil
   1 briarpatch.uucp
   1 bourbon.ee.tulane.edu
   1 bongo.uucp
   1 bgsuvax.uucp
   1 beta.lanl.gov
   1 bentley.uucp
   1 belltec.uucp
   1 bdt.uucp
   1 bcm.tmc.edu
   1 batcave.uucp
   1 bally.bally.com
   1 babbage.acc.virginia.edu
   1 auvax.uucp
   1 aucis.uucp
   1 astroatc.uucp
   1 aramis.rutgers.edu
   1 applga.uucp
   1 amsoft.uucp
   1 americ.uucp
   1 alux2.att.com
   1 alliant.com
   1 alembic.uucp
   1 agora.uucp
   1 aerospace.aero.org
   1 accuvax.nwu.edu.nwu.edu
   1 a.lanl.gov

Here are the offending sites from 'From:' lines:  Only the worst offenders
are listed, as there are over 1000 sites in this list.   If I get a lot
of requests, I will post the whole list that relates Senders to From.

 185 andrew.cmu.edu
 124 uunet.uu.net
  70 pnet51.cts.com
  53 vms.macc.wisc.edu
  50 sun.com
  45 forsythe.stanford.edu
  44 trantor.harris-atd.com
  43 canremote.uucp
  40 gsbacd.uchicago.edu
  38 gryphon.com
  33 krypton.arc.nasa.gov
  32 watmath.waterloo.edu
  30 bbn.com
  27 pnet01.cts.com
  26 venus.ycc.yale.edu
  25 sei.cmu.edu
  24 xerox.com
  24 will johnson
  23 nss.cs.ucl.ac.uk
  23 gaffa.mit.edu
  22 ulkyvx.bitnet
  21 cunyvm.cuny.edu
  21 cobra.mitre.org
  21 athena1.sun.com
  20 ames.arc.nasa.gov
  19 ucbvax.berkeley.edu
  19 iuvax.cs.indiana.edu
  17 lpami.wimsey.bc.ca
  17 cis.ohio-state.edu
  17 castor.ucdavis.edu
  16 helios.tn.cornell.edu
  15 ruby.dec.com
  15 pnet02.cts.com
  15 cs.utexas.edu
  15 arizona.edu
  15 aber-cs.uucp
  14 lanl.gov
  14 jumbo.dec.com
  14 grand.waterloo.edu
  14 eos.uucp
  13 stiatl.uucp
  13 h.gp.cs.cmu.edu
  12 wooglin.scc.com
  12 sri-nic.arpa
  12 central.sun.com
  11 watdragon.waterloo.edu
  11 shell.uucp
  11 psuvm.bitnet
  11 postgres.berkeley.edu
  11 bu-it.bu.edu
  11 ai.toronto.edu
  10 wpi.wpi.edu
  10 pacbell.com
  10 mcc.com
  10 heimat.uucp
  10 gateway.mitre.org
  10 gatech.edu
  10 egvideo.uucp
  10 beowulf.ucsd.edu
  10 b.gp.cs.cmu.edu
  10 apollo.com
   9 yvax.byu.edu
   9 vmsa.cf.uci.edu
   9 utorphys.bitnet
   9 uhura.cc.rochester.edu
   9 rust.dec.com
   9 richter.mit.edu
   9 pyrps5
   9 oxy.edu
   9 lnic1.hprc.uh.edu
   9 dockmaster.ncsc.mil
   9 apple.com
   9 alice.uucp
   8 vuse.vanderbilt.edu
   8 tekgvs.labs.tek.com
   8 pyrtech
   8 psuarlc.bitnet
   8 mudos.ann-arbor.mi.us
   8 m.cs.uiuc.edu
   8 lafcol.uucp
   8 garth.uucp
   8 f555.n161.z1.fidonet.org
   8 cbnews.att.com
   8 brl.mil
   8 att.att.com

Get with it folks!
-- 
Brad Templeton, Looking Glass Software Ltd.  --  Waterloo, Ontario 519/884-7473

lear@NET.BIO.NET (Eliot Lear) (05/10/89)

Unfortunately, your statistics are probably flagging many messages
that are gatewayed between mail and news.  Mail has no concept of a
references line.  Maybe we should consider incorporating the mechanism
into mail.
-- 
Eliot Lear
[lear@net.bio.net]

wisner@terminator.cc.umich.edu (Bill Wisner) (05/10/89)

I have noticed that some mail/news gateway software makes at least
a small effort to build a valid References: line, in that if it finds
a valid-looking message-ID in an In-Reply-To: line it puts that ID
into a References: line.

Of course, this may cause a new problem, if the message cited was
never gatewayed into USENET. But I doubt it.

jwl@ernie.Berkeley.EDU (James Wilbur Lewis) (05/10/89)

In article <3222@looking.UUCP> brad@looking.UUCP (Brad Templeton) writes:
-Recently I wrote some software to make use of the References line in news
-articles.  Much to my dismay, I found that most articles don't have a valid
-References: line!   The reason is a small number of sites with broken posting
-programs that don't include a References: line on followups.  I scanned
-65,000 news articles, and 3700 of them matched this expression:
-
-	if( !followup && subject has "^re:" )
-		-- bad article --
-(The subject starts with re: but there is no References: line)
-
-Now only 5%, that's not so bad, right?  Wrong.  Every followup to one of
-these bad articles ALSO has a broken reference chain, and is not linked to
-the parent.   All it takes is just a few sites to make the References: line
-useless.
-
-(If your count is very small compared to your news output, it may be
-a local bug for some types of posting, or it may be users who simply
-typed in 're:' in a manual subject line.)
-
-65450 valid articles, 3696 invalid articles

In the #2 spot:

- 513 ucbvax.berkeley.edu

ucbvax is a news server for a bunch of heavily-used local machines, and
a lot of news gets posted from there.  We use rn and Pnews.  I think we
can safely assume that Eric Fair is a competent (!) netnews administrator.
I think there might be a couple of things going on that you may not have
considered:

Rather than people manually inserting a "re:" in the subject lines of
articles which are really basenotes, it is possible that they are
deleting the References: lines of followups, in order to avoid the
dreaded "interp buffer overflow" from rn.  

If this is really what is going on, I suggest one of the following
patches to the software:  (1) modify rn to allow arbitrarily long
Reference: lines, or truncate them automatically when they get
unwieldy, or (2) only keep a reference to the article's immediate
parent, since the remainder of the current References: line could
be reconstructed from that.

Another explanation might be in your counting program -- what does it do
for malformed reference lines?  Many people incorrectly do a global
replacement of ">" for some alternate character, to defeat the
inews 50% included-text rule.  This messes up References: lines,
which, while marginally less annoying than extraneous "inews fodder",
is still a bother.  

-- Jim Lewis
   U.C. Berkeley

karl@ddsw1.MCS.COM (Karl Denninger) (05/10/89)

In article <3222@looking.UUCP> brad@looking.UUCP (Brad Templeton) writes:
>Recently I wrote some software to make use of the References line in news
>articles.  Much to my dismay, I found that most articles don't have a valid
>References: line!   The reason is a small number of sites with broken posting
>programs that don't include a References: line on followups.  

I hear this.  Our software gateways Usenet to our local BBS format
(AKCSNet), and it uses the references line to figure out where to put the
responses to an article.

Of course, where there is no references line, it assumes that there is a new
item there, and starts a new thread.  Very annoying.

What's just as bad is the people who do a "g/>/s//$/g" on an article in order
to remove the quoting and bypass the "more new text than old text" item.
That command should read:
	":g/^>/s//$/g" or something similar
If you use the first form, you get ALL the ">"s, including those bracketing
the references.  That, of course, makes for an invalid references line.

Secondly, is there a place that I can uucp or get emailed to me a copy of
RFC822 and RFC1036?  I'd like to have both on file here for our work; my
working knowledge is gleaned from sources at this point.

Thanks in advance.

--
Karl Denninger (karl@ddsw1.MCS.COM, <well-connected>!ddsw1!karl)
Public Access Data Line: [+1 312 566-8911], Voice: [+1 312 566-8910]
Macro Computer Solutions, Inc.		"Quality Solutions at a Fair Price"

rsalz@bbn.com (Rich Salz) (05/10/89)

The Subject line of this thread is wrong.

As others have pointed out, most of the "broken" messages are probably the
result of news/mail software gatewaying.  In particular, the high number
for Rice is the sun-spots group.  Ohio-state gateways the GNU newsgroups.
BBN does the GNU/Emacs mailing list.  Would you give up those groups, or
annoy those moderators with automated postings?  I wouldn't.

Also, I sometimes manually delete References: headers.  If the header is
something you care about, then don't put it under user control, or do what
Inews does when someone tries to forge a From: line.

	/Rich $alz, keeper of the news/mail gateway at BBN
	and maintainer of the news-n-mail mailing list
-- 
Please send comp.sources.unix-related mail to rsalz@uunet.uu.net.

hubcap@hubcap.clemson.edu (Mike Marshall) (05/10/89)

Here at hubcap.clemson.edu (one of the offenders) our news posting software
is Rick Adam's postnews V1.36.

I don't know what the headers of this message will look like when this posting
hits the net, but this is what they look like right now:

 * Subject: Re: List of sites with broken Followup (No References) Software
 * Newsgroups: news.admin
 * References: <3222@looking.UUCP>

It doesn't look to me like my news software is broken. Maybe my mail 
software is broken. Maybe NNTP is doing it. Maybe someone down the line
is really the culprit. Maybe a monster ate it. 

I don't want my site to continue to spew out spooge onto the net, but
obviously I'm not sure what to fix or even if it is my problem.

I'll continue to follow this thread of conversation. If this is some kind
of net wide problem, news.admin is the best place to solve it.

-Mike Marshall      hubcap@hubcap.clemson.edu

sloane@kuhub.cc.ukans.edu (Bob Sloane) (05/10/89)

In article <3222@looking.UUCP>, brad@looking.UUCP (Brad Templeton) writes:
> Recently I wrote some software to make use of the References line in news
> articles.  Much to my dismay, I found that most articles don't have a valid
> References: line!   The reason is a small number of sites with broken posting
> programs that don't include a References: line on followups.
    [ much deleted ]
>    3 kuhub.cc.ukans.edu

I take care of the News software here at kuhub, and I have tried real hard to
convince News to followup an article and not put in a references line.  So far
I haven't been able to do it.  Could you send me some more information about
the 3 articles?  Were they perhaps from inet groups, and were munged by the
mailing list interface software?  For example comp.os.vms is also the INFO-VAX
mailing list.  I am not sure how the interface between the mailing list and
usenet news could tell that any particular article was a followup or what
should be in the references line.  I suspect that you may have to just live
with the problem in the case of mailing lists gated into news.
+-------------------+-------------------------------------+------------------+
|  Bob Sloane        \Internet: SLOANE@KUHUB.CC.UKANS.EDU/Anything I said is |
|  Computer Center    \ BITNET: SLOANE@UKANVAX.BITNET   / my opinion, not my |
|  University of Kansas\  AT&T: (913) 864-0444         /  employer's.        |
+-----------------------+-----------------------------+----------------------+

brad@looking.UUCP (Brad Templeton) (05/10/89)

Many people have responded to tell me the reasons for bad References lines.
Such reasons include:
	Gatewayed mailing lists
	mail aliases that post news
	Manual deletion of the references line by the user

I understand and already knew about these reasons.  They are not the point.
We either have to fix almost all the problems, or the References line is
useless and might as well be scrapped.

We are spending many megabytes storing and shipping these lines after all.

Gateways of mailing lists & moderated groups can be tolerated to a degree,
because they limit the problem to a single group.  Other problems can't
be tolerated.  If that means shutting off mail->news gateways, that's what
it means.

Right now, and until the problem is fixed at 99.5% or more of sites, you
can't follow a usenet discussion by references thread, the way you would like
to.  You have to rely on the subject staying the same.  That's stupid, because
often the topic wanders and the subject should change with it.

Worse, you can't kill a followup tree, because if you kill on an article
with a broken chain, you only kill the broken chain, and even if you kill
on the parent article, the broken chains keep showing up again, and again,
and again.

Any plans for a hypertext-like system have to be scrapped.  Such plans were
the motive for News 3.0, for example.

So I think we may have to do the following:

	a) No mail->news gateways.  You wanna post, get some posting
	   software on your machine.  It's only fair if you're going to
	   post to 10,000 machines.
	b) Otherwise get some very smart gateway software that can figure
	   out a references line
	c) Posting software should detect when the References line has been
	   edited out of a posting, and re-insert a shortened line that at
	   least includes the ultimate parent (original article) and the
	   immediate parent.
	d) Non-unix sites running new software must be required to adhere to
	   the standard.

Or give up.  Which would be sad.
-- 
Brad Templeton, Looking Glass Software Ltd.  --  Waterloo, Ontario 519/884-7473

jbuck@epimass.EPI.COM (Joe Buck) (05/11/89)

In article <3222@looking.UUCP> brad@looking.UUCP (Brad Templeton) writes:
>Recently I wrote some software to make use of the References line in news
>articles.  Much to my dismay, I found that most articles don't have a valid
>References: line!   The reason is a small number of sites with broken posting
>programs that don't include a References: line on followups.  I scanned
>65,000 news articles, and 3700 of them matched this expression:
>
>	if( !followup && subject has "^re:" )
>		-- bad article --

Brad, you're jumping to conclusions based on a couple of incorrect assumptions.
First, an article doesn't have to be a followup if its subject matches "^re:".
Secondly, many articles enter the network via mail-to-news gateways.  For
example, the #2 spot on your list, Berkeley, is the major gateway for
converting the Arpa mailing lists to and from newsgroups.  Obviously
they can't magically concoct reference lines in most cases.

>If we can't fix this problem, we will just have to abandon the References:
>line and all the useful things that can be done with it.  (Like USENET
>hypertext threads etc.)

We do not.  Just treat articles with no references line as basenotes.
Gateways can also make a better attempt to generate References: headers
from In-Reply-To headers where they exist.

Obviously, no one has deliberately broken postnews and Pnews.
-- 
-- Joe Buck	jbuck@epimass.epi.com, uunet!epimass.epi.com!jbuck

cowan@marob.MASA.COM (John Cowan) (05/11/89)

In article <3222@looking.UUCP> brad@looking.UUCP (Brad Templeton) writes:
>I have the full details on sender/from pairs for these bad articles.  That's
>too much to post, so I will just list the offending Sender: and From: line
>sites below, with the counts of bad articles.   One clear problem is
>messages from Fidonet and some BITNET relays.   For Unix sites, working
>postnews software is available now, for free.

I am currently lobbying the Fidonet Technical Standards Association and the
author of the Fidonet<->Usenet/uucp gateway to handle Message-Id: and
In-Reply-To:/References: lines "correctly" by maintaining this information
internally to Fidonet messages.  Don't expect too much too soon, though;
still, at least there should soon at least >exist< a way of handling this
problem within Fidonet, even if it will be a long time before Fidonet
message-generators actually generate correct information.

A probable workaround is to strip "Re:" from subject lines, and not to
believe that an article is a followup unless a "References:" line exists.
Otherwise, use some kind of default followup algorithm (the one I use says
"This message is a followup to the last non-locally-written message with
the same subject" -- not perfect, but workable).

bob@tinman.cis.ohio-state.edu (Bob Sutterfield) (05/11/89)

In article <3229@looking.UUCP> brad@looking.UUCP (Brad Templeton) writes:
   So I think we may have to do the following:

   d) Non-unix sites running new software must be required to adhere to
      the standard.

I think this points out the real problem: coexistence in the real
world of the rest of the net, and ability to live with all the other
olden stuff and archaic practices that will persist over all our
objections.  We can't "require" anything, because the Usenet is a
complete anarchy, governed by shouting rather than by enforcable
regulations.  Gateways exist, and are likely to still be around for a
while.

Alas, this forces us all to discover much more clever methods to do
what we really want to do.  But the rest of the world will eventually
evolve and follow, sheep-like, along behind the bleeding edgers.  By
that time, the visionaries will have moved on to something else, still
trying to drag the rest of the net along behind.

It's a matter of practical evolution, not sudden revolution.

welty@algol.steinmetz (richard welty) (05/11/89)

In article <29126@ucbvax.BERKELEY.EDU> jwl@ernie.Berkeley.EDU.UUCP (James Wilbur Lewis) writes:
*Rather than people manually inserting a "re:" in the subject lines of
*articles which are really basenotes, it is possible that they are
*deleting the References: lines of followups, in order to avoid the
*dreaded "interp buffer overflow" from rn.  

*If this is really what is going on, I suggest one of the following
*patches to the software:  (1) modify rn to allow arbitrarily long
*Reference: lines, or truncate them automatically when they get
*unwieldy, or (2) only keep a reference to the article's immediate
*parent, since the remainder of the current References: line could
*be reconstructed from that.

the symbolics newsreader (which will be released as soon as i
find some time to fix the remaining known bugs) implements this
with a variable that allows the user to specify how many references
be retained.  if the variable is nil, then behaviour is the same
as rn (except that lisp machines don't do "interp buffer overflow".)
this seems to me to be a very reasonable solution; i'm currently
defaulting this variable to 4.  if you need earlier ones, you can
look up the articles whose article-ids are available.

*Another explanation might be in your counting program -- what does it do
*for malformed reference lines?  Many people incorrectly do a global
*replacement of ">" for some alternate character, to defeat the
*inews 50% included-text rule.  This messes up References: lines,
*which, while marginally less annoying than extraneous "inews fodder",
*is still a bother.  

basically, i think it is unreasonable to expect references: lines
be well-formed; i am planning on putting some hypertext like
stuff with article-ids in a future version of the symbolics newsreader,
and had already thought about this.  when they're well-formed, i'll
use 'em, otherwise i'll ignore them.  i don't think that any more
can reasonably be done -- we all know how hard it is to fix the other
guy's software.

richard
-- 
richard welty               welty@algol.crd.ge.com
518-387-6346, GE R&D, K1-5C39, Niskayuna, New York
    ``Every time I see an Alfa Romeo pass by,
         I raise my hat'' -- Henry Ford

tytso@athena.mit.edu (Theodore Y. Ts'o) (05/11/89)

You seem to be suggesting mandatory standards for what software people
run on machines.  And traditionally, that has gone over as well as
cold fusion within the MIT Physics department.  :-)

Now then, exactly how much do we lose if a references line is broken?
Well, that one article becomes treated as a base note, and thus must
be killed off if the object is to remove all articles on a thread.

If the object is to follow a line of discussion, all a single broken
reference chain does is to break the chain into two parts.  OK, this
is annoying.  But that's all.  What's this about broken chains being
so horrible?

Once the inet mailling list <--> news gateways are discounted, it
would seem that the number of non-compliant sites aren't that high.  I 
don't see this as such a horrible problem.

Then again, there's the problem that it is impossible (as in halting
problem impossible) to determine whether an article is ``supposed'' to
be a reply to another article or not.  That is, there is no real
difference between a reply to an article and a basenote which can be
detected by the News software.  In fact, someone might claim that
their article "deserved" to be considered a basenote instead of a
followup to this huge unmanageable chain, so it might be debatable even
using Human arbiters.

For example, if you implement and deploy your "enforcer" to send
horrible flaming mail to everyone. I predict it won't be long before
people who want to post articles via mail or who want to protect
people's RN from blowing up in their faces to change "re:" to
"F*ck_Brad:"......  and thus a new outlet for people's imaginations
would be developed, much like the line-eater fodders.  :-)

>We are spending many megabytes storing and shipping these lines after all.

Let's see.... we're currently keeping 27,088 articles (expire time =
14 days), and at an average of 80 characters of reference information
per article, this works out to 2 meg of information, which might seem
like a lot, until you realize that the two weeks of news consumes 144
meg of disk space.  That is to say the references comsumes all of 1.38%
of my News storage.  Oh... pain.... agony..... :-)  Personally, I'm not
going to worry about it. 

>	c) Posting software should detect when the References line has been
>	   edited out of a posting, and re-insert a shortened line that at
>	   least includes the ultimate parent (original article) and the
>	   immediate parent.

Define: ultimate parent --- I know what you mean, but define it in a
way that the News software can know about it.  *I'd* have a hell of
time finding the ID of the ultimate parent of any given subject line.

One thing I'd like to know:  are you currently using a hypertext
reader?  If you aren't, how can you be sure it would be so horrible?
If you are, I suspect that some of your problems may be stemming from
a deficiency of the reader instead of inherent problems with the
status quo.
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Theodore Ts'o				bloom-beacon!mit-athena!tytso
3 Ames St., Cambridge, MA 02139		tytso@athena.mit.edu
   Everybody's playing the game, but nobody's rules are the same!

brad@looking.UUCP (Brad Templeton) (05/11/89)

I think most of us understand the reasons for these messages, however it is
the results that are important.

I came upon this because I wrote a filter that could kill based on message-id.
The idea was, when you want to kill a topic, you record the Message-id at the
start of the References line.  This is supposed to be the ultimate parent
or original article.

Then I started reading.  A boring topic would come up.  I killed it in this
way.  Lo and behold, it would come up again.  I would kill it.  It came up
again, I would kill it again, and again, and again, and again, until I got
tired and went back to killing by the subject string.

We can't treat these as base notes, and as long as even a tiny minority
of posters post followups in this way, any thread-follow or topic kill
scheme simply doesn't work.

We will have to rely forever on Subject lines, leaving us forever with the
curse of completely meaningless subject lines.   Ideally, the References
line should do all the work.   Each article should have a different subject
that says what that article is actually about.

Perhaps I was being a bit nasty with the word 'broken.'  Many sites
in my list normally follow-up correctly, but miss from time to time.  Many
problems are well known things like mail gateways.  But the reasons and
motives don't matter here.
-- 
Brad Templeton, Looking Glass Software Ltd.  --  Waterloo, Ontario 519/884-7473

chip@vector.Dallas.TX.US (Chip Rosenthal) (05/11/89)

In article <3229@looking.UUCP> brad@looking.UUCP (Brad Templeton) writes:
>So I think we may have to do the following: ...

How about (e).  Substitute the title "orphaned response". :-(
-- 
Chip Rosenthal / chip@vector.Dallas.TX.US / Dallas Semiconductor / 214-450-5337

peter@ficc.uu.net (Peter da Silva) (05/11/89)

In article <3229@looking.UUCP>, brad@looking.UUCP (Brad Templeton) writes:
> Right now, and until the problem is fixed at 99.5% or more of sites, you
> can't follow a usenet discussion by references thread, the way you would like
> to.

This is objective nonsense. The vast majority of messages do have valid
references lines. You're throwing the baby out with the bathwater.
-- 
Peter da Silva, Xenix Support, Ferranti International Controls Corporation.

Business: uunet.uu.net!ficc!peter, peter@ficc.uu.net, +1 713 274 5180.
Personal: ...!texbell!sugar!peter, peter@sugar.hackercorp.com.

karl@ddsw1.MCS.COM (Karl Denninger) (05/12/89)

In article <3229@looking.UUCP> brad@looking.UUCP (Brad Templeton) writes:
>Many people have responded to tell me the reasons for bad References lines.
>Such reasons include:
>	Gatewayed mailing lists
>	mail aliases that post news
>	Manual deletion of the references line by the user
>
>I understand and already knew about these reasons.  They are not the point.
>We either have to fix almost all the problems, or the References line is
>useless and might as well be scrapped.

I disagree with this.  The References line IS very useful.  No, it's not
perfect, but it is useful.  Without it AKCSNet's linkage (threading)
wouldn't work at all, and neither would any "hypernews" project.

>We are spending many megabytes storing and shipping these lines after all.

As we do shipping your and my signature files.... which contain NO 
information that is necessary to the content of the discussion at hand......

That's a rhetorical and sarcastic response, yes, but the reason should be
obvious.  We can't even stop people from using 10-line signatures......

>Gateways of mailing lists & moderated groups can be tolerated to a degree,
>because they limit the problem to a single group.  Other problems can't
>be tolerated.  If that means shutting off mail->news gateways, that's what
>it means.

You'll never succeed in this...

......
>Any plans for a hypertext-like system have to be scrapped.  Such plans were
>the motive for News 3.0, for example.

They don't have to be scrapped, but people do have to realize that it's not
perfect.  We had a nice education job to do here when we implemented AKCSNet
linkage with Usenet, since we often end up with split discussions for just this
reason.  People didn't understand why threads were getting split up in the
gated groups.  We had to explain the reasons.  Now the users simply accept
that it isn't perfect, and CANT BE given the wide variety of the network
sites, imperfect software, and the anarchy that Usenet is.

They deal with it.  I deal with it.  I read many newsgroups in AKCSNet,
because it does attempt to organize things, and does a darn sight better job
of it than the Usenet software does.

So what if it's not perfect?  That's the nature of the beast!

>So I think we may have to do the following:

Define "we".  Certainly you don't think you're really going to impose this
on 10,000 administrators, do you?

>	a) No mail->news gateways.  You wanna post, get some posting
>	   software on your machine.  It's only fair if you're going to
>	   post to 10,000 machines.

Why not?  That immediately invalidates 100% of the mailing lists that end up
as newsgroups.

>	b) Otherwise get some very smart gateway software that can figure
>	   out a references line

That's going to be quite a trick from the mail gates that I have seen.  Now
they have to keep message id <> article ID pairs around ad nauseum.

>	c) Posting software should detect when the References line has been
>	   edited out of a posting, and re-insert a shortened line that at
>	   least includes the ultimate parent (original article) and the
>	   immediate parent.

This one I have a problem with.  If your hypernews software is so smart, it
should be able to figure out what's going on with just the ultimate parent.
After all, that is all that is REALLY required, as the rest is quite
irrelavent.  You see, you can't really rethread things accurately at a
distant site regardless of how hard you try -- the multiple paths that news
takes insures that this is impossible.  (Ie: you can't guarantee that your
idea of the "order" of responses is the "right" one, in fact, there IS no
"right" ordering.  It's all a matter of perspective, and that changes
site-by-site.)

Besides, if the header has been edited out, how do you propose that the
posting software reconstruct it?  The very information it needs to perform
this task is _gone_.

>	d) Non-unix sites running new software must be required to adhere to
>	   the standard.

How do you propose to do this?  I doubt you'll get people to cut their 
neighbors for this kind of reason, and I doubt that you'll get most people 
to "fix" their software if it is broken either.  You see, all this costs 
money (ie: investment of some kind), and most of the net runs on volunteer 
time.  Thus, you and I and everyone else HAS to put up with the imperfect 
nature of this beast we call Usenet.

>Or give up.  Which would be sad.

On forcing people to "conform"?  That's reality in this format.  You can't 
force people to comply when there is no policing power, or any organizational 
standard.

Give it up.  You can apply diplomatic pressure, but I bet many people will
tell you to "stuff" when it comes down to brass tacks.  Usenet simply isn't
an entity you or anyone else can police.

--
Karl Denninger (karl@ddsw1.MCS.COM, <well-connected>!ddsw1!karl)
Public Access Data Line: [+1 312 566-8911], Voice: [+1 312 566-8910]
Macro Computer Solutions, Inc.		"Quality Solutions at a Fair Price"

lyndon@cs.AthabascaU.CA (Lyndon Nerenberg) (05/12/89)

In article <3229@looking.UUCP> brad@looking.UUCP (Brad Templeton) writes:

>We are spending many megabytes storing and shipping these lines after all.

This caught my interest. Just how many megabytes *are* we shipping in
References headers? Well, two hours (real, not CPU :-) later I have
some numbers from the news directory tree on atha:

	Total bytes in References: headers:	1,175,870
	Total bytes in /usr/spool/news:	       80,886,000 *
	% of disk used by References: headers:	     1.45

[ Hmm, I wonder what percentage of the email byte count is devoted
  to Received: headers :-) ]

Given these numbers I don't consider the References: line to be
a significant overhead, especially when you consider the functionality
it adds.

I do think it's naive to expect BITNET (and other non-UUCP/Internet)
sites to start conforming to the RFC (no smiley).-- 
Lyndon Nerenberg   Computing Services   Athabasca University
{alberta,attvcr,ncc}!atha!lyndon  ||  lyndon@nexus.ca

lyndon@cs.AthabascaU.CA (Lyndon Nerenberg) (05/12/89)

In article <571@aurora.AthabascaU.CA> lyndon@auvax.UUCP (Lyndon Nerenberg) writes:
>	Total bytes in References: headers:	1,175,870
>	Total bytes in /usr/spool/news:	       80,886,000 *
>	% of disk used by References: headers:	     1.45

and then forgets to explain the '*' ...

The byte count for References was obtained by collecting the headers
from each article, and should be exact unless the script picked up some
stray tra* or .in* files in /usr/spool/news. The total online value
was determined by running a du -s against /usr/spool/news, which returned
80886 (1K) blocks. This number will be a bit on the high side, however
I consider both values to be more than accurate enough for the purposes
of the discussion.

-- 
Lyndon Nerenberg   Computing Services   Athabasca University
{alberta,attvcr,ncc}!atha!lyndon  ||  lyndon@nexus.ca

wcf@psuhcx.psu.edu (Bill Fenner) (05/12/89)

In article <649@marob.MASA.COM> cowan@marob.masa.com (John Cowan) writes:
|I am currently lobbying the Fidonet Technical Standards Association and the
|author of the Fidonet<->Usenet/uucp gateway to handle Message-Id: and
|In-Reply-To:/References: lines "correctly" by maintaining this information
|internally to Fidonet messages.  Don't expect too much too soon, though;
|still, at least there should soon at least >exist< a way of handling this
|problem within Fidonet, even if it will be a long time before Fidonet
|message-generators actually generate correct information.

There's no way you're going to get all of FidoNet to change to a USENET
capable BBS system.  I'm writing one (very slowly), but not everyone will
like it.  Not every software author will want to make the changes necessary
in their software, and not every sysop will want to run the new software.
It's possible, I suppose, to make USENET-capable a requirement before passing
the groups along; however, that's definitely not going to happen soon.

  Bill
-- 
   Bitnet: wcf@psuhcx.bitnet     Bill Fenner       | "Yesterday starts
  Internet: wcf@hcx.psu.edu                        |  tomorrow; tomorrow
 UUCP: {gatech,rutgers}!psuvax1!psuhcx!wcf         |  starts today"
Fido: Sysop at 1:129/87 (814/238 9633) \hogbbs!wcf |       -- Marillion

doug@xdos.UUCP (Doug Merritt) (05/12/89)

In article <May.9.14.38.03.1989.3044@NET.BIO.NET> lear@NET.BIO.NET (Eliot Lear) writes:
>Unfortunately, your statistics are probably flagging many messages
>that are gatewayed between mail and news.

Just what I was thinking. "ucbvax", for instance, has essentially
no users, yet was high on the list. But it is often used to gateway mail
into news (I've done that myself).

>Mail has no concept of a references line.  Maybe we should consider
>incorporating the mechanism into mail.

That would be nice, but there sure are a lot of mailers out there.
It would also help if, when the gatewaying takes place, and there's
no References: line, a system like ucbvax could arbitrarily add
one, picking some appropriate prior message based on matching
"Subject:  Re: " strings.

A total kludge, but it would maintain the desired consistency for
e.g. hypertext until all the mailers *do* add this line themselves.
	Doug
-- 
Doug Merritt		{pyramid,apple}!xdos!doug	doug@xdos.com
Member, Crusaders for a Better Tomorrow		Professional Wildeyed Visionary

"Of course, I'm no rocket scientist" -- Randell Jesup, Capt. Boinger Corps

eric@vu-vlsi.Villanova.EDU (Eric Raymond) (05/13/89)

The conversation-following code in TMNN relies on the References line. So
these losing sites are making life harder for everybody using it. *Please*,
people, get up to date! Join my beta list or get a recent 2.11 or *something*.

Not passing References headers is anti-social. As TMNN becomes more widespread
(and particularly as the hypertext features get folded in) it will become much
more so. This is especially true for the big gateway sites -- ucbvax, are you
listening?

jbuck@epimass.EPI.COM (Joe Buck) (05/15/89)

In article <3229@looking.UUCP> brad@looking.UUCP (Brad Templeton) writes:
>Many people have responded to tell me the reasons for bad References lines.

>Gateways of mailing lists & moderated groups can be tolerated to a degree,
>because they limit the problem to a single group.  Other problems can't
>be tolerated.  If that means shutting off mail->news gateways, that's what
>it means.

Brad, have you gotten any feedback from the "notes" gang?  After all, notes
relies on the References: lines quite heavily at the news-notes gateway
sites to convert into the internal structures they use.  They seem to survive
quite well, and I don't hear them screaming for the news-notes gateways
to be shut down.

Obviously the "notes" system cannot exist, and all plans for it must be
scrapped. :-)

>So I think we may have to do the following:
>
>	a) No mail->news gateways.  You wanna post, get some posting
>	   software on your machine.  It's only fair if you're going to
>	   post to 10,000 machines.

Sorry, won't happen.

>	b) Otherwise get some very smart gateway software that can figure
>	   out a references line

A slight modification to the existing "recnews" program can help.  Recognize
the "In-Reply-To" header, and see if a message-ID-like object appears on
the line, use it to generate a reference header.

>	c) Posting software should detect when the References line has been
>	   edited out of a posting, and re-insert a shortened line that at
>	   least includes the ultimate parent (original article) and the
>	   immediate parent.

There isn't a way to do that.  There isn't enough information present.

>	d) Non-unix sites running new software must be required to adhere to
>	   the standard.

It doesn't have anything to do with Unix or non-Unix.

>Or give up.  Which would be sad.

Or simply design any system with enough robustness to cope with the
inevitable errors that people will make and garbage that will be sent
as input.

-- 
-- Joe Buck	jbuck@epimass.epi.com, uunet!epimass.epi.com!jbuck

gore@eecs.nwu.edu (Jacob Gore) (05/16/89)

/ news.admin / jbuck@epimass.EPI.COM (Joe Buck) / May 15, 1989 /

>Brad, have you gotten any feedback from the "notes" gang?  After all, notes
>relies on the References: lines quite heavily at the news-notes gateway
>sites to convert into the internal structures they use.

I'm not aware of a notes implementation that uses References.  All the ones
I know use the "Subject:" line to build threads.

Jacob Gore				Gore@EECS.NWU.Edu
Northwestern Univ., EECS Dept.		{oddjob,chinet,att}!nucsrl!gore

wtm@bunker.UUCP (Bill McGarry) (05/16/89)

The high number for "bunker" and many of the fidonet sites listed
are the result of the misc.handicap newsgroup (the Handicap News).
The majority of the articles in this newsgroup originate from other
networks, such as the BITNET mailing list L-HCAP, the Fidonet
ABLED conference and 5 or 6 other Fidonet conferences, and a few
other small networks that are linked in manually.  It is just
not feasible to put "References" lines in these articles.

I would think that any software that is going to use the References
line would also require some sort of alternate method such as the
one that "rn" uses where the message thread is based upon the
Subject line.


				Bill McGarry
				moderator of misc.handicap
				(203) 337-1518

     PATH:  {oliveb, philabs, decvax, yale}!bunker!wtm
     wtm@bunker.uucp    l-hcap@vm1.nodak.edu
     Handicap News BBS (141/420)  1-203-337-1607

chris@ciprico.mn.org (Chris Johnson) (05/16/89)

In article <1220@psuhcx.psu.edu> wcf@psuhcx (Bill Fenner) writes:
>There's no way you're going to get all of FidoNet to change to a USENET
>capable BBS system.  I'm writing one (very slowly), but not everyone will
>
>  Bill

To say nothing of the fact that USENET is a clumsy dinosaur of a network, too.
Many such people that run FidoNet (not to imply FidoNet is better or worse
than USENET) or other computer networks are likely to be of the mind that to
make such a change would be a big step backwards in technology and protocol.
The one big advantage that USENET has is sheer size.

-- 
Chris Johnson    chris@ciprico.mn.org	    ..uunet!rosevax!cipric!chris
 Ciprico, Inc., 2955 Xenium Ln., Plymouth, MN 55441  USA    612.559.2034

phil@ux1.cso.uiuc.edu (05/17/89)

> >Brad, have you gotten any feedback from the "notes" gang?  After all, notes
> >relies on the References: lines quite heavily at the news-notes gateway
> >sites to convert into the internal structures they use.
> 
> I'm not aware of a notes implementation that uses References.  All the ones
> I know use the "Subject:" line to build threads.

I guess that makes notes the more rubust program in terms of being able to
handle what it gets.

--phil howard-- <phil@ux1.cso.uiuc.edu> (ux1 is a guilty host, too)

jeffery@jsheese.FIDONET.ORG (Jeff Sheese) (05/18/89)

In an article of <16 May 89 15:23:42 GMT>, chris@ciprico.mn.org (Chris  
Johnson) writes:

 >Many such people that run FidoNet (not to imply FidoNet is better or 
 >worse than USENET) or other computer networks are likely to be of the mind
 >that to make such a change would be a big step backwards in technology
 >and protocol.
 >The one big advantage that USENET has is sheer size.

Well, almost.  Most Fidonet mail transfer programs are able to use Zmodem file  
transfer which beats the heck out of G protocol.  Other than that the shear  
size and technical content of Usenet is VERY inviting.  I'm sure as interest  
grows the connectivity will grow as well.

--  
Jeff Sheese - via FidoNet node 1:109/116
UUCP: ...!netsys!jsheese!jeffery
ARPA: jeffery@jsheese.FIDONET.ORG
(I am sole owner.  My opinions represent my company.)
(Send all flames to null@jsheese.Fidonet.ORG)

doug@xdos.UUCP (Doug Merritt) (05/19/89)

In article <89.247289F5@jsheese.FIDONET.ORG> jeffery@jsheese.Fidonet.ORG writes:
>Well, almost.  Most Fidonet mail transfer programs are able to use Zmodem
>file transfer which beats the heck out of G protocol.

Why is it better?
	Doug
-- 
Doug Merritt		{pyramid,apple}!xdos!doug
Member, Crusaders for a Better Tomorrow		Professional Wildeyed Visionary

mju@mudos.ann-arbor.mi.us (Marc Unangst) (05/20/89)

In article <318@xdos.UUCP>, doug@xdos.UUCP (Doug Merritt) writes:
 >In article <89.247289F5@jsheese.FIDONET.ORG> jeffery@jsheese.Fidonet.ORG 
 >writes:
 >>Well, almost.  Most Fidonet mail transfer programs are able to use Zmodem
 >>file transfer which beats the heck out of G protocol.
 >
 >Why is it better?

1. It is a streaming protocol, so it doesn't wait for ACKs before sending
the next block.
2. It is an ACK-less protocol, so you don't waste your time confirming
to the sending machine that yes, we did receive that last block okay,
and that it can please send the next one.
3. It has dynamic block size adjustment.  If it's a clean line (few or no
errors), the block size can go up as far as 8K on a 9600 bps or higher
link.  If it's a dirty line, Zmodem will gradually decrease the block
size as it gets more and more errors, so there is less to retransmit in
the case of a line hit -- All the way down to 64 bytes on a very dirty
line.

Because 'g' always uses a fixed block size of 64 bytes? 128 bytes?, it
wastes a lot of time with ACKs on a clean line.  Yes, you can increase
the block size, but that means that you have more to transmit when you
get a line hit.

'g' was a good protocol 10 years ago, when it was first introduced.  It's
getting too old, and too slow in today's world of Trailblazer modems.
Who'll be the first to support Zmodem in their uucico?

 >        Doug
 >-- 
 >Doug Merritt            {pyramid,apple}!xdos!doug
 >Member, Crusaders for a Better Tomorrow         Professional Wildeyed 
 >Visionary
--  
Marc Unangst
UUCP smart    : mju@mudos.ann-arbor.mi.us
UUCP dumb     : ...!uunet!sharkey!mudos!mju
UUCP dumb alt.: ...!{ames,rutgers}!mailrus!clip!mudos!mju
Internet      : mju@mudos.ann-arbor.mi.us

jeffery@jsheese.FIDONET.ORG (Jeff Sheese) (05/20/89)

In an article of <19 May 89 14:34:37 GMT>, doug@xdos.UUCP (Doug Merritt)  
writes:

 >In article <89.247289F5@jsheese.FIDONET.ORG> jeffery@jsheese.Fidonet.ORG 
 >writes:
 >>Well, almost.  Most Fidonet mail transfer programs are able to use
 >>Zmodem file transfer which beats the heck out of G protocol.
 >
 >Why is it better?
 >        Doug

Maybe I'm mis-judging it from the implementation that I'm using, but at 1200  
baud I average 92 cps and at 2400 baud I average 190cps.  This is whether the  
connections are local or via PC Pursuit.  Of course I can't expect much more  
using a packet size of 64 bytes.

It appears that with G, each packet requires an ACK regardless of the window  
size.  With zmodem the transmission of data from origin to recipient is  
continuous, where the recipient tells the sender if any problems occur in  
transmission.  Packet size starts at 1024 bytes at 1200 baud (2048 at 2400  
baud) and adjust themselves according to line conditions.  It even has error  
recovery on partial transfers.

Now I'm not saying one technology is better than another.  Since Fidonet  
software is mainly restricted to one architecture (MSDOS and IBM clones) it  
has been able to progress to a very secure and reliable technology over the  
years.  The code is not designed to be as portable among different hardware  
and software systems as Usenet.  Maybe that makes it an unfair comparison.

--  
Jeff Sheese - via FidoNet node 1:109/116
UUCP: ...!netsys!jsheese!jeffery
ARPA: jeffery@jsheese.FIDONET.ORG
(I am sole owner.  My opinions represent my company.)
(Send all flames to null@jsheese.Fidonet.ORG)

scott@clmqt.UUCP (Scott Reynolds) (05/20/89)

From article <99.2474C600@jsheese.FIDONET.ORG>, by jeffery@jsheese.FIDONET.ORG (Jeff Sheese):
>Maybe I'm mis-judging it from the implementation that I'm using, but at 1200
>baud I average 92 cps and at 2400 baud I average 190cps.  This is whether the
>connections are local or via PC Pursuit.  Of course I can't expect much more
>using a packet size of 64 bytes.
>
>It appears that with G, each packet requires an ACK regardless of the window
>size.  With zmodem the transmission of data from origin to recipient is
>continuous, where the recipient tells the sender if any problems occur in

My xferstats file shows some interesting figures.  On a direct dial 2400
baud connection, average transfer rate (for files of at least 5K or bigger)
is 193 cps.  On a clean line, G will actually do up to about 222 cps,
believe it or not!  Compare this to my best ZModem times of 231 cps and not
one character higher, ever.  I average 226 cps on ZModem transfers.

On a 1200 baud connection through a packet switched network to an Internet
gateway to the host, I get 93 cps on average.  Best conditions yield up to
about 109 cps, but that happens extremely rarely.

>transmission.  Packet size starts at 1024 bytes at 1200 baud (2048 at 2400
>baud) and adjust themselves according to line conditions.  It even has error
>recovery on partial transfers.

Not sure what version of ZModem you are using -- DSZ for MS-DOS?  My
implementation, the rz/sz programs, start at 512 and 1024 for 1200 and 2400
baud, respectively.  However, as long as the transfer runs clean, the block
size adjusts, and it's not hard to achieve a 4K block with a fair sized
file.  Block size adjusts down to 16 (!) bytes on my implementation in bad
conditions.

>Now I'm not saying one technology is better than another.

I would.  ZModem uses 32-bit CRC checking, has error recovery (as mentioned
already), exceptionally quick error handling, and myriads of other little
features that really make it a solid protocol.  There are a few protocols
that edge it out in speed, such as SEAlink, but they are in general far
less reliable, at least in theory.

G is indeed streaming (watch it go when you are connected through a packet
network if you don't believe me) but the one thing that really slows it
down is poor error recovery.  If in mid transfer a block is short a few
characters, the receiver will time out after a fairly long period of time.
I say fairly long because I have watched the TD/RD indicators on the modem 
stay unlit for periods of 10-20 seconds; I would think that in a direct
dial connection that a more reasonable timeout would be 3-5 seconds.  Then
again I'm not a protocol designer so I shouldn't second guess, should I?
:-)

>...  Maybe that makes it an unfair comparison.

It's a fair comparison -- rz/sz run on most USG/Sys V, BSD, XENIX, and
other mostly compatible *NIX systems, as well as versions for VMS.  If
your site doesn't have the programs you hardly know what you're missing.
There are even hints that you can tie rz into news in the documentation,
but as of yet I haven't attempted it.

I believe the package is archived, but if you're having trouble finding it
and don't mind paying a little bit of long distance, the number you can
find it at is  +1 503 621 3746  2400 baud 8,1,n.
-- 
Scott Reynolds      scott@clmqt.UUCP    ..rutgers!sharkey!clmqt!scott
          Enterprise Information System   Marquette, MI