[news.admin] Recently observed nonconforming Message-IDs

eggert@twinsun.com (Paul Eggert) (06/12/91)

Commonly used news transport software accepts many Message-IDs that do
not conform to the Internet RFCs (1036, 822, and 1123).  Of course,
news software need not, and probably should not, reject all articles
containing nonconforming Message-IDs, but they are signs that something
is misconfigured or is otherwise confused, and some of the problems may
be serious enough to warrant concern.  Typically the problem lies at
the originating site.

I looked at a sample news history file containing 63381 recent Usenet
Message-IDs, and found 388 nonconforming Message-IDs.  Here is a table
of reasons for lack of conformance, together with the number of
corresponding Message-IDs.

#articles   conformance problem, and example nonconforming Message-ID

    217   The local-part may not contain an unquoted `:'.
          <17169:Jun1122:04:5791@kramden.acf.nyu.edu>

     47   A domain may not end with `.'.
          <25590171@hpcvra.cv.hp.com.>

     42   A domain may not begin with `.'.
          <9106052317.AA01532@.devon.prepnet.com>

     36   A Message-ID cannot contain two unquoted `@'s.
          <1991Jun6.165406.70@%boot.decnet@edwards-tems.af.mil>

     32   The local-part may not end with `.'.
          <k#mh7z.@rpi.edu>

     22   The local-part may not begin with `.'.
          <.+9+WD_@cs.widener.edu>

     17   Two adjacent unquoted `.'s may not appear in a Message-ID.
          <9105211421.AA10695@mailserv.zdv.uni-tuebingen..de>

     12   The domain may not be empty.
          <1991Jun5.192745.1945@>

      2   The local-part may not contain an unquoted `]'.
          <]ocdj2.cj7@cat.de>

      2   Quotes must match.
          <"<9105141856.AA26881@cnmus.cnm.us.es>

The article count does not sum to 388, because some Message-IDs had
more than one reason.  The list of reasons may not be exhaustive,
because I stopped looking for reasons once I discovered a reason for
every nonconforming Message-ID.  The news history file examined was
twinsun.com's news history file as of 1991/06/12 03:15 GMT.  This host
runs C News 24-Mar-1991, subscribes to just the technical Usenet
newsgroups, and expires history after 30 days.

You can look for nonconforming Message-IDs on your host by running the
following shell script with your news history file as standard input;
it will copy nonconforming lines to standard output.  It's been tested
only under C News.  Please make sure your egrep is up to the task; I
used GNU egrep.


	#!/bin/sh

	# These definitions are taken from RFC 822, except (as per RFC 1036)
	# white space and nonprinting characters are excluded.
	dtext='[!-Z^-~]'
	qtext='[]-~!#-[]'
	quoted_pair='\\[!-~]'
	quoted_string="\"($qtext|$quoted_pair)*\""
	atom="[-!#-'*-+/-9=?A-Z^-~]+"
	word="($atom|$quoted_string)"
	domain_literal="\\[($dtext|$quoted_pair)*\\]"
	domain_ref="$atom"
	sub_domain="($domain_ref|$domain_literal)"
	domain="$sub_domain(\\.$sub_domain)*"
	local_part="$word(\\.$word)*"
	addr_spec="$local_part@$domain"
	msg_id="<$addr_spec>"

	egrep -v "^$msg_id"

eggert@twinsun.com (Paul Eggert) (06/12/91)

Here is a list of recently observed nonconforming Usenet Message-IDs.
A companion posting describes how this list was derived.  The list
starts with a list of domains that posted nonconforming IDs, together
with a count of nonconforming postings from each domain.  At the end of
this message is a complete listing of host names and nonconforming
Message-IDs.

#posts	domain
----	--------
  86	hpcvbbs.UUCP
  48	kramden.acf.nyu.edu
  33	ab20.larc.nasa.gov
  30	lehigh.bitnet
  20	rpi.edu
  17	hpcvra.cv.hp.com.
  12	...!asuvax!gtephx
  12	(empty)
   8	scrumpy@.bnr.ca
   8	cluster@ukc.ac.uk
   7	cs.widener.edu
   7	%boot.decnet@edwards-tems.af.mil
   5	braille.uwo.ca
   5	.next1.osf.org.
   4	xds13.ferranti.com
   4	news%jho@iex.com
   4	feki.toppoint.de
   4	MHS
   4	.vaxkiller.agi.org.
   3	uunet.UU.NET
   3	idunno.Princeton.EDU
   3	dutepp13
   3	cymbal.reasoning.com.
   3	cognos.uucp@uunet.uu.net
   3	UK.AC.SALF.C
   3	EMAS-A
   3	.nextoid.
   3	.econ.vu.nl
   3	.devon.prepnet.com
   2	jev.
   2	engin.umich.edu
   2	dutepp1
   2	dutepp0
   2	cnmus.cnm.us.es
   2	cat.de
   1	vax1.rz.uni-regensburg.dbp.de
   1	vax1.informatik.fh-regensburg.dbp.de
   1	usenet@kadsma
   1	ukulele.reasoning.com.
   1	train@bcm.tmc.edu
   1	splash.
   1	soleil.iarc.mco.edu.
   1	seneca.Sed.Novell.COM.
   1	sct60a.sunyct.edu
   1	pn9050.cr.usgs.gov.
   1	obelix.
   1	nfhsn1.rus.uni-stuttgart.de.
   1	mute.ruhr.de
   1	mandata@uunet.uu.net
   1	mailserv.zdv.uni-tuebingen..de
   1	goofy.llnl.gov.
   1	elsinore.
   1	eana.f3.gmd.dbp.de
   1	e8@e8ha.eas.asu.edu
   1	domina.cs.tu-berlin.de
   1	apollo.
   1	UK.AC.MCC.CMS
   1	NEFX4.91Jun7101949@nefx4.ncsuvx.ncsu.edu
   1	.uucp
   1	.share.
   1	.nextserver.cs.stthomas.edu.cs.stthomas.edu..
   1	.halfdome.uucp.
   1	%rtsmv1.decnet@edwards-vax.af.mil



Here is the complete list of nonconforming Message-IDs, sorted by
domain and then by local-part.  To save space, the list's entries are
of the form

	domain
		local1
		local2
		...

which stand for the Message-IDs <local1@domain>, <local2@domain>, ...
One @ is elided in this brief report, so each @ in the list below
stands for one or more nonconforming Message-IDs containing two @s.
For example,

	%boot.decnet@edwards-tems.af.mil
		1991May17.121427.61

stands for <1991May17.121427.61@%boot.decnet@edwards-tems.af.mil>,
which has two @s and therefore does not conform.


(empty)
	9105100407.AA01243
	9105140937.AA22725
	9105150227.AA00828
	9105181320.AA25591
	1991May17.155656.367
	9105201309.AA14790
	9105220715.AA22652
	9105241604.AA08437
	SHAWN.91May28142234
	9105280148.AA13156
	1991Jun5.192745.1945
	9106062125.AA12728

%boot.decnet@edwards-tems.af.mil
	1991May17.121427.61
	1991May31.092959.64
	1991May31.122359.65
	1991Jun5.141704.67
	1991Jun6.165220.69
	1991Jun6.165848.71
	1991Jun6.165406.70

%rtsmv1.decnet@edwards-vax.af.mil
	1991Jun11.123907.1

...!asuvax!gtephx
	1991May14.183620.7561
	1991May15.162125.6532
	1991May15.162310.6587
	1991May15.200222.12702
	1991May20.200345.26003
	1991May24.172753.14239
	1991May28.202713.1040
	1991May28.202814.1095
	1991May29.195754.25342
	1991Jun7.225601.27271
	1991Jun7.232904.804
	1991Jun10.234526.335

.devon.prepnet.com
	9105171830.AA06830
	9105241623.AA11642
	9106052317.AA01532

.econ.vu.nl
	769
	770
	771

.halfdome.uucp.
	9106011825.AA01359

.next1.osf.org.
	9105282023.AA06828
	9105300227.AA11055
	9106031655.AA03418
	9106052145.AA11115
	9106061843.AA14156

.nextoid.
	9105220158.AA01321
	9105231413.AA01980
	9105251432.AA00333

.nextserver.cs.stthomas.edu.cs.stthomas.edu..
	9105242334.AA02258

.share.
	9105301224.AA29978

.uucp
	1991Jun2.002543.22989

.vaxkiller.agi.org.
	9105142307.AA18397
	9105171735.AA06835
	9105241658.AA00346
	9105301825.AA00271

EMAS-A
	16.May.91..08:57:28.bst..060957
	28.May.91..17:32:28.bst..060196
	31.May.91..11:23:12.bst..060024

MHS
	22*.S=rohrer.OU=cvax.O=psi.PRMD=SWITCH.ADMD=ARCOM.C=CH.
	1227*.G=Geir.S=Thorud.OU=forskning.O=teledir.PRMD=tele.ADMD=telemax.C=no.
	169*.S=zogg.OU=prz.O=ntb.PRMD=SWITCH.ADMD=ARCOM.C=CH.
	910529114115*.G=Allan.S=Cargille.OU=cs.O=uw-madison.PRMD=xnren.C=us.

NEFX4.91Jun7101949@nefx4.ncsuvx.ncsu.edu
	HOLMES

UK.AC.MCC.CMS
	05.Jun.91.12:26:12.BST.MBBGPBA

UK.AC.SALF.C
	29.May.91.20:25:16.A106EC
	29.May.91.20:49:08.A106FA
	31.May.91.16:46:40.A1018B

ab20.larc.nasa.gov
	comp.binaries.amiga:v91i154
	comp.binaries.amiga:v91i155
	comp.binaries.amiga:v91i156
	comp.binaries.amiga:v91i157
	comp.binaries.amiga:v91i158
	comp.binaries.amiga:v91i159
	comp.binaries.amiga:v91i160
	comp.binaries.amiga:v91i161
	comp.binaries.amiga:v91i162
	comp.sources.amiga:v91i104
	comp.sources.amiga:v91i105
	comp.sources.amiga:v91i106
	comp.sources.amiga:v91i107
	comp.sources.amiga:v91i114
	comp.sources.amiga:v91i108
	comp.sources.amiga:v91i109
	comp.sources.amiga:v91i110
	comp.sources.amiga:v91i111
	comp.sources.amiga:v91i112
	comp.sources.amiga:v91i113
	comp.binaries.amiga:v91i163
	comp.binaries.amiga:v91i164
	comp.sources.amiga:v91i115
	comp.sources.amiga:v91i116
	comp.sources.amiga:v91i117
	comp.binaries.amiga:v91i165
	comp.binaries.amiga:v91i166
	comp.binaries.amiga:v91i167
	comp.binaries.amiga:v91i168
	comp.binaries.amiga:v91i169
	comp.binaries.amiga:v91i170
	comp.binaries.amiga:v91i171
	comp.binaries.amiga:v91i172

apollo.
	9106041014.AA15603

braille.uwo.ca
	.674593550
	.675091650
	.675501501
	.676306327
	.676650084

cat.de
	$7]7j2.i&
	]ocdj2.cj7

cluster@ukc.ac.uk
	22448.2837941c
	22449.283795eb
	22454.28383c19
	22458.2838e31f
	22466.283a426b
	22484.28422990
	22494.2843795b
	22552.284f8019

cnmus.cnm.us.es
	"<9105132259.AA23725
	"<9105141856.AA26881

cognos.uucp@uunet.uu.net
	GARYM.91May30115448
	GARYM.91May31123148
	GARYM.91Jun6093906

cs.widener.edu
	RWS+J5.
	N!Y+N4.
	2!Y+66.
	4!Y+P8.
	.P1+4M.
	AV6+T+.
	.+9+WD_

cymbal.reasoning.com.
	9105232056.AA25432
	9105302308.AA18221
	9106020028.AA07159

domina.cs.tu-berlin.de
	336:

dutepp0
	.674293522
	.675935210

dutepp13
	.674298839
	.674326830
	.674999288

dutepp1
	.675678144
	.676108875

e8@e8ha.eas.asu.edu
	3478

eana.f3.gmd.dbp.de
	118:licht

elsinore.
	9105171259.AA16646

engin.umich.edu
	RL0+Y0.
	CW!-KM.

feki.toppoint.de
	91.130.08:21:20
	91.142.20:51:31
	91.146.12:14:27
	91.152.16:50:49

goofy.llnl.gov.
	9105242042.AA13310

hpcvbbs.UUCP
	282d2c85:3069.1comp.sys.handhelds;1
	282da161:3053.3comp.sys.handhelds;1
	282da1db:3074comp.sys.handhelds
	282daa63:3040.2comp.sys.handhelds;1
	282dad1a:3045.11comp.sys.handhelds;1
	282dbbf8:3077comp.sys.handhelds
	282eb2d5:3069.2comp.sys.handhelds;1
	282ec8d2:3076.2comp.sys.handhelds;1
	282ed46c:3079comp.sys.handhelds
	282efa57:3081comp.sys.handhelds
	282ffadd:3091comp.sys.handhelds
	283005ca:2766.1comp.sys.handhelds;1
	2830416e:3095comp.sys.handhelds
	28305a8d:3076.5comp.sys.handhelds;1
	2830a2ff:3095.1comp.sys.handhelds;1
	2830a398:3102comp.sys.handhelds
	283142be:3101.1comp.sys.handhelds;1
	28319bd0:3079.2comp.sys.handhelds;1
	283191f3:3106comp.sys.handhelds
	28319e55:3105.1comp.sys.handhelds;1
	2831a475:3101.2comp.sys.handhelds;1
	2831cb27:3101.3comp.sys.handhelds;1
	28322248:3095.2comp.sys.handhelds;1
	2832b42c:3121comp.sys.handhelds
	2832f360:3101.9comp.sys.handhelds;1
	2833207d:2897.1comp.sys.handhelds;1
	2833223b:2811.1comp.sys.handhelds;1
	283322a6:2812.1comp.sys.handhelds;1
	28334a11:3143.1comp.sys.handhelds;1
	28336e38:2668.1comp.sys.handhelds;1
	2833e7cc:3095.3comp.sys.handhelds;1
	283607b6:3027.4comp.sys.handhelds;1
	28342b60:3149comp.sys.handhelds
	283483b7:3076.7comp.sys.handhelds;1
	2835a558:3102.1comp.sys.handhelds;1
	2836056d:3004.1comp.sys.handhelds;1
	2837572e:3150comp.sys.handhelds
	28375d18:3102.2comp.sys.handhelds;1
	28386d51:2423.5comp.sys.handhelds;1
	2838d9fc:3163.1comp.sys.handhelds;1
	2839891d:3015.1comp.sys.handhelds;1
	283a0232:3195.1comp.sys.handhelds;1
	283a319e:3220comp.sys.handhelds
	283a3242:3220.1comp.sys.handhelds;1
	283a7f78:3220.2comp.sys.handhelds;1
	283ab4b6:3187.1comp.sys.handhelds;1
	283b27b5:2972.1comp.sys.handhelds;1
	283b6022:3227.1comp.sys.handhelds;1
	283b60a9:3214.1comp.sys.handhelds;1
	283bcf39:3221.1comp.sys.handhelds;1
	283c3634:3224.7comp.sys.handhelds;1
	283ca390:3248comp.sys.handhelds
	283f4202:3273comp.sys.handhelds
	2841d19f:3150.1comp.sys.handhelds;1
	2841e299:3283comp.sys.handhelds
	2842f4fc:3286.1comp.sys.handhelds;1
	284354f9:3187.2comp.sys.handhelds;1
	28449127:3255.2comp.sys.handhelds;1
	2844927c:3306comp.sys.handhelds
	2846746b:3319.1comp.sys.handhelds;1
	2846731a:3313.1comp.sys.handhelds;1
	2846d68f:3233.1comp.sys.handhelds;1
	28473da6:3329comp.sys.handhelds
	2847cff1:3330comp.sys.handhelds
	2847d178:3331comp.sys.handhelds
	2847ddad:3325.1comp.sys.handhelds;1
	284813da:3337comp.sys.handhelds
	28486743:3340comp.sys.handhelds
	28497f99:3342comp.sys.handhelds
	2849ea48:3329.2comp.sys.handhelds;1
	2849ee27:3343comp.sys.handhelds
	284aa075:3342.1comp.sys.handhelds;1
	284b2af4:3349.1comp.sys.handhelds;1
	284bb02a:3356comp.sys.handhelds
	284c8120:3354.1comp.sys.handhelds;1
	284c81c6:3361comp.sys.handhelds
	284cf5aa:3360.1comp.sys.handhelds;1
	284d4e10:3371.1comp.sys.handhelds;1
	284d4cd2:3368.1comp.sys.handhelds;1
	284dc9fc:3373comp.sys.handhelds
	284dca48:3374comp.sys.handhelds
	284dca79:3375comp.sys.handhelds
	284f1f60:3374.2comp.sys.handhelds;1
	2850bd41:3374.4comp.sys.handhelds;1
	285159d4:3003.1comp.sys.handhelds;1
	285314ec:3374.6comp.sys.handhelds;1

hpcvra.cv.hp.com.
	25590162
	25590163
	40790009
	25590164
	25590165
	25690005
	31600018
	25590166
	25590167
	25590168
	25590169
	31570005
	40730001
	28000001
	49550003
	25590170
	25590171

idunno.Princeton.EDU
	azhMklLK7RHF.
	az.gM8fzw6Wm.
	azPmJYC3NWq1.

jev.
	9105141929.AA02157
	9106101844.AA00906

kramden.acf.nyu.edu
	10581:May1315:01:2891
	10984:May1315:52:0591
	14077:May1321:14:2691
	25239:May1416:21:3591
	25361:May1416:31:0791
	25833:May1416:43:4291
	26085:May1416:52:3491
	26227:May1416:59:4091
	26899:May1417:39:1691
	1775:May1420:06:1291
	1905:May1420:11:4991
	7491:May1502:05:3291
	8834:May1504:18:2391
	9150:May1504:50:0591
	11900:May1518:21:5491
	12130:May1518:43:0491
	14021:May1521:56:2291
	14213:May1522:13:2291
	14365:May1522:32:5891
	14499:May1522:45:0491
	13550:May1819:18:3891
	13658:May1819:26:2491
	23781:May1901:08:3591
	23893:May1901:19:2191
	23997:May1901:27:0891
	25666:May1904:50:0491
	29167:May1918:13:2991
	3690:May1921:22:5191
	4703:May2000:47:5291
	10429:May2020:10:1291
	10670:May2020:25:2891
	18457:May2119:19:3891
	19060:May2120:58:0591
	19233:May2121:28:2291
	19427:May2121:58:5991
	19515:May2122:03:3191
	19659:May2122:17:2991
	4547:May2223:16:3391
	11942:May2521:33:5791
	12098:May2521:41:2791
	12177:May2521:45:5991
	24071:Jun214:03:1191
	24305:Jun214:50:4891
	1965:Jun309:17:2191
	2088:Jun309:46:2191
	4172:Jun815:28:1691
	14192:Jun923:16:0791
	17169:Jun1122:04:5791

lehigh.bitnet
	11059119:51:38RAD2
	11059119:37:08RAD2
	11059119:41:08RAD2
	11059119:48:24RAD2
	12059120:31:44HL00
	14059105:30:40A039
	14059108:50:50JV04
	14059121:42:23HL00
	14059121:55:36HL00
	16059110:40:10JV04
	16059111:59:05JV04
	17059115:10:52JV04
	19059110:01:46RP04
	20059109:05:49JV04
	20059118:41:50GB03
	21059115:29:49JV04
	22059105:31:48A039
	22059111:18:28JV04
	23059109:28:37JV04
	23059115:30:02JV04
	23059117:28:22EJK0
	24059119:34:24DEB1
	03069121:01:21BG00
	04069111:25:32JV04
	05069109:42:52JV04
	05069121:02:14AWY0
	05069122:17:50GB03
	07069113:00:09EJK0
	10069111:09:40LAL5
	11069117:20:17EJK0

mailserv.zdv.uni-tuebingen..de
	9105211421.AA10695

mandata@uunet.uu.net
	1991May30.134639.586

mute.ruhr.de
	91.131.19:21:14

news%jho@iex.com
	1991May21.041224.3996
	1991May23.145916.5268
	1991May30.144855.27644
	1991Jun1.060506.20532

nfhsn1.rus.uni-stuttgart.de.
	9105211518.AA06420

obelix.
	9105291547.AA01284

pn9050.cr.usgs.gov.
	9105151237.AA28291

rpi.edu
	=+_hsq.
	2-_hj1.
	7p=h!!.
	lt=h8x.
	tjdhrt.
	m=fh4+.
	=dghjq.
	adgh+v.
	adgh-s.
	adgh8w.
	rdghcz.
	rfgh47.
	.3ghas-
	h9lhcf.
	k#mh7z.
	k#mhay.
	.snh5ag
	.xrh0q=
	.xrhfp=
	.pth#.n

scrumpy@.bnr.ca
	1991May13.220553.28817
	1991May22.131136.25113
	1991May22.133053.25258
	1991May22.163101.26179
	1991May30.153824.20151
	1991Jun3.171023.29648
	1991Jun6.175342.28998
	1991Jun6.181622.29296

sct60a.sunyct.edu
	5180.on.Tue,.14.May.91.07:39:56.EDT.

seneca.Sed.Novell.COM.
	JKT.91Jun1015643

soleil.iarc.mco.edu.
	9105231338.AA03611

splash.
	9105171452.AA09476

train@bcm.tmc.edu
	1991May20.080124

ukulele.reasoning.com.
	9106061838.AA00554

usenet@kadsma
	1991May23.155810.1904

uunet.UU.NET
	csx-12i101:xcal
	csx-13i006:xdtm
	csx-13i017:imagemagic

vax1.informatik.fh-regensburg.dbp.de
	60:windl

vax1.rz.uni-regensburg.dbp.de
	539:irtel

xds13.ferranti.com
	KVFBE.
	.1GBQHF
	.RGBHK3
	.QHBBXB

henry@zoo.toronto.edu (Henry Spencer) (06/13/91)

In article <1991Jun12.071111.29652@twinsun.com> eggert@twinsun.com (Paul Eggert) writes:
>Commonly used news transport software accepts many Message-IDs that do
>not conform to the Internet RFCs (1036, 822, and 1123)....

Uh, do bear in mind that RFC1036 message IDs are much more leniently
defined than 822/1123 message IDs.  While it may be unwise to use news
message IDs that are not legal mail message IDs, it is not disastrous.
-- 
"We're thinking about upgrading from    | Henry Spencer @ U of Toronto Zoology
SunOS 4.1.1 to SunOS 3.5."              |  henry@zoo.toronto.edu  utzoo!henry

eggert@twinsun.com (Paul Eggert) (06/13/91)

henry@zoo.toronto.edu (Henry Spencer) writes:

>Uh, do bear in mind that RFC1036 message IDs are much more leniently
>defined than 822/1123 message IDs.  While it may be unwise to use news
>message IDs that are not legal mail message IDs, it is not disastrous.

Practically speaking, this is correct: after all, I wouldn't have
observed the nonconforming Message-IDs if they hadn't propagated
through part of Usenet successfully!  But RFC-1036 says ``all USENET
news messages must be formatted as valid Internet mail messages,
according to the Internet standard RFC-822.''  RFC-1036 places its own
constraints on message format, but it doesn't remove any RFC-822
constraints.

Of the 388 nonconforming Message-IDs that I found, only 10% had C News
format.  This is a good record, since 40% of the conforming IDs had C
News format.  The two problems were nonconforming domain names,
probably due to a C News installer answering conf/build's questions
incorrectly, and multiple `@'s, probably due to NNTP mishaps.