[net.news] Propagation Lengths

rsk@pucc-h (Rich Kulawiec) (09/08/84)

	Perhaps the propagation delay times can be (partially) explained
due to the rather lengthy path some articles take in getting from *there*
to *here*...I wrote a quick shell script and analyzed the approximately
4800 articles sitting in /usr/spool/news here at the moment, and got the
following interesting, but probably not surprising, results:

Hops	# of Articles		Hops	# of Articles
1	32			13	462
2	139			14	249
3	16			15	238
4	20			16	260
5	68			17	210
6	137			18	198
7	554			19	179
8	303			20	150
9	295			21	88
10	250			22	39
11	449			23	18
12	452			24	0

	Note: I did not count transfers around Purdue, from ECN to CS to
PUCC, nor did I count local articles *at all*.  The effect of the former
is to knock from 1 to 3 local hops off everything, which I felt would make
the results more generally applicable, especially to single-machine sites,
and the effect of the latter is to ignore about 200 articles.

	I wonder if our case is typical; if so, it would indicate that
the network topology is decidely non-optimal.  (I'm sure this is news to
nobody.)  Corporate, national, and geographical constraints notwithstanding,
surely we can improve on this!

	For anyone wanting to try this at home, here's the script:
----------
#/bin/sh
find /usr/spool/news -type f -exec egrep "^Path: " {} ";" > newspaths 
cat newspaths \
	| sed -e "s/^Path: //" -e "s/@.*//" \
	| rev \
	| sed -e "s/^[^!:]*[!:]/!/" \
	| rev \
	| sed -e "s/^Pucc-[CDEHIJKcdehijk][!:]//" \
		-e "s/^Stat-L[!:]//" \
		-e "s/^CS-Mordred[!:]//" \
		-e "s/^Physics[!:]//" \
		-e "s/^pur-ee[!:]//" \
		-e "/^\./d" \
		-e "/^"\$"/d" \
	| tr -d "[A-Za-z0-9\-]" \
	| awk "{x[length]++} END {for(i=1;i<=24;i++)print i,x[i]}" > summary
----------

	Note that the first sed invocation strips out the "Path: " at the
beginning of every line; the second forces every address to end in a "!",
which means ignoring Arpanet, Csnet, and other network strangeness tacked onto
the end of the path; the 5 lines in the middle take care of ignoring
on-campus article forwarding...and the last two commands to sed take care
of any blank lines, or any lines that start with a ".". (Some folks use
the notation "Path: ..foobar!me" in their signatures.)

	The "tr" collapses all the lines to a string of "!"'s, and the
awk script just counts the number of occurences of each line length.
Note also that this gives you a line by line version; I simply formatted
it to get the two-column display shown above.

	I'm sure that this script can *also* be improved on; but it worked.
-- 
---Rsk

UUCP: { decvax, icalqa, ihnp4, inuxc, sequent, uiucdcs  } !pur-ee!rsk
      { decwrl, hplabs, icase, psuvax1, siemens, ucbvax } !purdue!rsk

And the thing that you're hearing is only the sound
Of the low spark of high-heeled boys...