[comp.protocols.tcp-ip.domains] problems with nsfnet-relay.ac.uk

dsmith@xylem.uoregon.edu (Dale Smith) (02/23/91)

Grrrrrrr!!   Is anyone else having big trouble trying to deliver
mail to the UK???  I have hundreds of messages queued up for
delivery, but I can't connect to nsfnet-relay.ac.uk via smtp.
It seems like it has to be a broken mailer on their end of
things, since I can telnet to it and pings average 1/2 second
round trip.  A telnet session opens pretty quick, but telnetting
to the smtp port results in a "connection timed out" message.
This has been happening for 3 days now.

It seems outrageous that all mail to the UK is shoved through one
system.

Does anyone out there know what is going on and when it might be
fixed.

Dale Smith, Assistant Director of Network Services
University of Oregon		Internet: dsmith@ns.uoregon.edu
Computing Center		BITNET: dsmith@oregon.bitnet
Eugene, OR  97403-1212		Voice:503-346-4394  FAX:503-346-4397

john@waikato.ac.nz (John Houlker) (02/24/91)

In article <1991Feb22.181958.14608@ns.uoregon.edu>,
dsmith@xylem.uoregon.edu (Dale Smith) writes:
> Grrrrrrr!!   Is anyone else having big trouble trying to deliver
> mail to the UK???  I have hundreds of messages queued up for
> delivery, but I can't connect to nsfnet-relay.ac.uk via smtp.
> It seems like it has to be a broken mailer on their end of
> things, since I can telnet to it and pings average 1/2 second
> round trip.  A telnet session opens pretty quick, but telnetting
> to the smtp port results in a "connection timed out" message.
> This has been happening for 3 days now.

Yes, we see this and also have intervals of no IP connectivity
to nsfnet-relay.ac.uk (like now).
> 
> It seems outrageous that all mail to the UK is shoved through one
> system.

Some UK mail goes via other gates but nsfnet-relay.ac.uk sure
handles a lot, a backup would be much appreciated.

John Houlker

a20@nikhefh.nikhef.nl (Marten Terpstra) (02/25/91)

In article <1991Feb22.181958.14608@ns.uoregon.edu> dsmith@xylem.uoregon.edu (Dale Smith) writes:
>Grrrrrrr!!   Is anyone else having big trouble trying to deliver
>mail to the UK???  I have hundreds of messages queued up for
>delivery, but I can't connect to nsfnet-relay.ac.uk via smtp.
>It seems like it has to be a broken mailer on their end of
>things, since I can telnet to it and pings average 1/2 second
>round trip.  A telnet session opens pretty quick, but telnetting
>to the smtp port results in a "connection timed out" message.
>This has been happening for 3 days now.

As you may know you can reach the UK via the link from the US to ULCC in
London, where also the nsfnet-relay.ac.uk is. They seem to have a great deal
of problems with their link. We (in Europe) do not have the problem you have
because there's a link from Amsterdam to ULCC we use.

>
>It seems outrageous that all mail to the UK is shoved through one
>system.

Maybe it is. As you may know IP is only getting started in the UK so please
have some patience with them. There is another way of getting mail to the UK,
but that one is not advertised via the DNS. It can be delivered via EUnet.
This goes via mcsun.eu.net and then to the UK EUnet backbone in Kent. This
however does not help you with your queues.

>
>Does anyone out there know what is going on and when it might be
>fixed.

Don't know. I do know that there are discussions on backup arrangements for
their link to the US. We hope to reach such an agreement somewhere this
month.

If you see a lot of these problems you may try to reach tony@ean-relay.ac.uk,
or via EUnet tony%ean-relay.ac.uk@mcsun.eu.net.

Marten
--
Marten Terpstra                                  National Institute for Nuclear
Internet : terpstra@nikhef.nl 		                and High Energy Physics
Oldie-net: {....}mcsun!nikhefh!terpstra	      (NIKHEF-H), PO Box 41882, 1009 DB
Phone    : +31 20 592 5102                           Amsterdam, The Netherlands

kwe@bu-it.bu.edu (Kent England) (02/26/91)

> From: dsmith@xylem.uoregon.edu (Dale Smith)
> Newsgroups: comp.protocols.tcp-ip.domains
> Subject: problems with nsfnet-relay.ac.uk
> Date: 22 Feb 91 18:19:58 GMT
> 
> Grrrrrrr!!   Is anyone else having big trouble trying to deliver
> mail to the UK???  I have hundreds of messages queued up for
> delivery, but I can't connect to nsfnet-relay.ac.uk via smtp.

	A problem was discovered with the US-UK connection via the
DARPA Wideband Net about 19 Feb.  It was resolved about 22 Feb.

	As I understand it, there was trouble in the transatlantic
link, which is beyond the NSFnet connection.

	--Kent

dave@ecrc.de (Dave Morton) (02/26/91)

In article <75539@bu.edu.bu.edu>, kwe@bu-it.bu.edu (Kent England) writes:
|> 	As I understand it, there was trouble in the transatlantic
|> link, which is beyond the NSFnet connection.
|> 
Sigh - I know, but why are we still going over the pond to get to the UK ?
Is there a reason why we cannot get the root name server in Europe operational
and simply go over mcsun or whatever. Hey - it's not just the UK...... 
Am I missing something like the politics........

Dave Morton,
European Computer Research Centre		Tel. + (49) 89-92699-139
Arabellastr 17, 8000 Munich 81. Germany.	Fax. + (49) 89-92699-170
E-mail:	dave@ecrc.de

daniel@nstn.ns.ca (Daniel MacKay) (02/27/91)

>In article <75539@bu.edu.bu.edu>, kwe@bu-it.bu.edu (Kent England) writes:
 	As I understand it, there was trouble in the transatlantic
 link, which is beyond the NSFnet connection.
 
I have a little problem with the above: 

	owl% telnet nsfnet-relay.ac.uk
	Trying 128.86.8.6 ...
	Connected to nsfnet-relay.ac.uk.
	Escape character is '^]'.
	nsfnet-relay.ac.uk Login:
	
however:

	owl% telnet nsfnet-relay.ac.uk 25
	Trying 128.86.8.6 ...
	telnet: connect to address 128.86.8.6: Connection timed out
	owl%

There's no one listening on the SMTP port!  No, it's not a connectivity 
problem!
--
	Daniel MacKay				daniel@nstn.ns.ca
	NOC Manager, NSTN Operations Centre
	Communications Services			902-494-NSTN
	Dalhousie University
	Halifax, N.S.  CANADA
	B3H 4H8					uunet!nstn.ns.ca!daniel

system@jach.Hawaii.Edu (Henry Stilmack - JAC Hawaii SysMgr) (02/27/91)

In article <1991Feb22.181958.14608@ns.uoregon.edu>, dsmith@xylem.uoregon.edu (Dale Smith) writes:
>Grrrrrrr!!   Is anyone else having big trouble trying to deliver
>mail to the UK???  I have hundreds of messages queued up for
>delivery, but I can't connect to nsfnet-relay.ac.uk via smtp.
>It seems like it has to be a broken mailer on their end of
>things, since I can telnet to it and pings average 1/2 second
>round trip.  A telnet session opens pretty quick, but telnetting
>to the smtp port results in a "connection timed out" message.
>This has been happening for 3 days now.
>
>It seems outrageous that all mail to the UK is shoved through one
>system.
>
>Does anyone out there know what is going on and when it might be
>fixed.

I have found the same problem. Since we are in fact an "appendage" of the 
Royal Observatory at Edinburgh, this has been particularly annoying. As an 
interim measure, you might try using NASA's SPAN gateway. You will need the 
full JANET address of the target site, which is usually the same as the 
Internet address, except in reverse domain order (naturally! We all know 
the Brits drive on the wrong side of the road!).

For example, to mail to the Royal Observatory at Edinburgh, the syntax is:

  To: "cbs%uk.ac.roe.starlink::user"%rlesis@star.stanford.edu

Unfortunately, this does not generate a REPLYable header on some (VMS) 
machines.

Good luck!

______________________________________________________________________________
Henry Stilmack                               )  "Everybody's colored, or 
Computing Services Manager                   )   else you wouldn't be 
UK/Netherlands/Canada Joint Astronomy Centre )   able to see them."
hps@jach.Hawaii.Edu                          )    
------------------------------------------------------------------------------

pmoore@hemel.bull.co.uk (Paul Moore) (02/27/91)

I reckon the problem is a policy change in nsfnet.
Doing telnet nsfnet...... gets in OK
Doing telnet nsfnet... 25 times out. This does not mean nobody is listening
on port 25. If this was the case the connect would be reject straight away.
(try doing telnet xxxx 1234 , or 25 on a machine with no smtp daemon). The
fact that the connect to a specfic port times out shows that some filtering
in being done on the basis of dest port number, probably by a router in
front of nsfnet-relay.

Plus , the MX records for *.ac.uk say go to nsfnet-relay but the MX records
for *.co.uk say go to mcsun.

The problem is that of .com (or .co.uk on the Internet) trying to send to
.ac.uk sites I guess.

There was some flaming inside UKnet about people routing mail via nsfnet
saying "this is an acedemic only net - commercial orgs must not use it". But
when somebody pointed out that their MX records routed mail to nsfnet I 
thought that would be the end of - I guess not.

-- 
!---------------------------------------------------------------!
! Paul Moore, Bull HN UK, Maxted Rd,Hemel Hempstead, HP2 7DZ.   !
! Phone:(44) 442 232222  Fax:(44) 442 234084			!	
! pmoore@hemel.bull.co.uk     "a smile, a song and a core dump" !
!---------------------------------------------------------------!

rickert@mp.cs.niu.edu (Neil Rickert) (02/27/91)

In article <1991Feb27.122824.28436@hemel.bull.co.uk> pmoore@hemel.bull.co.uk (Paul Moore) writes:
>I reckon the problem is a policy change in nsfnet.
>Doing telnet nsfnet...... gets in OK
>Doing telnet nsfnet... 25 times out. This does not mean nobody is listening

 It is really enjoyable reading all of these sinister theories.

 Did it occur to anyone that perhaps, in the aftermath of the link having
been down, the machine is struggling to handle the backlog and can't always
keep up with the volume of incoming SMTP connections.

-- 
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
  Neil W. Rickert, Computer Science               <rickert@cs.niu.edu>
  Northern Illinois Univ.
  DeKalb, IL 60115                                   +1-815-753-6940

ronald@robobar.co.uk (Ronald S H Khoo) (02/28/91)

a20@nikhefh.nikhef.nl (Marten Terpstra) writes:

> here is another way of getting mail to the UK,
> but that one is not advertised via the DNS. It can be delivered via EUnet.
> This goes via mcsun.eu.net and then to the UK EUnet backbone in Kent. This
> however does not help you with your queues.

And it does not help him at all if the destination site is not an EUNet
customer because the mail will be bounced (probably at Kent).
Yes, Virginia, there is a use for MMDF authorisation code.  However,
if you know a UK site takes news, you can safely assume that they will be
authorised to take mail in %ified through mcsun.eu.net even though their
postmaster may get annoyed with you -- because that mail will probably
be charged to them.  Odd how direct $$ charging wakes people up :-)
But in general, the info in the DNS is correct.  There is no MX pointing
that way because the aren't allowed to get mail that way.

There may well be two routes into the UK, but they aren't useable as
backups for each other, because of the usage restrictions.  A small
number of sites can use both links, but I think they only advertise
MX via NSFnet-relay.ac.uk.  Similarly, my MX is mcsun.EU.net, and I can't
get mail via NSFnet-relay.ac.uk, or at least I'm not allowed to.

It's all very sad that business, politics, etc interfere with the common
sense approach that it would be better for the users if the potential
for dual-redundancy were to be exploited.  But it aint politically acceptable.
Sigh.  Never mind that an entire comunity's mail gets stuck for half
a week while they locate a shark stuck in an undersea cable somewhere.
Pah.  Politics.
-- 
Ronald Khoo <ronald@robobar.co.uk> +44 81 991 1142 (O) +44 71 229 7741 (H)

thorinn@RIMFAXE.DIKU.DK (Lars Henrik Mathiesen) (02/28/91)

   Date: 27 Feb 91 14:26:00 GMT
   From: rickert@mp.cs.niu.edu  (Neil Rickert)

    It is really enjoyable reading all of these sinister theories.

    Did it occur to anyone that perhaps, in the aftermath of the link having
   been down, the machine is struggling to handle the backlog and can't always
   keep up with the volume of incoming SMTP connections.

When I saw this, I didn't think that that could be the explanation, so
I looked in our kernel. I was rather surprised to find that if there
are more than the allowed number of new connections waiting to be
accept()ed, the Berkeley TCP will summarily drop incoming SYNs (no
ICMP whatever, no RST). I suppose the idea is that the congestion is
temporary and the SYN will be retransmitted. Maybe a source quench
would be appropriate?

Anyway, there's a sun.nsfnet-relay.ac.uk that advertises SMTP in its
WKS record:

    sun.nsfnet-relay.ac.uk. 360000  A       128.86.8.7
    sun.nsfnet-relay.ac.uk. 360000  MX      13 nsfnet-relay.ac.uk.
    sun.nsfnet-relay.ac.uk. 360000  HINFO   SUN-4 SUNOS
    sun.nsfnet-relay.ac.uk. 360000  WKS      128.86.8.7 tcp (
			     ftp smtp )

I can connect to SMTP on that host, so it doesn't seem to be filtered.
Interestingly, from Europe this is routed through 192.16.192.185:

    ;; ANSWERS:
    185.192.16.192.in-addr.arpa.    15594   PTR     sun.nsfnet-relay.ac.uk.

    ;; AUTHORITY RECORDS:
    192.16.192.in-addr.arpa.        447592  NS      MCSUN.EU.NET.
    192.16.192.in-addr.arpa.        447592  NS      SERING.CWI.NL.

but it doesn't seem to the same machine (doesn't do telnet, for
instance).

--
Lars Mathiesen, DIKU, U of Copenhagen, Denmark      [uunet!]mcsun!diku!thorinn
Institute of Datalogy -- we're scientists, not engineers.      thorinn@diku.dk

a20@nikhefh.nikhef.nl (Marten Terpstra) (02/28/91)

In article <9102281006.AA28115@rimfaxe.diku.dk> thorinn@RIMFAXE.DIKU.DK (Lars Henrik Mathiesen) writes:

[stuff deleted]

>I can connect to SMTP on that host, so it doesn't seem to be filtered.
>Interestingly, from Europe this is routed through 192.16.192.185:
>
>    ;; ANSWERS:
>    185.192.16.192.in-addr.arpa.    15594   PTR     sun.nsfnet-relay.ac.uk.
>
>    ;; AUTHORITY RECORDS:
>    192.16.192.in-addr.arpa.        447592  NS      MCSUN.EU.NET.
>    192.16.192.in-addr.arpa.        447592  NS      SERING.CWI.NL.
>
>but it doesn't seem to the same machine (doesn't do telnet, for
>instance).

Sorry about this. This is due to a wrong inverse mapping of 192.16.192.185.
This machine used to be sun.nsfnet-relay.ac.uk, but has been switched to a
cisco a few months ago. The reverse mapping should say cisco.ulcc.ac.uk. I'll
see to it that it will be changed in the reverse mapping of 192.16.192.

The network of 192.16.192 is used within Europe for IP over X.25, mainly for
IP over IXI. This may cause unreachability of these numbers, because they are
spread all over Europe, not necessarily connected. But since this are only
routers, you would not have the need to get in. For those managing these
ciscos (and suns) on 192.16.192 there is of course another way to get in ;-)

The link that people in Europe use to get to the UK is via NIKHEF in
Amsterdam, then via IP over IXI to ULCC in London where the
*.nsfnet-relay.ac.uk machines are located.

I also noticed that today I cannot reach nsfnet-relay on port 25. The guy
managing this machine is here, so I will ask him.

Hope this is solved soon.

Marten
--
Marten Terpstra                                  National Institute for Nuclear
Internet : terpstra@nikhef.nl 		                and High Energy Physics
Oldie-net: {....}mcsun!nikhefh!terpstra	      (NIKHEF-H), PO Box 41882, 1009 DB
Phone    : +31 20 592 5102                           Amsterdam, The Netherlands

cliff@demon.co.uk (Cliff Stanford) (03/02/91)

In article <1991Feb22.181958.14608@ns.uoregon.edu> dsmith@xylem.uoregon.edu (Dale Smith) writes:
>It seems outrageous that all mail to the UK is shoved through one
>system.
>

	How do you think it feels for *us*?

-- 
Cliff Stanford				Email:	cliff@demon.co.uk (Work)
Demon Systems Limited				cms@demon.co.uk   (Home)
42 Hendon Lane				Phone:	081-349 0063	  (Office)
London	N3 1TT	England				0860 375870	  (Mobile)

thorinn@RIMFAXE.DIKU.DK (Lars Henrik Mathiesen) (03/02/91)

   From: Julian Onions <j.onions@xtel.co.uk>

   Spot on. The nsfnet relay machine ... is configured to
   allow 10 simultaneous connections - and there is nearly always that
   number connected.  No need for sinister theories, 1 days outage is
   enough to build up a huge queue of messages inbound and outbound.

As I read that, it means that the SMTP server will stop accepting new
connections once the maximum number is reached, even though it could
do so. That is *wrong*, it should accept the connection, send a 421
reply code (service unavailable), and close the connection. (It
wouldn't even have to create an extra process to do that, it could be
done in the main loop.) This would 1) let the sender defer the letter
at once, 2) save some retransmitted SYN packets, 3) save some kernel
memory on nsfnet-relay for waiting TCP connections, and 4) be nice.

(Also, now there will always be about 8 established connections in a
kernel queue, waiting for the SMTP server to accept them. That's not
nice either.)

--
Lars Mathiesen, DIKU, U of Copenhagen, Denmark      [uunet!]mcsun!diku!thorinn
Institute of Datalogy -- we're scientists, not engineers.      thorinn@diku.dk

medin@cincsac.arc.nasa.gov (Milo S. Medin) (03/03/91)

In article <1991Feb26.092928.954@ecrc.de>, dave@ecrc.de (Dave Morton) writes:
|> 
|> In article <75539@bu.edu.bu.edu>, kwe@bu-it.bu.edu (Kent England) writes:
|> |> 	As I understand it, there was trouble in the transatlantic
|> |> link, which is beyond the NSFnet connection.
|> |> 
|> Sigh - I know, but why are we still going over the pond to get to the UK ?
|> Is there a reason why we cannot get the root name server in Europe operational
|> and simply go over mcsun or whatever. Hey - it's not just the UK...... 
|> Am I missing something like the politics........
|> 

The answer is not politics.  Adding a root server given the way the DNS is
currently built is serious business.  The root of the problem is that the
RTT estimation code in many servers is broken.  What this means is that 
all the root servers are normally pounded upon by DNS servers both near
and far.  In order for a new root nameserver in Europe to be used, you can't
just stick it in the named.ca file.  This file is only used on start up, and
this info is thrown away when a categorical list of root servers is 
retrieved.  Thus, a European root would have to be added to the master list
of root nameservers, and this means it would be known by all the servers in
the Internet.

Given the broken RTT estimators in most versions of BIND, deploying such a 
server in Europe would result in queries from all over the world pummelling
the transatlantic link to the European root.  This is hardly the result people
expect by deploying a "local" root server.  

But, none the less, we who worry about International topology and engineering
are trying to figure out exactly what the load is, and the possibility
of putting one in a place in Europe that would be most advantageous.  One 
problem is that much of Europe still talks to itself via the US.  Thus adding
a European root doesn't help much.  In the Pacific, networking is in a much
"cleaner" configuration.  Almost all the Pacific connectivity comes through
PACCOM, and all their links eventually wind up going through NASA Ames, where
the Western FIX is located.  I run a root nameserver which is directly attached
to the FIX, thus they have very good access to the root, even though it's
not local.   But the links to Europe come into the U.S. all over the place,
and thus European access is problematic.  

I know many people would rather believe that problems are political in nature
and not technical (and some certainly are, much to all our grief!), most
people in the Internet community really try and do the right thing, and 
re-engineering the root nameserver system is one of many things we are
trying to do to improve life in the worldwide Internet.

						Thanks,
						    Milo

PS Usual disclaimers apply!

john@waikato.ac.nz (John Houlker) (03/03/91)

In article <1991Mar3.070813.29410@riacs.edu>, medin@cincsac.arc.nasa.gov
 (Milo S. Medin) writes:
> 
> The answer is not politics.  Adding a root server given the way the DNS is
> currently built is serious business.  The root of the problem is that the
> RTT estimation code in many servers is broken.  What this means is that 
> all the root servers are normally pounded upon by DNS servers both near
> and far.  
.
> Given the broken RTT estimators in most versions of BIND, deploying such a 
> server in Europe would result in queries from all over the world pummelling
> the transatlantic link to the European root.  
.
> re-engineering the root nameserver system is one of many things we are
> trying to do to improve life in the worldwide Internet.

Milo, given the inertia of the large number of "broken" servers, what
about a "hack" of applying filters (at international gateways) on domain
UDP to root servers that are too "distant".  That would improve the
RTT estimation :-).  Is root-root server traffic all TCP?

John

kre@cs.mu.OZ.AU (Robert Elz) (03/04/91)

thorinn@RIMFAXE.DIKU.DK (Lars Henrik Mathiesen) writes:

>That is *wrong*, it should accept the connection, send a 421
>reply code (service unavailable), and close the connection.

This sounds like a good thing to do, but in practice, it can
be a disaster.  It takes some cpu time to do this, which isn't
something you have available in these circumstances (and remember,
its not just one of these things, is possibly dozens, all at once).

But worse - the characteristics are wrong, on receiving a 421,
servers typically queue the message and attempt to send it again
later.  If lots of 421's are sent at the same time, then its likely
that you will get lots of mail being retransmitted an hour later,
overloading the recipient system yet again, and so on.

But, if the SYN is simply not answered, or probably better a RST
is sent, then the sending mailer will typically try a secondary
MX, and leave the mail there - that mailer can then forward the
mail to the primary MX (or handle it otherwise) - which it will
generally do by serialising the messages and sending one at a time,
vastly decreasing the instantaneous load on the receiving server.

Of course, this only works if there is a secondary MX, which isn't
the case for messages MX'd to nsfnet-relay.ac.uk - but that could
be fixed.

The secondary MX also has the effect of preventing most of the burst
load after a link has been down in any case, by collecting all of
the messages while the link was down, and then delivering them serially
when it returns, imposing a nice steady load.

kre

kre@cs.mu.OZ.AU (Robert Elz) (03/04/91)

medin@cincsac.arc.nasa.gov (Milo S. Medin) writes:

>In article <1991Feb26.092928.954@ecrc.de>, dave@ecrc.de (Dave Morton) writes:
>|> 
>|> Sigh - I know, but why are we still going over the pond to get to the UK ?
>|> Is there a reason why we cannot get the root name server in Europe operational
>|> and simply go over mcsun or whatever.

>The answer is not politics.

The answer is that the routing question (how to route from DE to UK,
etc), has nothing whatever to do with root nameserver locations,
but with what wires are installed where, and how the routing is
arranged after that.  Where the root nameservers are changes nothing.

kre

medin@cincsac.arc.nasa.gov (Milo S. Medin) (03/04/91)

In article <1991Mar3.234002.3035@waikato.ac.nz>, john@waikato.ac.nz (John Houlker) writes:
.
.
.
|> 
|> Milo, given the inertia of the large number of "broken" servers, what
|> about a "hack" of applying filters (at international gateways) on domain
|> UDP to root servers that are too "distant".  That would improve the
|> RTT estimation :-).  Is root-root server traffic all TCP?
|> 
|> John

John, this is a gross hack.  And it just means that the transatlantic link
might not get the requests, but all the packets would still get to the
U.S. side of the router.  Not to mention that some people might need to
talk to the root legitimately for debugging, and this all assumes that 
the root server in Europe isn't primary for anything else.  All the root-root
is normally TCP if they are zone xferring, but many roots in the U.S. do
act as primaries for other things.  

And none of this fixes the problem of European transit traffic via the U.S.
The point of my post was simply to indicate that the problem is complicated,
and people are working on it, but that progress is slow...  But not as slow
as getting a really well working BIND implementation!

							Thanks,
							   Milo

J.Crowcroft@cs.ucl.ac.uk (Jon Crowcroft) (03/04/91)

 >>It seems outrageous that all mail to the UK is shoved through one
 >>system.

it aint just the system - there is currently one main link for
academic traffic - if someone wants to fund another one they are
welcome - its fairly expensive:-)

soon, we (hope we) will have fallback routes via RIPE/IXI and via other ICBNET
bits of routes...

 >>Does anyone out there know what is going on and when it might be
 >>fixed.

this probably isnt the right list to discuss network operations, but
have you tried mailing gateway-outage at BBN who manage the
butterflies either end of the UK-US link...a little pressure on them
might help fix things like routing table size limits sooner
(actually now fixed...but not for ever since they seem to be compiled 
in limits).

also, i recommend mailing the people at ULCC (john seymour
and/or tony bates)
who actually run the service there
since they can answer technically better than anyone else...

and are responsible for the end systems you refer to...

jon

piet@cwi.nl (Piet Beertema) (03/05/91)

		That is *wrong*, it should accept the connection, send a 421
		reply code (service unavailable), and close the connection.
	This sounds like a good thing to do, but in practice,
	it can be a disaster. It takes some cpu time to do
	this, which isn't something you have available in these
	circumstances (and remember, its not just one of these
	things, is possibly dozens, all at once).
	But worse - the characteristics are wrong, on receiving
	a 421, servers typically queue the message and attempt
	to send it again later.
Nevertheless it can be a good idea:
Most sendmail daemons enter an infinite sleep() loop as
long as the load is too high, so the sending host times
out and (re)queues the message for later attempts anyway.
Hosts (major mail/relay hosts) with very long queues do
suffer from this, since all the timeouts severely hold
up the processing of the queue. If the recipient host
would send a 421 reply code and exit, the queue would be
processed much faster and mails that can be delivered in
a particular queue run thus delivered much faster.
In sendmail-5.61/IDA the #define BUSYEXIT allows you to
choose the preferred action (if you want the 412 reply
to be given, you should fix srvrsmtp.c around line 143 to
do so in the !batched case instead of the batched case).


-- 
	Piet Beertema, CWI, Amsterdam	(piet@cwi.nl)

dfk@NIC.EU.net (Daniel Karrenberg) (03/06/91)

medin@cincsac.arc.nasa.gov (Milo S. Medin) writes:

 > .... One 
 >problem is that much of Europe still talks to itself via the US.  Thus adding
 >a European root doesn't help much.  

This may not be the right newsgroup but Milo brought it up here, so i'll
continue it here. I have no intention to flame or critisise Milo, just
to light another angle of the story.

There are indeed some Europe-Europe paths still going via the US but I
object to the "much of Europe" phrasing.  To my knowlege this is not
happening on a major scale.  The EUropean Internet community (RIPE) is
doing its best to eradicate these paths!  At the last RIPE meeting we
have passed a very strong recommendation to get the problem solved
quickly.  There were quite a few people who advocated drastic measures
like filtering traffic on transatlantic links. 

We apreciate the help we get form the US Internet community while we
solve this problem and we are working on solving it. 

 > ... But the links to Europe come into the U.S. all over the place,
 >and thus European access is problematic.  

There are efforts underway to re-engineer this and some steps have
already been taken like the consolidating management of the US end of
lines to France and Scandinavia. 

 >I know many people would rather believe that problems are political in nature
 >and not technical (and some certainly are, much to all our grief!), 

In this case the European end of the problem is political and not at all
technical.  We have no centrally funded and managed backbone networks
which cover all of Europe.  Everything we do here has to be done by
cooperation rather than by central management and central
responsibility.  Imagine that all regional networks in the US just had
bilateral connections with different usage policies on each of them! 
That would be *hard* to FIX (pun intended).

The fact that "Internet Protocol Suite" was a dirty word here until very
recently -and still is in some parts, especially those with money-
doesn't help us either. 

 >most
 >people in the Internet community really try and do the right thing, and 
 >re-engineering the root nameserver system is one of many things we are
 >trying to do to improve life in the worldwide Internet.

(Re)-engineering the European Internet as a whole works just like that. 
The Internat community (coordinated by RIPE) is very pragamatic and
tries very hard to keep a reasonable infrastructure going.  We have to
live with our political and funding constraints however. 

Cheers

Daniel

(Deputy Chairman of RIPE)
-- 
Daniel Karrenberg                    Future Net:  <dfk@cwi.nl>
CWI, Amsterdam                        Oldie Net:  mcsun!dfk
The Netherlands          Because It's There Net:  DFK@MCVAX

dfk@NIC.EU.net (Daniel Karrenberg) (03/06/91)

 >ronald@robobar.co.uk (Ronald S H Khoo) writes:

 >>a20@nikhefh.nikhef.nl (Marten Terpstra) writes:
 >> here is another way of getting mail to the UK,
 >> but that one is not advertised via the DNS. It can be delivered via EUnet.
 >> This goes via mcsun.eu.net and then to the UK EUnet backbone in Kent. This
 >> however does not help you with your queues.

 >And it does not help him at all if the destination site is not an EUNet
 >customer because the mail will be bounced (probably at Kent).

This is -very fortunately- no longer true! So this route works. It 
is not indeicated by MX RRs in the DNS however.

Even better: The EUnet backbone at Kent will be on the Internet soon.
So the only thing that is missing is some agreement between the two UK nets
to help each other out in case of outages. As far as I know this has been
suggested and the mills are grinding.....

Daniel
-- 
Daniel Karrenberg                    Future Net:  <dfk@cwi.nl>
CWI, Amsterdam                        Oldie Net:  mcsun!dfk
The Netherlands          Because It's There Net:  DFK@MCVAX

brian@ucsd.Edu (Brian Kantor) (03/06/91)

If your sendmail doesn't accept the connection, so that my sendmail
times out trying, my sendmail will mark your host down and skip all
other delivery attempts to it in this queue run.

If you accept the connection then 421 it, our sendmail will attempt to
separately deliver every damn one of the messages we have queued for
your site and will get a 421 for each one of them, which means that
both you and I have just spawned one process for each of the messages
just to get a "go away" message.

Consider for a moment what would be happening to nsfnet-relay.ac.uk if
it were answering 421 instead of just letting connections time out?
UCSD is only one site, and we have over 150 messages queued for them
right now.  Multiply that by the number of other sites that have stuff
in the queue for them, and you have a real mess.

Don't do the 421 thing, please.  The AT&T gateway did and it was a
disaster.  They now time out connections when they're busy.
	- Brian

dave@ecrc.de (Dave Morton) (03/06/91)

In article <kre.668024207@mundamutti.cs.mu.OZ.AU> you write:
|> medin@cincsac.arc.nasa.gov (Milo S. Medin) writes:
|> 
|> >In article <1991Feb26.092928.954@ecrc.de>, dave@ecrc.de (Dave
Morton) writes:
|> >|> 
|> >|> Sigh - I know, but why are we still going over the pond to get to
the UK ?
|> >|> Is there a reason why we cannot get the root name server in
Europe operational
|> >|> and simply go over mcsun or whatever.
|> 
|> >The answer is not politics.
|> 
|> The answer is that the routing question (how to route from DE to UK,
|> etc), has nothing whatever to do with root nameserver locations,
|> but with what wires are installed where, and how the routing is
|> arranged after that.  Where the root nameservers are changes nothing.
|> 
Yes indeed, Robert is correct, the wires exist, the routing sometimes doesnt.
I was talking about all the DNS queries that end up going to the root
servers in US. What I guess I should have said was: can we avoid sending
nameserver queries across the ocean by installing a root name server here
in Europe and *also* fix the routing so that we dont have MX records etc
in New York for a site that's just down the road. As Daniel Karrenberg has
pointed out there's some hope yet. 
  
Dave Morton,
European Computer Research Centre		Tel. + (49) 89-92699-139
Arabellastr 17, 8000 Munich 81. Germany.	Fax. + (49) 89-92699-170
E-mail:	dave@ecrc.de

lyndon@cs.athabascau.ca (Lyndon Nerenberg) (03/08/91)

daniel@nstn.ns.ca (Daniel MacKay) writes:

>I have a little problem with the above: 

>	owl% telnet nsfnet-relay.ac.uk 25
>	Trying 128.86.8.6 ...
>	telnet: connect to address 128.86.8.6: Connection timed out

>There's no one listening on the SMTP port!  No, it's not a connectivity 
>problem!

Wrong. If there is no process listening on the port, telnet would say
"connection refused," not "connection timed out." Example:

% telnet annex01 smtp
Trying 131.232.3.5 ...
telnet: connect: Connection refused
telnet> 

Needless to say, our terminal servers don't speak smtp :-)

If you try to connect to a port that has no listener, you will get back an
ICMP_UNREACH_PORT from the host. This generates the message above. The
"timed out" message indicates no response whatsoever was received. It could
be that the ICMP_UNREACH_PORT did not make it back, generating the "timed out"
message, but that's highly unlikely since you were able to telnet sucessfully
to the telnet port.

-- 
    Lyndon Nerenberg  VE6BBM / Computing Services / Athabasca University
           atha!cs.athabascau.ca!lyndon || lyndon@cs.athabascau.ca
                    Packet: ve6bbm@ve6bbm.ab.can.noam
      The only thing open about OSF is their mouth.  --Chuck Musciano

kre@cs.mu.OZ.AU (Robert Elz) (03/09/91)

piet@cwi.nl (Piet Beertema) writes:

>Most sendmail daemons enter an infinite sleep() loop as
>long as the load is too high, so the sending host times
>out and (re)queues the message for later attempts anyway.

Only if there is no secondary MX - if there's an alternate
destination available, upon noticing a down or non-responding
host sendmail will send to the backup.

>Hosts (major mail/relay hosts) with very long queues do
>suffer from this, since all the timeouts severely hold
>up the processing of the queue.

Hosts that process large queues (and probably all hosts) should be
remembering that a host is down, and not even bothering to attempt
other connections to there.  Unfortunately, in sendmail at least,
a 421 response doesn't count as "down", so neither the secondary
MX, nor the queue skipping work in that case.

>If the recipient host
>would send a 421 reply code and exit, the queue would be
>processed much faster and mails that can be delivered in
>a particular queue run thus delivered much faster.

In practice this only helps mis-configured hosts.

>In sendmail-5.61/IDA the #define BUSYEXIT allows you to
>choose the preferred action (if you want the 412 reply
>to be given, you should fix srvrsmtp.c around line 143 to
>do so in the !batched case instead of the batched case).

Yes - that code should also be inside "#ifndef BUSYEXIT / #endif"
so that it can be completely excluded (and should be).

kre

kre@cs.mu.OZ.AU (Robert Elz) (03/09/91)

dave@ecrc.de (Dave Morton) writes:

>I was talking about all the DNS queries that end up going to the root
>servers in US. What I guess I should have said was: can we avoid sending
>nameserver queries across the ocean by installing a root name server here
>in Europe

In Australia, we're attempting to solve this (and avoid the problems
that a root nameserver would bring) by having a very well known
forwarder nameserver - that everyone (just about) uses.  That namserver
still needs to make queries to the root nameservers across the ocean,
but it only needs to make them once (per TTL).  Everyone else can pick up
the answer from that nameserver's cache.

kre