[comp.protocols.tcp-ip] Small TTL values evaporate in large networks

spurgeon@SIRIUS.CC.UTEXAS.EDU (Charles Spurgeon) (05/25/89)

Here's one for the ``Netstoppers Notebook.''

We've recently resolved a number of mail failures at UTexas by
noticing that the TCP in older UNIX boxes (BSD 4.2, SunOS 3.2) was
setting the TTL to 15 in outgoing packets.

The network path to some sites at MIT consists of 17 hops from Austin
and the TTL was expiring before the packet got to the destination
address!  This kept the mail from getting through.  If you notice
consistent mail failures to a distant address you might want to check
on the outgoing TTL.  It appears that BSD 4.3 sets the TTL to 30 in
outgoing TCP.

While cognizant of the need to keep TCP segment lifetimes on a short
chain one does wonder what a good value for the TCP TTL might be?
With hop counts of around 20 already being seen, the new ``standard''
of 30 doesn't inspire confidence...

ahill@BBN.COM ("Alan R. Hill") (05/25/89)

Charles,
	Was there any evidence that the hop that dropped the packet
after its TTL expired informed the source of its action?

Alan

jas@proteon.com (John A. Shriver) (05/25/89)

Neither value is correct.  It is clearly called out in RFC 793 (TCP)
to be sixty (60).  To quote:

  TCP/Lower-Level Interface

    The TCP calls on a lower level protocol module to actually send and
    receive information over a network.  One case is that of the ARPA
    internetwork system where the lower level module is the Internet
    Protocol (IP) [2].

    If the lower level protocol is IP it provides arguments for a type
    of service and for a time to live.  TCP uses the following settings
    for these parameters:

      Type of Service = Precedence: routine, Delay: normal, Throughput:
      normal, Reliability: normal; or 00000000.

      Time to Live    = one minute, or 00111100.

        Note that the assumed maximum segment lifetime is two minutes.
        Here we explicitly ask that a segment be destroyed if it cannot
        be delivered by the internet system within one minute.

Thus, neither 4.2bsd (15) nor 4.3bsd (30) is correct.  (Why fix a bug
with another bug?)

You can do a quick check by examining the value of TCP_TTL in
/usr/sys/netinet/tcp_timer.h, but there is no guaruntee that your
vendor compiled the kernel using this file.

In SunOS 3.5.2, the value has been increased to 30, which is not so
bad a bug.

The TTL was still 15 on Ultrix-32 V2.2.  I reported it in SPR
487999/ICA-16612.  I got patches for three kernel files.  I presume
that it is fixed in V3.0.

spurgeon@SIRIUS.CC.UTEXAS.EDU (Charles Spurgeon) (05/25/89)

>	Was there any evidence that the hop that dropped the packet
>after its TTL expired informed the source of its action?
The folks who originally saw the problem reported ICMP messages, so
I guess the answer is ``yes.''

Running ``traceroute'' gave us a fairly good set of replies as well,
although a trace of the route to ``life.ai.mit.edu'' showed 15 hops to
the last gateway seen, but only 13 hops back.  

jsm@phoenix.Princeton.EDU (John Scott McCauley Jr.) (05/30/89)

A similar problem: at least one BSDish implementation of TCP I've seen
had IPTTLDEC set to 5, not 1. (IPTTLDEC apparantly is the number that gets
subtracted from a packet's TTL when the machine forwards the packet -- it
also serves as the minimum ttl value that a packet can have before sending an
ICMP TTL_EXCEEDED message I think.)

This wasn't too much of a problem as the machine wasn't a gateway. However,
another machine on the local net was sending out broadcast packets with a
TTL of 3 (don't know why), so the first machine was the only machine on
the net sending out ICMP TTL_EXCEEDED messages.

I did a binary patch to change IPTTLDEC to 1 (in two places) and the
problem went away.

	Scott

-- 
Scott McCauley, jsm@phoenix.princeton.edu (INTERNET)
Home:	(609) 683-9065
Office: (609) 243-3312   (FTS 340-3312)
Fax:	(609) 243-2160   (FTS 340-2160)

brian@ucsd.EDU (Brian Kantor) (05/30/89)

The new Wollongong/AT&T WIN TCP/IP version 3.0 for the 3B series of
computers arrived with its TTL set to 0xf - 15, in other words.  Luckily
it's a parameter you can set (and the documentation includes telling you
how to do it).

Hmm. I just noticed that our copy sat on the AT&T loading dock for 15 days -
I wonder if that TTL value also applies to delivery time?  Maybe I'd
better NOT set it to 60 or I'll have to wait two months for the next
one....			:-)
	- Brian

casey@lll-crg.llnl.gov (Casey Leedom) (06/02/89)

  Just as a note, it turns out the Stellar's Stellix version 1.6 also has
this problem.  It sets the TTL field to 10 for ICMP and TCP packets.  I'll
attempt to get a patched kernel and will report on a fix when it's
available.

Casey