[comp.unix.ultrix] Ultrix 2.x ADB patch to TCP/IP TTL

wyatt@cfa.HARVARD.EDU (Bill Wyatt) (02/03/90)

From article <276@jove.dec.com>, by mogul@decwrl.dec.com (Jeffrey Mogul):
> In article <6227@cps3xx.UUCP> nanda@cpsvax.cps.msu.edu (Arun Nanda {manager}) writes:
>>We are a VAX 8600 site running Ultrix 2.2.  Of late, I have observed a
>>strange symptom that I know, for sure, was not existing before.  The
>>problem is as follows:

>>While trying to establish connections to certain remote sites through
>>"telnet", "rlogin" or "ftp", the command always fails with the
>>connection timing out. 
[...]

> This sounds to me like your TCP datagrams are being sent with a TTL
> (Time to live in the IP header) that is too small.  The problem may
> have arisen recently because your route to these "certain hosts"
> has changed, from one with a short path to one with a longer path.
> 
> Ultrix 2.2 inherited from 4.2BSD an initial TTL of 15 for TCP packets.
[...]
> In Ultrix 3.0, the TCP TTL is set to 60, which is probably large enough.
> If you are able to upgrade to the latest release of Ultrix, you will
> probably solve this particular problem, and get better TCP performance
> in general.
> 
> Otherwise, if you have an Ultrix source license, you can change the
> value of TCP_TTL in netinet/tcp_timer.h, and recompile.  If you don't
> have sources, in theory it should be possible to patch your kernel to use
> a larger TTL.  I would rather not guess the addresses to patch, though!

Well, if this is the original poster's problem, we also had the same
problem and unfortunately, we're still on Ultrix 2.x for yet a little
while longer (don't ask). A few months ago we stopped being able to
reach some remote sites, with the symptoms described.  We do have
source, so I grovelled around for a while, looking at assembler output
and the like, and found the location of the TTL parameter to fix. I
hereby post it with no guarantees (except that we've used it for
several months with no problem). 

You can patch /usr/sys/BINARY.vax/tcp_output.o and relink a kernel,
or adb the /vmunix and reboot.

For ultrix 2.2, the offset from tcp_output is 0x5a0; for ultrix
2.0, the offset is 0x5d4. The default value of the contents is `ff6',
which is the assembly command  ` cvtlb $f,8(r7)' which is setting 
ip_ttl in a structure to 0xf. Check via adb:

adb /vmunix

tcp_output+0x5a0 ? x
xxxxxx:      ff6      (where xxxxx is whatever the address is)

and it's the first byte (the initial `f') that is the TTL count.
Change it to, say, 47:

adb -w /vmunix
tcp_output+0x5a0 ? w 0x2ff6
xxxxxx:      ff6     =   2ff6

Remember - for OS versions other than 2.2, the offset is different!

Bill Wyatt, Smithsonian Astrophysical Observatory  (Cambridge, MA, USA)
    UUCP :  {husc6,cmcl2,mit-eddie}!harvard!cfa!wyatt
 Internet:   wyatt@cfa.harvard.edu
     SPAN:   cfa::wyatt                 BITNET: wyatt@cfa
Bill Wyatt, Smithsonian Astrophysical Observatory  (Cambridge, MA, USA)
    UUCP :  {husc6,cmcl2,mit-eddie}!harvard!cfa!wyatt
 Internet:   wyatt@cfa.harvard.edu
     SPAN:   cfa::wyatt                 BITNET: wyatt@cfa

mogul@decwrl.dec.com (Jeffrey Mogul) (02/06/90)

In article <303@cfa.HARVARD.EDU> wyatt@cfa.HARVARD.EDU (Bill Wyatt) writes:
>Well, if this is the original poster's problem, we also had the same
>problem and unfortunately, we're still on Ultrix 2.x for yet a little
>while longer (don't ask). A few months ago we stopped being able to
>reach some remote sites, with the symptoms described.  We do have
>source, so I grovelled around for a while, looking at assembler output
>and the like, and found the location of the TTL parameter to fix. I
>hereby post it with no guarantees (except that we've used it for
>several months with no problem). 

I can't comment directly on the correctness of your fix, since I
don't have an Ultrix 2.2 system to test it on, but I should point
out that the TTL parameter also occurs in the code in the tcp_respod()
function in tcp_subr.c.  I suspect that what this means is that if
you have the wrong TTL here, some subset of your ACK packets will
get dropped.  If things are working for you, either it's because I'm
wrong about this, or it's because almost all TCP connections have data
flowing in both directions (and your fix should set the TTL right for
any segment carrying data).

-Jeff