wyatt@cfa.HARVARD.EDU (Bill Wyatt) (02/03/90)
From article <276@jove.dec.com>, by mogul@decwrl.dec.com (Jeffrey Mogul): > In article <6227@cps3xx.UUCP> nanda@cpsvax.cps.msu.edu (Arun Nanda {manager}) writes: >>We are a VAX 8600 site running Ultrix 2.2. Of late, I have observed a >>strange symptom that I know, for sure, was not existing before. The >>problem is as follows: >>While trying to establish connections to certain remote sites through >>"telnet", "rlogin" or "ftp", the command always fails with the >>connection timing out. [...] > This sounds to me like your TCP datagrams are being sent with a TTL > (Time to live in the IP header) that is too small. The problem may > have arisen recently because your route to these "certain hosts" > has changed, from one with a short path to one with a longer path. > > Ultrix 2.2 inherited from 4.2BSD an initial TTL of 15 for TCP packets. [...] > In Ultrix 3.0, the TCP TTL is set to 60, which is probably large enough. > If you are able to upgrade to the latest release of Ultrix, you will > probably solve this particular problem, and get better TCP performance > in general. > > Otherwise, if you have an Ultrix source license, you can change the > value of TCP_TTL in netinet/tcp_timer.h, and recompile. If you don't > have sources, in theory it should be possible to patch your kernel to use > a larger TTL. I would rather not guess the addresses to patch, though! Well, if this is the original poster's problem, we also had the same problem and unfortunately, we're still on Ultrix 2.x for yet a little while longer (don't ask). A few months ago we stopped being able to reach some remote sites, with the symptoms described. We do have source, so I grovelled around for a while, looking at assembler output and the like, and found the location of the TTL parameter to fix. I hereby post it with no guarantees (except that we've used it for several months with no problem). You can patch /usr/sys/BINARY.vax/tcp_output.o and relink a kernel, or adb the /vmunix and reboot. For ultrix 2.2, the offset from tcp_output is 0x5a0; for ultrix 2.0, the offset is 0x5d4. The default value of the contents is `ff6', which is the assembly command ` cvtlb $f,8(r7)' which is setting ip_ttl in a structure to 0xf. Check via adb: adb /vmunix tcp_output+0x5a0 ? x xxxxxx: ff6 (where xxxxx is whatever the address is) and it's the first byte (the initial `f') that is the TTL count. Change it to, say, 47: adb -w /vmunix tcp_output+0x5a0 ? w 0x2ff6 xxxxxx: ff6 = 2ff6 Remember - for OS versions other than 2.2, the offset is different! Bill Wyatt, Smithsonian Astrophysical Observatory (Cambridge, MA, USA) UUCP : {husc6,cmcl2,mit-eddie}!harvard!cfa!wyatt Internet: wyatt@cfa.harvard.edu SPAN: cfa::wyatt BITNET: wyatt@cfa Bill Wyatt, Smithsonian Astrophysical Observatory (Cambridge, MA, USA) UUCP : {husc6,cmcl2,mit-eddie}!harvard!cfa!wyatt Internet: wyatt@cfa.harvard.edu SPAN: cfa::wyatt BITNET: wyatt@cfa
mogul@decwrl.dec.com (Jeffrey Mogul) (02/06/90)
In article <303@cfa.HARVARD.EDU> wyatt@cfa.HARVARD.EDU (Bill Wyatt) writes: >Well, if this is the original poster's problem, we also had the same >problem and unfortunately, we're still on Ultrix 2.x for yet a little >while longer (don't ask). A few months ago we stopped being able to >reach some remote sites, with the symptoms described. We do have >source, so I grovelled around for a while, looking at assembler output >and the like, and found the location of the TTL parameter to fix. I >hereby post it with no guarantees (except that we've used it for >several months with no problem). I can't comment directly on the correctness of your fix, since I don't have an Ultrix 2.2 system to test it on, but I should point out that the TTL parameter also occurs in the code in the tcp_respod() function in tcp_subr.c. I suspect that what this means is that if you have the wrong TTL here, some subset of your ACK packets will get dropped. If things are working for you, either it's because I'm wrong about this, or it's because almost all TCP connections have data flowing in both directions (and your fix should set the TTL right for any segment carrying data). -Jeff