rbd@lamont.UUCP (02/24/87)
Our site just upgraded all of our Suns from Sun 3.0 UNIX to Sun 3.2 UNIX. This upgrade has broken the rlogin on our PDP 11/70 running 2.9 BSD. (Prior to this upgrade, we never experienced any difficulty with rlogin between these machines.) Symptoms are: i) When rlogging in from a Sun to the PDP, the PDP accepts the connection, prints out /etc/motd, and then immediately logs you out with a "Connection closed." message. ii) Attempting to rlogin from the PDP to a Sun crashes the PDP about 75% of the time with a "panic: dtom" message, and succeeds the other 25% of the time. Telnet, rsh and rcp have all been working normally. The Sun 3.0 rlogin and in.rlogind also worked normally with the PDP when they were restored from a backup tape. These facts, together with Sun's admission that there were major alterations made to rlogin and in.rlogind between 3.0 and 3.2 (mostly to pass along window size information), seem to point the finger directly at an incompatibility between the Sun 3.2 and PDP 2.9 rlogins. Is this due to a lack of robustness in the 2.9 networking (possible bug fix in 2.10?) or is Sun doing something to rlogin that is incompatible with 2.9 and maybe other systems as well? We would greatly appreciate hearing from other sites experiencing similar problems. Does anyone have a fix for this? Roger Davis Lamont-Doherty Geological Observatory Palisades, NY 10964 (914) 359-2900 x547 {seismo,decvax,ihnp4}!philabs!lamont!rbd
abe@j.cc.purdue.edu.UUCP (02/27/87)
In article <162@lamont.UUCP>, rbd@lamont.UUCP (Roger Davis) writes: > > ii) Attempting to rlogin from the PDP to a Sun > crashes the PDP about 75% of the time with a > "panic: dtom" message, and > succeeds the other 25% of the time. There is a simple fix to sohasoutofband() in sys/socket.c that will prevent one kind of dtom panic: sohasoutofband(so) struct socket *so; { mapinfo map; wrong! savemap(map); if (so->so_pgrp == 0) return; right! savemap(map); Remove the savemap(map) call before the if/return and place it after them. That removes the possibility of an unbalanced savemap/restoremap pair. Vic Abell, abe@j.cc.purdue.edu j.cc.purdue.edu!abe
guy@gorodish.UUCP (02/27/87)
>These facts, together with Sun's admission that there were major >alterations made to rlogin and in.rlogind between 3.0 and 3.2 (mostly >to pass along window size information), "Admission" is hardly the appropriate word here. The major alterations were made by Berkeley for 4.3BSD; we picked them up for 3.2 because they are a major win, especially when doing "rlogin"s from a Sun where windows are quite often *not* the canonical 34x80 Sun window and often change size during a session. >Is this due to a lack of robustness in the 2.9 networking (possible >bug fix in 2.10?) or is Sun doing something to rlogin that is >incompatible with 2.9 and maybe other systems as well? The people at Berkeley's CSRG were the ones who did something to "rlogin", not us. The only differences between our "rlogin" and the 4.3BSD one are that ours supports the Sun-style "ioctl" to get the window size and that ours actually checks the return code from various network and pseudo-tty I/O operations. The only differences between our "in.rlogind" and the 4.3BSD "rlogind" are that ours supports the Sun-style "ioctl" and has otherwise been modified to run under a system closer to 4.2BSD than to 4.3BSD, and that it uses a different strategy for dealing with "write"s that return EWOULDBLOCK. My suspicion is that 2.9 is at fault here (sorry, Keith). The fact that the 2.9 machine crashes indicates that there is certainly at least one 2.9BSD bug - no machine should crash just because some packet it doesn't understand comes over the wire. If you try to log in to a system running the 4.3BSD "rlogin" daemon, such as a Sun running SunOS 3.2, the "rlogind" daemon will send an out-of-band message telling the "rlogin" client that it's a new-style daemon and that it should send the daemon updates when the window size changes. The chances are that the 2.9BSD code is somehow not handling this properly.