niklas@appli.se (Niklas Hallqvist) (09/18/90)
Hello, out there. This is a repost of an article posted to comp.unix.i386 at the time of the comp.unix.* reorganisation. I didn't get any answer then, so this time I'll broaden the distribution to include comp.protocols.{nfs,tcp-ip} on top of using the new correct newsgroup; comp.unix.sysv386! We have an ethernet network with three nodes, all of them running NFS. One of the most useful commands is on(1) which runs commands on another node but retains the environment (including the current directory). Very neat! My problem is: I can't use this facility to run programs on our 386/ix (2.0.2 core, 1.1.2 TCP/IP, 2.0 NFS). I get this error message: "on: af clnt_call..RPC: Unable to receive" sometimes, and sometimes I won't even get an error message! The logfile /tmp/rexd.log looks something like this: Sep 6 09:00 (Rpchild/10444): Child #10444 processing RPC for request REXD INFO: errno=22, msg="Invalid argument" Sep 6 09:00 (Rpchild/10444): About to fork execution child; cmd='ls' REXD INFO: errno=9, msg="Bad file number" Sep 6 09:00 (Rpchild/10444): [RPC Child: svc_fds == 0, shutting down] REXD INFO: errno=9, msg="Bad file number" or like this: Sep 6 09:02 (Rpchild/10446): Child #10446 processing RPC for request REXD INFO: errno=22, msg="Invalid argument" Sep 6 09:02 (Rpchild/10446): About to fork execution child; cmd='ls' REXD INFO: errno=9, msg="Bad file number" Sep 6 09:02 (Rpchild/10446): [RPC Child: svc_fds == 0, shutting down] REXD INFO: errno=4, msg="Interrupted system call" The other way everything works like it's expected to (e.g. running a command on our NCR Tower using the 386/ix on(1) command). Even local usages, like "on localhost ls" fails! What have I done wrong? Is there a magical kernel parameter which is wrongly set? Please help! And then there's this "remote shell" handle by /etc/rshd on the 386/ix. Very often (not always, though) my client "remsh" on another node gets hung after sending the standard input to the foreign shell. Very annoying indeed! After I kill the client the daemon continues as if nothing has happenned. It seems like the EOF gets lost on the way, but reappears if I kill the client. Another possibly related weirdness of our 386/ix system is the presence of all these strange TIME_WAIT, CLOSE_WAIT, FIN_WAIT_2 & CLOSED IP-sessions that never goes away from our netstat: Active Internet connections Proto Recv-Q Send-Q Local Address Foreign Address (state) tcp 0 0 ix.1224 ix.111 TIME_WAIT tcp 0 0 ix.1182 nix.1181 CLOSE_WAIT tcp 0 0 ix.1181 nix.1039 CLOSE_WAIT tcp 0 0 ix.shell appli.1023 CLOSED The corresponding lines from our node "nix": (only the ones concerning "ix") tcp 0 0 nix.1181 ix.1182 FIN_WAIT_2 tcp 0 0 nix.1039 ix.1181 FIN_WAIT_2 There are no corresponding line for the CLOSED connection to node "appli" in the output from netstat on that node. What's going on here? Most things does work like X11, NFS, rlogin, rcp etc. It's just "rexd" & "rshd" that fails! Any Ideas? Niklas --- Niklas Hallqvist Phone: +46-(0)31-19 14 85 Applitron Datasystem Fax: +46-(0)31-19 80 89 N. Gubberogatan 30 Email: niklas@appli.se S-416 63 GOTEBORG sunic!chalmers!appli!niklas Sweden
niklas@appli.se (Niklas Hallqvist) (09/18/90)
Hello, out there. This is a repost of an article posted to comp.unix.i386 at the time of the comp.unix.* reorganisation. I didn't get any answer then, so this time I'll broaden the distribution to include comp.protocols.{nfs,tcp-ip} on top of using the new correct newsgroup; comp.unix.sysv386! We have an ethernet network with three nodes, all of them running NFS. One of the most useful commands is on(1) which runs commands on another node but retains the environment (including the current directory). Very neat! My problem is: I can't use this facility to run programs on our 386/ix (2.0.2 core, 1.1.2 TCP/IP, 2.0 NFS). I get this error message: "on: af clnt_call..RPC: Unable to receive" sometimes, and sometimes I won't even get an error message! The logfile /tmp/rexd.log looks something like this: Sep 6 09:00 (Rpchild/10444): Child #10444 processing RPC for request REXD INFO: errno=22, msg="Invalid argument" Sep 6 09:00 (Rpchild/10444): About to fork execution child; cmd='ls' REXD INFO: errno=9, msg="Bad file number" Sep 6 09:00 (Rpchild/10444): [RPC Child: svc_fds == 0, shutting down] REXD INFO: errno=9, msg="Bad file number" or like this: Sep 6 09:02 (Rpchild/10446): Child #10446 processing RPC for request REXD INFO: errno=22, msg="Invalid argument" Sep 6 09:02 (Rpchild/10446): About to fork execution child; cmd='ls' REXD INFO: errno=9, msg="Bad file number" Sep 6 09:02 (Rpchild/10446): [RPC Child: svc_fds == 0, shutting down] REXD INFO: errno=4, msg="Interrupted system call" The other way everything works like it's expected to (e.g. running a command on our NCR Tower using the 386/ix on(1) command). Even local usages, like "on localhost ls" fails! What have I done wrong? Is there a magical kernel parameter which is wrongly set? Please help! And then there's this "remote shell" handle by /etc/rshd on the 386/ix. Very often (not always, though) my client "remsh" on another node gets hung after sending the standard input to the foreign shell. Very annoying indeed! After I kill the client the daemon continues as if nothing has happenned. It seems like the EOF gets lost on the way, but reappears if I kill the client. Another possibly related weirdness of our 386/ix system is the presence of all these strange TIME_WAIT, CLOSE_WAIT, FIN_WAIT_2 & CLOSED IP-sessions that never goes away from our netstat: Active Internet connections Proto Recv-Q Send-Q Local Address Foreign Address (state) tcp 0 0 ix.1224 ix.111 TIME_WAIT tcp 0 0 ix.1182 nix.1181 CLOSE_WAIT tcp 0 0 ix.1181 nix.1039 CLOSE_WAIT tcp 0 0 ix.shell appli.1023 CLOSED The corresponding lines from our node "nix": (only the ones concerning "ix") tcp 0 0 nix.1181 ix.1182 FIN_WAIT_2 tcp 0 0 nix.1039 ix.1181 FIN_WAIT_2 There are no corresponding line for the CLOSED connection to node "appli" in the output from netstat on that node. What's going on here? Most things does work like X11, NFS, rlogin, rcp etc. It's just "rexd" & "rshd" that fails! Any Ideas? Niklas -- Niklas Hallqvist Phone: +46-(0)31-19 14 85 Applitron Datasystem Fax: +46-(0)31-19 80 89 N. Gubberogatan 30 Email: niklas@appli.se S-416 63 GOTEBORG, Sweden sunic!chalmers!appli!niklas
als@bohra.cpg.oz (Anthony Shipman) (09/19/90)
In article <1112@appli.se>, niklas@appli.se (Niklas Hallqvist) writes: > > Hello, out there. ....... > And then there's this "remote shell" handle by /etc/rshd on the > 386/ix. Very often (not always, though) my client "remsh" on another node > gets hung after sending the standard input to the foreign shell. Very > annoying indeed! After I kill the client the daemon continues as if nothing > has happenned. It seems like the EOF gets lost on the way, but reappears if > I kill the client. This used to happen with me when trying to pipe a file from a non 386/ix system to 386/ix 2.0.1 rshd. It would happen 100% of the time. However when the sender was killed the missing EOF would go through and the command would complete properly. I count this as a bug in 386/ix TCP/IP. Some other programs would not talk to foreign tcp/ip nodes properly, sometimes coming, sometimes going. There was even incompatibility between different versions of 386/ix. It appears that 386/ix was only tested with itself (at least at Rev 2.0.?). -- Anthony Shipman ACSnet: als@bohra.cpg.oz.au Computer Power Group 9th Flr, 616 St. Kilda Rd., St. Kilda, Melbourne, Australia D