robert@spam.istc.sri.com (04/15/88)
When last youall heard from me I was bemoaning the loss of some /usr/include/netinet header files. Thanks to your help I've obtained many of the files I need an am continuing on my quest to port code to the HP9000/300 from the Sun's we've been using. In the course of that I've encountered some, more serious, problems which I would like to mention to an HP people watching the group, as well as to any users who may have encountered the same problems I have. Thanks! +++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ New problems ------------ I've been trying to port both applications which use sockets for IPC, and the gated router. The latest set of problems relates to applications porting and development, as I have not gotten far enough into the gated port yet. I've encountered problems with using sockets on the HP9000/300 running OS version 6.0. I've ported some software I developed which starts a Server listening for Client connections, and about 32 Client processes asyncronously. Althought most of the Clients will connect to the Server, there is a bug which causes the kernel to use up all of the mbufs (kernel memory buffers) and wedges the some portion of the networking kernel. When this happens one can no longer telnet into the HP over the ethernet, although one may still log onto the console of the HP and telnet out. A side effect of this is that the NFS running on the HP gets hung, so you cannot log in and have your account there (it wedges on login until CNTL C). There is a message from the kernel which says something about "out of mbufs" and "read error from network", but since the rlogin window on my Sun to the HP disappears immediately exits I haven't been able to collect the exact error message. Since people other than I use the HP I can't do the test at will, as we have to reboot the HP each time it happens. Based on the error messages and the results of the problem my guess is that the networking portion of the kernel is wedging when it runs out (or thinks it runs out) of buffers. Testing on my part is incomplete (we have our own HP on order so I can crash it at will with impunity), but the bug only seems to occur when I open more than 30 connections to the server. HP/UX does not offer getdtablesize() (Sun 3.4 does) which returns the max number of bits usable for the select() 'nfds' paramater, so I was assuming 32. The fact that you can accept more fds than you can select is a new feature which my software will have to deal with, but in the meantime I'm crashing the system. By the way, my Client and Server processes are definately not sending too much data (thus causing buffer loss), as they have worked before on VAXen, Suns, and sort of on A/UX (Mac II). In case you're wondering why I'm opening up so many Client connections, it's a torture test of the socket/file-descriptor interface. In addition I'm testing for a bug under 4.2 where TCP peers can become ESTABLISHED without the Client and Server becoming established. This is a pretty serious bug, and it is impinging upon my development effort, so I'm very open to input from anyone about how I can resolve my probelms (aside from not opening so many connections). One last thing. Does anyone know why the netstat program is so slow? It produces one line at a time on our 9000/300, almost as it if it is starting at the top of kmem and searching through a byte at a time. Is this normal for the 300, or the 800? Thanks again. ---------------------------------------------------------------- Robert Allen, robert@spam.istc.sri.com "It's the last of the 415-859-2143 (work phone, days) V8 Interceptors Max!" ----------------------------------------------------------------
wunder@hpcea.CE.HP.COM (Walter Underwood) (04/16/88)
One last thing. Does anyone know why the netstat program is so slow? It takes a while to look up each hostname. "netstat -n" just prints the IP addresses ("numbers"), and runs much faster. wunder