robert@spam.istc.sri.com (04/15/88)
When last youall heard from me I was bemoaning the loss of some
/usr/include/netinet header files. Thanks to your help I've obtained
many of the files I need an am continuing on my quest to port
code to the HP9000/300 from the Sun's we've been using. In
the course of that I've encountered some, more serious, problems
which I would like to mention to an HP people watching the group,
as well as to any users who may have encountered the same problems
I have. Thanks!
+++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
New problems
------------
I've been trying to port both applications which use sockets
for IPC, and the gated router. The latest set of problems relates
to applications porting and development, as I have not gotten far
enough into the gated port yet.
I've encountered problems with using sockets on the HP9000/300
running OS version 6.0. I've ported some software I developed which
starts a Server listening for Client connections, and about 32 Client
processes asyncronously. Althought most of the Clients will connect
to the Server, there is a bug which causes the kernel to use up all of
the mbufs (kernel memory buffers) and wedges the some portion of the
networking kernel. When this happens one can no longer telnet into the HP
over the ethernet, although one may still log onto the console of the HP
and telnet out. A side effect of this is that the NFS running on the
HP gets hung, so you cannot log in and have your account there (it
wedges on login until CNTL C). There is a message from the kernel
which says something about "out of mbufs" and "read error from network",
but since the rlogin window on my Sun to the HP disappears immediately
exits I haven't been able to collect the exact error message. Since
people other than I use the HP I can't do the test at will, as we have
to reboot the HP each time it happens. Based on the error messages
and the results of the problem my guess is that the networking portion
of the kernel is wedging when it runs out (or thinks it runs out) of
buffers. Testing on my part is incomplete (we have our own HP on order
so I can crash it at will with impunity), but the bug only seems to
occur when I open more than 30 connections to the server. HP/UX does
not offer getdtablesize() (Sun 3.4 does) which returns the max number
of bits usable for the select() 'nfds' paramater, so I was assuming
32. The fact that you can accept more fds than you can select is
a new feature which my software will have to deal with, but in the
meantime I'm crashing the system. By the way, my Client and Server
processes are definately not sending too much data (thus causing
buffer loss), as they have worked before on VAXen, Suns, and sort of
on A/UX (Mac II). In case you're wondering why I'm opening up so
many Client connections, it's a torture test of the socket/file-descriptor
interface. In addition I'm testing for a bug under 4.2 where TCP
peers can become ESTABLISHED without the Client and Server becoming
established. This is a pretty serious bug, and it is impinging upon
my development effort, so I'm very open to input from anyone about
how I can resolve my probelms (aside from not opening so many connections).
One last thing. Does anyone know why the netstat program is so slow?
It produces one line at a time on our 9000/300, almost as it if it is
starting at the top of kmem and searching through a byte at a time.
Is this normal for the 300, or the 800?
Thanks again.
----------------------------------------------------------------
Robert Allen, robert@spam.istc.sri.com "It's the last of the
415-859-2143 (work phone, days) V8 Interceptors Max!"
----------------------------------------------------------------wunder@hpcea.CE.HP.COM (Walter Underwood) (04/16/88)
One last thing. Does anyone know why the netstat program is so slow?
It takes a while to look up each hostname. "netstat -n" just prints
the IP addresses ("numbers"), and runs much faster.
wunder