[comp.unix.xenix] Xenix TCP/IP

wrp@biochsn.acc.Virginia.EDU (William R. Pearson) (10/26/89)

	I was unable to reply to the individual who asked for reports
about TCP/IP for Xenix, and this review may be of general interest.


	I am using Streamlined Networks TCP/IP for Xenix386 2.3.2.
It provides ftp, telnet, rcp, rlogin, rsh, and works with the WD8003e
board. Cost: $495.

	It works, but it isn't perfect.  I can rcp out to any machine,
and I can rsh into my machine, and I can ftp out, but I cannot ftp
in.  I can rcp into my machine while logged into a remote machine, e.g.

	sun> rcp file xenix:

But I cannot do:

	xenix> rcp sun:file .

(I can do:)

	xenix> rcp file sun:

I can rlogin out but not it.  I can ftp in and out from xenix, but not
into xenix. (It says it only supports some sort of anonymous login, but
that doesn't even work).

	One last problem, when I am rlogin-ed from xenix to a remote 
machine (or telnet), the connection seems to die for 15 - 60 seconds
periodically, and then it comes back.  Perhaps some timing parameter
is not set up properly.

	There was very little documentation about how to set it up
for my location, which has subnets and routers, but after I asked
our network guru, it was working quickly.  I also had some problems
figuring how to get the wd8003e working on and inboard 386 computer,
but that wasn't their fault.

	Technical support is poor, I have mentioned all of these problems
to them, but nothing has happened.  I don't think they like supporting
Xenix very much, as opposed to Unix 3.2.

	But it works, I use it every day.  I cost less than SCO, but I'm
thinking of switching.

Bill Pearson

cpm@dlcq15.datlog.co.uk (Paul Merriman) (02/13/90)

I sent this a couple of weeks ago and no one has replied ;-(
If anyone even sees it please mail me - I have a feeling it didn't get out
to the Net last time. Eagerly awaiting your reply...


Hi, 
	I have encountered a couple of problems with Xenix 2.3.2 and Xenix
TCP/IP with Western Digital or 3Com cards and was wondering if anyone else out
there has had similar problems.

Background Information
----------------------

We have several sites using Unisys PW800s (386 PCs) and Unisys Xenix 2.3.2,
which is a Unisys licenced version of SCO Xenix 2.3.2, with Western Digital
network cards.

Problem 1)
---------

Occasionally we get a kernel panic as follows:-


TRAP 0000000E in SYSTEM, error code 06000000
eax=FF030202 ebx=00000000 ecx=4A000001 edx=00000030
esi=0008A204 edi=4A000001 ebp=06000620 fl=00010282
udc=00030018 es=00000018 fs=0003003F gs=0000003F
tr=00000100 pc=0090020:0001A12b ksp=060005B8

kernel: PANIC: non-recoverable kernel page fault



The machine has the following hardware

ram : 1Mbyte + 4Mbyte ram card
disk : 110Mbyte
network card : Western Digital 
serial I/O : Anvil Stallion card
O/S : SCO XENIX 2.3.2 beta release
Machine model : Unisys PW800/20


On other occasions a machine will just "die", with no accompanying panic 
message. We mentioned the problem to someone at SCO some time ago and they
came back with "it's a hardware error, reseat the memory boards". It has 
happened on several other machines since then so I think we can rule out
hardware, unless it's a real incompatibility ;-( !!

We haven't seen this panic on any other machines (e.g. Compaq), yet...

The problem "seems" to be network related - i.e. we were doing something 
intensive on the network at the time (e.g. a large rcp) though this may be
coincidence, or just not true!

Problem 2)
----------

This has been seen on the above Unisys machines with Western Digital network
card and a Compaq with 3Com card.

A number of processes which have socket connections to other machines break
their connections. It should be mentioned here that these processes use 
non-blocking writes and an alarm call to determine when to "give up" on the
write and break the connection. In one case you could not then connect to
the machine across the network (telnet, rlogin), though the machine is 
running and can be accessed from the console. In some cases the connections 
have managed to re-establish themselves some time later. 

Unfortunately because of client pressure to get the systems up and running again
we have been unable to examine the problem "in situ" (or even in Halifax :-)!)
and have had to restart the machines (which clears the problem).

The SCO TCP/IP manual mentions an "attrition of resources" problem which they
have had reported but cannot reproduce - maybe this is it.

Some investigation using "rsh" showed that if you killed an rsh daemon on the
PC then subsequent telnet sessions would just hang - as if waiting to write
across the network. This was most noticeable if you tried to "cat" a long file
whilst telnetting to the PC (it would just hang half way through, but did
respond to the break key). This would tie in with "attrition of resources" -
presumably the telnet session would be waiting for resources (buffers?)
to become free.

We are currently trying to reproduce this sort of problem in a reliable manner
so that we can present this information to SCO; however until that time does
anyone else using Xenix TCP/IP have similar experiences to recount?
-- 
C. Paul Merriman        <cpm@datlog.co.uk> or < {backbone}!ukc!datlog!cpm >
                       Voice:  +44 1 863 0383 (x2153)

adnan@sgtech.UUCP (Adnan Yaqub) (02/16/90)

In article <1990Feb13.131255.3683@dlcq15.datlog.co.uk> cpm@dlcq15.datlog.co.uk (Paul Merriman) writes:

   Problem 1)
   ---------

   Occasionally we get a kernel panic as follows:-


   TRAP 0000000E in SYSTEM, error code 06000000
   eax=FF030202 ebx=00000000 ecx=4A000001 edx=00000030
   esi=0008A204 edi=4A000001 ebp=06000620 fl=00010282
   udc=00030018 es=00000018 fs=0003003F gs=0000003F
   tr=00000100 pc=0090020:0001A12b ksp=060005B8

   kernel: PANIC: non-recoverable kernel page fault

I have seen this also.  I assume you have tcp/ip 1.0.1d.  We were
trying to get things going over StarLAN and the WD driver was buggy.
We contacted WD and got a new driver.  We still get the panics, and
sometimes a message which says: "qenable would have been called with
NULL in wdsched() for XWAIT" and then a panic.  What module (use nm)
is at the pc above?  SCO told us they have an even more recent WD
driver than the one we got from WD.  The said they just fixed a bug on
Friday, February 9, 1990!

   Problem 2)
   ----------

   This has been seen on the above Unisys machines with Western Digital network
   card and a Compaq with 3Com card.

   A number of processes which have socket connections to other machines break
   their connections. It should be mentioned here that these processes use 
   non-blocking writes and an alarm call to determine when to "give up" on the
   write and break the connection. In one case you could not then connect to
   the machine across the network (telnet, rlogin), though the machine is 
   running and can be accessed from the console. In some cases the connections 
   have managed to re-establish themselves some time later. 

We have a similar problem where our main host on the network goes
deaf (can send out packets but not receive them).  It seems to be load
related, i.e., it occurs when we have lots of activity into the
machine (4 or more telnet sessions).  I used the streams watch
utility, sw, but couldn't see anything unusual.

We have another problem here with SCO TCP/IP one host, the main one,
spits out "Note: tcp sum: source <ip-address> sum <hex number>" every
now and again.  I assume that these are warnings that a packet has
been received with a TCP checksum error.  The scary thing is that the
network is very clean and the IP address of the source is sometimes
the IP address of another Xenix box on the network.

We have been told that the new TCP/IP code is in QA at SCO right now.
Our plan of attack is to try and get a copy of the new (newer :-) WD
driver and see if that helps things.  We have not tried 3com boards.
Maybe we should.  Also, it was suggested that we try doing some
telnets to ourselves (which uses the loopback driver) to see if the
problem is driver related or socket related.  (If it just weren't so
intermittent...)

I hope this rambling helps.  You have my sympathy.
--
Adnan Yaqub
Star Gate Technologies, 29300 Aurora Rd, Solon, OH, 44139, USA, +1 216 349 1860
[...cwjcc!ncoast ...uunet!abvax ...ism780c ...sco ...mstar]!sgtech!adnan