[comp.unix.xenix] 386 Xenix 2.3.1/vpix/Compaq 20 Problem. Help!

clewis@ecicrl.UUCP (Chris Lewis) (04/06/89)

One of our clients is having a severe system problem and maybe someone
out there can help.  SCO/Computone sure can't.

System:
	Compaq 20 with 3Mb.
	386 Xenix 2.3.1
	1.1 (I think) vpix & DOS Symphony
	Word Perfect 4.2 (Xenix version)
	Cyma (some sort of RMCOBOL accounting package)
	Wangtek 60Mb tape drive.
	Advantage-8 (Computone/Intellicom/Intelliport etc.) 8 port
		serial card.
	Several Wyse 60's/NEC 890 and some sort of Epson (all serial)
	monochrome adapter of some sort.

Problem: 

The severe problem is a complete system freezeup, ranging from 
once or twice a week to 5 or 6 times a day.  The disk is apparently
not frozen, because it flashes a bit afterwards and the reboot's fsck
doesn't complain much (if at all), but the main console and all wyses are 
hung.  Not even caps lock on the main console toggles the indicator.  I've
tried (as suggested on the net) to disconnect and reconnect the keyboard
but no dice.  Activity at the time: not much....  System will hang with
as few as two users, one on console, one on wyse.  This *may* be correlated with
vpix/Symphony usage, but it's difficult to get a clear picture of what's
going on.  [Our original contact-person at the client's site has left,
and the new one isn't as knowledgeable or helpful, and is considerably
less patient... ;-(]   The one thing that does seem to be common is that
someone's on one of the Computone ports (mind you it's hard to get two
people on without at least using one!)

I suspect that this is either a vpix problem, or there isn't enough memory
for buffering.  Strangely enough, there doesn't appear to be all 
that much disk activity when it hangs.  Does anybody have enough experience
with vpix on Xenix to know whether complete system hangs are common?  Are
freezeups occuring on Xenix without vpix installed?  How does the system
behave when there really isn't enough memory?  You'd expect the disk to
go wild, but that doesn't seem to be the problem here.  Unfortunately, this
version of Xenix doesn't appear to have sar.

We had suspected that it was something to do with the Computone, for when
I set "compatible: main" (disable line discipline bypass) on all serial
ports the hang stopped happening for a couple of weeks.  Now it's back with 
a vengeance.   Computone said "increase clists".  Right - the hang was
happening with *2* users.  We've heard unsubstantiated rumors that 2.3.1
with computones hangs on an AST Premium, and that switching to an Intellicon
(not related to Computone) did *not* help - don't know whether that system
had vpix or not.  SCO didn't even know that Computones needed to have new
drivers installed when going from 2.2.1 to 2.3.1 - or that there was
a different Compaq-20 driver.  Or, *maybe* there's more than one type of
hang?

There is a marginal possibility that this hang is related to the wangtek
problem - wangtek-60's do cause system hangs when lots of disk activity
is occuring - we've proven that on several machines, but in this case
the wangtek isn't being used at all!

Our other less pressing problem is: occasionally, averaging a couple of 
times per month, the system panics with "page fault in kernel".  Is this
a known problem?

Help!

Thanks,
-- 
Chris Lewis, Markham, Ontario, Canada
{uunet!attcan,utgpu,yunexus,utzoo}!lsuc!ecicrl!clewis
Ferret Mailing list: ...!lsuc!gate!eci386!ferret-request
(or lsuc!gate!eci386!clewis or lsuc!clewis)

kessler%cons.utah.edu@wasatch.utah.edu (Robert R. Kessler) (04/07/89)

In article <226@ecicrl.UUCP> clewis@ecicrl.UUCP (Chris Lewis) writes:
>One of our clients is having a severe system problem and maybe someone
>out there can help.  SCO/Computone sure can't.
>
> [Description of system]
>
>The severe problem is a complete system freezeup, ranging from 
>once or twice a week to 5 or 6 times a day.  The disk is apparently
> [ Further detailed description]

I don't know if this will help or not, but we had a similar complete
system lockup a few months back.  They have a PS/2 and ARNET board, so
I'm not sure that the problems are the same (plus they didn't and
still don't have 2.3 yet).  It turns out that the way that we had
implmented the system, was that all users logged in as the same user.
Once they got in, our own software took over and we did per login
verification.  This caused random hangups, as often as a few times per
day.  It didn't seem to be related to the number of users on the
system either.  Well, it turns out that the problem was the same
login.  Changing our system to use the gettydefs login, where the
default login is the terminal id AND making all "terminal id" users
have groups that were not root, solved the problem.  We did have to
make both changes, just making new logins didn't help since they were
all root.  (We realized the implications of everyone being root, but
it doesn't matter for our application, since our software was
protecting everyone from messing with the system).  Anyway, this
solved the problem and they no longer hangup.

We are still having problems with certain programs crashing with some
kind of memory error which causes a core dump at high utilization
times.  However, this only crashes one program, not all users.  They
are getting more memory for the machine, which might solve the
problem. 

Hope this helps.

B.

craigp@summus.UUCP (craigp) (04/07/89)

In article <226@ecicrl.UUCP>, clewis@ecicrl.UUCP (Chris Lewis) writes:
> One of our clients is having a severe system problem and maybe someone
> out there can help.  SCO/Computone sure can't.

> Problem: 
> 
> The severe problem is a complete system freezeup, ranging from 
> once or twice a week to 5 or 6 times a day.  The disk is apparently

> The one thing that does seem to be common is that
> someone's on one of the Computone ports (mind you it's hard to get two
> people on without at least using one!)
> 

We have a client which was having similar problems.  I believe he
had an AST multi-port card.  The entire system would lock up
(atleast serial i/o would, though I believe that the system was
in fact still running) at times.  

It only happened when someone was using an AST port.  The cables
between the terminals and PC's were quite long, so we shortened them
and the problem hasn't recurred since.

If your terminals are some distance from the PC, trying moving them
closer. (I don't know what the distance limitations on serial cables
are but I believe its less than 75 feet).

Good Luck!