[comp.unix.xenix] Xenix 2.1.3 slowdown on 286 clone

compata@cup.portal.com (05/05/88)

Several of my customers are running Xenix 2.1.3 on a 1 MB AT clone.  They have
complained that the system seems to begin running more slowly after it has
been up for several days.  Even to the point where response to a haltsys
requires ten minutes.  The problem can be easily cured by rebooting.

I would consider suggesting that they upgrade to 2.2 if I can be confident
that it will fix the problem.  Is this a known problem with 2.1.3?  Is it
fixed in 2.2?  Is it perhaps caused by some data structure accumulating,
eventually filling memory?

I know that an upgrade would require that they also purchase more memory,
at least 1 more MB.  They are running a MS-COBOL application from 3-5
concurrent terminals, so the problem could also be in COBOL.  It certainly
does have problems.  Does anyone have any suggestions?

Dave Close, Compata, Arlington, Texas
email to compata@cup.portal.com or sun!cup.portal.com!compata
telex to 6295-5830

davidsen@steinmetz.ge.com (William E. Davidsen Jr) (05/11/88)

In article <5114@cup.portal.com> compata@cup.portal.com writes:
| Several of my customers are running Xenix 2.1.3 on a 1 MB AT clone.  They have
| complained that the system seems to begin running more slowly after it has
| been up for several days.  Even to the point where response to a haltsys
| requires ten minutes.  The problem can be easily cured by rebooting.

I run as long as 20-30 days between boots on both 2.1.3 (286) and 2.2.1
(386) systems. I see no slowdowns. I think you have something else going
on.

-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs | seismo}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

stan@sdba.UUCP (Stan Brown) (05/13/88)

> In article <5114@cup.portal.com> compata@cup.portal.com writes:
> | Several of my customers are running Xenix 2.1.3 on a 1 MB AT clone.  They have
> | complained that the system seems to begin running more slowly after it has
> | been up for several days.  Even to the point where response to a haltsys
> | requires ten minutes.  The problem can be easily cured by rebooting.
> 
> I run as long as 20-30 days between boots on both 2.1.3 (286) and 2.2.1
> (386) systems. I see no slowdowns. I think you have something else going
> on.
> 
> -- 
> 	bill davidsen		(wedu@ge-crd.arpa)
>   {uunet | philabs | seismo}!steinmetz!crdos1!davidsen
> "Stupidity, like virtue, is its own reward" -me

	You might look for a loose cable to a terminal.  Historicaly
	on UNIX(tm) machines a loose wire here (which can send random noise 
	in the port)  will drastcly slow down the system as each character 
	(or what the system thinks are characters) will have to be looked
	at  by the kernel.

	Again just a thought.


-- 
Stan Brown	S. D. Brown & Associates	404-292-9497
gatech!sdba!stan
	"vi forever"

Dave_H_Close@cup.portal.com (05/13/88)

The following is a summary of replies received in response to my article.

===============================================================================
===============================================================================

Original article: <5114@cup.portal.com>

Several of my customers are running Xenix 2.1.3 on a 1 MB AT clone.  They
have complained that the system seems to begin running more slowly after it
has been up for several days.  Even to the point where response to a haltsys
requires ten minutes.  The problem can be easily cured by rebooting.

I would consider suggesting that they upgrade to 2.2 if I can be confident
that it will fix the problem.  Is this a known problem with 2.1.3?  Is it
fixed in 2.2?  Is it perhaps caused by some data structure accumulating,
eventually filling memory?

I know that an upgrade would require that they also purchase more memory,
at least 1 more MB.  They are running a MS-COBOL application from 3-5
concurrent terminals, so the problem could also be in COBOL.  It certainly
does have problems.  So what would you suggest?

===============================================================================

I ran SCO Xenix 2.1.3 for an extended period of time before upgrading
to 2.2.1 and never saw such a phenomenon, so I would guess that it is
your COBOL system. Does it have a driver built in? Does it leave something
running all the time that might have a core leak? If so, then every time
it gets swapped out, approximately 700K is being written to then read
back from the disk. Try this: The next time the system starts slowing
down, do a "ps -elf" and find processes that are taking lots of memory
and have been running for a long time, and ask the owning users to
exit them, then restart them. If that does the trick, then talk to
the people who supplied the software and ask them to fix the bugs...
if that does not help, then there is perhaps a bug in the xenix. But
I expect that you'll find a memory losing long running program to be
at fault.
Good luck!

Jay Libove
Arpa:   Jay.Libove@andrew.cmu.edu	Bitnet: Jay.Libove@drycas.bitnet
UUCP:   ...!{uunet, ucbvax, harvard}!andrew.cmu.edu!Jay.Libove
UUCP:   ...!{pitt | bellcore} !darth!libove!libove

===============================================================================

Dave,
	You have two problems.  One is the lack of main memory.  For my
customers I make sure that there is at leas 1/2 meg of memory for each
user, plus 1 meg for the system.

	The other problem is a little more difficult to fix.  I suspect
that you did not allocate enough swap space on the disk.  As a rule of
thumb the swap space should be at least 2 times the amount of memory
the computer has.  The reason the haltsys is taking so long is because
the swap space is getting extremely fragmented.

	The computer itself should be able to handle the load you
are putting on it.  You do not have to upgrade to 2.2, but it would be a
good idea.  2.2 solves some problems and is also a little faster (?).

	I mailed this to you because I didn't want to clutter up the
net.  Please summarize all the responces you get and post it.  I would
be interested in hearing about any other possibilities.

Jonathan Bayer
Intelligent Software Products, Inc.
19 Virginia Ave.
Rockville Centre, NY   11570
...uunet!ispi!jbayer

===============================================================================

From: sun!uunet.UU.NET!idsnh!ben

Some applications create many temporary files in /usr/tmp.
When the system reboots, these files are deleted through the action 
the system running the script in /etc/rc.  The application should
remove these temporary file as a matter of course, but there may be
some permission problems because of the way the application is invoked.

Next time look in the /usr/tmp directory and see who owns the files. 
Maybe you can track it from there.

===============================================================================

From: uunet!mcl!stacy (Stacy L. Millions)

I don't know about the particular problem that you are
describing, but for 3 to 5 simultanious users you need
more than 1MB of RAM, add at least 1 more Meg, I would
recomend 3 more if you are using something like cobol.

---
"IBM Personal System/2. It's like having 256,000 crayons in one box."
    For those of you who are still doing your business reports with crayons!

S. L. Millions                                            ..!uunet!mcl!stacy 

===============================================================================

I run as long as 20-30 days between boots on both 2.1.3 (286) and 2.2.1
(386) systems. I see no slowdowns. I think you have something else going
on.

-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs | seismo}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

===============================================================================
===============================================================================

Thanks to all who replied.  It seems clear that the problem is not Xenix.
It may be MS-COBOL or it could be the swap space allocation (installation
default).  BTW, 1 MB does not generally seem to be a problem in this case.
I understand that 3-5 users running vi, cc, etc., would need much more.  But
these users are all running the same program, consisting of a shared COBOL
runtime interpreter and small pseudo-code files.  Of course, if there is
some sort of "core leak", then more memory might postpone the appearance of
the slowdown, but not necessarily eliminate it.  SCO recommended 1 MB as
enough for this application.

Dave Close, Compata, Arlington, Texas
email:  compata@cup.portal.com or sun!cup.portal.com!compata
telex:  6295-5830

abcscnge@csuna.UUCP (Scott "The Pseudo-Hacker" Neugroschl) (05/18/88)

In article <235@sdba.UUCP> stan@sdba.UUCP (Stan Brown) writes:
:> In article <5114@cup.portal.com> compata@cup.portal.com writes:
:> | 
:> | [stuff about slow system]
:> |

:
: [stuff about loose cables]
:

A good way to check for this (if the serial ports are running getty) is
to check for an unusually large wtmp file.


-- 
Scott "The Pseudo-Hacker" Neugroschl
UUCP:  ...!ihnp4!csun!csuna!abcscnge
-- "They also surf who stand on waves"
-- Disclaimers?  We don't need no stinking disclaimers!!!

terry@wsccs.UUCP (Every system needs one) (05/28/88)

In article <235@sdba.UUCP>, stan@sdba.UUCP (Stan Brown) writes:
> > In article <5114@cup.portal.com> compata@cup.portal.com writes:
> > | Several of my customers are running Xenix 2.1.3 on a 1 MB AT clone.  They have
> > | complained that the system seems to begin running more slowly after it has
> > | been up for several days.  Even to the point where response to a haltsys
> > | requires ten minutes.  The problem can be easily cured by rebooting.

There are a number of possible problems, the most likely of which is
probably the creation of process ID's.  We have the same problem on
the Ultrix system I'm writing this from.  After a while, system performance
goes way down and anyone trying to log in via a LAT terminal gets the
message "Insufficient node resources".  Finally tracked this down to
having PID's on interactive sessions "too far apart".  The soloution?
kill all the getty's and let init restart them.  Problem goes away and
LAT logins are allowed again.  Sheesh!  I can live with the stupid parity
and virtual device bugs, but this is ridiculous!

> > 
> > I run as long as 20-30 days between boots on both 2.1.3 (286) and 2.2.1
> > (386) systems. I see no slowdowns. I think you have something else going
> > on.

If you are not running a lot of users, the explanation above explains the
discrepancy.

Another possibility is:  Are your customers using "Smart" ports? These are
multiport cards which offload serial I/O processing to "speed up" the
CPU.  In someone's infinite wisdom, the size of the clist structs went
and changed from SCO 2.1.3 to 2.2.x (the modern version of the OS :-).
If you are running 2.2.x drivers with 2.1.3, you make have some kernel
tromping going on.

> 	You might look for a loose cable to a terminal.  Historicaly
> 	on UNIX(tm) machines a loose wire here (which can send random noise 
> 	in the port)  will drastcly slow down the system as each character 
> 	(or what the system thinks are characters) will have to be looked
> 	at  by the kernel.

Nice try, but SCO put out the message "line noise on tty1A, shutting down
port", at least on my system.  Besides, if that were the trouble, it would
happen immediately, not gradually build up.  Even worse, if they are using
correct smartports and drivers, each garbage character in would NOT cause
an interrupt.

| Terry Lambert           UUCP: ...{ decvax, ihnp4 } ...utah-cs!century!terry |
| @ Century Software        OR: ...utah-cs!uplherc!sp7040!obie!wsccs!terry    |
| SLC, Utah                                                                   |
|                   These opinions are not my companies, but if you find them |
|                   useful, send a $20.00 donation to Brisbane Australia...   |
| 'Admit it!  You're just harrasing me because of the quote in my signature!' |