[comp.unix.ultrix] Paging problem on Decstaion 3100

nakamoto@joplin.mpr.ca (Alan Nakamoto) (03/29/90)

I am currently trying a Beta version of some software for the Decstation
3100. On the instructions, it tells about a problem in the paging
algorithm in Ultrix which causes virtual memory accesses that will put
the machine into some kind of paging loop. When I tried the software,
some of the things that I tried which would require virtual memory
access did cause this problem.

Looking at the monitor program revealed many page ins but few page outs.
The disk was spinning away with little (0.4%) CPU activity being
recorded on the DECstation.

My question is, has anybody experienced this problem, heard of it, or
heard of a solution. I vaguely recall that there was some discussion of
a similar problem in comp.sys.dec or comp.unix.ultrix but don't recall
what became of it. If someone has a record of this discussion or any
other info, it would be greatly appreciated if they could e-mail me.

Thanks,

Alan Nakamoto
Pacific Microelectronics Centre
Burnaby BC. (604)-293-6052.  ...!uunet!ubc-cs!mpre!nakamoto

grr@cbmvax.commodore.com (George Robbins) (03/29/90)

In article <2112@kiwi.mpr.ca> nakamoto@joplin.mpr.ca (Alan Nakamoto) writes:
> 
> 
> I am currently trying a Beta version of some software for the Decstation
> 3100. On the instructions, it tells about a problem in the paging
> algorithm in Ultrix which causes virtual memory accesses that will put
> the machine into some kind of paging loop. When I tried the software,
> some of the things that I tried which would require virtual memory
> access did cause this problem.

There are alledgedly some problems with the Ultrix paging mechanism such
that it doesn't page effectively when there isn't enough memory.  It's not
clear exactly what the DEC position on the problem is, but the only effective
workaround I've heard of is to add more memory.  This would typically mean
expansion to 24Mb for a windowed environment, perhaps 16MB otherwise.

It might be that the problem is addressed in Ultrix 3.1C which runs on the
3100 but isn't "supported" in the 3100/UWS environment.  On the other hand,
it might not make a bit of difference.

As far as I know the only effective option is to add memory, though if you
have software support, you might attempt to log it as a critical problem
(software problem makes it impossible to effectivly utilize system) and see
if you get any satisfaction.

-- 
George Robbins - now working for,     uucp:   {uunet|pyramid|rutgers}!cbmvax!grr
but no way officially representing:   domain: grr@cbmvax.commodore.com
Commodore, Engineering Department     phone:  215-431-9349 (only by moonlite)

avolio@decuac.dec.com (Frederick M. Avolio) (04/07/90)

[ I really wanted  to fix the spelling of DECstation in the Subject but then
rn subject threads would break.... :-) ]
Look, Its probably not smart to get back in this, but, simply put, the problem
is being worked on.  Sorry for sounding high-handed before.  Just don't
like being personally insulted over a comm problem.  (I did try to cancel
the article, but through the magic of nntp....)  Anyway, the problem which
some very small number of customers have seen is being worked on, has been
worked on, and will continue to be worked on until it is fixed.  
Problems that only a very very few people see are often very hard to
fix.  

By the way, if anyone in customer support does give an unsatisfactory
answer or if you think it is flip or uncaring, you can always ask to speak
to the manager on duty.  

Fred Avolio (or whatever)
going back to my real job...

hagan@DCCS.UPENN.EDU (John Dotts Hagan) (04/11/90)

% I'm occasionally seeing what might be similar paging problems on my DS5800,
% which has 32MB of memory.  What you see is the load average starts to spike
%  over the course of several minutes and eventually everything hangs. 
Sometimes
% it seems to break loose again, sometimes not.
% 

That is the symptom we see, as well, and that is what I reported as a
"paging problem", due to the looks of the ps awwxl output.

> I don't know if this is really the "paging problem" or something else.
 We had
> some similar problems, apparently caused by use of the 3.1C csh where csh
> bugs would make it try to allocate all available memory.  The 3.1C mandatory
> patches replaces this shell as the defeault 'csh', but I still have to make
> it avilable to users who have been accustomed to having some kind of command
> completion shell available for ultrix.
> 
> I'll have to try and get these patches and see if they offer some
improvement.
> 

Interesting.  I have seen some tasks seem to dump core when started, like:
% emacs
Segmentation Fault
(core dumped)
%
but the core file is really from /bin/csh, not emacs!


> In the meantime, does anybody know how to reliably force a dump on a 5800?
> 
> I've tried a couple of times, but been screwed by the non-fixed location of
> "doadump" on the RISC stuff, and now that I finally have the number taped to
> my console, I find that saying "go 0x...." doesn't seem to work! 
Perhaps mips
> code expects more context out of the caller?  If so, it should still be
> possible have some (hopefully) fixed address in the assembly code section of
> the kernel that you can "go to" that tries to fix up the context enough to
> call doadump().
> 

I mentioned this problem in my previous posting as well.  We have about
1 out of 5 times it locks up, and we try the go <doadump> thing, it works.

--Kid.

paul@speedmetal.engin.umich.edu (Paul Killey) (04/12/90)

In article <23066@netnews.upenn.edu> hagan@DCCS.UPENN.EDU (John Dotts Hagan) writes:
>
>Interesting.  I have seen some tasks seem to dump core when started, like:
>% emacs
>Segmentation Fault
>(core dumped)
>%
>but the core file is really from /bin/csh, not emacs!

If you say (emacs) or emacs & or run emacs from /bin/sh you're all
set.  I think because then you get forked, and not vforked.

We get the segmentation fault when trying to run emacs all the time,
but no core file.  Our fix is to have people invoke it one of the ways
indicated above.

We also have the creeping load average till the system hangs at least
once a day on at least one of our machines.

--paul