[comp.unix.wizards] Crashes with 'vrelrm rss' on 4.2 vax with SUN NFS

dap@aber-cs.UUCP (Dave Price) (04/22/87)

I am suddenly having lots of crashes with
	panic vrelrm rss
on a 4.2 BSD with SUN NFS 3.0.1 kernel that has been
running for several months previously with crashes...
Any suggestions.... I have source and i've found the line
of code that generates the panic. Its obviously to do with
a process freeing less ( or more ?) memory than it tried to
when exiting.... Any suggestions please....
The machine is (as i said before) running a kernel that
has been stable for many months... We are probably however
now more dependent on YELLOW PAGES than before and
we may have more filesystems mounted....
Dave Price

UUCP : { ENGLAND or WALES }!ukc!aber-cs!dap
JANET: dap@uk.ac.aber.cs           PHONE:    +44 970 3111 x 3267	
Post: University College of Wales, Penglais, Aberystwyth, UK, SY23 3BZ.

-- 
UUCP : { ENGLAND or WALES }!ukc!aber-cs!dap
JANET: dap@uk.ac.aber.cs           PHONE:    +44 970 3111 x 3267	
Post: University College of Wales, Penglais, Aberystwyth, UK, SY23 3BZ.

jim@cs.strath.ac.uk (Jim Reid) (04/24/87)

In article <19@aber-cs.UUCP> dap@aber-cs.UUCP () writes:
>I am suddenly having lots of crashes with
>	panic vrelrm rss
>on a 4.2 BSD with SUN NFS 3.0.1 kernel that has been
>running for several months previously with crashes...
>Any suggestions.... I have source and i've found the line
>of code that generates the panic. Its obviously to do with
>a process freeing less ( or more ?) memory than it tried to
>when exiting.... Any suggestions please....

The answer. Tony Begg at Brunel had this problem last year. I didn't
hear any more after he told me his fix, so I suppose he fixed it for
good. (We run NFS2.0 - from Sun, not the Instruction Set -  and we've
not had any vrelrm rss crashes.)

Here's what he had to say:


Received: from uk.ac.rdg.onion by stracs.cs.strath.ac.uk; Mon, 25 Aug 86 16:31:37 +0100
Date-Received: Mon, 25 Aug 86 16:31:37 +0100
Received: from brueer.uucp by Onion.Cs.Reading.AC.UK with UUCP (Reading Mail System 3.11/3.22)
	id AA19048; Mon, 25 Aug 86 16:29:39 bst
From: Tony Begg <tony@uk.ac.brunel.ee>
Date: Mon, 25 Aug 86 14:28:37 GMT
Message-Id: <498.8608251428@Mars.ee.brunel.ac.uk>
To: E.M.Weston@UK.AC.rdg.am.uts, jim@uk.ac.strath.cs
Subject: Paging bug in NFS 3.0 on the VAX

Dear Jim and Elaine

I think I have traced the bug that is causing our NFS Vaxes to crash with
a "vrelvm rss" panic. Sun make more explicit use of memory management 
register p1br/p1lr and introduce an item p_p1br in the proc structure to
allow this. The macro which takes a proc pointer and produces the next 
stack pte "sptopte" has been changed to calculate relative to p_p1br rather
than p_p0br. I believe the old way of calculating it should be still valid.
They produce results that differ by CLSIZE (2) and I believe the Sun version
is wrong. When pte's are scarce, the Sun calculation can result in the last
data pte being the same as the first stack pte - this was happening for us
with "lpd" causing the crash as vrelvm tries to release the same pte twice,
not doing it and ending up with the wrong value for the residual p_rssize
which should have decremented to 0.

I have set HIGHPAGES (in /sys/vax/vmparam.h) to UPAGES (as with BSD) rather 
than (UPAGES+CLSIZE) as Sun have it. So far so good - I found by running a
paging benchmark and printing stuff using lpr that I could crash the Vax in
one or two attempts - with the new value of HIGHPAGES I have tried about 20
times and it's still up. Early days yet to say whether I have messed other 
stuff up, but I think I have found it.
  
I hope this does the trick. If it does, thank Tony, not me. It was his fix.

		Jim

ARPA:	jim%cs.strath.ac.uk@ucl-cs.arpa, jim@cs.strath.ac.uk
UUCP:	jim@strath-cs.uucp, ...!seismo!mcvax!ukc!strath-cs!jim
JANET:	jim@uk.ac.strath.cs

"JANET domain ordering is swapped around so's there'd be some use for rev(1)!"