dap@aber-cs.UUCP (Dave Price) (04/22/87)
I am suddenly having lots of crashes with panic vrelrm rss on a 4.2 BSD with SUN NFS 3.0.1 kernel that has been running for several months previously with crashes... Any suggestions.... I have source and i've found the line of code that generates the panic. Its obviously to do with a process freeing less ( or more ?) memory than it tried to when exiting.... Any suggestions please.... The machine is (as i said before) running a kernel that has been stable for many months... We are probably however now more dependent on YELLOW PAGES than before and we may have more filesystems mounted.... Dave Price UUCP : { ENGLAND or WALES }!ukc!aber-cs!dap JANET: dap@uk.ac.aber.cs PHONE: +44 970 3111 x 3267 Post: University College of Wales, Penglais, Aberystwyth, UK, SY23 3BZ. -- UUCP : { ENGLAND or WALES }!ukc!aber-cs!dap JANET: dap@uk.ac.aber.cs PHONE: +44 970 3111 x 3267 Post: University College of Wales, Penglais, Aberystwyth, UK, SY23 3BZ.
jim@cs.strath.ac.uk (Jim Reid) (04/24/87)
In article <19@aber-cs.UUCP> dap@aber-cs.UUCP () writes: >I am suddenly having lots of crashes with > panic vrelrm rss >on a 4.2 BSD with SUN NFS 3.0.1 kernel that has been >running for several months previously with crashes... >Any suggestions.... I have source and i've found the line >of code that generates the panic. Its obviously to do with >a process freeing less ( or more ?) memory than it tried to >when exiting.... Any suggestions please.... The answer. Tony Begg at Brunel had this problem last year. I didn't hear any more after he told me his fix, so I suppose he fixed it for good. (We run NFS2.0 - from Sun, not the Instruction Set - and we've not had any vrelrm rss crashes.) Here's what he had to say: Received: from uk.ac.rdg.onion by stracs.cs.strath.ac.uk; Mon, 25 Aug 86 16:31:37 +0100 Date-Received: Mon, 25 Aug 86 16:31:37 +0100 Received: from brueer.uucp by Onion.Cs.Reading.AC.UK with UUCP (Reading Mail System 3.11/3.22) id AA19048; Mon, 25 Aug 86 16:29:39 bst From: Tony Begg <tony@uk.ac.brunel.ee> Date: Mon, 25 Aug 86 14:28:37 GMT Message-Id: <498.8608251428@Mars.ee.brunel.ac.uk> To: E.M.Weston@UK.AC.rdg.am.uts, jim@uk.ac.strath.cs Subject: Paging bug in NFS 3.0 on the VAX Dear Jim and Elaine I think I have traced the bug that is causing our NFS Vaxes to crash with a "vrelvm rss" panic. Sun make more explicit use of memory management register p1br/p1lr and introduce an item p_p1br in the proc structure to allow this. The macro which takes a proc pointer and produces the next stack pte "sptopte" has been changed to calculate relative to p_p1br rather than p_p0br. I believe the old way of calculating it should be still valid. They produce results that differ by CLSIZE (2) and I believe the Sun version is wrong. When pte's are scarce, the Sun calculation can result in the last data pte being the same as the first stack pte - this was happening for us with "lpd" causing the crash as vrelvm tries to release the same pte twice, not doing it and ending up with the wrong value for the residual p_rssize which should have decremented to 0. I have set HIGHPAGES (in /sys/vax/vmparam.h) to UPAGES (as with BSD) rather than (UPAGES+CLSIZE) as Sun have it. So far so good - I found by running a paging benchmark and printing stuff using lpr that I could crash the Vax in one or two attempts - with the new value of HIGHPAGES I have tried about 20 times and it's still up. Early days yet to say whether I have messed other stuff up, but I think I have found it. I hope this does the trick. If it does, thank Tony, not me. It was his fix. Jim ARPA: jim%cs.strath.ac.uk@ucl-cs.arpa, jim@cs.strath.ac.uk UUCP: jim@strath-cs.uucp, ...!seismo!mcvax!ukc!strath-cs!jim JANET: jim@uk.ac.strath.cs "JANET domain ordering is swapped around so's there'd be some use for rev(1)!"