[comp.unix.xenix] System V kernel panic not syncing

amull@Morgan.COM (Andrew P. Mullhaupt) (03/12/90)

Anyone know how to get the SCO UNIX System V/386 r3.2 kernel
to do sync when it panics so fsck is happy when you start up again?

I'm told that BSD can do this, and I have a use for this while
running this OS on a 'beta' CPU which will cause a panic under
certain circumstances.

Thanks in Advance,
Andrew Mullhaupt

neal@mnopltd.UUCP (03/13/90)

->Anyone know how to get the SCO UNIX System V/386 r3.2 kernel
->to do sync when it panics so fsck is happy when you start up again?
->
->I'm told that BSD can do this, and I have a use for this while
->running this OS on a 'beta' CPU which will cause a panic under
->certain circumstances.

Wait a second... This doesn't make sense.  Panics are frequently situations
where the kernel code is not confident to continue processing.  Doing a 
sync in this situation is like painting over rust.   For any degree of system 
stability you need to fsck and "take your medecine"...

------------------------------------------------------------------------------
Neal Rhodes                       MNOP Ltd                     (404)- 972-5430
President                Lilburn (atlanta) GA 30247             Fax:  978-4741
       uunet!emory!jdyx!mnopltd!neal Or uunet!gatech!stiatl!mnopltd!neal
------------------------------------------------------------------------------

davidsen@sixhub.UUCP (Wm E. Davidsen Jr) (03/14/90)

In article <779@s5.Morgan.COM> amull@Morgan.COM (Andrew P. Mullhaupt) writes:
| Anyone know how to get the SCO UNIX System V/386 r3.2 kernel
| to do sync when it panics so fsck is happy when you start up again?

  I believe that an attempt is made to determine if a sync can be done
or not. If the kernel thinks the in-memory info is bad it should NOT try
to sync, since this may keep fsck from recovering anything. I seem to
recall installing a patch in SysIII to prevent the sync.

  Or you could use a version which doesn't panic ;-(
-- 
bill davidsen - davidsen@sixhub.uucp (uunet!crdgw1!sixhub!davidsen)
    sysop *IX BBS and Public Access UNIX
    moderator of comp.binaries.ibm.pc
"Getting old is bad, but it beats the hell out of the alternative" -anon

amull@Morgan.COM (Andrew P. Mullhaupt) (03/15/90)

In article <167@mnopltd.UUCP>, neal@mnopltd.UUCP writes:
> 
> ->Anyone know how to get the SCO UNIX System V/386 r3.2 kernel
> ->to do sync when it panics so fsck is happy when you start up again?
> ->
> 
> Wait a second... This doesn't make sense.  Panics are frequently situations
> where the kernel code is not confident to continue processing.  Doing a 
> sync in this situation is like painting over rust.   For any degree of system 
> stability you need to fsck and "take your medecine"...
> 

Not really. The problem is that a hardware workaround for a bug in the
CPU chip is detecting a potentially incorrect condition (relating
only to floating point exceptions) and translating the error into
a parity error so that it won't go unnoticed. Needless to say, UNIX
notices the fake parity error and panics. Now I know it's a bogus
parity error because the memory on the machine is rated faster than
its running, brand new, and it's completely repeatable. (Until I
put in a software workaround for the hardware workaround, it was
VERY repeatable.)

Now I did say this was 'beta' hardware - (and aside from this one
fault it seems solid as a rock...) - so nothing really important
is at risk. It's just that when that particular panic occurs, I
don't need to go looking too hard to tell if the situation is as I
have outlined above. (So far every one of the panics has been of this
kind...) Since there is quite a bit of mass storage on this machine,
it takes some time to fsck on the reboot. I was just looking for a
simple way to avoid this unnecessary delay. 

I have received quite a few responses admonishing me not to force a
sync on the grounds of file system safety. Well, I just thought I'd
put everybody in the picture so they could understand why I wanted
to force the sync.

The software workaround involves disabling all floating point
exceptions (and ensuring the correct status of the sticky bits) - 
so it can be criticized from the point of view of numerical analysis.

While we're on the subject. It occurs to me that an even more 
elegant solution would be to ignore parity errors which come from 
one specific location. (The fake parity error is always reported
from the same location...) Anyone know how to do this? Note: I do
not have access to kernel source, to my knowledge.

Thanks in Advance,
Andrew Mullhaupt

guy@auspex.auspex.com (Guy Harris) (03/16/90)

>Wait a second... This doesn't make sense.

It may not make sense, but nevertheless it's worked quite fine, in my
experience, for the past 6 years or so on various systems.  At least
in the panics I've seen, the file system data structures in core, etc.
aren't what's screwed up, so syncing the file system caused no problems.

>For any degree of system stability you need to fsck and "take your
>medecine"...

Yes, the "sync" isn't intended as something to make "fsck"s unnecessary,
it's just intended to limit the lossage from a crash.