frobinso@cirm.northrop.COM (Fletcher Robinson) (06/15/88)
I have come up against two problems on our new 4D 70G. One machine with very little user activity(..waiting for software..), after sitting idle for several days will begin to display the following error : error in kernel #42 severity=2 , etc. This locks me out of the system with the only way to recover is to power down. After power is restored, it functions as before for several days until it displays the same error again. Another problem is using tar to backup files. There is a profusion of the following error message: 1ps0d0s6: error csr=0x4000 bn=?????? statcode=83 recovered, errcode=0x1 accompanied sparingly by : tp7: (18) correctable data error Anyone have any insight into these errors? Are they aviodable?
perry@PHOENIX.PRINCETON.EDU ("Kevin R. Perry") (06/16/88)
>From info-iris-request@brl-vmb.arpa Thu Jun 16 10:03:04 1988 >Date: Wed, 15 Jun 88 9:40:43 PDT >From: Fletcher Robinson <frobinso@cirm.northrop.com> >To: info-iris@BRL.ARPA >Subject: kernel and tar errors >Message-Id: <8806151244.aa15986@SMOKE.BRL.ARPA> > >I have come up against two problems on our new 4D 70G. One machine with >very little user activity(..waiting for software..), after sitting idle >for several days will begin to display the following error : > error in kernel #42 severity=2 , etc. Haven't seen this one, sorry. > >Another problem is using tar to backup files. There is a profusion of >the following error message: > 1ps0d0s6: error csr=0x4000 bn=?????? statcode=83 recovered, errcode=0x1 >accompanied sparingly by : > tp7: (18) correctable data error We have experienced this problem on our 4D's. I recently asked an SGI field service person about it. He claims the problem is known to SGI, and they're working on it. It's in the hardware, and they supposedly understand what is wrong. Something about impedence-matching on something in the tape-drive unit. He says it's perfectly safe to ignore these messages, and maybe someday they'll have a fix. Hey, at least it gives you something to watch while you're doing backups! :-) K.Perry perry%phoenix@princeton.edu Sys Prog Computing & Info. Technology Princeton Univ.
rpaul@dasys1.UUCP (Rodian Paul) (06/17/88)
> Another problem is using tar to backup files. There is a profusion of > the following error message: > 1ps0d0s6: error csr=0x4000 bn=?????? statcode=83 recovered, errcode=0x1 > accompanied sparingly by : > tp7: (18) correctable data error You are probably running a 380 meg Hitachi drive on the 4D in question? SGI will be replacing cables for these drives pretty soon, seems the controller can't keep up with the drive sometimes (so I've been told), thus the drive error messages. Apparently nothing to worry about so long as 'recovered' comes up. The tp7: (18) has been a real bitch for me. I get the message about 7 out of 10 'tar cv's. The message is specific to the tape drive. Several weeks ago I had some corrupted data on a couple of tapes. 'tar' read them off fine, but the data was trashed. After chatting with SGI someone said that they'd had the same problem at SGI (later on I was told by many people there, that this could never happen), anyway a field engineer came out and we ran a whole lot of tests. We still got lot's of 'correctable data error' messages, but no trashed data. Since then things seem fine, I don't know what trashed my data before, or if it's related to the flakey controller/disk cables, only time will tell...