Hampton@dockmaster.arpa (David R. Hampton) (12/17/88)
I am running BSD 4.2 on a VAX 11/785, using the University of Maryland uda driver. The modification time on my driver is 19 December 1985. I just started seeing numerous uda errors today on one of our three drives. These errors are occurring on a RA60 disk pack that is being used as one large partition. Most of the errors are soft errors, but there is an occasional hard error thrown in. Could someome please explain to me what they mean. Thanks much. P.S. DEC was just here yesterday for our monthly maintenance visit. I had not seen any errors before then. Could they have changed something that would cause this? -------------------------------------------------- uda0: soft error, disk transfer error, unit 1, grp 0x0, hdr 0x3f038, event 0313 100040 80239df8 1 cb4102 96e8 1060000 10005 381b 2040000 103 66f55acb 3f038 60100a8 0 0 0 uda0: soft error, SDI error, unit 1, event 0353, hdr 0x0 100040 80239df8 1 eb4103 96e8 1060000 10005 381b 2040000 103 66f55acb 0 c0014b 2051700 2f000401 0 uda0: soft error, disk transfer error, unit 1, grp 0x0, hdr 0x4cce8, event 0313 100040 8023aa70 1 cb4102 96e8 1060000 10005 381b 2040000 103 66f55acb 4cce8 40102fd 0 0 0 uda0: soft error, SDI error, unit 1, event 0353, hdr 0x0 100040 8023aa70 1 eb4103 96e8 1060000 10005 381b 2040000 103 66f55acb 0 c0041b 504c800 2f0004e2 0 uda0: soft error, SDI error, unit 1, event 0353, hdr 0x0 100040 8023a4f8 1 2b4103 96e8 1060000 10005 381b 2040000 103 66f55acb 0 c0041b 504c800 2f0004e2 0 uda0: hard error, disk transfer error, unit 1, grp 0x0, hdr 0x4cce8, event 0313 100040 8023b138 1 cb0102 96e8 1060000 10005 381b 2040000 103 66f55acb 4cce8 c0041b 2045500 2f000462 0 ra1c: hard error sn314600 status 313 etc, etc. David R. Hampton Hampton @ Dockmaster.Arpa 301/859-4537
chris@mimsy.UUCP (Chris Torek) (12/21/88)
In article <17846@adm.BRL.MIL> Hampton@dockmaster.arpa (David R. Hampton) writes: >I am running BSD 4.2 on a VAX 11/785, using the University of Maryland >uda driver. The modification time on my driver is 19 December 1985. You have an ancient edition (but then, you have 4.2BSD...). It probably mostly works. I have nothing decent for 4.2BSD though. >uda0: soft error, disk transfer error, unit 1, grp 0x0, hdr 0x3f038, > event 0313 The important number is the `event'. 0313 translates to `lost receiver ready drive error'. `hdr 0x3f038' here says that the drive was working on block 258104---probably irrelevant in this case. The receiver probably refers to the UART receiver for the serial cable between the controller and the drive. (This is a WAG.) >uda0: soft error, SDI error, unit 1, event 0353, hdr 0x0 >uda0: hard error, disk transfer error, unit 1, grp 0x0, hdr 0x4cce8, > event 0313 I have never figured out what makes an error an `SDI error', but 0353 is `drive detected error drive error'---not very informative, other than that the drive's error checking code thinks something is wrong. Here you got one lost-receiver-ready, so it retried and got a drive-detected- error, and gave up. >P.S. DEC was just here yesterday for our monthly maintenance visit. I >had not seen any errors before then. Could they have changed something >that would cause this? Check the cables. One is probably loose, or bent. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris
eap@bu-cs.BU.EDU (Eric Pearce) (12/28/88)
We had some similar errors on a RA81 on a VAX 750 running 4.3 BSD: Dec 20 17:14:32 bucsb vmunix: uda0: soft error, disk transfer error, unit 2, grp 0x0, hdr 0xd4c27, event 0650 Dec 20 17:14:32 bucsb vmunix: uda0: soft error, disk transfer error, unit 2, grp 0x0, hdr 0xd4ef1, event 0650 I did a level 0 dump of the entire disk. DEC ran 'EVRLB' from the diag supervisor and then I restored the disk. So far, no errors. Was this the "correct" thing to do? DEC said the drive was ok as far as they could tell, but they would replace the HDA if I got errors after doing the format. I would be interested in hearing about tools I could use to map out bad blocks or sectors (I have done this without too much trouble on the Sun, Encore and Celerity products). -e -- ------------------------------------------------------------------------------- Eric Pearce ARPANET eap@bu-it.bu.edu Boston University Information Technology CSNET eap%bu-it@bu-cs 111 Cummington Street JNET jnet%"ep@buenga" Boston MA 02215 UUCP !harvard!bu-cs!bu-it!eap 617-353-2780 voice 617-353-6260 fax BITNET ep@buenga
chris@mimsy.UUCP (Chris Torek) (12/29/88)
In article <26927@bu-cs.BU.EDU> eap@bu-cs.BU.EDU (Eric Pearce) writes: >Dec 20 17:14:32 bucsb vmunix: uda0: soft error, disk transfer error, >unit 2, grp 0x0, hdr 0xd4c27, event 0650 0650 is a 6-symbol ecc error (therefore correctable and corrected). The 4.3BSD-tahoe driver decodes these things, printing something like uda0: soft error, disk transfer error: unit 2, lbn 871463: 6 symbol ecc error (code A, subcode B) The code and subcode are there in case DEC suddenly define new error codes. >I did a level 0 dump of the entire disk. DEC ran 'EVRLB' from the >diag supervisor and then I restored the disk. So far, no errors. I cannot keep them straight, but presumably EVRLB forwards any marginal sectors it finds. >Was this the "correct" thing to do? DEC have several `formatters' for forwarding bad sectors. None of them are capable of restoring an HDA to a virgin state, but at least one of them (once called `rabads') can forward a sector by LBN. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@mimsy.umd.edu Path: uunet!mimsy!chris