news@bungi.com.mu.edu (05/13/91)
Wow, now that I have Minix 1.3 on the machine (THANKS HEAPS Bruce!) I have noticed quite alot of: SCSI ok with recovery. code 0x17, logical address 0x<some address, varies> and a few SCSI failure, key 0x3, code 0x11, log adr 0x31e, sense buf 0xc462 messages. The first one seems to occur after a few hours of running the drive and often appears up to 10 times in a row. The second message occurs five to ten minutes after the first type of message start and can lead to a kernal panic. Is it just the driver (remember its Minix 1.3) or could there be a problem with the drive (a Mini-Scribe) ? I have held off compiling the 1.5h version of the OS until I can be sure of where the problem lies. Thanks for any help, marcb ------------------------------------------------------------------------------ Marc A. Boschma marcb@img.uu.oz.AU Systems Development img Consultants GPO Box 3304GG Melbourne, Victoria 3001 Australia
culberts@hplwbc.hpl.hp.com (Bruce Culbertson) (05/16/91)
> From: daver!uunet!munnari!eyrie.img.uu.oz.au!marcb@mips.com (Marc A. Boschma) > > Wow, now that I have Minix 1.3 on the machine (THANKS HEAPS > Bruce!) I have noticed quite alot of: > SCSI ok with recovery. code 0x17, logical address 0x<some address, varies> > > and a few > > SCSI failure, key 0x3, code 0x11, log adr 0x31e, sense buf 0xc462 "Ok with recovery" is a "soft error" -- some random noise or a power glitch caused a disk operation to a healthy block to fail. Both Minix and most SCSI disks retry operations which fail. This message means Minix eventually was successful in performing a disk operation which initially failed. Minix is trying to say that it is happy and its file system is intact, but something funny happened with you disk which you might want to know about. It is normal and expected that you will see soft errors occasionally but if you are seeing several a day, you have a problem. A typical cause is a defect in the disk surface which makes reading the block unreliable. The standard Minix distribution includes a tool for testing all the blocks on a disk. Another tool builds a file of all the bad blocks so that the blocks will not be allocated to files you care about. If you get frequent retry messages and the block numbers are truly random, then you have a problem in the drive electronics, its power supply, or your pc532. Debugging it might require some creativity. "SCSI failure" means Minix cannot talk to your disk. This usually results in a panic. If Minix has been successfully talking to your disk and then suddenly gets a "SCSI failure", then your file system is likely to be corrupted. Cross your fingers and run fsck after you debug and correct the problem. If your file system is really in bad shape but you are desperate to save your data, you might have some success with the disk editor "de". > Is it just the driver (remember its Minix 1.3) or could there be a problem > with the drive (a Mini-Scribe) ? 1.3 has a pretty good SCSI driver, though not perfect. Many people have used it with Mini-Scirbe drives. I do not think the 1.5h driver is substantially different from the 1.3 driver. Bruce Culbertson
s861298@minyos.xx.rmit.oz.au (Marc A. Boschma) (05/18/91)
culberts@hplwbc.hpl.hp.com (Bruce Culbertson) writes: >> From: daver!uunet!munnari!eyrie.img.uu.oz.au!marcb@mips.com (Marc A. Boschma) >> >> Wow, now that I have Minix 1.3 on the machine (THANKS HEAPS >> Bruce!) I have noticed quite alot of: >> SCSI ok with recovery. code 0x17, logical address 0x<some address, varies> >> >> and a few >> >> SCSI failure, key 0x3, code 0x11, log adr 0x31e, sense buf 0xc462 >"Ok with recovery" is a "soft error" -- some random noise or a power >glitch caused a disk operation to a healthy block to fail. Both Minix >and most SCSI disks retry operations which fail. This message means >Minix eventually was successful in performing a disk operation which >initially failed. Minix is trying to say that it is happy and its >file system is intact, but something funny happened with you disk >which you might want to know about. >It is normal and expected that you will see soft errors occasionally >but if you are seeing several a day, you have a problem. A typical >cause is a defect in the disk surface which makes reading the block >unreliable. The standard Minix distribution includes a tool for testing >all the blocks on a disk. Another tool builds a file of all the bad >blocks so that the blocks will not be allocated to files you care about. >If you get frequent retry messages and the block numbers are truly >random, then you have a problem in the drive electronics, its power >supply, or your pc532. Debugging it might require some creativity. The soft errors only occur for a given block once or twice so I hope there is only some noise on the SCSI bus. I'm thinking of doing a low level format and trying again. These problems occured after the machine had been on for about a day. Maybe better cooling is needed. >"SCSI failure" means Minix cannot talk to your disk. This usually >results in a panic. If Minix has been successfully talking to your >disk and then suddenly gets a "SCSI failure", then your file system >is likely to be corrupted. Cross your fingers and run fsck after you >debug and correct the problem. If your file system is really in bad >shape but you are desperate to save your data, you might have some >success with the disk editor "de". fsck has managed to clean it twice now..though I lost 6 blocks somewhere. >> Is it just the driver (remember its Minix 1.3) or could there be a problem >> with the drive (a Mini-Scribe) ? >1.3 has a pretty good SCSI driver, though not perfect. Many people >have used it with Mini-Scirbe drives. I do not think the 1.5h driver >is substantially different from the 1.3 driver. Ok, so I'll start debuging the hardware if the drive doesn't work after the format >Bruce Culbertson