kevin@cfctech.UUCP (Kevin Darcy) (12/08/89)
For all you UNIXPC hardware-hackers out there sick of seeing "HELP ME!" articles and mail, I apologize (I hate seeing it too, but I've been struggling with this stuff for 3 weeks now and I am so frustrated I have to do something), but... My poor old 2-year-old 3B1 is sick. Specifically, its disk is sick. The symptoms are system panics of the type "I/O error in push", with accompanying hard disk (controller?) info of the form HDERR ST:51 EF:10 CL:FF65 CH:FF02 SN:FF0B SC:FF01 SDH:FF26 DMACNT:FFFF DCRREG:96 MCRREG:8500 Thu Dec 7 22:07:38 1989 HDERR ST:51 EF:10 CL:FF65 CH:FF02 SN:FF0B SC:FF01 SDH:FF26 DMACNT:FFFF DCRREG:96 MCRREG:8500 Thu Dec 7 22:07:39 1989 HDERR ST:51 EF:10 CL:FF65 CH:FF02 SN:FF0B SC:FF01 SDH:FF26 DMACNT:FFFF DCRREG:96 MCRREG:8500 Thu Dec 7 22:07:39 1989 HDERR ST:51 EF:10 CL:FF65 CH:FF02 SN:FF0B SC:FF01 SDH:FF26 DMACNT:FFFF DCRREG:96 MCRREG:8500 Thu Dec 7 22:07:39 1989 which, of course, is also appearing in my /usr/adm/unix.log. When I first started getting these panics, I booted the diagnostic disk, and ran hard disk tests. During the "recal" phase, many blocks showed up as unreadable. Multiple passes of the recal phase would always show up bad blocks, but not always the SAME bad blocks, although some would show up much more than others. The "surface test" would rarely find any more bad blocks than the "recal" would. After backing up my system, I went in and mapped a whole bunch of the worst offending blocks, and everything seemed to be working just fine (with the exception of the occasional buzzing of the drive which accompanies the read errors). I also opened up the machine, cleaned out the dust bunnies, checked everything visually, and disconnected and reconnected the disk drive cables. Now, two weeks later, the machine is starting to panic again. Last night I had init croak because of a disk error. I post this because, not being much of a hardware hacker, there are a lot of things that I do not understand about this whole situation: 1) If a disk block is "unreadable" on one pass of the "recal" phase, and it is indeed a media error, how can it pass on the next? 2) What could cause so many errors at once (I've mapped about 20 bad blocks in the last 3 weeks; in the previous two years I've had the machine, only 3 had to be mapped, and the drive ran happy as a clam)? 3) Why does the "surface test" appear to be no more rigorous than the "recal" phase? 4) How would I tell the difference between a controller problem or a bona fide disk problem on the 3B1? 5) Is reformatting my next step? I also have some related quasi-technical questions: 1) If the hard disk is in need of replacement, am I limited to the same disk drive (I know that *second* disk drives can vary, but I haven't been around the UNIXPC block enough to know whether its bootstrap stuff expects a certain hardware configuration on the primary boot device)? I would love to use something bigger... 2) If I have to use a Miniscribe 6085, where is a good source for them (I expect AT&T wants a ridiculous amount)? I wouldn't mind buying a whole 3B1 and using it completely for parts, either. 3) I'm also looking for a good source for tape drives. This after I sat down over the long weekend and backed up my machine on 147 floppies. After that experience, I will never in my life buy a computer with insufficient backup capability. 4) Why is floppy disk I/O on the UNIXPC so hacked up? From what I can tell, there are 2 /dev entries for the built-in floppy drive, one of which (/dev/rfp020) gives me nothing but errors when I try to cpio to it, and the the other (/dev/rfp021) appears to work fine when writing, except whenever I try to do a cpio file listing, it always gives the error "Out of phase - get help" and stops on the second diskette of the series. I realize that the floppy on the 3B1 is really "meant" to be used from the ua menus, but, in the absence of being able to see what is on my diskettes, I'm really scared that one or another of the diskettes in the backup I made from the ua menu could turn up bad, and I will have no way of getting at the data beyond that point without some serious hacking. If it matters, the machine is a vanilla 2-Meg RAM, 67 Mb hard disk fire-sale 3B1. OS=3.51. I would be greatly appreciative of replies via article or e-mail. Please no "RTFM" replies unless you can cite specific references from the manuals which come with the machine. ------------------------------------------------------------------------------ kevin@cfctech.UUCP | Kevin Darcy, Asst. Unix Systems Admin. ...[mailrus!]sharkey!cfctech!kevin | MIS, Technical Services Voice: (313) 948-4863 | Chrysler Financial Corp. 948-4975 | 27777 Franklin, Southfield, MI 48034 ------------------------------------------------------------------------------
thad@cup.portal.com (Thad P Floryan) (12/15/89)
Re: Kevin Darcy's comments and questions, enclosed is a copy of the "Floppy Cheat Sheet" I pass out at out UNIX Users' Group meetings. Many people purchased their systems sans docs, and this one (of many) "sheet" has made life easier for them. Enjoy! Thad Floryan [ thad@cup.portal.com (OR) ..!sun!portal!cup.portal.com!thad ] ============================================================================== For a floppy filesystem (floppy already formatted and mkfs'd): $ mount /dev/fp021 /mnt [ -r ] -r if read only; VERY important if write protect tab is on disk $ umount /dev/fp021 ============================================================================== Reports present formatting of floppy to determine, among other things, the numbers of sectors for when making copies. The floppy must NOT be mounted. $ iv -t /dev/rfp020 ============================================================================== For doing floppy transfers (floppy already formatted; these steps OVERWRITE). Floppy MUST be formatted but NOT mounted with `mount' store: $ find . -cpio /dev/rfp021 Writes to as many floppies as required, and prompts for each. $ find filepath -cpio /dev/rfp021 same as above $ find filename(s) -cpio /dev/rfp021 same as above dir: $ cpio -itBv < /dev/rfp021 $ cpio -itBv < /dev/rfp021 > f.dir writes directory into f.dir restor: $ cpio -iBm < /dev/rfp021 preserves original dates $ cpio -iBm [ patterns ] </dev/rfp021 ------------------------------------------------------------------------------ NOTES: remove "B" if receive "End of volume; errno: 25, Can't read input" add "c" if receive "Out of phase -- get help" add "d" to a restore to create directories as needed ============================================================================== Processing UNIXPC installable disks and files ("file+IN"): Directory of a file+IN: $ cpio -ictBv < file+IN Restore a file+IN or a $ cd {dir in which the file will be unpacked} cpio-file. (Note: the "m" option to cpio preserves $ cpio -icBdum < file+IN the original dates) Make a file+IN and/or $ cat Files | cpio -ocBv > file+IN write to floppy in install format: $ cat Files | cpio -ocBv > /dev/rfp021 $ find . -print | cpio -ocB > /dev/rfp021 ===============================================================================