liu@trlsasb.oz (01/06/88)
We have an eagle disk with a SI 9900 controller which has had it's index file corrupted 4 times in the last 4 months. Has anybody else experienced these problems. Is it a software or hardware problem.? The error message I get when I try and mount the disk is BITMAPERR- I/O error on storage bitmap; volume locked Now the system error message manual implies you can repair this by using the verify utility but the catch 22 is that you have to mount the volume files-11 ... and the mount won't let you do this. Is there a way of repairing a disk with a bad index file ? D D
carl@CITHEX.CALTECH.EDU (Carl J Lydick) (01/09/88)
> We have an eagle disk with a SI 9900 controller which has had it's index file > corrupted 4 times in the last 4 months. Has anybody else experienced these > problems. Is it a software or hardware problem.? We've had similar problems. In both cases, it appears to have been a hardware problem. On one system, we had a crash resulting from a machine check, and when the system came up again, the disk was corrupted. This is not the problem you're having, presumably. > The error message I get when I try and mount the disk is > BITMAPERR- I/O error on storage bitmap; volume locked On the other system, one of the servo boards for the Eagle in question failed. The result was that we could read the home block on the volume (you can check this by doing a MOUNT/FOREIGN: if it works, you can read the home block), but we couldn't read the BITMAP or the bad block track (you obviously can't read the BITMAP; to see if you can read the bad block track, mount the disk/foreign, and try doing a physical backup). > Now the system error message manual implies you can repair this by > using the verify utility but the catch 22 is that you have to mount > the volume files-11 ... and the mount won't let you do this. > Is there a way of repairing a disk with a bad index file ? This depends on just how bad the I/O error is. If you just have trouble with the bitmap, mount will succeed, but with the volume locked against extending any files. You can then run verify and fix the problem (unless you've actually got a bad block in the bitmap; in that case, do an image backup, then run EXOR on the disk to reformat it [you should do this if the disk hasn't been formatted in the last few years; if you can't remember when it was formatted, it's due for reformatting] and find the bad blocks on the disk, then restore the saveset). On the other hand, if you can't read anything except the track with the home block (and it sounds like this is your problem), you should have the hardware problem repaired (as I said above, this is probably a problem with the servo), and then run verify on the disk.
ESMP09@SLACTWGM.BITNET (Ed Miller SLAC x3291 or [415]854-1055) (01/13/88)
> We have an eagle disk with a SI 9900 controller which has had it's index file > corrupted 4 times in the last 4 months. Has anybody else experienced these > problems. Is it a software or hardware problem.? > The error message I get when I try and mount the disk is > BITMAPERR- I/O error on storage bitmap; volume locked > Now the system error message manual implies you can repair this by > using the verify utility but the catch 22 is that you have to mount > the volume files-11 ... and the mount won't let you do this. > Is there a way of repairing a disk with a bad index file ? We had a similar problem several months ago with disks on an SI9900 controller. The problem happened twice within a few days (once on a 9751 disk, once on a 9798 disk), but has not recurred since. There was no evidence that it was software related (we'd been running VMS 4.5 for months before, and still are running it). There was no obvious connection to hardware changes, but there may have been some upgrades of CPU (not disk) interfaces in the controller since the two occurences. Our problem took the following form: when we tried to MOUNT a disk, MOUNT complained that it was the member of a shadow set, and proceeded to mount it, but with a software writelock. (We don't use shadow volumes, so that was the first puzzle.) It turns out that the indication that a disk is a member of a shadow set is stored in the first block of BITMAP.SYS--when we dumped that file it was obvious that it had been overwritten with irrelevant data-- not only the first block, but the first few blocks. For our situation, the fix was easy: MOUNT/OVERRIDE=SHADOW ANAL/DISK/REPAIR (There was a lot of repair to be done, since the first few blocks of BITMAP.SYS needed to be reconstructed, but there were no damage that could not be repaired.) If your problem is similar, you might be able to make the same kind of fix with MOUNT/OVERRIDE=LOCK ANAL/DISK/REPAIR Ed Miller ESMP09@SLACTWGM.BITNET Stanford Linear Accelerator Center
rde@eagle.ukc.ac.uk (R.D.Eager) (01/15/88)
I believe 4.6 has a fix to allow volumes with trashed bitmaps to be mounted. -- Bob Eager rde@ukc.UUCP ...!mcvax!ukc!rde Phone: +44 227 764000 ext 7589
scott@stl.stc.co.uk (Mike Scott) (01/20/88)
In article <880109071621.025@CitHex.Caltech.Edu> carl@CITHEX.CALTECH.EDU (Carl J Lydick) writes: > > > We have an eagle disk with a SI 9900 controller which has had it's index file > > corrupted 4 times in the last 4 months. Has anybody else experienced these > > problems. Is it a software or hardware problem.? > >We've had similar problems. In both cases, it appears to have been a hardware We've also had some nasty problems with a supereagle and QD32 controller (on a uVAX-II/VMS4.5). We were getting corrupted data without any warning apart from the disk write-locking itself. After reformatting the disk and restoring from a backup tape which had the corrupted disk data on it, there were a number of files apparently entered in two directories, one correctly, one wrongly. The symptoms were consistent with the bad block replacement algorithm failing by revectoring a supposed bad block in the index file, then forgetting it had done this. It makes me suspicious of the very idea of automatic bad block replacement, if this sort of thing happens with no warning: at least the old badblk.sys was pretty foolproof. The killer is, I don't even think it was a media problem: I suspect a head amplifier - the reformatting program carefully prints out all the replaced block numbers, and hides the fact that they are all on the same disk head! -- Regards. Mike Scott (scott@stl.stc.co.uk <or> ...uunet!mcvax!ukc!stl!scott) phone +44-279-29531 xtn 3133.
ted@blia.BLI.COM (Ted Marshall) (01/23/88)
In article <613@acer.stl.stc.co.uk>, scott@stl.stc.co.uk (Mike Scott) writes: > We've also had some nasty problems with a supereagle and QD32 controller (on a > uVAX-II/VMS4.5). We were getting corrupted data without any warning apart from > the disk write-locking itself. After reformatting the disk and restoring from a > backup tape which had the corrupted disk data on it, there were a number of > files apparently entered in two directories, one correctly, one wrongly. The > symptoms were consistent with the bad block replacement algorithm failing by > revectoring a supposed bad block in the index file, then forgetting it had done > this. It makes me suspicious of the very idea of automatic bad block > replacement, if this sort of thing happens with no warning: at least the old > badblk.sys was pretty foolproof. I had a similar problem on a DEC RA-80 on a massbus controller on a 750. I found that reading certain blocks yielded garbage with no warning except that maybe 1 in 30 reads of that block would yield the correct data! Again, all of this occured with no indication of errors from the driver! One point on your seeing files in two directories. The backup of this disk was made on a semi-live system (i.e. I was on, doing work that created files). When that was restored to the new disk and the system brought up, I noticed that while all of the directory entries for those files existed, several of the files themselves didn't. In addition, some of the other directory entries where linked to files that other people had created since the restore! It appears that although all of the directory entries were caught in the backup, some of the INDEXF.SYS entries weren't! Then since the directory entries specified FIDs with last sequence number + 1, these new files also got the same FID. The bottom line is that the double-entry files you saw may not have had anything to do with failures of bad-block replacement. -- Ted Marshall ...!ucbvax!mtxinu!blia!ted <or> mtxinu!blia!ted@Berkeley.EDU Britton Lee, Inc., 14600 Winchester Blvd, Los Gatos, Ca 95030 (408)378-7000 The opinions expressed above are those of the poster and not his employer.
scott@stl.stc.co.uk (Mike Scott) (02/02/88)
In article <3968@blia.BLI.COM> ted@blia.BLI.COM (Ted Marshall) writes: >In article <613@acer.stl.stc.co.uk>, scott@stl.stc.co.uk (Mike Scott) writes: >> We've also had some nasty problems with a supereagle and QD32 controller (on a ....... >> files apparently entered in two directories, one correctly, one wrongly. The >> symptoms were consistent with the bad block replacement algorithm failing by ....... >One point on your seeing files in two directories. The backup of this disk >was made on a semi-live system (i.e. I was on, doing work that created files). >When that was restored to the new disk and the system brought up, I noticed >that while all of the directory entries for those files existed, several of >the files themselves didn't. In addition, some of the other directory entries >where linked to files that other people had created since the restore! It ....... >The bottom line is that the double-entry files you saw may not have had >anything to do with failures of bad-block replacement. I'm afraid I was rather misleading in my article. Certainly, the restored disk had the problems I noted. But I know that at least one of the files was afftected before I did the backup/reformat/restore. It was one of mine, and was why I realised we had a major problem! It was only after the restore that I carried out a post-mortem. I haven't noticed any problems doing backups on live systems, but we very rarely need to do a restore, so wouldn't notice probably :-( -- Regards. Mike Scott (scott@stl.stc.co.uk <or> ...uunet!mcvax!ukc!stl!scott) phone +44-279-29531 xtn 3133.