sysruth@helios.toronto.edu (Ruth Milner) (07/25/89)
I am interested in hearing from those of you who have installed a combination of Xylogics 754 SMD controller/CDC Sabre 9720-1230 (1.23GB) disk(s) on your Iris systems, particularly the 4D/240 or 120. We have encountered numerous odd problems with ours, some of which have been solved. Some missing functionality for SMD drives was provided by upgrading to 3.1D. We also know that to eliminate intermittent hanging under heavy disk-loading conditions, we must reformat with a new 754, rev. 2.4 or higher (which we have installed, although we can't reformat until our Exabyte is working); Xylogics has modified the sector header layout in a way which cures this bug. My query concerns problems with flagging bad blocks on the drive using fx. Currently on each drive we have a bad block (actually a bad track, since fx maps out whole tracks at a time) which cannot be marked as bad. Neither one is among the special few which by definition cannot be mapped (see the 4D System Administrator's Guide, p. 4-10). From the hotline I gather that it is most likely due to the location of the defect coinciding with the bit in the sector header which is set to show that the block is mapped (I will restrain myself from commenting :-) on the wisdom of relying on a bit in a bad block to mark the block as bad, since there may be no other simple way to do it under the IRIX filesystem layout). What we saw during the initial fx run was the track mapped out as part of the manufacturer's defect list, and then mapped again during the badblock testing run which followed. As a result there are two entries for each of these tracks in the badblock list, both pointing to the same map track. However the system still doesn't recognize them as bad, and using fx online I can only remove one of the entries from the table. One of these tracks was in a second swap area, which we could remove from active use, but the other is in the middle of the partition used for everyone's home directories. Occasionally someone runs into it, and this has sometimes caused system crashes. There is some chance that the reformat will shift things around in the sector header enough that the "bad" bit will be readable. However I'm not counting on it. What I would like to know is whether other people have seen this on their CDC drives. If we replace these two drives, are we likely (statistically speaking) to see the same problem recur on the new ones? Did you buy your controller and/or disk(s) from SGI or from a third party? What did you have to do in order to fix it? Does anybody from SGI who knows a lot about SMD drives know any more about this problem? BTW, can I readin the badblock list before beginning the reformat (which keeps the sector header the same length), do the format, and write the list back out afterwards? I will be keeping the same partition table, so the mapping area will be in the same place. I strongly suspect that a simple format will lose the existing information, and I do not trust the testing program to do a *really* thorough job of finding bad spots. Nor do I particularly wish to type in the MDL by hand :-). Please reply directly to me, and I will summarize to the net if there is new information. (I'm also interested in finding out just how many people out there are using Sabres on their Irises; I know of only one other site in Canada). Thanks in advance. -- Ruth Milner UUCP - {uunet,pyramid}!utai!helios.physics!sysruth Systems Manager BITNET - sysruth@utorphys U. of Toronto INTERNET - sysruth@helios.physics.toronto.edu Physics/Astronomy/CITA Computing Consortium