[comp.sys.sgi] CDC Sabres on Iris systems

sysruth@helios.toronto.edu (Ruth Milner) (07/25/89)

I am interested in hearing from those of you who have installed a combination
of Xylogics 754 SMD controller/CDC Sabre 9720-1230 (1.23GB) disk(s) on your
Iris systems, particularly the 4D/240 or 120.

We have encountered numerous odd problems with ours, some of which have been
solved. Some missing functionality for SMD drives was provided by upgrading
to 3.1D. We also know that to eliminate intermittent hanging under heavy 
disk-loading conditions, we must reformat with a new 754, rev. 2.4 or higher 
(which we have installed, although we can't reformat until our Exabyte is 
working); Xylogics has modified the sector header layout in a way which cures
this bug. 

My query concerns problems with flagging bad blocks on the drive using fx.
Currently on each drive we have a bad block (actually a bad track, since fx
maps out whole tracks at a time) which cannot be marked as bad. Neither one is
among the special few which by definition cannot be mapped (see the 4D System
Administrator's Guide, p. 4-10). From the hotline I gather that it is most 
likely due to the location of the defect coinciding with the bit in the sector 
header which is set to show that the block is mapped (I will restrain myself 
from commenting :-) on the wisdom of relying on a bit in a bad block to mark 
the block as bad, since there may be no other simple way to do it under the 
IRIX filesystem layout). What we saw during the initial fx run was the track 
mapped out as part of the manufacturer's defect list, and then mapped again 
during the badblock testing run which followed. As a result there are two 
entries for each of these tracks in the badblock list, both pointing to the
same map track. However the system still doesn't recognize them as bad, and 
using fx online I can only remove one of the entries from the table. One of 
these tracks was in a second swap area, which we could remove from active use, 
but the other is in the middle of the partition used for everyone's home 
directories. Occasionally someone runs into it, and this has sometimes caused 
system crashes.

There is some chance that the reformat will shift things around in the
sector header enough that the "bad" bit will be readable. However I'm not
counting on it. What I would like to know is whether other people have seen
this on their CDC drives. If we replace these two drives, are we likely
(statistically speaking) to see the same problem recur on the new ones? 
Did you buy your controller and/or disk(s) from SGI or from a third party?
What did you have to do in order to fix it? Does anybody from SGI who knows a 
lot about SMD drives know any more about this problem?

BTW, can I readin the badblock list before beginning the reformat (which keeps 
the sector header the same length), do the format, and write the list back out 
afterwards? I will be keeping the same partition table, so the mapping area
will be in the same place. I strongly suspect that a simple format will lose 
the existing information, and I do not trust the testing program to do a 
*really* thorough job of finding bad spots. Nor do I particularly wish to
type in the MDL by hand :-).

Please reply directly to me, and I will summarize to the net if there is new
information. (I'm also interested in finding out just how many people out there
are using Sabres on their Irises; I know of only one other site in Canada).

Thanks in advance.

-- 
 Ruth Milner          UUCP - {uunet,pyramid}!utai!helios.physics!sysruth
 Systems Manager      BITNET - sysruth@utorphys
 U. of Toronto        INTERNET - sysruth@helios.physics.toronto.edu
  Physics/Astronomy/CITA Computing Consortium