[comp.sys.sun] Disk sequencer errors

viosca@umn-cs.arpa (R. Randall Viosca (Randy)) (01/24/89)

in regards to:
>> xy0a: read retry (disk sequencer error) -- blk #495, abs blk #495
>>Is this a serious disk problem that I should worry about?...
>
>We have received these errors on several different systems (several
>different controller/disk combinations.) Sometimes the disk stops working
>and sometimes we get a couple of the error messages and then things
>proceed normally....

If my memory serves me correctly, we were having the same errors as well.
Problems went away when I changed the configuration on the vme-mbio
adaptor and on the controller.  I believe that both should be configured
to 24 bit addresses.  From what I have been able to gather from trial and
error:

	Disk sequencer errors seem to occur in 20 bit operation.

	Memory address errors seem occur when the adaptor is in 20 bit
	mode and the controller is in 24 bit (or vice versa).

Please notice that I said `seem'.  It would be most helpful to have access
to a list of controller errors and what they mean (in english).  Without
them, disk maintenance is like a black art.

For instance, what do you do with a slue of these in adm/messages?

	xy2e: write restore (cylinder & head header error) - ... a blk no.

RTFM don't work.  Slipping the offending sector doesn't work.  Mapping
works, but that's ugly.  You end up hoping that they are in a swap because
you will most likely have to fix the entire track.  Now, here's the
oddity, if you do a rhdr (read header) before and after the fix they are
both the same i.e. nothing gets slipped or mapped?

Clearly, it would be nice to know what is happening here!

	Randy Viosca
	University of Minnesota Computer Science Laboratories

sjh@helicon.math.purdue.edu (S. Holmes [Consulting Detective]) (05/03/89)

"R. Randall Viosca (Randy)" <viosca@umn-cs.arpa>:
  >> in regards to:
  >> >> xy0a: read retry (disk sequencer error) -- blk #495, abs blk #495
  >> >>Is this a serious disk problem that I should worry about?...
  >> >
  >> >We have received these errors on several different systems (several
  >> >different controller/disk combinations.) Sometimes the disk stops working
  >> >and sometimes we get a couple of the error messages and then things
  >> >proceed normally....

We had a rash of these problems and checked all of the recommended
parameters etc.  We ended up getting a new disk and that fixed it.  It
appears that the HDA (on the eagle) was going out.

Steve Holmes				purdue!sjh
Systems Administrator			sjh@math.purdue.edu
Dept. of Mathematics			(317) 494-6055
Purdue University
W. Lafayette, Indiana 47907