[comp.sys.sun] Tape wearing out on Exabyte?

grant@saturn.cs.swin.oz.au (Grant Collins) (08/15/90)

I have a brand new Exabyte with a brand new tape in it and everytime I do
a dump over the network to it it gives a message like:

st1: warning, the tape may be wearing out or the head may need cleaning.
st1: write retries= 13936, file= 0, block= 3448

I have seen this discussed a while ago but I cant recall the
cause/solution to the problem.  Any ideas?   When I restore the contents
to an unused partition all APPEARS OK.  Is there anyway to verify that
nothing has been corrupted or do I just cross my fingers and hope for the
best?  Any help much appreciated..

leadley@uhura.cc.rochester.edu (Scott Leadley) (09/02/90)

In article <1990Aug15.235722.7639@rice.edu> grant@saturn.cs.swin.oz.au (Grant Collins) writes:

>I have a brand new Exabyte with a brand new tape in it and everytime I do
>a dump over the network to it it gives a message like:
>
>st1: warning, the tape may be wearing out or the head may need cleaning.
>st1: write retries= 13936, file= 0, block= 3448

Cleaning the heads every other day, approximately every 10-15 hours of
use, works for us.  We use Sony 8mm cleaning cartidges 'cause they are a
lot cheaper and easier to get than Exabyte cleaning cartidges.

Scott Leadley - leadley@cc.rochester.edu

dave@imax.com (Dave Martindale) (09/13/90)

In article <1990Aug15.235722.7639@rice.edu> grant@saturn.cs.swin.oz.au (Grant Collins) writes:
>I have a brand new Exabyte with a brand new tape in it and everytime I do
>a dump over the network to it it gives a message like:
>
>st1: warning, the tape may be wearing out or the head may need cleaning.
>st1: write retries= 13936, file= 0, block= 3448
>
>I have seen this discussed a while ago but I cant recall the
>cause/solution to the problem.  Any ideas?   When I restore the contents
>to an unused partition all APPEARS OK.  Is there anyway to verify that
>nothing has been corrupted or do I just cross my fingers and hope for the
>best?  Any help much appreciated..

Exabytes normally get a certain number of retries when writing or reading
a tape.  If you want to know how many have occurred on the current tape at
any point, do a "mt status" on the drive.

The number of retries you get depends on the tape, how clean the heads
are, and how much of the tape you have used so far.  When writing
full-length 2 Gb tapes with an apparently-clean drive using new Sony
tapes, I've seen anywhere from 200 to 20,000 retries.  I can only assume
that this is mostly variation in the quality of the coating on the
individual tapes.

To complicate things further, the Sun SCSI tape driver seems to report the
number of retries only when the device is closed, *and* only if the number
is greater than about 5000.  So you can go along for years without ever
seeing that message if you typically write only a few hundred Mb to a tape
(e.g. for backups).  Then, when you start writing full-length tapes, a
certain percentage of your tapes (10-20% of mine do this) get enough
retries for the errors to get reported.

In general, you don't need to worry about it.  The drive *has* rewritten
those data blocks and should be able to re-read the tape without problem.
If it could not write the data successfully, it should have given you an
I/O error.  It does take extra space on the tape to do these rewrites, but
the drive does not normally use the full length of the tape anyway, so
there is built-in capacity for a certain error rate.

You might want to try cleaning the drive when you see the message, just in
case the head is dirty.  But in my experience, the errors seem to be
mostly associated with a particular tape.  I've had one tape get over
10000 errors, then the next tape in the same drive would get only 2000,
without doing anything at all to clean the drive.

stevenr@relay.eu.net (Steven Roth) (10/08/90)

In article <1990Sep4.232322.16921@rice.edu> leadley@uhura.cc.rochester.edu (Scott Leadley) writes:
>
>Cleaning the heads every other day, approximately every 10-15 hours of
>use, works for us.  We use Sony 8mm cleaning cartidges 'cause they are a
>lot cheaper and easier to get than Exabyte cleaning cartidges.

So how do you actually use the cleaning tapes?  I have a standard TDK
cleaning tape, and the Exabyte spits it out after unsuccessfully trying to
determine BOT.  How do you exercise the drive to make the cleaning tape go
past the heads?

Steven M. Roth,   International Institute for Applied Systems Analysis (IIASA)
A-2361 Laxenburg, Austria, Europe    UUCP: iiasa!stevenr
INTERNET: stevenr@iiasa.eu.net      BITNET: tuvie!iiasa!stevenr@cernvax.BITNET

cgh018%olympus@rti.rti.org (Calvin Hayden x2254) (10/08/90)

leadley@uhura.cc.rochester.edu (Scott Leadley):
> In article <1990Aug15.235722.7639@rice.edu> grant@saturn.cs.swin.oz.au (Grant Collins) writes:
> 
>>I have a brand new Exabyte with a brand new tape in it and everytime I do
>>a dump over the network to it it gives a message like:
>>
>>st1: warning, the tape may be wearing out or the head may need cleaning.
>>st1: write retries= 13936, file= 0, block= 3448

Just a note here.  We have 8mm units, (Exabyte transport units) sold by
othere companies (workstation solutions, Microtechnology, etc).  Most of
the vendors have told us that their hardware maint (replacement within 24
hours) is invalidated if we have been using anything other than the
Exabyte cleaning cartridges - the Sonys are supposedly too abrasive, and
cause excessive wear on the head(s).  Dont know how true this is, but
thought I'd mention it.

Calvin Hayden
TI
...mcnc!rti!{tijc02|olympus}!root

mitt@haze.mitre.org (Jeff Mittelman) (10/08/90)

This is not normal.  This is a known bug in the SCSI driver.  The bug
number is 1042822 and a patch is supposed to be available in the next
couple of weeks.

Exabytes normally get a certain number of retries when writing or reading
a tape.  If you want to know how many have occurred on the current tape at
any point, do a "mt status" on the drive.

The number of retries you get depends on the tape, how clean the heads
are, and how much of the tape you have used so far.  When writing
full-length 2 Gb tapes with an apparently-clean drive using new Sony
tapes, I've seen anywhere from 200 to 20,000 retries.  I can only assume
that this is mostly variation in the quality of the coating on the
individual tapes.

To complicate things further, the Sun SCSI tape driver seems to report the
number of retries only when the device is closed, *and* only if the number
is greater than about 5000.  So you can go along for years without ever
seeing that message if you typically write only a few hundred Mb to a tape
(e.g. for backups).  Then, when you start writing full-length tapes, a
certain percentage of your tapes (10-20% of mine do this) get enough
retries for the errors to get reported.

In general, you don't need to worry about it.  The drive *has* rewritten
those data blocks and should be able to re-read the tape without problem.
If it could not write the data successfully, it should have given you an
I/O error.  It does take extra space on the tape to do these rewrites, but
the drive does not normally use the full length of the tape anyway, so
there is built-in capacity for a certain error rate.

You might want to try cleaning the drive when you see the message, just in
case the head is dirty.  But in my experience, the errors seem to be
mostly associated with a particular tape.  I've had one tape get over
10000 errors, then the next tape in the same drive would get only 2000,
without doing anything at all to clean the drive.

csb@gdwb.oz.au (Craig Bishop) (10/08/90)

dave@imax.com (Dave Martindale) writes:

>In article <1990Aug15.235722.7639@rice.edu> grant@saturn.cs.swin.oz.au (Grant Collins) writes:
>>I have a brand new Exabyte with a brand new tape in it and everytime I do
>>a dump over the network to it it gives a message like:
>>
>>st1: warning, the tape may be wearing out or the head may need cleaning.
>>st1: write retries= 13936, file= 0, block= 3448
>>
>>I have seen this discussed a while ago but I cant recall the
>>cause/solution to the problem.  Any ideas?   When I restore the contents
>>to an unused partition all APPEARS OK.  Is there anyway to verify that
>>nothing has been corrupted or do I just cross my fingers and hope for the
>>best?  Any help much appreciated..

>Exabytes normally get a certain number of retries when writing or reading
>a tape.  If you want to know how many have occurred on the current tape at
>any point, do a "mt status" on the drive.

In SUNOS 4.1 the st driver has changed and these errors are occurring more
frequently and erroneously.

We have a SUN supplied exabyte and as well as using it for nightly backups
we do backups of our PC's to it during the day using PCNFS Lifeline and
some control software we wrote.

Since SUNOS 4.1 we have been getting errors on our nightly saves which we
never got before and the PC backups don't work at all the PC Lifeline just
craps out.

What is happening we found was that the st driver no longer positions the
tape correctly when the tape is loaded or when it is rewound. The UNIX
tools continue to work but the errors above (we believe) are caused by
this problem.

The first PC to try and backup after the tape has been loaded fails in a
different place than the first PC save after a rewind.  PC Lifeline times
out 4 times (hard coded limit) and then gives up on the dd process running
on the Sun, the dd is not responding because it is waiting for the st
driver which is playing with the tape trying to get ready to write.

SUN Australia has reported this as a bug to SUN US and we are waiting for
a fix. We beleive the fix will stop the majority of the over night errors.

Craig Bishop			Geelong & District Water Board
Phone: +61 52 262506		61-67 Ryrie St Geelong
Fax:   +61 52 218236		Victoria 3220 Australia