[comp.sys.att] Hard disk problems

lemke@Apple.COM (Steve Lemke) (08/10/88)

Recently, I've been getting the following in my /usr/adm/unix.log:

HDERR ST:51 EF:10 CL:FF80 CH:FF01 SN:FF00 SC:FF02 SDH:FF25 DMACNT:FFFF DCRREG:95 MCRREG:9100 Tue Aug  2 02:00:38 1988

drv:0 part:2 blk:19552 rpts:1 Tue Aug  2 02:00:39 1988

HDERR ST:51 EF:10 CL:FF80 CH:FF01 SN:FF00 SC:FF02 SDH:FF25 DMACNT:FFFF DCRREG:95 MCRREG:9B00 Tue Aug  9 02:01:22 1988

drv:0 part:2 blk:19552 rpts:1 Tue Aug  9 02:01:22 1988

Now, the special thing about 2:00 am is that's when my uucico cron runs
to call another local system.  However, sometimes I force uucico to run
from the keyboard at other times during the day, I get no unix.log
entries.  I don't know why they both happened on Tuesday, as my uucico
cron runs daily.

Can anyone tell me what causes these messages to happen?  They don't 
appear to be "hard" errors (non-recoverable).  Should I add this block
to my bad block table?  How do I go about doing that?

Thanks in advance.

-Todd-

gene@zeno.MN.ORG (Gene H. Olson) (08/11/88)

In article <15324@apple.Apple.COM> comdesign!ivucsb!todd@pyramid.com writes:
>Recently, I've been getting the following in my /usr/adm/unix.log:
>
>HDERR ST:51 EF:10 CL:FF80 CH:FF01 SN:FF00 SC:FF02 SDH:FF25 DMACNT:FFFF DCRREG:95 MCRREG:9100 Tue Aug  2 02:00:38 1988
>
>drv:0 part:2 blk:19552 rpts:1 Tue Aug  2 02:00:39 1988
>
>HDERR ST:51 EF:10 CL:FF80 CH:FF01 SN:FF00 SC:FF02 SDH:FF25 DMACNT:FFFF DCRREG:95 MCRREG:9B00 Tue Aug  9 02:01:22 1988
>
>drv:0 part:2 blk:19552 rpts:1 Tue Aug  9 02:01:22 1988

You have `bad block' there.  For some reason the diagnostics do not
find all of them.  On my system I had two of them that I had to fix
using the technique below:

1)	Shut down your system. (init s) and boot up the diagnostic.

2)	When the diagnostic menu comes up, type `s4test'.   The system
	will respond `expert>'.   Then type `6,12'.   The system will
	respond giving you the sizes of partitions 0, 1, 2.   On my
	system (multiuser option) these sizes are 64, 5000, rest-of-disk.
	Also note the number of heads on your disk; mine has 8.  Now
	type `U' to return to the regular menu.

3)	Computation time.  The messages in unix.log reference the same
	block, and if you are smart, you will check it both ways to make
	sure you have selected the right block before attempting to spare it.

	First look at the Hex messages above:

	CL: FF80 CH: FF01 SN: FF00 SDH: FF25

	In all these numbers, only the lower byte is significant.  CL is
	cyclinder low (80 hex = 128 decimal) and CH is cylinder high
	(01 hex = 1 decimal).  That makes the cyclinder number 256*1 + 128
	= 384.  SDH gives the head (only lower nibble = 5 decimal).  The
	sector number (SN = 0).

	I find it works best to spare blocks using the partition 0 block
	offset.  On standard ST506 drives (UNIX-PC) there are 17 sectors
	per block.  Each sector is 512 bytes.  16 Sectors are used to be
	8 1K blocks in the filesystem.  The 17th block is reserved to
	spare out bad blocks.  (A nice arrangement since it usually avoids
	a seek when accessing bad blocks).   So there are 8 blocks per
	track (this is a magic clue!).   Now we can figure out the
	partition 0 block number as follows:

		CH * 256 * (# heads = 8) * (# blocks/track = 8) =    16384
		CL *   1 * (# heads = 8) * (# blocks/track = 8) =     8192
		SDH                      * (# blocks/track = 8) =       40
		SN  / (# sectors/block = 2)                     =        0
		----------------------------------------------------------
		Partition 0 block number                        =    24616
	
	This is the block you want to spare out.  You should get the same
	answer using:

		Size of partition 0 (first cylinder)    =    64
		Size of partition 1 (swap area)         =  5000
		Reported (partition #2) block number    = 19552
		-----------------------------------------------
		Partition 0 block number                = 24616

4)	Select the menu item that spares bad blocks.  You will be allowed
	spare out blocks 3 ways, by sector, by bit on track, and by block.
	Select the `block' option.  Spare block #24616.   You can spare
	just the first, or both the first and second sectors in that
	block.  You might as well spare both of them.


Gene H. Olson
amdahl!bungia!gene

Kdavid@gizzmo.UUCP (David Solan) (08/15/88)

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

Many thanks to Gene H. Olson for his recent meticulous description of
how to remove a bad block on the hard disk of the UNIX PC using the
Diagnostic Floppy.

Objective Utilities (TM), a software product of Objective Programming
Inc., (Copyright (C) 1988 Objective Programming Incorporated, All
Rights Reserved) can also accomplish this task on the UNIX PC (ALONG
WITH MANY OTHERS), but in quite a different manner.  Mr. Olson's
method involves shutting down your system in order to do this bad
block removal.  It also involves doing complicated arithmetic.
Objective Utilities involves neither of these.

With Objective Utilities, the bad block (which, using a certain
feature of Objective Utilities, you can readily map to the file it is
in) is removed dynamically, while your system is up and running.
After this removal, it is actually feasible to have a more
sequentially organized file than you had previously with the bad
block, and therefore a file that can be accessed more quickly by your
hard disk, while with the Diagnostic Floppy bad block removal method,
while seek times allegedly are not degraded after the removal (BUT SEE
BELOW!), latency times certainly are.

Actually, the /usr/adm/unix.log output only gives you information on a
1024 byte bad "block".  If you remove both the 512 byte sectors
comprising this bad block using Mr. Olson's method, as indeed he
suggests (because in this situation you really have no other choice),
seek times might indeed be degraded, since the second "bad" sector
will have to be placed on a different track than the first "bad"
sector (there's only room for one bad sector per track -- see below).

Also, there is something I don't understand here and perhaps Mr. Olson
could be so kind as to respond.  Isn't it true that when you place a
sector on the bad block table, via the Diagnostic Floppy for the UNIX
PC, the actual data contents of the "bad" sector are NOT moved to the
17th sector of the track it is mapped to, so that data on that sector
is lost forever if you do this?  I was sure this was the case.  Was I
wrong?  Can anyone answer this question absolutely, positively, for
sure (no guesses here please!)?

One last point in response to some earlier ruminations on USENET.  The
WD1010 method of sector verification when reading the hard disk
depends on a certain layout for each track of the hard disk.  The
first 26 bytes is verification info.  Then, for the next 17 sectors,
we have 79 more bytes of verification info followed by 512 bytes of
actual disk data, repeated 17 times.  This comes to a total of 10073
bytes for 17 512-byte sectors, 16 of which are actually used in the
UNIX file system or elsewhere for data (that is, 8192 bytes of actual
data per track).  Since the ST-506 drives used in this machine have
10416 raw bytes per track, this leaves an extra 343 bytes per track --
not enough to do anything in particular.  Therefore, you CANNOT fit
more than 17 sectors per track, at least not with the current STRICT
limitations imposed on much of the hardware/software of the UNIX PC.

If anyone on USENET wishes more information on Objective Utilities, or
our other product for the UNIX PC, ACCUCLOCK, please send your FULL
and EXACT USPS ADDRESS to the address below and brochures will be in
the mail.  You could also contact me by voice phone, as listed below,
at any reasonable time, weekends included.

P.S. While I am on the topic, let me get something publicly off my
chest that has been gnawing away at me for some time.  Objective
Utilities is a fully copyrighted and trademarked product of Objective
Programming Incorporated.  We spent many long hours into the night to
develop it.  It belongs EXCLUSIVELY to us -- period.  This is to
notify anyone on USENET who might be using an illegal copy of this
product, and unfortunately we have clear evidence that such activity
was at least attempted by at least one person on USENET, that you are
violating our rights if you are using it without a license to do so.
Please IMMEDIATELY destroy ALL copies you have and preferably send us
a personal notice that you are doing this.  Of course, the product is
available for a price to all who wish to pay for it.  Multi-CPU
licenses are also available.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

		David  Solan
		Post Office Box 123
		Norwalk,  CT  06856
		Voice: (203) 866-6900
		attmail: <!dsolan>

-- 
16 million Americans will die in the next 10 years without their seat belts on.  Buckle up now!
                                       {codas,u1100a}-----\
David Solan                   rutgers!rochester!pcid!kodak!gizzmo!kdavid
                                 {lazlo,ethos,fthood}-----/

Kdavid@gizzmo.UUCP (David Solan) (08/18/88)

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

As Emily Litella would say: neevvver minnnnnnnd.  I bit the bullet and
sacrificed 512 bytes of my hard disk being read in sequentially for
the sake of suffering UNIX humanity.  As you might recall, I asked
the question in a former posting:

>                                 Isn't it true that when you place a
>sector on the bad block table, via the Diagnostic Floppy for the UNIX
>PC, the actual data contents of the "bad" sector are NOT moved to the
>17th sector of the track it is mapped to, so that data on that sector
>is lost forever if you do this?  I was sure this was the case.  Was I
>wrong?

I tested it out.  I was NOT wrong!  You most assuredly DO lose 512
bytes for every sector you put on the bad block table (at least via
the Diagnostic Floppy program that I am using).  What appears in the
place of what was once beautiful data is just a string of 512
hexidecimal ff's.  So think Objective Utilities!

P.S. Sorry, but my USENET .signature was wrong in my last posting --
factually and getting 2 nodes mixed up.  NOW it is correct, both ways.

-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-

		David  Solan
		Post Office Box 123
		Norwalk,  CT  06856
		Voice: (203) 866-6900
		attmail: <!dsolan>

-- 
24 million Americans will die in the next 10 years without their seat belts on.  Buckle up now!
                                       {codas,u1100a}-----\
David Solan                   rutgers!rochester!kodak!pcid!gizzmo!kdavid
                                 {lazlo,ethos,fthood}-----/

mikebe@i88.isc.com (Michael G. Beirne) (11/19/90)

Hello, I am having problems formatting a MiniScribe 6085 Hard drive.

With the enhanced diagnostics it does seem to format the drive,( it clicks
for each cylinder and the numbers count down from 1024) but
at the end it complains with the message:
Test	: Hard disk test( Drive 0 )
Subtest	: Format.
Error	: Winchester: Can't Write the new VHB:Response = 10
	Enter y [Y] to Abort, Return to continue:

If I just hit return it says that 
VHB write failed. Disk need to be re-initialized.

The Disk tests complain about not being able to recalibrate the drive.

I have tried these with two different hard drives and used the MiniScribe
6085 in another system with no problems, so it does not seem to be the drive. 

Should I just stretch my budget and buy someones dead/used system and try
to cobble something together?

Or does someone know from this description where I should look for bad
signals/chips?  I am good with a soldering iron and oscilloscope so this
is not too daunting a task.
--
mikebe@i88.isc.com or beirne@chinet.chi.il.us