[unix-pc.general] WD2010 / No ECC

jcm@mtunb.ATT.COM (was-John McMillan) (08/19/89)

In article <15462@rphroy.UUCP> tkacik@rphroy.UUCP (Tom Tkacik) writes:
:
>
>I installed the WD2010 into a standard 7300, and can verify that it makes
>the disk drive work better.  After installing Lenny's errnotify(1) command,
>I have been seeing at least 3 or 4 disk errors a day.  If I was doing
>anything, it would go up.  Compiling gcc could generate about 20 or 30
>errors.
:
>It must be the error correction circuitry.  I recommend the change
>even if you are not using a big disk.  (Who knows, maybe someday you will. :-))
:

Several people have made assertions about installing the WD2010:
	1) It has (a) reduced errors, or (b) recovered 'lost' disks;
	2) It must be the Error Correcting Code [ECC] that does this.

I don't dispute (1) and am happy this chip helps.  We all need
	a break once in awhile ];-)

However, these references to ECC are fanciful or an incorrect
	use of the term.

THERE IS an ECC generator/checker on the WD2010.  (Its use is
	indicated/triggered by SDH reg bit7=1.) Since the kernel
	was designed to support the WD1010 chip -- which lacks ECC --
	there is NO kernel support of ECC.  

The WD1010/WD2010's CRC mode appends a 2 byte field to the end of
	each data field.  The ECC mode appends a 4 byte field.
	Without having tried it, two disk formats seem incompatible,
	requiring a re-format when changing between CRC and ECC.
	It would be nice were the ECC supported -- but I've never
	even identified a way to figure out WHICH chip is plugged in:
		a) W.D. technical support said THEY had no idea of
			how to figure this out.
		b) Perhaps some specific ECC associated command to
			a WD1010 would fail identifiably, but it
			wasn't obvious at a glance.
 
Since IT AIN'T THE ECC, wherein lies the magic of the WD2010?
	Without getting into the theory of how a Phase Lock Loop [PLL]
	works -- which would be ridiculous for ME to try 8) -- let's
	just assert a smarter PLL circuit makes fewer errors in
	assessing marginal signals.  Ie: As the waveforms vary
	from an "ideal" model, the poorer circuit will begin to
	mis-assign transitions and mis-track the signal.

PLL circuits are NOT ECC, they are just a tracking mechanism meant
	to deal with normal perturbations and PREVENT errors by
	constant small adjustments to timing calculations.

So... ECC's OUT, the PLL's a guess, and why in honk am I here today?

john mcmillan	-- att!mtunb!jcm

dold@mitisft.Convergent.COM (Clarence Dold) (08/20/89)

in article <1624@mtunb.ATT.COM>, jcm@mtunb.ATT.COM (was-John McMillan) says:

> Several people have made assertions about installing the WD2010:
> 	1) It has (a) reduced errors, or (b) recovered 'lost' disks;
> 	2) It must be the Error Correcting Code [ECC] that does this.

> I don't dispute (1) and am happy this chip helps.  We all need
> 	a break once in awhile ];-)

> However, these references to ECC are fanciful or an incorrect
> 	use of the term.
> 	there is NO kernel support of ECC.  

> 	It would be nice were the ECC supported -- but I've never
> 	even identified a way to figure out WHICH chip is plugged in:
> 		a) W.D. technical support said THEY had no idea of
> 			how to figure this out.

Indeed, there is no ECC support in the kernel, and it would require a 
reformat of the disk if it did.
The chip merely informs the system of an ECC error.  Some systems leave
a 'Track Buffer' intact, for the 2010 to make the change, others must
locate the data, wherever it has been transferred, and make the change there.

I don't recall the detail, but detecting the presence of a 2010 vs a 1010
on a machine that could have either was done by setting a cylinder register
to 1200, then reading it back.  A WD1010 would modulo 1024 the register,
a WD2010 would return 1200.

> Since IT AIN'T THE ECC, wherein lies the magic of the WD2010?
> 	Without getting into the theory of how a Phase Lock Loop [PLL]
> 	works -- which would be ridiculous for ME to try 8) -- let's
> 	just assert a smarter PLL circuit makes fewer errors in

The PLL isn't on the chip, it's external.
The chip itself might be better...

Try setting the Step Rate in an iv.desc file to 14 instead of 0, 
then iv -u the disk.  No loss of data, just a 20% increase in seek performance
on 28mSec disks.

Ignore any significance in my signature.
I don know nuddin about the Unix PC.
The iv part, in particular, is untried, untested, and subject to failure.
(but somebody that is well backed up should tell us if it works.)

-- 
---
Clarence A Dold - dold@tsmiti.Convergent.COM		(408) 434-5293
		...pyramid!ctnews!tsmiti!dold
		P.O.Box 6685, San Jose, CA 95150-6685	MS#10-007

jcm@mtunb.ATT.COM (was-John McMillan) (08/21/89)

In article <1182@mitisft.Convergent.COM> dold@mitisft.Convergent.COM (Clarence Dold) writes:
:
>Indeed, there is no ECC support in the kernel, and it would require a 
>reformat of the disk if it did.
>The chip merely informs the system of an ECC error.  Some systems leave
>a 'Track Buffer' intact, for the 2010 to make the change, others must
>locate the data, wherever it has been transferred, and make the change there.

	The above description follows what the WD2010 documents
	indicate IFF the chip is running in ECC mode -- ie., with
	the appropriately re-formatted disk and its 4 byte ECC
	checksums instead of the 2 byte CRC checksums.  The
	chip, as noted in the first sentence, is NOT put in this
	mode.

>I don't recall the detail, but detecting the presence of a 2010 vs a 1010
>on a machine that could have either was done by setting a cylinder register
>to 1200, then reading it back.  A WD1010 would modulo 1024 the register,
>a WD2010 would return 1200.

	I tried this 6 months ago: apparently my test was wrong --
	mea culpa ];-(

	The Kernel Debugger, after loading 0xff into the HICYL register
	reads back 0x03.  Thanks for getting me back on track.  (This
	may also slay an otherwise healthy system, so reset it to its
	original values before departing, please!)

:

>The PLL isn't on the chip, it's external.
>The chip itself might be better...
	
	Agreed: the Data Separator PLL is off-chip.  I had wrongly
	presumed the MFM decoder also used a PLL, but I now see the
	chip's notes says the decoder's data is clocked in bit-by-bit.

	Since it ain't the ECC and it ain't the PLL, WHAT IS the
	difference between the WD1010 & WD2010's data error rates?
	Is it just chip-to-chip differences -- ie., might changing
	to a new WD1010 "fix" problems?  It there a speed/sensitivity
	difference between the two chips?  Is the data separator's
	output marginal for a 1010, but fine for the 2010?  I doubt
	this can be cleared up!
	
>Try setting the Step Rate in an iv.desc file to 14 instead of 0, 
>then iv -u the disk.  No loss of data, just a 20% increase in seek performance
>on 28mSec disks.

	Others have suggested this:  Personally, I just can't face
	the risk....   Gutless Me!  (Except in girth.)

john mcmillan	-- att!mtunb!jcm