[net.unix-wizards] 4.1BSD and bad sectoring

ajg%ll-vlsi@sri-unix.UUCP (06/13/83)

Hi,
  We have one VAX 11/780 "ll-vlsi" running 4.1BSD (tape release 7/19/81)
(error free disk packs) working just fine.


  However we have just attempted to bring up a second
VAX 11/780 with 4.1BSD (tape release date 7/19/81), and
are having problem with bad sectoring.
This VAX has:
     1 SI 9400 controller(massbus)
     2 SI 9775 disk drives (675 MB unformatted)
               each emulating 2 RM05's
     1 TU77 tape drive

These disk packs are NOT error free.

We used the following procedure bringing up the system:
   
     1) We used the DEC EVRAC diagnostic
        to enter bad sectors

     2) Booted UNIX from tape

Now I use the dump utility to test out some areas of the disk,
e.g.  "/etc/dump 0f /3h/dummy /usr "

I get hard error console messages corresponding to sectors that have
been entered in the bad sector file!!!

Anybody got any ideas what the problem is ??

Note: we have already made the path News-item-number:25 
line 112 of hpreg.

smk@linus.UUCP (Steven M. Kramer) (06/16/83)

We have the same problem.  I thought the driver handled that, but it
never has for us.  We have a 4.1 on a 780 with RM05's.
-- 
--steve kramer
	{allegra,genrad,ihnp4,utzoo,philabs,uw-beaver}!linus!smk	(UUCP)
	linus!smk@mitre-bedford						(ARPA)

dmmartindale@watcgl.UUCP (Dave Martindale) (06/19/83)

Your problem is probably that the bootloader ("boot" on the floppy)
has a disk driver which cannot handle bad sector forwarding, and your
"vmunix" has a bad sector in it.  You lose.  It should be possible to
add the bad sector forwarding to the standalone driver to fix this.

Once you get the system up, you probably have yet another problem.
(We have a SI 9800 controller, which I assume is at least as good as
the 9400 for bug fixes.)  When you do a write of many blocks to the disk
and one of the sectors being written to is bad, the controller aborts
the transfer showing BSE just as it's supposed to.  Unfortunately,
the driver determines which block got the error by looking at the RH780's
word count register to see how far along the transfer got, and the 9800
buffers up to 4 sectors of data in the controller.  Thus the calculated
bad sector number may be anywhere from 1 to 4 high, and the forwarding
calculation fails.  Does anyone have a fix for this?
So far this hasn't bothered us since we have flag-free packs, but SOMEDAY
it's going to get us....

	Dave Martindale, watcgl!dmmartindale

guy@rlgvax.UUCP (06/20/83)

We've had a similar problem with DEC RM05's and a DEC controller; we have
several bad blocks which get BSE errors, but when we used "bad144" to put
their block numbers in the bad block table it didn't seem to do the
mapping.  I've heard the vanilla "hp" driver doesn't handle bad blocks right,
and that there is a driver from Purdue which does.

		Guy Harris
		RLG Corporation
		{seismo,mcnc,we13,brl-bmd,allegra}!rlgvax!guy

dmmartindale@watcgl.UUCP (Dave Martindale) (06/22/83)

	To get bad sectoring to work, it isn't enough to add the sector
to the bad block table via bad144.  The actual header on that disk sector
must have the "bad sector" bit set in order to get the controller to set the
BSE bit.  Only when an I/O aborts because of a BSE error does the driver
go off and look the sector up in the bad sector table.

	One way of doing this (although tedious) is to reformat the
pack.  I suspect that the DEC formatter believes the bad sector file on
disk, so if you add the sector to the file via bad144 and then run the
formatter (I think its name is EVRAC) it will flag it bad for you.
I believe there is also a section of the formatter which will add
sectors to the bad sector file manually.

	Dave Martindale