[comp.sys.amiga] Self-repairing SCSI disks!

mcohen@nrtc.northrop.com (Marty Cohen) (09/28/88)

Why are (at least some) SCSI drives self-repairing?

Configuration: Amiga 1000, 1meg insider, C-Ltd SCSI controller,
50 meg drive, Live!

Problem: While duplicating EMPTY (to create a sub-directory via
moving (there ought to be an easier way - the mac is much nicer
here)), the system freezes, the screen goes blank, and the guru
appears. Upon reboot, the system tries to access the hard disk
and seemingly fails, with the light on the hard disk flickering
periodically.

Solution (courtsey of C-Ltd): Just let it run!! According to
C-Ltd, the disk is being repaired. Eventually (5 min to a few
hours), the disk will be ok. 

Amazingly, this works. This has happened twice, and both times
the disk has repaired itself. The onle problem is that the
copy of empty is useless - no files can be copied to it. So I just
discard it, create a new copy of empty, and everything seems ok.

Questions:

1. What causes this?

2. How is the disk repaired and what kind of repairs are done?

3. Could this cause something else to go wrong later?

4. How will 1.3 affect this?

Thanks,

Marty Cohen (mcohen@nrtc.northrop.com - 128.99.0.1)

ejkst@cisunx.UUCP (Eric J. Kennedy) (09/29/88)

In article <4275@louie.udel.EDU> mcohen@nrtc.northrop.com (Marty Cohen) writes:
>Why are (at least some) SCSI drives self-repairing?
>
>Configuration: Amiga 1000, 1meg insider, C-Ltd SCSI controller,
>50 meg drive, Live!
>
>Problem: While duplicating EMPTY (to create a sub-directory via
>moving (there ought to be an easier way - the mac is much nicer
>here)), the system freezes, the screen goes blank, and the guru
>appears. Upon reboot, the system tries to access the hard disk

Yeah, why is it that duplicating EMPTY from the workbench is so flakey?
It doesn't crash my system (with 33 Meg drive) but the system freezes
for about 10 seconds and the drive light comes on for a long time.  It
can't take that long to just make a new directory and copy the icon.

>and seemingly fails, with the light on the hard disk flickering
>periodically.
>
>Solution (courtsey of C-Ltd): Just let it run!! According to
>C-Ltd, the disk is being repaired. Eventually (5 min to a few
>hours), the disk will be ok. 

This is not the disk repairing itself, but rather is the AmigaDOS disk
validator repairing the disk.  (You know, it's that file in L: called
disk-validator.  "Ohhh, *That's* what that does..." :-)  

When you crashed during a write to the disk, the hard disk got slightly
munged.   The validator was just trying to repair the damage.  In my
experience, the only thing that doesn't make it is the file being
written to.  In your case, the directory Empty.


This brings up another question, though.  How good is the
disk-validator?  I've had it fix my hard drive about 4 or 5 times now,
and never noticed any ill effects, (except for the file that was
originally clobbered) but just how reliable and trustworthy is it?


-- 
------------
Eric Kennedy
ejkst@cisunx.UUCP

jesup@cbmvax.UUCP (Randell Jesup) (09/29/88)

In article <4275@louie.udel.EDU> mcohen@nrtc.northrop.com (Marty Cohen) writes:
>Why are (at least some) SCSI drives self-repairing?
...
>appears. Upon reboot, the system tries to access the hard disk
>and seemingly fails, with the light on the hard disk flickering
>periodically.
>
>Solution (courtsey of C-Ltd): Just let it run!! According to
>C-Ltd, the disk is being repaired. Eventually (5 min to a few
>hours), the disk will be ok. 
...
>1. What causes this?

	The validator process of the file system is running.

>2. How is the disk repaired and what kind of repairs are done?

	It knows that the filesystem was in an inconsistant state when
then system went down, and is going through every file and directory
looking for trouble, and figuring out what blocks of the disk are free
for use.

>3. Could this cause something else to go wrong later?

	No.

>4. How will 1.3 affect this?

	If you use FFS, it will happen much faster.

-- 
Randell Jesup, Commodore Engineering {uunet|rutgers|allegra}!cbmvax!jesup

dillon@CORY.BERKELEY.EDU (Matt Dillon) (09/30/88)

>>1. What causes this?
>
>	The validator process of the file system is running.

	Speaking of which, before I got my hard disk I was worried that
I would destroy it about as often as I screw up floppies.  Needless to say
this just didn't have me worried, it had me *very* worried.

	Fortunetly, after getting my HD, I have had no problems at all.  The
major reason is that, going through an external (SCSI) interface, it is
nearly impossible to destroy more than one sector in the worst of all
practical cases in a crash situation.  In fact, with SCSI, you never get
half-written-out sectors since incomplete SCSI commands produce error
codes.

	The main problem with the floppies, as we found out years ago (the
Delay(0) -> track <usually 40> destroyed problem), is that entire tracks can
be destroyed if the trackdisk.device goes bonkers in the latter situation
due to the fact that the MFM encoding is being done by the Amiga.  Also, if 
your Amiga crashes in the middle of writing a track out to floppy, you can
loose an entire track.

						-Matt

jesup@cbmvax.UUCP (Randell Jesup) (10/01/88)

In article <12840@cisunx.UUCP> ejkst@unix.cis.pittsburgh.edu (Eric J. Kennedy) writes:
>Yeah, why is it that duplicating EMPTY from the workbench is so flakey?
>It doesn't crash my system (with 33 Meg drive) but the system freezes
>for about 10 seconds and the drive light comes on for a long time.  It
>can't take that long to just make a new directory and copy the icon.

	Known "bug".  WB is allocating most of the system memory in small
chunks, then releasing it.  Does not cause a crash under normal conditions,
just takes a while if you have a lot of memory.

-- 
Randell Jesup, Commodore Engineering {uunet|rutgers|allegra}!cbmvax!jesup

gmg@hcx.uucp (Greg M. Garner) (10/02/88)

While we are on the subject of scsi....

How does everyone like the microbotics scsi controller? (Matt, I know
you have one, do you like it?). I want to set up a hard drive on my
1000, and I want to use an embedded scsi controller drive like the 
st225n or st157n. Will the microbotics scsi (the one that goes in 
the 2 meg expansion box) interface ok with these drives? What about
the C ltd scsi interface?  And finally, is there a DMA hard disk interface
for the 1000 without buying an expansion chassis? Thanks for the answers! 

   Greg Garner
   gmg@hcx.uucp
   501-442-4847

dillon@CORY.BERKELEY.EDU (Matt Dillon) (10/02/88)

Greg Garner gmg@hcx.uucp Writes:
:While we are on the subject of scsi....
:
:How does everyone like the microbotics scsi controller? (Matt, I know
:you have one, do you like it?). I want to set up a hard drive on my
:1000, and I want to use an embedded scsi controller drive like the 
:st225n or st157n. Will the microbotics scsi (the one that goes in 
:the 2 meg expansion box) interface ok with these drives? What about
:the C ltd scsi interface?  And finally, is there a DMA hard disk interface
:for the 1000 without buying an expansion chassis? Thanks for the answers! 

	The only reason I have the microbotics SCSI controller is 
because I already had a StarBoard II when I started getting interested in
buying one.  The microbotics is hardly the cream of the crop, it *isn't*
DMA (note: I have an A1000).

	Judging by use and disassembling the code, the driver itself is
sound, but the support programs are very, very bad.  The main support program,
called MDFixer, is used to low-level format and verify the HD.  It crashes more
often than not.  The CLI-based text-menu based thing they provide on their
support disk is extremely (1) Slow, (2) Time Consuming, (3) Badly written,
and (4) provides very little help, instructions, or documentation.

	Fortunetly, I didn't have to use that stuff much.  Not owning any
other SCSI controllers I can't give a comparison, but once I got past the
initial junk I have had no problems using my HD.

	As I said at the beginning, the actual driver is sound once
you get past the support crap.  However, since the driver is based solely
on synchronous commands internally (i.e. asynchronous IO requests are done
synchronously), and since it must busy wait on the SCSI controller (not 
being DMA), I strongly suggest people set the Priority of the filesystems
in the Mountlist to 0 rather than 5.  The microbotics driver does not have
any critical timing requirements and thus they do not Forbid() or Disable()
inside it all that much, which means it essentially busy waits in the context
of the task which made the device call, which is usually the FileSystem.

	The way Amiga priorities work, the CPU is shared round-robin
fashion only amoung the highest priority tasks (all running at the same
priority).  Setting the filesystem priority to 0 means that all the other USER
tasks running in the system, such as your favorite editor or terminal program,
still get CPU even when the microbotics driver busy waits.

				-Matt

dillon%cory.berkeley.edu@UDEL.EDU (10/04/88)

Received: from CUNYVM by CUNYVM.BITNET (Mailer X2.00) with BSMTP id 2935; Fri,
 30 Sep 88 16:57:54 EDT
Received: from UDEL.EDU by CUNYVM.CUNY.EDU (IBM VM SMTP R1.1) with TCP; Fri, 30
 Sep 88 16:57:49 EDT
Received: by Louie.UDEL.EDU id af10648; 30 Sep 88 13:04 EDT
Received: from Louie.UDEL.EDU by Louie.UDEL.EDU id af10283; 30 Sep 88 12:30 EDT
Received: from USENET by Louie.UDEL.EDU id aa10236; 30 Sep 88 12:25 EDT
From: Matt Dillon <dillon@cory.berkeley.edu>
Subject: Re: Self-repairing SCSI disks!
Message-ID: <8809300243.AA20500@cory.Berkeley.EDU>
Date: 30 Sep 88 02:43:48 GMT
To:       amiga-relay@UDEL.EDU
Sender:   amiga-relay-request@UDEL.EDU

>>1. What causes this?
>
>    The validator process of the file system is running.

    Speaking of which, before I got my hard disk I was worried that
I would destroy it about as often as I screw up floppies.  Needless to say
this just didn't have me worried, it had me *very* worried.

    Fortunetly, after getting my HD, I have had no problems at all.  The
major reason is that, going through an external (SCSI) interface, it is
nearly impossible to destroy more than one sector in the worst of all
practical cases in a crash situation.  In fact, with SCSI, you never get
half-written-out sectors since incomplete SCSI commands produce error
codes.

    The main problem with the floppies, as we found out years ago (the
Delay(0) -> track <usually 40> destroyed problem), is that entire tracks can
be destroyed if the trackdisk.device goes bonkers in the latter situation
due to the fact that the MFM encoding is being done by the Amiga.  Also, if
your Amiga crashes in the middle of writing a track out to floppy, you can
loose an entire track.

                        -Matt

jesup%cbmvax.uucp@UDEL.EDU (10/04/88)

Received: from CUNYVM by CUNYVM.BITNET (Mailer X2.00) with BSMTP id 7243; Fri,
 30 Sep 88 23:30:01 EDT
Received: from UDEL.EDU by CUNYVM.CUNY.EDU (IBM VM SMTP R1.1) with TCP; Fri, 30
 Sep 88 23:29:59 EDT
Received: from Louie.UDEL.EDU by Louie.UDEL.EDU id aa16818; 30 Sep 88 20:13 EDT
Received: from USENET by Louie.UDEL.EDU id aa16757; 30 Sep 88 20:11 EDT
From: Randell Jesup <jesup@cbmvax.uucp>
Subject: Re: Self-repairing SCSI disks!
Message-ID: <4909@cbmvax.UUCP>
Date: 30 Sep 88 21:12:39 GMT
Organization: Commodore Technology, West Chester, PA
To:       amiga-relay@UDEL.EDU
Sender:   amiga-relay-request@UDEL.EDU

In article <12840@cisunx.UUCP> ejkst@unix.cis.pittsburgh.edu (Eric J. Kennedy)
 writes:
>Yeah, why is it that duplicating EMPTY from the workbench is so flakey?
>It doesn't crash my system (with 33 Meg drive) but the system freezes
>for about 10 seconds and the drive light comes on for a long time.  It
>can't take that long to just make a new directory and copy the icon.

    Known "bug".  WB is allocating most of the system memory in small
chunks, then releasing it.  Does not cause a crash under normal conditions,
just takes a while if you have a lot of memory.

--
Randell Jesup, Commodore Engineering {uunet|rutgers|allegra}!cbmvax!jesup

gmg%hcx.uucp@UDEL.EDU (10/04/88)

Received: from CUNYVM by CUNYVM.BITNET (Mailer X2.00) with BSMTP id 3985; Sat,
 01 Oct 88 23:12:28 EDT
Received: from UDEL.EDU by CUNYVM.CUNY.EDU (IBM VM SMTP R1.1) with TCP; Sat, 01
 Oct 88 23:12:26 EDT
Received: from Louie.UDEL.EDU by Louie.udel.EDU id ac00572; 1 Oct 88 19:14 EDT
Received: from USENET by Louie.UDEL.EDU id aa00453; 1 Oct 88 19:08 EDT
From: "Greg M. Garner" <gmg@hcx.uucp>
Subject: Re: Self-repairing SCSI disks!
Message-ID: <786@cseg.uucp>
Date: 1 Oct 88 20:26:18 GMT
To:       amiga-relay@UDEL.EDU
Sender:   amiga-relay-request@UDEL.EDU


While we are on the subject of scsi....

How does everyone like the microbotics scsi controller? (Matt, I know
you have one, do you like it?). I want to set up a hard drive on my
1000, and I want to use an embedded scsi controller drive like the
st225n or st157n. Will the microbotics scsi (the one that goes in
the 2 meg expansion box) interface ok with these drives? What about
the C ltd scsi interface?  And finally, is there a DMA hard disk interface
for the 1000 without buying an expansion chassis? Thanks for the answers!

   Greg Garner
   gmg@hcx.uucp
   501-442-4847

dillon%cory.berkeley.edu%UDEL.EDU@cunyvm.cuny.edu (10/04/88)

Received: from CUNYVM by CUNYVM.BITNET (Mailer X2.00) with BSMTP id 5293; Mon,
 03 Oct 88 20:44:29 EDT
Received: from UDEL.EDU by CUNYVM.CUNY.EDU (IBM VM SMTP R1.1) with TCP; Mon, 03
 Oct 88 20:44:27 EDT
Received: from Louie.UDEL.EDU by Louie.UDEL.EDU id ac10567; 3 Oct 88 18:07 EDT
Received: from USENET by Louie.UDEL.EDU id aa10103; 3 Oct 88 17:56 EDT
From: dillon%cory.berkeley.edu@UDEL.EDU
Subject: Re: Self-repairing SCSI disks!
Message-ID: <4382@louie.udel.EDU>
Date: 3 Oct 88 21:55:32 GMT
To:       amiga-relay@UDEL.EDU
Sender:   amiga-relay-request@UDEL.EDU

Received: from CUNYVM by CUNYVM.BITNET (Mailer X2.00) with BSMTP id 2935; Fri,
 30 Sep 88 16:57:54 EDT
Received: from UDEL.EDU by CUNYVM.CUNY.EDU (IBM VM SMTP R1.1) with TCP; Fri, 30
 Sep 88 16:57:49 EDT
Received: by Louie.UDEL.EDU id af10648; 30 Sep 88 13:04 EDT
Received: from Louie.UDEL.EDU by Louie.UDEL.EDU id af10283; 30 Sep 88 12:30 EDT
Received: from USENET by Louie.UDEL.EDU id aa10236; 30 Sep 88 12:25 EDT
From: Matt Dillon <dillon@cory.berkeley.edu>
Subject: Re: Self-repairing SCSI disks!
Message-ID: <8809300243.AA20500@cory.Berkeley.EDU>
Date: 30 Sep 88 02:43:48 GMT
To:       amiga-relay@UDEL.EDU
Sender:   amiga-relay-request@UDEL.EDU

>>1. What causes this?
>
>    The validator process of the file system is running.

    Speaking of which, before I got my hard disk I was worried that
I would destroy it about as often as I screw up floppies.  Needless to say
this just didn't have me worried, it had me *very* worried.

    Fortunetly, after getting my HD, I have had no problems at all.  The
major reason is that, going through an external (SCSI) interface, it is
nearly impossible to destroy more than one sector in the worst of all
practical cases in a crash situation.  In fact, with SCSI, you never get
half-written-out sectors since incomplete SCSI commands produce error
codes.

    The main problem with the floppies, as we found out years ago (the
Delay(0) -> track <usually 40> destroyed problem), is that entire tracks can
be destroyed if the trackdisk.device goes bonkers in the latter situation
due to the fact that the MFM encoding is being done by the Amiga.  Also, if
your Amiga crashes in the middle of writing a track out to floppy, you can
loose an entire track.

                        -Matt

jesup%cbmvax.uucp%UDEL.EDU@cunyvm.cuny.edu (10/04/88)

Received: from CUNYVM by CUNYVM.BITNET (Mailer X2.00) with BSMTP id 6952; Tue,
 04 Oct 88 00:55:14 EDT
Received: from UDEL.EDU by CUNYVM.CUNY.EDU (IBM VM SMTP R1.1) with TCP; Tue, 04
 Oct 88 00:55:11 EDT
Received: from Louie.UDEL.EDU by Louie.UDEL.EDU id al10567; 3 Oct 88 18:11 EDT
Received: from USENET by Louie.UDEL.EDU id aa10390; 3 Oct 88 18:01 EDT
From: jesup%cbmvax.uucp@UDEL.EDU
Subject: Re: Self-repairing SCSI disks!
Message-ID: <4393@louie.udel.EDU>
Date: 3 Oct 88 22:00:09 GMT
To:       amiga-relay@UDEL.EDU
Sender:   amiga-relay-request@UDEL.EDU

Received: from CUNYVM by CUNYVM.BITNET (Mailer X2.00) with BSMTP id 7243; Fri,
 30 Sep 88 23:30:01 EDT
Received: from UDEL.EDU by CUNYVM.CUNY.EDU (IBM VM SMTP R1.1) with TCP; Fri, 30
 Sep 88 23:29:59 EDT
Received: from Louie.UDEL.EDU by Louie.UDEL.EDU id aa16818; 30 Sep 88 20:13 EDT
Received: from USENET by Louie.UDEL.EDU id aa16757; 30 Sep 88 20:11 EDT
From: Randell Jesup <jesup@cbmvax.uucp>
Subject: Re: Self-repairing SCSI disks!
Message-ID: <4909@cbmvax.UUCP>
Date: 30 Sep 88 21:12:39 GMT
Organization: Commodore Technology, West Chester, PA
To:       amiga-relay@UDEL.EDU
Sender:   amiga-relay-request@UDEL.EDU

In article <12840@cisunx.UUCP> ejkst@unix.cis.pittsburgh.edu (Eric J. Kennedy)
 writes:
>Yeah, why is it that duplicating EMPTY from the workbench is so flakey?
>It doesn't crash my system (with 33 Meg drive) but the system freezes
>for about 10 seconds and the drive light comes on for a long time.  It
>can't take that long to just make a new directory and copy the icon.

    Known "bug".  WB is allocating most of the system memory in small
chunks, then releasing it.  Does not cause a crash under normal conditions,
just takes a while if you have a lot of memory.

--
Randell Jesup, Commodore Engineering {uunet|rutgers|allegra}!cbmvax!jesup

chad@cup.portal.com (10/06/88)

In a previous article, (Matt Dillon) writes:

)	Judging by use and disassembling the code, the driver itself is
)sound...

	In respect to this comment, I must disagree... ( I do agree with the
rest of what you said, for the most part.)

	I got a Quantum Q250 (imbedded SCSI) Hard disk, along with the
StarDrive SCSI.  I like the hardware, as it makes things VERY
convenient, but the driver simply DOES NOT support this type of hard
disk...  It is not the fault of the Quantum, which conforms very
closely to the SCSI spec...  Perhaps too close...  In any case,
Microbotics has been absolutely NO help at all, and I have not been
able to contact Joanne Dow to ask/demand for some help.
Unfortunately, the problem seems to have occured with other brands of
hard disks, also... 

	Thus, my suggestion to people is to try their specific drive type out
with the Stardrive before you buy this controller, to make sure it
works with it...  In general, most Seagate SCSI's and Miniscribes
should work, but Conners, Maxtors, etc. may be questionable. 

	My system's been in a mess for over a month because of this problem,
and it is EXTREMELY aggravating, because all it takes is a little
software mods to correct... (Nothing like wasting $900 on a hard disk
system that can't work, is there :-)
~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
				Chad 'The_Walrus' Netzer -> AmigaManiac++
"Chess players mate better!"