[comp.sys.amiga.tech] 2091 problems

stevem@sauron.Columbia.NCR.COM (Steve McClure) (07/24/90)

In article <6611@tekig5.PEN.TEK.COM> phils@tekig5.PEN.TEK.COM (Philip E Staub) writes:
+In article <13242@mcdphx.phx.mcd.mot.com> stan@teroach.UUCP (Stan Fisher) writes:
+>My problem, and one I've heard of elsewhere, which doesn't seem to be being
+>addressed by Commodore is when you've got multiple fast SCSI devices (drives)
+>and do copies between the two drives (physical drives, not just partitions)
+>the whole system hangs with the drive access LED stuck on.  The only way out
+>is to re-boot and then wait for the validator to do it's thing.
+>There seems to be a REAL problem with the SCSI bus arbitration on the 2091!
+>Can't someone at Commodore comment on this?
+>I already posted this problem to c.s.a.t and heard nothing back.
+>
+
+After looking at several device drivers, I've suspected this to be a problem 
+for a while now, and it's not limited to the A2091.
+
+I suspect that the problem has its roots in some sample device driver code
+which Commodore has been distributing (and updating, from time to time). The
+code implements a sample ramdisk driver, but it has shown up in various
+incarnations, including as the basis for a hard disk driver by at least one
+other manufacturer.
+
+The gist of it is something like this:
+
+The driver starts up a separate task for each physical drive. Each task
+has a "busy" flag to indicate that the drive is in use, to prevent 
+multiple processes from attempting to access the same drive at the same
+time. This works just fine, as long as you only have one drive connected,
+or if you're only accessing one drive at a time.
+
+Unfortunately, access to multiple drives must pass through the same
+SCSI controller chip on the interface board, and there isn't (last time
+I looked) any mechanism to prevent an attempt by these multiple tasks
+to access the controller at the same time. This could happen if you have
+two disk intensive programs running at the same time, or if you have a 
+single program which does asynchronous I/O to the harddisk, and it happened 
+to access files on both drives.
+
[ lots deleted ]
+
+I can think of at least two ways to handle this. The first would be to
+implement a semaphore-like mechanism to control access to the controller
+chip itself. This can be a bit tricky, though, if future support of
+disconnect/reconnect-capable drives is anticipated. You have to release the
+semaphore upon disconnect and re-acquire it upon reconnect, as well as the
+arbitration for initial access.
+
+The second way (which may not work) would be to check the chip status
+prior to beginning the sequence of states which constitutes a bus transfer.
+The reason I say it may not work is that there is a potential race condition
+between checking controller status and beginning the transfer. At least a 
+forbid()/permit() would be necessary here.
+
+Perhaps the use of exec semaphores will ultimately provide the cleanest
+solution. But from some red flags I see in the AutoDocs on semaphores, I'd 
+suspect it wouldn't be totally safe to use them on 1.3 or earlier. Perhaps
+some comments from C= on the conditions under which 1.3 semaphores fail
+might be appropriate here.
+
I would think that following the SCSI spec of watching the bus for the 
Bus Free Delay time of 800 nanoseconds would assure you that you can now
assert the BSY signal and go to arbitration.  But I have seen "SCSI" device
drivers that didn't even handshake REQ/ACK properly, so you never know.
 
-- 
----------------------------------------------------------------------
Steve		email: Steve.McClure@Columbia.NCR.COM	803-791-7054
The above are my opinions, which NCR doesn't really care about anyway!
CAUSER's Amiga BBS! | 803-796-3127 | 8pm-8am 8n1 | 300/1200/2400

phils@tekig5.PEN.TEK.COM (Philip E Staub) (07/25/90)

In article <2243@sauron.Columbia.NCR.COM> stevem@sauron.UUCP (Steve McClure) writes:
>I would think that following the SCSI spec of watching the bus for the 
>Bus Free Delay time of 800 nanoseconds would assure you that you can now
>assert the BSY signal and go to arbitration.  But I have seen "SCSI" device
>drivers that didn't even handshake REQ/ACK properly, so you never know.
> 
This is true for multiple initiators on the bus. 

I'd say things are different when your two bus requests are channeled
through two processes using the same initiator hardware.
Merely waiting for the bus to be free for some length of time is no guarantee 
that another process doesn't come along and steal the controller chip out 
from under you between the time the chip says the bus is free (notice it's 
said the same thing for the other processes on the box, too) and the time
you go to selection phase, unless you're protected by a semaphore or via 
forbid().

Besides, I don't know of many Amigae which can do an accurately timed 800ns 
wait loop 8-).

>-- 
>----------------------------------------------------------------------
>Steve		email: Steve.McClure@Columbia.NCR.COM	803-791-7054
>The above are my opinions, which NCR doesn't really care about anyway!
>CAUSER's Amiga BBS! | 803-796-3127 | 8pm-8am 8n1 | 300/1200/2400


-- 
------------------------------------------------------------------------------
Phil Staub, phils@tekig5.PEN.TEK.COM
Definition: BUG: A feature (present or absent) which is (at best) inconvenient.