[comp.sys.amiga.hardware] 32bit DMA lockups

ajbrouw@neabbs.UUCP (ALBERT-JAN BROUWER) (02/02/91)

Hey CBM, this message is worth a CAT-Scan or two :-). If you know about
the problem I'm about to describe please comment. I would like know
whether the problem is actually being addressed.

Gregory Travis wrote:
> Here is some more information for Commodore/Randell.  As I posted before
> and as Randell suggested, turning off reselection on the A2091 with
> 5.92 ROMS and the WD "A" chip is NOT sufficient to avoid the lockup
> problem.  First of all, my configuration:
>
>	1 A2500/30 with 4 Meg of 32-Bit RAM
>	1 A2091 with ROM V5.92 and the WD "A" chip.  2 Meg of 16-bit RAM
>	  on the board.
> etc. etc.

I think you might be suffering from two separate problems here Gregory.
The 2091 WD reselection thing has been highlighted sufficiently I think.

However, there is a problem with _SOME_ 2630 cards while doing DMA to/from
32bit memory. I've determined this through experimentation, and by hearing
from other people having exactly the same problem (probably including you).

The problem tends to occur when the harddrive is in full swing DMA transfer,
unfortunately mostly during writes. A good way to evoke it is by running
DiskSpeed or some other write intensive program. When it occurs, the system
simply locks up, or barely makes it to the guru alert.

I've been able to reproduce it with both a HardFrame and a 2090a. A friend
of mine has the same trouble with a 2090a and a 4.x motherboard. I've
had my motherboard upgraded from 6.0 to 6.2 but the problem persisted.
Perhaps redundantly; we both have 2630 cards.

In a recent query to this group John Veldthuis reported exactly the same
problem, but amazingly this was with a 2620 and a HardFrame.

Note that the failure frequency varies a lot when swapping controllers
or slot positions; under one configuration it only occured once every
four avarage usage day equivalents. Still unacceptable ofcourse.


A fix? There is none. For the time being you may want to restrict DMA
access to non 32bit ram. The rudest way is to pull the plug (jumper)
on your 2630's 32 bit ram. An alternative is to set your DMA mask.
The mask isn't sufficiently specific; your 16bit fast is probably from
$600-$800, your 32 bit is at $200-$600 and the chip is below $100(000)
so you can only set the mask to limit DMA to chip memory (mask = ffffe)
This will cause harddrive access to slow down a _LOT_ though.
Also set the cache buffers to be allocated in chip memory, or else
you'll still lockup.

Another point is to avoid using disk optimizers and such, these will
cause DMA to 32bits mem because they bypass the filing system.
(The filing system is responsible for applying the DMA mask).

Medium term solution; modify addbuffers to allocate lots of 16 bit
mem buffers. Perhaps hack up your harddisk.device.

Sigh,

-Albert  (hp4nl!neabbs!ajbrouw)

"After 5 days of debugging efforts, I decided to
 let it win a core-wars competition instead."

chem194@csc.canterbury.ac.nz (John Davis) (02/04/91)

In article <555376@neabbs.UUCP>, ajbrouw@neabbs.UUCP (ALBERT-JAN BROUWER) writes:
> However, there is a problem with _SOME_ 2630 cards while doing DMA to/from
> 32bit memory. I've determined this through experimentation, and by hearing
> from other people having exactly the same problem (probably including you).
> 
> The problem tends to occur when the harddrive is in full swing DMA transfer,
> unfortunately mostly during writes...
> 
> I've been able to reproduce it with both a HardFrame and a 2090a. A friend
> of mine has the same trouble with a 2090a and a 4.x motherboard. I've
> had my motherboard upgraded from 6.0 to 6.2 but the problem persisted.
> Perhaps redundantly; we both have 2630 cards.
> 
> In a recent query to this group John Veldthuis reported exactly the same
> problem, but amazingly this was with a 2620 and a HardFrame.
>...
> A fix? There is none....

I had exactly the same problem on fitting a 2630 to my rev4.3 b2000 using a 
2090a. A call to CBM revealed it was a known problem, and could be fixed by 
adding a pullup resistor across one IC on the motherboard. It cured the 
problem totally for me (I left it copying a 1.5mb file around the hd for
a few hours to test - no problem whatsoever). As far as I know the
fix _ISN'T_ needed on rev6 boards, but it could well be worth 
checking with CBM anyway (and definitely worth checking on your friends
rev4.x machine)

-----------------------------------------------------------
| o  John Davis - CHEM194@canterbury.ac.nz               o |
| o  (Depart)mental Programmer,Chemistry Department      o |
| o  University of Canterbury, Christchurch, New Zealand o | 
| o                                                      o |
| o  co-sysop AmigaINFO BBS,1200/2400 baud CCITT,        o |
| o           24 hours a day, ph NZ +3-3371-531          o |

ajbrouw@neabbs.UUCP (ALBERT-JAN BROUWER) (02/05/91)

Gregory Travis wrote:
> 	Now, I figure that, slow as it may be, access to the 16-bit
> memory has got to be faster than going out to the disk drive.  So, I
> would like to have a system in which all "addbuffer" buffers and
> rez/resident images get loaded into 16-bit memory, leaving the 32-bit
> memory virgin for the serious computational tasks of the 68030.

OK, here's a small hack that should allow you to AddBuffer
16 bit memory instead of wasting 32 bit. It isn't the
cleanest thing on earth so read the ReadMe file first.

begin 777 AddSlowFast.LZH
M(3<M;&@Q+: !   $ @  "+U$%@  "T%D9%-L;W=&87-TZ>+&8N/?\,> ,;( X
M>-!8)&!=UO0/[,MUHC#9FRRN%.HYAE7N+RK(<^&JZ?\V?:.HEDMYWRH[L&[#X
M !?ST5=S)=%AU#753>8YM._33F4 M1$!ZYW.W, ?A^__NOQ]?=_O;  K'Z&BX
M-9^OBKP[Z_MJ7=!12[Q;_B[>X92_8A]^BC%"=1-?EJ(C04R9MPC23K7"OO'JX
MKF0R2USOFE-&<$>]LEL(-VC6:&1+H=%4T4]00D,9M" G<][9DZRQL/;QHD=:X
MQFW_G*0KW ?!+1OKO;XC)]]-73/#K]HKUP4D^1Y#SL,M3L[;$S.W7^"0NI")X
MO.$OZ4[#"=>F >Q/90D@/_<@=WFIS*DT\@%R(X[@*.<K)>X:!_.P)U/85;?FX
MCN"=%WST_@8(!'0M2'X8*VKZ[7CI@G1LCEFQV!9*1##&P7]3CGT7>4!SU_OFX
MGJ5]RA'!+8A^_;%Y]U=^.?TIW^6EAM4=G[XCOJ$^'+KND)4?5PI']NM40[HYX
M)36;G-CI&43CSMP9/.@K_4)+T0:G8,>@)J-RCZCSR9?<TNV-=UY')L=LMJ"DX
M$!P)+6QH,2UR P  O04  .8"118   9296%D3648Y/!]/K_]9& ( Q/C@\WOX
M=R[>_"XYP#\J^^_V%=]_[_-+[?U3G \._I3V%-+\3?G3F53+*O1F/V<N?)'KX
M=UENK Z&)._DNTF0(J$%<Q'2@M0(,F)#?21FIBKX R57=0@5LMT#$9K1>>6;X
M[G4GJ6G@22%"[>V=E%7GV-&'$,M)+,8ULY&I\) 4' &QS\61[?:9?96KSHK"X
MV0&=16](?&O<4*^?!HBNI4B7A8!CNA_2'0.VXP!V[YA'F\W?ZA[]"(08+QZ.X
M8@H\HEWWO '-/S41!NU7SW_:G.-[J<7/Y/P/=^H2/7M8),E[=X-@O@.27/N<X
M 6F^<WCI ;!E:D0V0SI,%#Q8+1L?&'PCZA=_?B_<?;_0H^+.R37$SN8.V!T,X
M=SUY2)3ZS"8]I'_"] P;T5;FUJCJZGWV\]G9E_Q8[(_XJI%;9O28Y>(KCWL3X
M.]! YC5_(9$^V0>J,O5XE+2P17Q(&(D839G#M[,ZHFR2FF#-QF-8X>;#2*%<X
M"*RV6#M%II':D+?>Z:*^C<&TZ%]CL62/C4QANY$_7E<B[1)DR#J8(Y,?+"#XX
MTME,:MO4@@_-A)I6>K*-W3&R*%ULH[DZI+C.,5.;G)=2:/(Q(YK?N.#N3C(\X
MXH5X9,4X]^&A!955MQIZE*MJ%C6%7KBXV/R:/*2,1S(;B=1M(AI!:&V;,?LCX
M[2LX>2-75!T*,M6C59Q:/7J'.696>DP*U%*D' 2KB\L05*9!PCE4<0<Y;A)DX
MT0Q-7I@I9H0S!4FCZ  E0,ML@V&1&R4U]I[:;1\,_UX4M#4D%0IH:LIJ+.SFX
M]G@0R9)=]#&][::6'KG-U$\%WCI(OIT>A[,\00S9+UF90G\Z)">;ALSI;!1'X
M\_^.6IX?T,V\FL,>_AX:OO+-#=H)6?83BI#3D;(*#DLLF'B4B39',X$9_]"WX
M[WX%I5>GWH^V(_B.'P542*QYV=5QK!.BKTIMG>R[G?UWFH4#93DI#TX[*:;VX
MQ?HS7_>!SWT!,5U6:@"7F2#KH9GH4Z^?Z-&U<;&LBPZ-D0G>LO [@C=VH0Z^X
MSI@%;EWIYF[+KL\TG\>"SGR8D%R4N7)#[9#(,:LGY-9NGN\C\%DAQAIAITC'X
M4MJ_+V)+R.%%7BRWTD1$>!Z/.2,]GEI3#$:RXFM/4.-4N.IS[YZ*M,."UIK[X
.J4EU7#SQN*']7/HR:P!$X
 X
end

-Albert  (hp4nl!neabbs!ajbrouw)

"After 5 days of debugging efforts, I decided to
 let it win a core-wars competition instead."

chem194@csc.canterbury.ac.nz (John Davis) (02/07/91)

> I had exactly the same problem on fitting a 2630 to my rev4.3 b2000 using a 
> 2090a. A call to CBM revealed it was a known problem, and could be fixed by 
> adding a pullup resistor across one IC on the motherboard. It cured the 
> problem totally for me (I left it copying a 1.5mb file around the hd for
> a few hours to test - no problem whatsoever). As far as I know the
> fix _ISN'T_ needed on rev6 boards, but it could well be worth 
> checking with CBM anyway (and definitely worth checking on your friends
> rev4.x machine)

Sorry about not posting the actual details of the fix initially, I couldn't
find my notes on it. Anyway, here's the info ....
 
fix for DMA probs with rev 4.x m/boards and accelerator cards
-------------------------------------------------------------

fit a 3k3 ohm resistor across pins 11 and 20 (top right to bottom right
pins of chip, when looking at motherboard from front of machine) on
chip U605 (it's one of a group of 6 to the left of the 68000, again
looking from the front of the machine).  
 
Worked like a charm for me - before putting the resistor in any write
of >100k crashed the machine, now I can write 5mb files (with a single
write() call) 'til I'm blue in the face, without a hiccup.

-----------------------------------------------------------
| o  John Davis - CHEM194@canterbury.ac.nz               o |
| o  (Depart)mental Programmer,Chemistry Department      o |
| o  University of Canterbury, Christchurch, New Zealand o | 
| o                                                      o |
| o  co-sysop AmigaINFO BBS,1200/2400 baud CCITT,        o |
| o           24 hours a day, ph NZ +3-3371-531          o |

gsarff@meph.UUCP (Gary Sarff) (02/13/91)

In article <555376@neabbs.UUCP>, ajbrouw@neabbs.UUCP (ALBERT-JAN BROUWER) writes:
>
>Hey CBM, this message is worth a CAT-Scan or two :-). If you know about
>the problem I'm about to describe please comment. I would like know
>whether the problem is actually being addressed.
>
>Gregory Travis wrote:
>> Here is some more information for Commodore/Randell.  As I posted before
>> and as Randell suggested, turning off reselection on the A2091 with
>> 5.92 ROMS and the WD "A" chip is NOT sufficient to avoid the lockup
>> problem.  First of all, my configuration:
>>
>>	1 A2500/30 with 4 Meg of 32-Bit RAM
>>	1 A2091 with ROM V5.92 and the WD "A" chip.  2 Meg of 16-bit RAM
>>	  on the board.
>> etc. etc.
>
>I think you might be suffering from two separate problems here Gregory.
>The 2091 WD reselection thing has been highlighted sufficiently I think.

I have been seeing messages about this and am curious.  Am I correct in
assuming that the WD "A" chip is the WD3393A?  The reason I am asking is
because I have been in the process of modifying our scsi driver, that
previously worked with NCR boards, Adaptec boards, and our own controller
based on the WD3393 chip.  I did have to do some modifications to our
driver, though they were fairly minor as such things go.  I am wondering if
it is the consensus here, among users, developers, or CATS people, or any
combination thereof, that the "reselection" problem is the fault of the
WD3393A chip or is some problem with the 2091 controller board itself
and/or its interaction with the WD3393A chip.  Our driver does do
disconnect/reconnect selection and seems to be working fine so far, I have
just finished testing it with 1 Archive Viper 150Meg scsi tape, 2 Wren VII
1.2 Gig SCSI, 2 Wren VI, 702 Megs, a Wren IV (I think) 182 Meg, and a
seagate ST251, all being read from and written to simultaneously, and have
not noticed any problems occuring, BUT, I am not doing this on an amiga,
but on our own machines that we make, multibus Moto-680x0 based machines.
I do have an amiga 2000 though and thought I would post since I did not
like the sound of some of the things I have been hearing about peoples
problems, re: the new A chip.  Thanks


---------------------------------------------------------------------------
                          I _don't_ live for the Leap!
     ..uplherc!wicat!sarek!gsarff

kdarling@hobbes.ncsu.edu (Kevin Darling) (02/19/91)

gsarff@meph.UUCP (Gary Sarff) writes:
>>The 2091 WD reselection thing has been highlighted sufficiently I think.
>
>I have been seeing messages about this and am curious. 

We also use the WD33C93A in our 68K computer (not an Amiga).  I don't
know what the diff is between the A version and previous ones, but
at least one reselection bug was documented as far back as 1987... ROM
revisions 8 and below for the 3393 part.  Could even be that CBM found it.

Look under the chip, and there should be a number like xx3393xxD06xP
or similar... the 06 is the ROM revision number.  Rev 09 supposedly
fixed the selection bug, altho there could be revs past that one.

Again, I may be mixing apples and oranges here (A and not-A)... I didn't
work on the SCSI section except for converting a driver to be interrupt
driven.  But I seem to recall that the guy who did the hardware, mentioned
a known bug.  Not much help; sorry.  I'll ask him.

I know of one company which gave up on the prototype 3393's, but we're
happy with the chip so far.  best - kev  <kdarling@catt.ncsu.edu>

jesup@cbmvax.commodore.com (Randell Jesup) (02/20/91)

In article <00077@meph.UUCP> gsarff@meph.UUCP writes:
>I have been seeing messages about this and am curious.  Am I correct in
>assuming that the WD "A" chip is the WD3393A?  The reason I am asking is

	Yes.

>because I have been in the process of modifying our scsi driver, that
>previously worked with NCR boards, Adaptec boards, and our own controller
>based on the WD3393 chip.  I did have to do some modifications to our
>driver, though they were fairly minor as such things go.  I am wondering if
>it is the consensus here, among users, developers, or CATS people, or any
>combination thereof, that the "reselection" problem is the fault of the
>WD3393A chip or is some problem with the 2091 controller board itself
>and/or its interaction with the WD3393A chip.

	The problem is based on the fact that WD changed what the chip did
when it's reselected while you're trying to select a differect drive.  It
now requires us to "sniff" the interrupt queue or some such to figure out what
happened.  (I'm not the person writing the scsi stuff.)

	We used the low-level portions of the chip due to bugs in the high-
level portions in the pre-A versions (perhaps with reselect, I don't know).

-- 
Randell Jesup, Keeper of AmigaDos, Commodore Engineering.
{uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.commodore.com  BIX: rjesup  
The compiler runs
Like a swift-flowing river
I wait in silence.  (From "The Zen of Programming")  ;-)