[comp.sys.nsc.32k] SCSI Woes, revisited

rhyde@ucrmath.ucr.edu (randy hyde) (05/02/91)

I found a problem with my SCSI driver.  It seems the assembler I'm
using (Solutionware's NSX16 cross software package for CP/M [running
on a Z-80 card on my PC]) generates bad code.

I hand compiled Bruce's C code and managed to get it to work.  As I
massaged it towards my code ("keep it working")  I discovered that I
could break the code by inserting NOPs in at certain points of the code
investigation showed the assembler producing bad offsets under certain
circumstances.  Sigh.  Minix is on it's way.  Guess I'll have to wait
for Bruce's assembler before I continue with my explorations.

I have a couple of quick questions on the subject.

The manual states that you shouldn't use interruptable instructions for
pseudo-DMA (I assume this means MOVSi).  What about MOVMD?  Is this
instruction kosher?  What is the problem with an interruptable instr.
like MOVS?

Also, since the cache is write-though, I assume that I do not need to
increment the destination pointer when writing to the pseudo-DMA area
(except one bump at the end of the operation to prevent a read from
that same location later on causing a problem).  Is this correct?

While on the subject, what's wrong with simply disabling the data cache
while accessing the pseudo-DMA area?  Can't you disable the data cache and
leave the instr cache going?  This would seem to produce the best performance
anyway since constantly reading from (or writing to) the pseudo-DMA area would 
tend to invalidate a lot of the cache.  Is this a correct assumption?

*** Randy Hyde

gs@vw25.chips.com (George Scolaro) (05/09/91)

[In the message entitled "SCSI Woes, revisited" on May  1, 23:03, randy hyde writes:]
> 
> I have a couple of quick questions on the subject.
> 
> The manual states that you shouldn't use interruptable instructions for
> pseudo-DMA (I assume this means MOVSi).  What about MOVMD?  Is this
> instruction kosher?  What is the problem with an interruptable instr.
> like MOVS?

An interruptible instruction may on returning from the trap handler
re-transfer part of the data that was already transferred prior to the
interrupt. This will cause extra data to be read or written to the peripheral
device - not a good idea! I don't know if MOVMD is an interruptible
instruction, maybe someone from NS can supply that information. Anyhow, more
importantly, on the 32532 instructions like MOVD are just as fast as the more
complex MOVSi or MOVMi instruction, as long as they are in the instruction
cache - which would be the case.

> Also, since the cache is write-though, I assume that I do not need to
> increment the destination pointer when writing to the pseudo-DMA area
> (except one bump at the end of the operation to prevent a read from
> that same location later on causing a problem).  Is this correct?

Sounds right.

> While on the subject, what's wrong with simply disabling the data cache
> while accessing the pseudo-DMA area?  Can't you disable the data cache and
> leave the instr cache going?  This would seem to produce the best performance
> anyway since constantly reading from (or writing to) the pseudo-DMA area would 
> tend to invalidate a lot of the cache.  Is this a correct assumption?

Not necessarily, since disabling the data cache would also cause the memory
side of the transfer to not be cached. This would reduce the transfer
performance - we rely on the cache doing a line fill on a read - 16 byte
transfer (more important on read from memory and write to peripheral).
Also, if you just disable the cache and then re-enable it - it will be
flushed, to ensure coherency - locking it might be better (though I'm not
sure if you can lock the data cache - you certainly can for the instruction
cache). The bottom line is try and see which is fastest. Also, make sure
you have a 32532 data booklet on hand and read the fine print very closely!
It always amazed me to see the way NS did their 32k data sheets - the most
important information was always printed in the smallest type!

best regards,

-- 
George Scolaro (gs@vw25.chips.com)	Chips & Technologies
(408) 434-0600				3050 Zanker Road
					San Jose, CA  95134

rhyde@ucrmath.ucr.edu (randy hyde) (05/10/91)

I've been playing around with the cache vs. pseudo-DMA transfer tradeoffs
and I've come up with a couple of theories I'd like to bounce around and
see if they're realistic--

MOVD vs. MOVMD (My understanding is that MOVMD is *not* interruptable, that's
why it was limited to 16 bytes).  Perhaps the execution times of four MOVD
instrs and one MOVMD instruction are the same.  However, the MOVMD instruction
is more compact so it impacts the instr cache less.  MOVMD would seem to be
a better choice in this regard.

To transfer a 512 byte block, I've seen it implied here and in the documenttion
that one should execute 128 MOVD instrs in a row.  This is probably great if
all those instrs are in the cache.  If they're not, I would imagine that
executing some smaller number (say 32) in a loop would produce better
results since you would pay the price of filling the cache for only 35 (or so)
instrs and then you could run the remaining 96 MOVDs out of cache.  Would this
produce better results?

Alas, I've been unable to test these two theories out, it seems my drive
tops out somewhere between 700K-1Mbyte/sec transfer rate (when transferring
64K blocks).  Every trick I try to speed it up fails.  The disk transfer
rate seems to be the limiting factor.  However, I don't want to use this
as an excuse to stop trying, I may buy a faster SCSI-II drive some day.
*** Randy Hyde

dlr@daver.bungi.com (Dave Rand) (05/12/91)

[In the message entitled "Re: SCSI Woes, revisited" on May  9, 18:00, randy hyde writes:]
> MOVD vs. MOVMD (My understanding is that MOVMD is *not* interruptable, that's
> why it was limited to 16 bytes).  Perhaps the execution times of four MOVD
> instrs and one MOVMD instruction are the same.  However, the MOVMD instruction
> is more compact so it impacts the instr cache less.  MOVMD would seem to be
> a better choice in this regard.

Nope. MOVMD is much slower than MOVD for a number of subtle ( and not so
subtle) reasons. MOVSD is also slower than MOVD (modulo the cache effects).
I ran quite a few simulations before the design of the PC532 to figure out
what was best, and why. The effects of the cache, *IOINH, DRAM, and SCSI
are complex (at the best of times, and undocumented in the worst).

Required reading is the NS32532 Data sheet (with instruction timing, or
the NS32GX32 data sheet if you can't find a 532 sheet with timing);
an application note "Using Memory-Mapped I/O with the NS32GX32 or NS32532";
an application note "Series 3200 Instruction Execution Times"; and the
PC532 theory of operation.

This still won't tell you everything (such as *IOINH invalidating the 
instruction cache - nothing is *ever* mentioned about this!). But you will
have better information than a gut feeling... And once you think you know
everything there is to know, hook the whole thing up to a logic analyzer
and find out what it REALLY does (bet you get as many surprises as we
did!).

> 
> Alas, I've been unable to test these two theories out, it seems my drive
> tops out somewhere between 700K-1Mbyte/sec transfer rate (when transferring
> 64K blocks).  Every trick I try to speed it up fails.  The disk transfer
> rate seems to be the limiting factor.  However, I don't want to use this
> as an excuse to stop trying, I may buy a faster SCSI-II drive some day.
> *** Randy Hyde

The current scheme will drive the SCSI bus to 3.8-3.9 megabytes per second
in ASYNC mode (assuming the SCSI device is fast enough - so far only another
PC532 has been fast enough to sustain this rate). In SYNC mode, it should
go to nearly 5 megabytes per second, sustained. That is as fast as the SCSI
chips can support - there is no point trying to go faster.



-- 
Dave Rand
{pyramid|mips|bct|vsi1}!daver!dlr	Internet: dlr@daver.bungi.com