[net.puzzle] Z80 Code Puzzle

wasser_1@viking.DEC (John A. Wasser) (03/12/85)

In the January 1985 issue of Dr. Dobb's Journal, on page 17 the following
article appeared:

	David S. Tilton sent us [DDJ] the following sequence of Z80 
	code.  It's an absolutely astonishing implementation of... 
	but you can figure it out.  Unfortunately, it requires such 
	narrowly defined conditions that it's probably useless, 
	despite being the fastest one of its kind. What does it
	do, and what are its narrow requirements for utility?

[Comments added by me -JW]

-------------------------------------------------------------------------
; A is an 8 bit register
; B is an 8 bit register
; HL is a 16 bit register made up of H (the high byte) and L (the low byte)
;
LOOP:
	CP	(HL)	; Compare A to memory at HL
	RET	Z	; Return if equal.

	RL	L	; Rotate carry and L register (bottom 8 bits of
			; HL pair) left (Carry goes into LSB of L)
	DJNZ	LOOP	; B--; if (B != 0) GOTO LOOP

	LD	L,0	; Load L with the value Zero.
	RET		; Return
-------------------------------------------------------------------------

	I have yet to hear a reasonable answer.  If you get an answer
	from DDJ, please don't post it for a couple of weeks... that
	would spoil the fun!

		-John A. Wasser

Work address:
ARPAnet:	WASSER%VIKING.DEC@decwrl.ARPA
Usenet:		{allegra,Shasta,decvax}!decwrl!dec-rhea!dec-viking!wasser
USPS:		Digital Equipment Corp.
		Mail stop: LJO2/E4
		30 Porter Rd
		Littleton, MA  01460

chuck@dartvax.UUCP (Chuck Simmons) (03/12/85)

<Another puzzle appears below.>

> LOOP:
> 	CP	(HL)	; Compare A to memory at HL
> 	RET	Z	; Return if equal.
> 
> 	RL	L	; Rotate carry and L register (bottom 8 bits of
> 			; HL pair) left (Carry goes into LSB of L)
> 	DJNZ	LOOP	; B--; if (B != 0) GOTO LOOP
> 
> 	LD	L,0	; Load L with the value Zero.
> 	RET		; Return

This looks like a binary search.  Suppose A is a value that we want to
lookup in a table and that the address of the table is equivalent to 1
mod 64k.  Further we suppose that the number of elements in the table 
is 2**B-1.  Finally the elements in the table have been stored in a
rather special order.  In particular, we have 

    table(2i) < table(i) < table(2i+1)

for all i in 1..2**(B-1)-1.  (Actually, we can violate this constraint
for the last few values if we don't have enough elements to fill up
the table.  A reasonable thing to do would be to use table(1) as a
filler element.)

Example:  A = 5; B = 3; table = 4, 2, 6, 1, 3, 5, 7; L = 1

5>4 so set carry bit
L <- 3
B = 2
5<6 so clear carry bit
L <- 6
B = 1
5=5 so return


My puzzle is written in 68000 code, and it does not do anything nearly
as common as perform a binary search.  (This puzzle is due to Mike Morton,
the author of the reaganagrams published in a recent Scientific American.)
d0, d1, and d2 are 32 bit registers.

        move.w #32,d2   ; d2 = 32
        bra.s start     ; jump into our loop
lp:     or.l d1,d0      ; d0 |= d1
start:  move.l d0,d1    ; d1 = d0
        add.l #1,d0     ; d0 = d0 + 1
        dbcs d2,lp      ; if carry is clear, then decrement d2,
                        ; and if d2 != 0 goto lp.
                        ; (Thus if the carry is set or if d2 becomes 0,
                        ; we will leave the loop.)

chuck_simmons%d1@dartvax

wasser_1@viking.DEC (John A. Wasser) (03/20/85)

	Here are the answers I've gotten for the Z80 code puzzle.
	After each answer are my comments.
-------------------------------------------------------------------------------
> From:	RHEA::DECWRL::"allegra!watmath!watdaisy!ndiamond" 
> Regarding the Z80 code puzzle you posted:  is it a binary search?
> (Please be kind enough to answer, and I won't post it.  Thank you.)
>    Norman Diamond
> UUCP:  {decvax|utzoo|ihnp4|allegra}!watmath!watdaisy!ndiamond
> CSNET: ndiamond%watdaisy@waterloo.csnet
> ARPA:  ndiamond%watdaisy%waterloo.csnet@csnet-relay.arpa

	I don't know what it is... that's why I posted it.  There does
	seem to be a binary search going on, but...  what is being
	searched for?  If there is a resulting L for each A, why not
	use A as an index into a table?
-------------------------------------------------------------------------------
> From:	RHEA::DECWRL::"cc1@UCLA-LOCUS.ARPA" 
> It isn't that hard to figgure out. The code is checking up to 8 memory
> locations in the same page of memory for a particular byte. Code very
> similar to this is used by my model 1 to scan the keyboard (which works
> by shorting address lines to data lines, and software decoding), with
> the compare being for any non-zero location => key pressed.
> 				Michael Gersten
	Which 8 memory locations?  Why L shifted left and why is
	the carry shifted in?  Your answer would be more plausible
	if you could answer these questions.
-------------------------------------------------------------------------------
> From:	RHEA::DECWRL::"ihnp4!alberta!zaphod!bobd"
> This is simply marvelous!  As far as I can tell from my Z80 instruction
> manual, what we have is a basic heap search.  Here are my assumptions
> about the registers and data structures:
>
>	a) HL points to a table on a 256 byte boundary.
>	b) B has log(2) N of the table size (rounded up).
>	c) The table is maximum 255 bytes long
>	d) The first entry is empty (irrelevant)
>	e) HL is initialized to baseOfTable + 1 (L=1, H=??)
>	f) The table entries are one byte values ordered in heap
>	   formation.  I.e., the first item is the midpoint value of the
>	   table.  The entries at 2 and 3 are the midpoints of the upper
>	   and lower ranges.
>
> The carry bit from the CP instruction is used to decide whether to use
> the upper or lower half of the table.  To do a Knuth style analysis of
> the performance, let N be the number of entries in the table, let k be
> the ceiling of log2 N,
>
> ; A is an 8 bit register (the comparand)
> ; B is an 8 bit register ( = k )
> ; HL is a 16 bit register made up of H (the high byte) and L (the low byte)
> ;  HL = ??01
> LOOP:
>	CP	(HL)	; 7 cycles
>	RET	Z	; 5 cycles or 11 cycles if taken
>	RL	L	; 8 cycles
>	DJNZ	LOOP	; 8 cycles or 13 cycles if taken
>	LD	L,0	; 7 cycles
>	RET		; 10 cycles
>
> For the value in A not in the table, the loop time is 28*k + 5*(k-1) +
> 17 cycles.  Since k is 8 or less, the search will fail in a maximum of 276
> cycles.
>
> I do not agree that this is from a narrowly defined constraints;
> obviously the constraints must be well defined.  I still the algorithm
> is a marvelous implementation of a good data structure algorithm.
>
> Please let me know if there are problems with my analysis, or if there
> are other points that people brought up.

	This is the most thorough guess I've seen.  I'm not sure what
	such a search is good for.  I also don't see why it wouldn't
	be faster to set the table up so it can be indexed directly
	by A. Move A to L, move (HL) to L and you're done.  Is the
	resulting value of B important?
-------------------------------------------------------------------------------

	I think the last answer is best so far... but I still don't
	feel we have the final answer.  If anyone KNOWS the answer
	from previous experience (Dr. Dobb is not talking) please
	post the answer now.  Any explanation must specify the
	contents of the table, the initial contents of A, L and B and
	the expected results in L and B.  Unless the algorithm produces
	different (useful) results for different initial values of
	L or B, or produces useful results in BOTH L and B, it will
	not fit the criteria of being fast because a simple table
	lookup would be faster.

	I hope we get an answer soon!


		-John A. Wasser

Work address:
ARPAnet:	WASSER%VIKING.DEC@decwrl.ARPA
Usenet:		{allegra,Shasta,decvax}!decwrl!dec-rhea!dec-viking!wasser
USPS:		Digital Equipment Corp.
		Mail stop: LJO2/E4
		30 Porter Rd
		Littleton, MA  01460