wasser_1@viking.DEC (John A. Wasser) (03/12/85)
In the January 1985 issue of Dr. Dobb's Journal, on page 17 the following article appeared: David S. Tilton sent us [DDJ] the following sequence of Z80 code. It's an absolutely astonishing implementation of... but you can figure it out. Unfortunately, it requires such narrowly defined conditions that it's probably useless, despite being the fastest one of its kind. What does it do, and what are its narrow requirements for utility? [Comments added by me -JW] ------------------------------------------------------------------------- ; A is an 8 bit register ; B is an 8 bit register ; HL is a 16 bit register made up of H (the high byte) and L (the low byte) ; LOOP: CP (HL) ; Compare A to memory at HL RET Z ; Return if equal. RL L ; Rotate carry and L register (bottom 8 bits of ; HL pair) left (Carry goes into LSB of L) DJNZ LOOP ; B--; if (B != 0) GOTO LOOP LD L,0 ; Load L with the value Zero. RET ; Return ------------------------------------------------------------------------- I have yet to hear a reasonable answer. If you get an answer from DDJ, please don't post it for a couple of weeks... that would spoil the fun! -John A. Wasser Work address: ARPAnet: WASSER%VIKING.DEC@decwrl.ARPA Usenet: {allegra,Shasta,decvax}!decwrl!dec-rhea!dec-viking!wasser USPS: Digital Equipment Corp. Mail stop: LJO2/E4 30 Porter Rd Littleton, MA 01460
chuck@dartvax.UUCP (Chuck Simmons) (03/12/85)
<Another puzzle appears below.> > LOOP: > CP (HL) ; Compare A to memory at HL > RET Z ; Return if equal. > > RL L ; Rotate carry and L register (bottom 8 bits of > ; HL pair) left (Carry goes into LSB of L) > DJNZ LOOP ; B--; if (B != 0) GOTO LOOP > > LD L,0 ; Load L with the value Zero. > RET ; Return This looks like a binary search. Suppose A is a value that we want to lookup in a table and that the address of the table is equivalent to 1 mod 64k. Further we suppose that the number of elements in the table is 2**B-1. Finally the elements in the table have been stored in a rather special order. In particular, we have table(2i) < table(i) < table(2i+1) for all i in 1..2**(B-1)-1. (Actually, we can violate this constraint for the last few values if we don't have enough elements to fill up the table. A reasonable thing to do would be to use table(1) as a filler element.) Example: A = 5; B = 3; table = 4, 2, 6, 1, 3, 5, 7; L = 1 5>4 so set carry bit L <- 3 B = 2 5<6 so clear carry bit L <- 6 B = 1 5=5 so return My puzzle is written in 68000 code, and it does not do anything nearly as common as perform a binary search. (This puzzle is due to Mike Morton, the author of the reaganagrams published in a recent Scientific American.) d0, d1, and d2 are 32 bit registers. move.w #32,d2 ; d2 = 32 bra.s start ; jump into our loop lp: or.l d1,d0 ; d0 |= d1 start: move.l d0,d1 ; d1 = d0 add.l #1,d0 ; d0 = d0 + 1 dbcs d2,lp ; if carry is clear, then decrement d2, ; and if d2 != 0 goto lp. ; (Thus if the carry is set or if d2 becomes 0, ; we will leave the loop.) chuck_simmons%d1@dartvax
wasser_1@viking.DEC (John A. Wasser) (03/20/85)
Here are the answers I've gotten for the Z80 code puzzle. After each answer are my comments. ------------------------------------------------------------------------------- > From: RHEA::DECWRL::"allegra!watmath!watdaisy!ndiamond" > Regarding the Z80 code puzzle you posted: is it a binary search? > (Please be kind enough to answer, and I won't post it. Thank you.) > Norman Diamond > UUCP: {decvax|utzoo|ihnp4|allegra}!watmath!watdaisy!ndiamond > CSNET: ndiamond%watdaisy@waterloo.csnet > ARPA: ndiamond%watdaisy%waterloo.csnet@csnet-relay.arpa I don't know what it is... that's why I posted it. There does seem to be a binary search going on, but... what is being searched for? If there is a resulting L for each A, why not use A as an index into a table? ------------------------------------------------------------------------------- > From: RHEA::DECWRL::"cc1@UCLA-LOCUS.ARPA" > It isn't that hard to figgure out. The code is checking up to 8 memory > locations in the same page of memory for a particular byte. Code very > similar to this is used by my model 1 to scan the keyboard (which works > by shorting address lines to data lines, and software decoding), with > the compare being for any non-zero location => key pressed. > Michael Gersten Which 8 memory locations? Why L shifted left and why is the carry shifted in? Your answer would be more plausible if you could answer these questions. ------------------------------------------------------------------------------- > From: RHEA::DECWRL::"ihnp4!alberta!zaphod!bobd" > This is simply marvelous! As far as I can tell from my Z80 instruction > manual, what we have is a basic heap search. Here are my assumptions > about the registers and data structures: > > a) HL points to a table on a 256 byte boundary. > b) B has log(2) N of the table size (rounded up). > c) The table is maximum 255 bytes long > d) The first entry is empty (irrelevant) > e) HL is initialized to baseOfTable + 1 (L=1, H=??) > f) The table entries are one byte values ordered in heap > formation. I.e., the first item is the midpoint value of the > table. The entries at 2 and 3 are the midpoints of the upper > and lower ranges. > > The carry bit from the CP instruction is used to decide whether to use > the upper or lower half of the table. To do a Knuth style analysis of > the performance, let N be the number of entries in the table, let k be > the ceiling of log2 N, > > ; A is an 8 bit register (the comparand) > ; B is an 8 bit register ( = k ) > ; HL is a 16 bit register made up of H (the high byte) and L (the low byte) > ; HL = ??01 > LOOP: > CP (HL) ; 7 cycles > RET Z ; 5 cycles or 11 cycles if taken > RL L ; 8 cycles > DJNZ LOOP ; 8 cycles or 13 cycles if taken > LD L,0 ; 7 cycles > RET ; 10 cycles > > For the value in A not in the table, the loop time is 28*k + 5*(k-1) + > 17 cycles. Since k is 8 or less, the search will fail in a maximum of 276 > cycles. > > I do not agree that this is from a narrowly defined constraints; > obviously the constraints must be well defined. I still the algorithm > is a marvelous implementation of a good data structure algorithm. > > Please let me know if there are problems with my analysis, or if there > are other points that people brought up. This is the most thorough guess I've seen. I'm not sure what such a search is good for. I also don't see why it wouldn't be faster to set the table up so it can be indexed directly by A. Move A to L, move (HL) to L and you're done. Is the resulting value of B important? ------------------------------------------------------------------------------- I think the last answer is best so far... but I still don't feel we have the final answer. If anyone KNOWS the answer from previous experience (Dr. Dobb is not talking) please post the answer now. Any explanation must specify the contents of the table, the initial contents of A, L and B and the expected results in L and B. Unless the algorithm produces different (useful) results for different initial values of L or B, or produces useful results in BOTH L and B, it will not fit the criteria of being fast because a simple table lookup would be faster. I hope we get an answer soon! -John A. Wasser Work address: ARPAnet: WASSER%VIKING.DEC@decwrl.ARPA Usenet: {allegra,Shasta,decvax}!decwrl!dec-rhea!dec-viking!wasser USPS: Digital Equipment Corp. Mail stop: LJO2/E4 30 Porter Rd Littleton, MA 01460