wittig@gmdzi.UUCP (Georg Wittig) (10/03/89)
The Z80 processor is known to be one of the cheapest processors in the world, but alas it is also known to be NOT one of the fastest. The more I am interested in fast assembler coded algorithms to avoid unnecessary waiting periods. The algorithms I'm interested in are relatively elementary: 16 bit multiplication, 16 bit division, 32 bit arithmetics, filling an arbitraray amount of storage with a given bit pattern, conversion of ascii encoded numbers to binary and vice versa, conversion from binary to hexadecimal notation, etc. Well, I do have such algorithms; but I wonder if they can be made faster. Does there exist a collection of such algorithms I can access? (I don't have access to anonymous ftp.) I'm watching this newsgroup since last winter, and I haven't seen one article on this subject. Or do I watch the wrong newsgroup? :-| If there is someone who wants to send me some of those algorithms, please don't mail them, please post them: I could bet there are some people out there in netland who are interested, too. :-) Waiting patiently ... Thanks in advance, -- Georg Wittig GMD-Z1.BI P.O. Box 1240 D-5205 St. Augustin 1 (West Germany) email: wittig@gmdzi.uucp phone: (+49 2241) 14-2294 ------------------------------------------------------------------------------- "Freedom's just another word for nothing left to lose" (Kris Kristofferson)
rzh@lll-lcc.UUCP (Roger Hanscom) (10/05/89)
In <1306@gmdzi.UUCP>, Georg Wittig (wittig@gmdzi.UUCP) requests Z80 algorithms for 16- and 32-bit arithmetic, code conversion, etc. Georg- There's a pretty good book on the subject called "Z80 Assembly Language Routines" by Lance Leventhal and Winthrop Saville (Osborne/McGraw-Hill, Berkeley, 1983). I don't know if it's still in print (soft-bound), but perhaps you could find it in a good library?? It presents assembly language fragments with descriptive text in chapters with titles such as: "code conversion", "string manipulation", "arithmetic", "bit manipulation and shifts", etc. I can't vouch for the relative speed of these algorithms, but it certainly could be a good place to start! If you are interested in floating point arithmetic, there's not much here. You might then try looking at a dis-assembled BASIC or f.p. math libraries from a Z80 "C" compiler?? Any help with f.p. out there???? roger rzh@lll-lcc.llnl.gov
jurjen@cwi.nl (Jurjen N.E. Bos) (10/10/89)
Maybe somebody is interested in this: I can multiply to unsigned 8-bit numbers in 158 clockcycles. If you can beat me, please let me know. If you are interested, I'll publish the algorithm. To make you anxious, I will not tell you the trick right now. (By the way: it costs less than 1K.) -- | | "Never imagine yourself not to be otherwise than what | | Jurjen N.E. Bos | it might appear to others that what you were or might | | | have been was not otherwise than what you had been | | jurjen@cwi.nl | would have appeared to them to be otherwise." |
black@ROM.ECSE.RPI.EDU (Jerry Glomph Black) (10/14/89)
(Jurjen N.E. Bos) writes: >Maybe somebody is interested in this: I can multiply to unsigned 8-bit numbers >in 158 clockcycles. If you can beat me, please let me know. If you are >interested, I'll publish the algorithm. >To make you anxious, I will not tell you the trick right now. >(By the way: it costs less than 1K.) Big deal! Use a 64180 (aka Z180) chip, has the MLT instruction, takes 17 clock cycles, or 1.85 micro seconds. (By the way: it's a 2-byte instruction) Jerry Glomph Black, 8-bit terrorist
mwilson@crash.cts.com (Marc Wilson) (10/15/89)
In article <8910132000.AA03993@rom.ecse.rpi.edu> black@ROM.ECSE.RPI.EDU (Jerry Glomph Black) writes: > >Big deal! Use a 64180 (aka Z180) chip, has the MLT instruction, takes 17 >clock cycles, or 1.85 micro seconds. (By the way: it's a 2-byte instruction) Since when is the 64180 made by Zilog? The 64180 is a Hitachi Z80 clone with some additional instructions. -- ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ Marc Wilson ARPA: ...!crash!mwilson@nosc.mil ...!crash!pnet01!pro-sol!mwilson@nosc.mil UUCP: [ cbosgd | hp-sdd!hplabs | sdcsvax | nosc ]!crash!mwilson INET: mwilson@crash.CTS.COM ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
black@ROM.ECSE.RPI.EDU (Jerry Glomph Black) (10/16/89)
>Big deal! Use a 64180 (aka Z180) chip, has the MLT instruction, takes 17 >clock cycles, or 1.85 micro seconds. (By the way: it's a 2-byte instruction) >> Since when is the 64180 made by Zilog? The 64180 is a Hitachi Z80 clone >>with some additional instructions. Zilog, having flubbed the Z280 project for so many years, was mortally embarrassed by Hitachi's excellent 64180 chip (and even more excellent 1-chip version, the 647180). Thus they made a deal to market the 64180 under the pseudonym Z180. The 64180 is far more than a 'Z80 clone'. It does support the Z80 instruction set, but has 2 asynch and 1 synch serial ports, 1 Meg address space, 2 16-bit timers, and 4 external interrupt lines. Jerry Glomph Black, 8-bit terrorist
dbraun@cadev5.intel.com (Doug Braun ~) (10/16/89)
In article <8910132000.AA03993@rom.ecse.rpi.edu> black@ROM.ECSE.RPI.EDU (Jerry Glomph Black) writes: > (Jurjen N.E. Bos) writes: >>Maybe somebody is interested in this: I can multiply to unsigned 8-bit numbers >>in 158 clockcycles. . . . > >Big deal! Use a 64180 (aka Z180) chip, has the MLT instruction, takes 17 >clock cycles, or 1.85 micro seconds. (By the way: it's a 2-byte instruction) > >Jerry Glomph Black, 8-bit terrorist Bigger deal!! Use a Z280 and do *16* bit multiplies (signed AND unsigned) in 25 or so clcok cycles. Also divide! (32 bits / 16 bits). Also 2-byte instructions. Unless you want to do something like: HL := DEHL DIV (IX+2345H) Doug Braun Intel Corp CAD 408 765-4279 / decwrl \ | hplabs | -| oliveb |- !intelca!mipos3!cadev4!dbraun | amd | \ qantel / or: dbraun@cadev4.intel.com
josef@peun11.uucp (Moellers) (10/17/89)
mwilson@crash.cts.com (Marc Wilson) writes: > Since when is the 64180 made by Zilog? The 64180 is a Hitachi Z80 clone >with some additional instructions. It was not originally made by Zilog (i.e. developped) but Zilog probably found it a neat chip (which it is) and it fitted nicely in their range, so they second-sourced it and named it the Z180. That's all folks. Josef Moellers paper mail: e-mail: c/o Nixdorf Computer AG USA: uunet!philabs!linus!nixbur!mollers.pad Abt. DX-PC !USA: mcvax!unido!nixpbe!mollers.pad Pontanusstrasse Phone: D-4790 Paderborn (+49) 5251 146245 +-----------------------------------------------------------------------+ | "Many that live deserve death. And some that die deserve life. | | Can You give it to them? Then do not be too eager to deal out | | death in judgement" | | Gandalf to Frodo in "The Fellowship of the Ring"| +-----------------------------------------------------------------------+ Josef Moellers paper mail: e-mail: c/o Nixdorf Computer AG USA: uunet!philabs!linus!nixbur!mollers.pad
wittig@gmdzi.UUCP (Georg Wittig) (10/17/89)
Someone mailed to me the following memory fill algorithm: > > ld hl,buffer ; point at buffer > ld de,buffer + 1 ; point at next byte > ld bc,count - 1 ; number of bytes minus one > ld (hl),xxx ; save the first byte > ldir ; replicate through rest of buffer > >is the fastest buffer fill I know on the Z80. There exists a much faster faster algorithm for that. Let's see: The central statement in your solution is `ldir'. It takes 21 T states per byte. For 16 bytes this is 336 T states. Using the push statement is much faster: Set D = E = the byte to be filled in; let SP point 1 byte after the end of the area to be filled; B contains the number of 16 byte blocks to be filled. Then use "push DE"s, and you're finished very quickly. DI ; CP/M must not interrupt, because SP will be ; misused LD (sp_save),SP ; save current SP value LD SP,HL ; assuming HL points to <end+1> L: PUSH DE ; 8 times, so 16 bytes are filled PUSH DE PUSH DE PUSH DE PUSH DE PUSH DE PUSH DE PUSH DE DJNZ L ; a 16 byte portion has been processed. LD SP,(sp_save) ; restore the SP EI ; done This way you can fill up to 4096 (256*16) bytes. If more bytes are to be filled, build a loop around it. If the number of the bytes to be filled isn't a multiple of 16, the bytes 1 to 15 can be filled straight forward with a traditional algorithm. The timing of that algorithm: The central loop starts at "L:" and ends with "DJNZ". "push DE" needs 11 T states; DJNZ needs 13 ones. So for 16 bytes to be filled you get: 8 * 11 + 13 = 101 101 / 336 = 30 % Relatively fast, isn't it? And -- it works! PS: The idea isn't mine, I found it some time ago in a journal. I'm sorry I don't remember which one it was. -- Georg Wittig GMD-Z1.BI P.O. Box 1240 D-5205 St. Augustin 1 (West Germany) email: wittig@gmdzi.uucp phone: (+49 2241) 14-2294 - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - "Freedom's just another word for nothing left to lose" (Kris Kristofferson)
nobody@wrgate.WR.TEK.COM (Unpriveleged user) (10/18/89)
>Someone mailed to me the following memory fill algorithm: > >> >> ld hl,buffer ; point at buffer >> ld de,buffer + 1 ; point at next byte >> ld bc,count - 1 ; number of bytes minus one >> ld (hl),xxx ; save the first byte >> ldir ; replicate through rest of buffer >> >>is the fastest buffer fill I know on the Z80. > >There exists a much faster faster algorithm for that. Let's see: > >The central statement in your solution is `ldir'. It takes 21 T states per >byte. For 16 bytes this is 336 T states. > >Using the push statement is much faster: > Set D = E = the byte to be filled in; > let SP point 1 byte after the end of the area to be filled; > B contains the number of 16 byte blocks to be filled. >Then use "push DE"s, and you're finished very quickly. > > DI ; CP/M must not interrupt, because SP will be > ; misused {rest of clever PUSHing algorithm deleted} From: michaelk@copper.WR.TEK.COM (Michael D. Kersenbrock) Path: copper!michaelk The push-algorithm is good for memory fill patterns when the pattern itself is of a particular variety and is of a fixed length. The LDIR algorithm works for any pattern of length. You load the pattern once, with: hl-> beginning of memory area de-> beginning of memory area + pattern_length bc == fill_length - pattern length You then load your pattern starting at (HL) ONCE with a sequence that gets it from the source (wherever it comes from). Then you LDIR it. If the fill length is less than pattern length, this can be tested when being loaded manually the "first (and only) time". This generalization of the 1-byte case (presented at the top of this article) is a fast way of loading variable-length patterns, AND it can be interrupted. Byte count is small too. Just depends on your needs :-). Mike Kersenbrock Tektronix Microprocessor Development Products michaelk@copper.WR.TEK.COM Aloha, Oregon
josef@peun11.uucp (Moellers) (10/18/89)
dbraun@cadev5.intel.com (Doug Braun ~) writes: >Bigger deal!! Use a Z280 and do *16* bit multiplies (signed AND unsigned) >in 25 or so clcok cycles. Also divide! (32 bits / 16 bits). >Also 2-byte instructions. Unless you want to do something like: > HL := DEHL DIV (IX+2345H) >Doug Braun Intel Corp CAD > 408 765-4279 OK, but then You'll loose Z80 compatibility (unless I'm wrong)! You might as well take the NS32K, that gives You multiply, divide, remainder in all sorts of data types (signed, unsigned, float) and sized (byte, word, double, float, long) Josef Moellers paper mail: e-mail: c/o Nixdorf Computer AG USA: uunet!philabs!linus!nixbur!mollers.pad Abt. DX-PC !USA: mcvax!unido!nixpbe!mollers.pad Pontanusstrasse Phone: D-4790 Paderborn (+49) 5251 146245 +-----------------------------------------------------------------------+ | "Many that live deserve death. And some that die deserve life. | | Can You give it to them? Then do not be too eager to deal out | | death in judgement" | | Gandalf to Frodo in "The Fellowship of the Ring"| +-----------------------------------------------------------------------+ Josef Moellers paper mail: e-mail: c/o Nixdorf Computer AG USA: uunet!philabs!linus!nixbur!mollers.pad
dg@lakart.UUCP (David Goodenough) (10/18/89)
dbraun@cadev5.intel.com (Doug Braun ~) sez: ] black@ROM.ECSE.RPI.EDU (Jerry Glomph Black) writes: ] ]> (Jurjen N.E. Bos) writes: ]>>Maybe somebody is interested in this: I can multiply to unsigned 8-bit numbers ]>>in 158 clockcycles. . . . ]> ]>Big deal! Use a 64180 (aka Z180) chip, has the MLT instruction, takes 17 ]>clock cycles, or 1.85 micro seconds. (By the way: it's a 2-byte instruction) ]> ]>Jerry Glomph Black, 8-bit terrorist ] ] ] Bigger deal!! Use a Z280 and do *16* bit multiplies (signed AND unsigned) ] in 25 or so clock cycles. Yes, but are they plug in replacements for the Z80?? No, I didn't think so. -- dg@lakart.UUCP - David Goodenough +---+ IHS | +-+-+ ....... !harvard!xait!lakart!dg +-+-+ | AKA: dg%lakart.uucp@xait.xerox.com +---+
davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (10/19/89)
In article <8910161517.AA06146@rom.ecse.rpi.edu>, black@ROM.ECSE.RPI.EDU (Jerry Glomph Black) writes: | pseudonym Z180. The 64180 is far more than a 'Z80 clone'. It does support | the Z80 instruction set, but has 2 asynch and 1 synch serial ports, 1 Meg | address space, 2 16-bit timers, and 4 external interrupt lines. So can I get a nice S100 board with one? Maybe even a nice single board CP/M system to replace the collection of relics I currently run? Either that or a board which gives me an AT on a single A100 card ;-) -- bill davidsen (davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen) "The world is filled with fools. They blindly follow their so-called 'reason' in the face of the church and common sense. Any fool can see that the world is flat!" - anon
davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (10/19/89)
In article <1335@gmdzi.UUCP>, wittig@gmdzi.UUCP (Georg Wittig) writes: | This way you can fill up to 4096 (256*16) bytes. If more bytes are to be | filled, build a loop around it. If the number of the bytes to be filled isn't a | multiple of 16, the bytes 1 to 15 can be filled straight forward with a | traditional algorithm. No, you and the count with 17(8) and then jump into the loop so you only do the excess modulo 16 the first time. If the excess is non-zero you have to loop one more time, though. -- bill davidsen (davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen) "The world is filled with fools. They blindly follow their so-called 'reason' in the face of the church and common sense. Any fool can see that the world is flat!" - anon
bobc@attctc.Dallas.TX.US (Bob Calbridge) (10/19/89)
In article <1238@crdos1.crd.ge.COM>, davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) writes: ` In article <8910161517.AA06146@rom.ecse.rpi.edu>, black@ROM.ECSE.RPI.EDU (Jerry Glomph Black) writes: ` | pseudonym Z180. The 64180 is far more than a 'Z80 clone'. It does support ` | the Z80 instruction set, but has 2 asynch and 1 synch serial ports, 1 Meg ` | address space, 2 16-bit timers, and 4 external interrupt lines. ` ` So can I get a nice S100 board with one? Maybe even a nice single ` board CP/M system to replace the collection of relics I currently run? ` Either that or a board which gives me an AT on a single A100 card ;-) What you need is a nice S180 board with one. Actually, I recall that the Micro Mint used to have one. Pretty cheap too. I don't know if they're still in business. Used to be you could find their ads in Byte. I haven't read it in so long I can't guarantee it. It was a single board system and came, I seem to recall, with either CP/M or a CP/M clone. -- -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- = I know it's petty.......... = - But I have to justify my salary! - =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
josef@peun11.uucp (Moellers) (10/20/89)
davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) writes: > So can I get a nice S100 board with one? Maybe even a nice single >board CP/M system to replace the collection of relics I currently run? >Either that or a board which gives me an AT on a single A100 card ;-) You CAN get a nice system with the 64180 on it: The Micromint/Ciarcia's Circuit Cellar SB180(FX) It features: - HD64180 (with the interfaces: 2xserial, timers, dma, mmu) - 256 kB memory (FX has 512 kB, externally extendable to 2Megs) - floopy controller for 8", 5.25" (and 3.5") - parallel interfaces (CENTRONICS AND a 8255) - SCSI chip (FX only) - EPROM with monitor You can oder it with the Zsystem (CPM2.2 compatible but A LOT NICER) costs about $500. Add to this the GT180: - piggy backs onto SB180(FX) - high resolution: (standard is 640x480 but I ran it on a Mutlisync II) - Uses HD63484 ACRTC -> LINE, RECTANGLE, CIRCLE, ELLIPSE, ... - RGBI/TTL or 16 out of 4096 colors analog Makes You another $500 poorer, but ITS GREAT!!! Give Micromint a ring: 1-(800)635-3355 or 1-(203)-871-6170 NOTE: I AM IN NO WAY CONNECTED TO MICROMINT APART FROM BEING VERY HAPPY WITH THE SB180FX/GT180 Josef Moellers paper mail: e-mail: c/o Nixdorf Computer AG USA: uunet!philabs!linus!nixbur!mollers.pad Abt. DX-PC !USA: mcvax!unido!nixpbe!mollers.pad Pontanusstrasse Phone: D-4790 Paderborn (+49) 5251 146245 +-----------------------------------------------------------------------+ | "Many that live deserve death. And some that die deserve life. | | Can You give it to them? Then do not be too eager to deal out | | death in judgement" | | Gandalf to Frodo in "The Fellowship of the Ring"| +-----------------------------------------------------------------------+ Josef Moellers paper mail: e-mail: c/o Nixdorf Computer AG USA: uunet!philabs!linus!nixbur!mollers.pad
fifi@cosmo.UUCP (A.F.Zinser) (10/25/89)
In article <577@nixpbe.UUCP> josef@peun11.uucp (Moellers) writes: > > [...] > >Bigger deal!! Use a Z280 and do *16* bit multiplies (signed AND unsigned) > >in 25 or so clcok cycles. Also divide! (32 bits / 16 bits). > >Also 2-byte instructions. Unless you want to do something like: > > HL := DEHL DIV (IX+2345H) > OK, but then You'll loose Z80 compatibility (unless I'm wrong)! > You might as well take the NS32K, that gives You > multiply, divide, remainder in all sorts of data types (signed, > unsigned, float) and sized (byte, word, double, float, long) Sorry, the Z280 (Z8000) instruction set is a superset of the Z80 instruction set; it's the same sitiuation as using the HD64180 (Z180)! As far as I know the NS32K never understands any Z80 instruction :-). Axel Zinser --***%%%%***-- Axel F. Zinser ...uunet!mcvax!unido!cosmo!fifi fifi@cosmo.uucp
josef@peun11.uucp (Moellers) (10/27/89)
fifi@cosmo.UUCP (A.F.Zinser) writes: >In article <577@nixpbe.UUCP> josef@peun11.uucp (Moellers) writes: >> [quote of my reply stating that if You wanted mul/div, You could just as well take a 32K machine] >Sorry, the Z280 (Z8000) instruction set is a superset of the Z80 >instruction set; it's the same sitiuation as using the HD64180 (Z180)! >As far as I know the NS32K never understands any Z80 instruction :-). Sorry again, but the Z280 is neither a Z8000 nor a superset of the Z80 (as far as I know, I have a reference man at home and now I'm at work). The Z80, Z280 and Z8000 are three completely different CPUs. So, if You're going to use an incompatible CPU, take a good one (NS32K)! Josef Moellers paper mail: e-mail: c/o Nixdorf Computer AG USA: uunet!philabs!linus!nixbur!mollers.pad Abt. DX-PC !USA: mcvax!unido!nixpbe!mollers.pad Pontanusstrasse Phone: D-4790 Paderborn (+49) 5251 146245 +-----------------------------------------------------------------------+ | "Many that live deserve death. And some that die deserve life. | | Can You give it to them? Then do not be too eager to deal out | | death in judgement" | | Gandalf to Frodo in "The Fellowship of the Ring"| +-----------------------------------------------------------------------+