rar@unix.cis.pitt.edu (Richard A Rubin) (07/31/90)
In article <37512@shemp.CS.UCLA.EDU> wales@CS.UCLA.EDU (Rich Wales) writes: >Of the above, I'm familiar only with "The Zen of Assembly Language". I >recommend this book highly. It's especially good at helping one under- >stand the design of the PC and the impact that design has on the way a >program works. Could you please post the author, publisher, etc. for this book? Rich Rubin University of Pittsburgh Dept. of Radiology
wales@valeria.cs.ucla.edu (Rich Wales) (07/31/90)
In article <37512@shemp.CS.UCLA.EDU> I wrote: Of the above, I'm familiar only with "The Zen of Assembly Language". I recommend this book highly. It's especially good at helping one understand the design of the PC and the impact that design has on the way a program works. In article <26363@unix.cis.pitt.edu> rar@unix.cis.pitt.edu (Richard A Rubin) replied: Could you please post the author, publisher, etc. for this book? OK, here it is. Abrash, Michael. Zen of Assembly Language: Volume I, Knowledge. Scott, Foresman & Co., 1990. ISBN 0-673-38602-3. List price: US $29.95 -- -- Rich Wales <wales@CS.UCLA.EDU> // UCLA Computer Science Department 3531 Boelter Hall // Los Angeles, CA 90024-1596 // +1 (213) 825-5683 "Indeed! Twenty-four is the gateway to heroic salvation."
dmt@pegasus.ATT.COM (Dave Tutelman) (08/02/90)
In article <1068@ashton.UUCP> tomr@ashton.UUCP (Tom Rombouts) writes: >I am a roughly intermediate C coder... >I now seek references on 8086 assembly language that are more >than just teaching of opcodes. I agree with all of Rich Wales' well-written response. My response below is similar in substance, but maybe with slightly different emphasis. >What are the main priciples in optimizing .ASM code? (Down to the >clock ticks level) Assuming you NEED to optimize clock ticks (see below for that decision), you will need a manual for the CPU you're using, with a cycle count for each instruction. However, there are a couple of things that make it more complicated than just adding up the cycles burned: - The various 80x8x chips have DIFFERENT instruction sets. If you really need to optimize, you may need different code for the 8086, 80286, and 80386. (I certainly have examples of 8086 and 80286 differences in optimal code for a routine.) - The different chips also have a different number of cycles for the same instruction. Example of this effect: one of my fast assembly programs had to multiply an integer by a constant integer. The 80286 and 80386 do integer multiplication pretty fast, but the 8086 did it faster as a string of shift and add instructions. - Which brings us to... These CPUs have a prefetch queue. A string of very fast (low-cycle-count) instructions won't run as fast as a sum-of-cycles calculation would lead you to believe, since the instructions and data have to be fetched from memory. The whole thing is sort of an intuitive tradeoff. One of the best rules I use for intuition is to optimize for the SLOWEST CPU I expect the program to run on, since the faster ones will be "pretty quick" anyway. (E.g., if you can meet your performance requirement on a 4.7 MHz 8088, the same code is "fast enough" on a 8 MHz 286.) >And the most important (IMHO) questions of all: >WHEN is it practical to go to assembly rather than a more portable >language such as C? How can assembly be developed to make it >re-usable and understandable by others? I agree that's the most important one. (Or two, actually.) First the "when", then the "how": WHEN: - You need the speed. This is one everyone admits, but it's too often abused. I prefer to code the WHOLE program in C, then measure it to see where my performance problems (if any) are. If there are performance problems in an area that ASM could help, I recode THAT portion. A few words of experience here: . This approach yields far less assembly coding than the "hackers" who start off with assumptions about speed and revert to assembler immediately. . Most performance problems found this way yield more to algorithm improvement than to assembly code. . Assembly sections added this way are usually very small and localized. If yours aren't, there is probably something wrong with your software architecture in the first place. Revisit the decomposition of your program. - You need the space. This one is little-recognized, but important. Some examples where this yields big gains (Freudian slip - I originally typed that as "bug gains"): . You need a small stand-alone program (.EXE or .COM). If you code in C, you'll get several kilobytes of essential library and startup code. Assembler can cut this to a small fraction of that if the function to be performed is simple. (E.g.- "Hello world" in C will probably result in a 5K-10K .EXE; in assembler you can probably code it in about 100 bytes.) (My favorite example is a little .COM file that turns the NumLock key off. If I recall correctly, it's under 20 bytes; try doing that in C.) . You need a specialized "library" routine, that is much more general (and thus much bigger) in the standard library. However, you can frequently gain much of the advantage by just recoding the routine in C. Of course, these gains only matter if program size is important. A few cases where it is include: (1) TSR programs that take up precious RAM all the time, and (2) squeezing a bunch of files onto the minimum number of floppies for distribution. - There's no other way. This is decreasingly a valid excuse. When I started PC hacking, you had to do a lot yourself that's now part of the major (MSC, Turbo C) commercial C products. For instance: . Interrupt handlers. . BIOS calls. . Efficient jump tables (e.g., for BIOS-like function dispatchers). . Interrupt-vector manipulation. I've done all these in assembler years ago. I'd use the C facilities today. - When dealing with hardware that has timing or register dependencies (generally special I/O). My favorite example here is a speed optimization of an existing program, where the bottleneck was Screen I/O. I decided that, for the slower processors, I'd have to write directly to display memory (a decision frequently made by application developers). It was easy enough to code this in C, but I wound up with snow on the screen. In order to "de-snow" the screen, the timing was so critical I had to write the code in assembler. (I posted a tutorial on the subject a couple of years ago; it's on several bulletin boards, probably under the name "snowfree", or something like that.) HOW: - Identify whole functions or families of functions that can be recoded in assembler. If your code has an "object" flavor, then the methods associated with a class of objects become a candidate for recoding. If you can't break it apart in this way, consider restructuring before recoding. - Recode the LOWEST possible levels of the function chain. (I think Rich mentioned this, too.) This minimizes the work needed to port the program. (The older I get, the more I consider portability early.) - Remember those programming "standards" that you routinely ignore in your C coding? Well follow them RELIGIOUSLY in assembler; they're even more important here. Especially: . Break up the code into small, modular functions. Be more ruthless about this than even the best C code, since some C "one-liners" may be a half page of ASM. . Include a FULL preamble for EVERY function. Include input (arguments, register state at call time, etc.), output, what's done, which registers aren't preserved, etc. . All the rules Rich noted about commenting lines. . Choose meaningful names for variables, functions, and even labels (which you usually don't worry about in C). I'd like to nod some praise in the direction of a former colleague, Eric Bauer. We shared an all-assembler project (a multi-tasking MS-DOS), and had to read one another's code to get anything done. Eric's adherence to good coding practice made it easier to follow his code than even most C code (for instance, the C programs I currently maintain). So it IS possible to write non-obscure assembler, even if you're doing something inherently obscure. >(I personally think assembly language is becoming a lost art, and >that it can produce significant payoffs in time critical applications >such as database processing or graphics display.) Graphics display - ASM absolutely essential. Database processing - I'd guess that algorithm and structure optimization will yield bigger payoffs. Maybe, just maybe, in the disk handling (on the assumption that you believe your DBMS wants to bypass the OS's file system, a questionable conclusion at best).... >Tom Rombouts Torrance Techie tomr@ashtate.A-T.com V:(213) 538-7108 Well, this has been fun for me. Hope it's helpful to you. Dave +---------------------------------------------------------------+ | Dave Tutelman | | Physical - AT&T Bell Labs - Lincroft, NJ | | Logical - ...att!pegasus!dmt == dmt@pegasus.att.com | | Audible - (201) 576 2194 | +---------------------------------------------------------------+