johnl@ima.UUCP (John R. Levine) (07/02/86)
In article <1739@milano.UUCP> janssen@milano.UUCP writes: >I was talking to some VMS FORTRAN developers at DECUS last year, and >they were griping a bit about having to redo the code generation for >the 8600 series VAX. It seems that the microcode was re-written, and >now the fastest code sequences for something like DO-loops were >actually in terms of the "high-level" machine instructions, whereas >before the FORTRAN compiler avoided those instructions because it was >faster to do the operation in terms of a number of simpler >instructions. This sort of thing has been happening forever. Back in 1969, when I was using a 360/91, there was a particularly egregious problem in that the '91 didn't implement the 360's decimal instructions in hardware, and the software simulation was very slow. This meant that some otherwise plausible code from PL/I programs, which used decimal instructions in I/O formatting (or something like that -- it's been a while) ground to a halt. On the 91, the fastest way to do a block move was, as I recall, with a lot of double precision floating point loads and stores rather than with a block move. And the load multiple instruction, which loads registers from memory, was only faster than a series of regular load instructions if you were loading at least 4 registers. You get the idea. More recently, the timings among the Intel 8086, 186, and 286 vary quite a lot. For example, on an 8086 it's almost always faster to simulate a multiplication by a constant by a series of shifts and adds, while on the 286 the multiply wins. I suppose this means that you're better off optimizing for a RISC since you can hope that there's no level of confusing microcode between you and the hardware, but then you start thinking about levels of pipelining and cache architectures (the 91 ran loops much faster if they were less than 64 bytes long and were word aligned) and I get confused again. A while ago, some friends of mine at Yale were thinking about a compiler which would give you both the object code for your program and an optimized machine architecture to run it on. Maybe they had the right idea after all. -- John R. Levine, Javelin Software Corp., Cambridge MA +1 617 494 1400 { ihnp4 | decvax | cbosgd | harvard | yale }!ima!johnl, Levine@YALE.EDU The opinions expressed herein are solely those of a 12-year-old hacker who has broken into my account and not those of any person or organization.