forsyth@minster.york.ac.uk (06/29/90)
during a discussion of self-modifying code and compiling bitblt, i described the technique we use to avoid trouble with the instruction cache on machines like the 68020. this provoked a response from gillies@m.cs.uiuc.edu: > Horrors, your code is not only machine-dependent, it's cache > dependent. Who volunteers to port your code? Some naive grad > student? Lord FINCHLEY tried to mend the Electric Light Himself. It struck him dead: And serve him right! It is the business of the wealthy man To give employment to the artisan. (H Belloc) i think that those of our research students who actually write code, armed with a copy of Locanthi's article, are competent to work out this small problem. indeed, it was a research student who wrote the implementation in the first place! i have several serious comments. first, there is nothing to stop us using a portable version of bitblt on a new architecture until we can adapt the more machine-dependent version. now, bear in mind its size: $ wc -l *.c 313 mem_rop.c 193 templates.c 506 total most of the 313 lines does clipping, case analysis, and copying instructions from template sets into the code buffer; only a few lines deal with the cache (or the 68000). by comparison, just the SunOS 3.2 `.h' files are: $ wc -l mem_rop*.h 260 mem_rop_impl_ops.h 106 mem_rop_impl_util.h 366 total (ours is currently monochrome only, but it happens that all parts of the Sun implementation mentioned here do not include things specific to colour machines. Sun's code for mem_rop itself is much larger, of course, owing in part to their need to handle 1 to n-bit and n-bit to 1 pixel conversions. nothing here is meant as criticism of their code.) Sun's code (i gather) originally contained `portable' constructions such as register short x; do { ... } while (--x != -1); to cause 68k dbra's to be generated. with the addition of sparc & 386i they have apparently removed this sort of thing from the body of the code and put it into #defines selected by machine type (eg, in pr_impl_util.h): /* loop macros */ PR_LOOPVP(var, op) PR_LOOPV(var, op) PR_LOOPP(count, op) PR_LOOP(count, op) these definitions and others include embedded calls to #defines _STMT, IFLINT, PTR_ADD, PTR_INCR, LOOP_DECR, and so on. this is a reasonable approach, but however compact or adaptable this makes the C code, the result is a kind of private language (less clear than Bourne's algol68 for his shell) which must be understood precisely if one is to work out whether the code will indeed port to a new architecture. while it is true that the `register short' code will execute (suboptimally) on other machines, one must still know the trick to realise why the code is so oddly written. in other words, in either case (compiling or `portable') there is in practice a modest amount of work involved in understanding the code well enough to check that it will work correctly on a new machine. but if it is portable C (perhaps configured by #define/#ifdef) will it not work immediately on a new machine, without further examination? possibly not. there are many potential pitfalls: the frame buffer and memory might have different layouts (eg, byte orders, interleaving); registers and memory might use different bit orders; and so on. the C code as written might not yet be adaptable enough to cope with some bizarre new (no doubt patented) invention of the hardware developer. as to the cache dependence: of course! on the other hand, by not using the special SunOS trap, our code is actually more portable: it will work on any 68020, not just Suns! i shall have to recheck it for later 68k machines, and it is possible the current scheme will not work; we do not change machine architectures often enough, much as Sun would like us to, for that checking to be onerous. much more of my time is spent working out what tinkering has been done by the operating system suppliers between releases (working on bitblt is more interesting, i assure you). the main purpose of our Suns is to provide a good, responsive, b&w bitmapped graphics environment (on 68020s with 4 Mbytes!). bitblt is therefore a critical primitive, and it is worthwhile squeezing as much performance as we can out of it. yet for maintenance we know from experience that the C+cpp+asm approach is more readily understood than a `pure' assembly language version. to judge from the speed of one of our colour SS1s compared to our colour 3/60, the SPARC might benefit from compiling bitblt too... but then, workstation suppliers want to sell whichever flavour of graphics hardware assist they were supporting last week!
markha@microsoft.UUCP (Mark HAHN) (07/02/90)
there are plenty of applications where it would be great to generate code on the fly. the only thing that really distinguishes bitblt is that users can easily see when it's too slow. but what about all the other tidy abstractions like hash tables? the Synthesis Kernel seems to be a variant of this cool idea, though it apparently only does it for system-level stuff. they seem to think of this kind of 'customization' is something that only an oracle-like metaclass would want to do. their example is a filesystem that provides thunks for your particular FILE*. any of the code-is-data languages should have it easy, since they can just provide a compiler in the runtime. complaining that it's unportable is pretty weak, since all performace tuning (rather than redesign) is unportable. what's needed is some adequately expressive portable language, which should also probably be terse and easy to optimize. the portability necessary couldn't be achieved in the past, but most major architectures today are just minor variations. (exercise: name a machine (please DON'T post it!) that has other than 8, 16, 32 bit data types on natural boundaries, two's complement math with 32bit addresses and IEEE fp. do not consider historic blemishes such as mainframes or the 286.) besides compiling the language, the OS needs to provide fixups. doesn't seem much to ask, does it? then, of course, you've got to trust the builtin compiler, and it won't help you if you need/prefer the local weirdnesses... regards, -- Mark Hahn microsoft!markha@uunet.uu.net uunet!microsoft!markha YES, Bill Gates IS my personal savior, and I CHANNEL for him in CLEAR WEATHER.