alexande@drivax.UUCP (10/01/87)
Last week I reported on the troubles I was having with Datalight Optimum-C version 3.08. Three days after I called Datalight about these problems, I received the latest version of the compiler (3.11). This is impressive service. There is good news and bad news about 3.11. First, the compiler bugs have been pretty nearly eradicated, and Optimum-C can now compile MicroEMACS without a hitch. The bad news is that the generated code is quite large, and does not compare favorably with the other compilers I tested. To its credit, Optimum-C does seem to generate very good (i.e. fast) code for small benchmarks like the Sieve. But since I spend most of my time running MicroEMACS, not benchmarks, I'm not really interested in Sieve performance. Below is a table of the code sizes for some of the modules in MicroEMACS compiled with various compilers. In all cases, the small code/large data model was used, and only 8086 code was generated (no 80186 code). The Datalight compiler was tested for space and time optimizations separately, and with no optimization. The other compilers used were Lattice C 2.14 (LC), Turbo C 1.0 (TC), and Metaware High C 1.3 (HC). Note that the Lattice and Metaware compilers are not the current versions. module DLC DLC DLC LC HC TC name none time space -Z -O ---- ---- ---- ---- ---- ---- ---- basic.c 2261 2509 2297 2171 1838 1913 buffer.c 2979 3121 3025 3035 2683 2714 display.c 2371 2685 2536 2465 2095 2064 echo.c 2351 2601 2416 2657 2201 2098 extend.c 859 892 879 862 779 805 file.c 3145 3257 3140 3085 2820 2914 kbd.c 1948 2080 2023 2130 1825 1826 line.c 4430 4354 4293 4168 3886 3909 main.c 1804 1859 1803 1903 1718 1699 random.c 2314 2527 2289 2357 2021 1955 region.c 1249 1261 1239 1263 1086 1160 search.c 4169 4191 4142 4436 3789 3819 symbol.c 1271 1345 1320 1233 1111 1170 window.c 2968 3501 2989 2975 2616 2629 These results were quite unexpected. The Datalight optimizer produced bigger code than any other compiler, including Datalight's own non-optimizing compiler. I looked more closely at the generated code to find out why this might be. There were several areas were Datalight suffered. First, the optimizer generates extra code to keep registers filled up with variables, even when those variables are used only once, or when the register contents are immediately destroyed. Secondly, the optimizer generates an entire stack restoration and return sequence for each "return" statement in a function. This is faster than a jump to a common piece of code at the end of the function, but is wasteful of space. It would make more sense for the optimizer to generate jumps for space optimization, and inline return sequences for time optimization. Overall, Datalight appears to favors time over space. Here is a sample piece of code that shows some differences between Turbo C and Datalight Optimum-C. The assembly output from the two compilers is shown side-by-side. Note that Turbo achieves smaller code by using common return code, and by using the "les" instruction instead of Datalight's bigger sequence of "mov" instructions. ---------------------------------------------------------------------- struct s { struct s *s_next; struct p *s_pptr; int s_value; }; struct p { struct p *p_next; struct s *p_sptr; int p_value; }; struct s *sp; struct p *pp; func() { register struct p *lpp; if (sp->s_value == 0) { if ((lpp = sp->s_pptr->p_next) == pp) return (0); sp->s_pptr = lpp; } } ----------------------------------------+----------------------------- tcc -Z -O -mc | dlc -o -md ----------------------------------------+----------------------------- public sp,pp,func | public _func,_pp,_sp func: |_func: push BP | push BP push SI | mov BP,SP sub SP,4 | sub SP,4 mov BP,SP | les BX,[00h] les BX,[00h] | seg ES seg ES | cmp 8[BX],0 cmp 8[BX],0 | jne L44 jne L53 | seg ES seg ES | les BX,4[BX] mov AX,6[BX] | seg ES seg ES | mov DX,2[BX] mov BX,4[BX] | seg ES mov ES,AX | mov AX,[BX] seg ES | mov 0FFFEh[BP],DX mov AX,2[BX] | mov 0FFFCh[BP],AX seg ES | cmp DX,[06h] mov BX,[BX] | jne L32 mov 2[BP],AX | cmp AX,[04h] mov 0[BP],BX | jne L32 mov CX,[06h] | xor AX,AX mov SI,[04h] | jmps L44 cmp AX,CX |L32: mov DX,0FFFEh[BP] jne L37 | mov AX,0FFFCh[BP] cmp BX,SI | les BX,_func L37: jne L41 | seg ES xor AX,AX | mov 6[BX],DX add SP,4 | seg ES pop SI | mov 4[BX],AX pop BP |L44: mov SP,BP ret | pop BP L41: mov AX,2[BP] | ret mov BX,0[BP] | les SI,[00h] | seg ES | mov 6[SI],AX | seg ES | mov 4[SI],BX | L53: add SP,4 | pop SI | pop BP | ret | -- Mark Alexander ...{hplabs,seismo,sun,ihnp4}!amdahl!drivax!alexande "Bob-ism: the Faith that changes to meet YOUR needs." -- Bob
alexande@drivax.UUCP (Mark Alexander) (10/07/87)
There is an error in the listing that compared assembly output from Turbo C and Datalight C. The column on the left was generated by Datalight, while the column on the right was generated by Turbo C. Sorry for the confusion. I should also point out that the code size table compared actual code size, not object file size. -- Mark Alexander ...{hplabs,seismo,sun,ihnp4}!amdahl!drivax!alexande "Bob-ism: the Faith that changes to meet YOUR needs." -- Bob