alexande@drivax.UUCP (10/01/87)
Last week I reported on the troubles I was having with Datalight
Optimum-C version 3.08. Three days after I called Datalight about
these problems, I received the latest version of the compiler (3.11).
This is impressive service.
There is good news and bad news about 3.11. First, the compiler bugs
have been pretty nearly eradicated, and Optimum-C can now compile
MicroEMACS without a hitch. The bad news is that the generated
code is quite large, and does not compare favorably with the
other compilers I tested. To its credit, Optimum-C does seem to
generate very good (i.e. fast) code for small benchmarks like
the Sieve. But since I spend most of my time running MicroEMACS,
not benchmarks, I'm not really interested in Sieve performance.
Below is a table of the code sizes for some of the modules
in MicroEMACS compiled with various compilers. In all cases,
the small code/large data model was used, and only 8086 code
was generated (no 80186 code). The Datalight compiler was
tested for space and time optimizations separately,
and with no optimization.
The other compilers used were Lattice C 2.14 (LC), Turbo C 1.0 (TC), and
Metaware High C 1.3 (HC). Note that the Lattice and Metaware compilers
are not the current versions.
module DLC DLC DLC LC HC TC
name none time space -Z -O
---- ---- ---- ---- ---- ---- ----
basic.c 2261 2509 2297 2171 1838 1913
buffer.c 2979 3121 3025 3035 2683 2714
display.c 2371 2685 2536 2465 2095 2064
echo.c 2351 2601 2416 2657 2201 2098
extend.c 859 892 879 862 779 805
file.c 3145 3257 3140 3085 2820 2914
kbd.c 1948 2080 2023 2130 1825 1826
line.c 4430 4354 4293 4168 3886 3909
main.c 1804 1859 1803 1903 1718 1699
random.c 2314 2527 2289 2357 2021 1955
region.c 1249 1261 1239 1263 1086 1160
search.c 4169 4191 4142 4436 3789 3819
symbol.c 1271 1345 1320 1233 1111 1170
window.c 2968 3501 2989 2975 2616 2629
These results were quite unexpected. The Datalight optimizer
produced bigger code than any other compiler, including Datalight's
own non-optimizing compiler. I looked more closely at the generated
code to find out why this might be. There were several areas
were Datalight suffered.
First, the optimizer generates extra code to keep registers
filled up with variables, even when those variables are used
only once, or when the register contents are immediately
destroyed.
Secondly, the optimizer generates an entire stack restoration and
return sequence for each "return" statement in a function. This
is faster than a jump to a common piece of code at the end of
the function, but is wasteful of space. It would make more sense
for the optimizer to generate jumps for space optimization, and
inline return sequences for time optimization. Overall,
Datalight appears to favors time over space.
Here is a sample piece of code that shows some differences
between Turbo C and Datalight Optimum-C. The assembly output
from the two compilers is shown side-by-side. Note that Turbo
achieves smaller code by using common return code, and by using
the "les" instruction instead of Datalight's bigger sequence of
"mov" instructions.
----------------------------------------------------------------------
struct s {
struct s *s_next;
struct p *s_pptr;
int s_value;
};
struct p {
struct p *p_next;
struct s *p_sptr;
int p_value;
};
struct s *sp;
struct p *pp;
func()
{
register struct p *lpp;
if (sp->s_value == 0)
{
if ((lpp = sp->s_pptr->p_next) == pp)
return (0);
sp->s_pptr = lpp;
}
}
----------------------------------------+-----------------------------
tcc -Z -O -mc | dlc -o -md
----------------------------------------+-----------------------------
public sp,pp,func | public _func,_pp,_sp
func: |_func:
push BP | push BP
push SI | mov BP,SP
sub SP,4 | sub SP,4
mov BP,SP | les BX,[00h]
les BX,[00h] | seg ES
seg ES | cmp 8[BX],0
cmp 8[BX],0 | jne L44
jne L53 | seg ES
seg ES | les BX,4[BX]
mov AX,6[BX] | seg ES
seg ES | mov DX,2[BX]
mov BX,4[BX] | seg ES
mov ES,AX | mov AX,[BX]
seg ES | mov 0FFFEh[BP],DX
mov AX,2[BX] | mov 0FFFCh[BP],AX
seg ES | cmp DX,[06h]
mov BX,[BX] | jne L32
mov 2[BP],AX | cmp AX,[04h]
mov 0[BP],BX | jne L32
mov CX,[06h] | xor AX,AX
mov SI,[04h] | jmps L44
cmp AX,CX |L32: mov DX,0FFFEh[BP]
jne L37 | mov AX,0FFFCh[BP]
cmp BX,SI | les BX,_func
L37: jne L41 | seg ES
xor AX,AX | mov 6[BX],DX
add SP,4 | seg ES
pop SI | mov 4[BX],AX
pop BP |L44: mov SP,BP
ret | pop BP
L41: mov AX,2[BP] | ret
mov BX,0[BP] |
les SI,[00h] |
seg ES |
mov 6[SI],AX |
seg ES |
mov 4[SI],BX |
L53: add SP,4 |
pop SI |
pop BP |
ret |
--
Mark Alexander ...{hplabs,seismo,sun,ihnp4}!amdahl!drivax!alexande
"Bob-ism: the Faith that changes to meet YOUR needs." -- Bobalexande@drivax.UUCP (Mark Alexander) (10/07/87)
There is an error in the listing that compared assembly output from
Turbo C and Datalight C. The column on the left was generated
by Datalight, while the column on the right was generated by Turbo C.
Sorry for the confusion.
I should also point out that the code size table compared actual code size,
not object file size.
--
Mark Alexander ...{hplabs,seismo,sun,ihnp4}!amdahl!drivax!alexande
"Bob-ism: the Faith that changes to meet YOUR needs." -- Bob