[net.works] assembly v.s. HOL

@RUTGERS.ARPA:BRYAN@SU-SIERRA.ARPA (04/16/85)

From: Doug Bryan <BRYAN@SU-SIERRA.ARPA>


The recent traffic concerning assembly v.s. a HOL illustrates one of the
major causes of the software crisis as we know it today (in a phrase, the
software crisis is: the cost of hardware is steadily decreasing while
the cost of software is steadily increasing).  Namely, the problem is
that many competent programmers still consider code size and execution
speed to be the major qualities of "good code".

Granted there are still a number of solution domains where size and
speed are very important but these are becoming fewer and fewer by the
day thanks to our friends in EE (256k and 1m memory chips, MC68020 size
micros...).  The majority of the cost of software lies in maintenance.
Thus, in the industry as a whole, readability and understandability become
the major qualities of "good code".  If you cannot maintain it, it is 
going to cost a great deal.  Laura Creighton made a similar point in her
recent message.

I have no doubt that Doug Pardee is an excellent assembly programmer but
we need to remember that he is of an endangered species.  The vast
majority of working programmers today don't even fully understand
concepts like pointer values and parallel processing.  How many other
programmers, even in your own shop Doug, can read and understand your
tight, hand-optimized code?  

Programs written in a HOL are more readable and understandable to the
masses; therefore they cost less; therefore they are better programs.

I hope I haven't over reacted to an interesting discussion on code
generation optimization.

doug bryan
-------

doug@terak.UUCP (Doug Pardee) (04/18/85)

> I have no doubt that Doug Pardee is an excellent assembly programmer but
> we need to remember that he is of an endangered species.  The vast
> majority of working programmers today don't even fully understand
> concepts like pointer values and parallel processing.

Endangered?  Gee, I hope not.  Is somebody coming up behind me with a
shotgun?   :-)

I certainly agree that most current programmers are indeed unfamiliar
with such concepts.  The main reason is because our universities believe
that such concepts are worthless and don't teach them.  And since so
few people understand the concepts, they must not be valuable, right?

Oh, I don't believe that "most" or even "many" programs should be coded
in assembler.  Which is mighty fortunate, given the attitude of the
universities.  But there still are, and always will be, programs which
should be assembly coded.  Since there aren't many, there don't need to
be many assembler programmers.  And that works out, too.
-- 
Doug Pardee -- Terak Corp. -- !{hao,ihnp4,decvax}!noao!terak!doug

doug@terak.UUCP (Doug Pardee) (04/18/85)

[the trap snaps shut...]

> software crisis is: the cost of hardware is steadily decreasing while
> the cost of software is steadily increasing).
> 
> Granted there are still a number of solution domains where size and
> speed are very important but these are becoming fewer and fewer by the
> day thanks to our friends in EE (256k and 1m memory chips, MC68020 size
> micros...).

I was wondering when someone was going to point out how today's faster
CPUs make up for the overhead of using an HLL.

I took that ol' Ackerman function and ran some benchmarks.  I compiled
it using VAX/UNIX 4.2BSD "cc -O" and ran it on our VAX 11/750.  To
compute "acker(3,6)" ten times required 62.2 seconds.  I then rewrote
it into Z-80 assembler.  The same computation takes 18.8 seconds on a
4 MHz Z-80A.  (Yes, it got the same answer: acker(3,6)=509).

So, moving up from a Z-80A to a VAX 11/750, the program only takes 3.3
times as long when coded in C.  Or looked at another way, the overhead
of using C code can bring a VAX down below the performance level of a
Timex/Sinclair 1000.
-- 
Doug Pardee -- Terak Corp. -- !{hao,ihnp4,decvax}!noao!terak!doug

@RUTGERS.ARPA:TLI@USC-ECLB.ARPA (04/23/85)

From: Tony Li <Tli@Usc-Eclb>

    From: terak!doug at topaz.arpa (Doug Pardee)

    I was wondering when someone was going to point out how today's
    faster CPUs make up for the overhead of using an HLL.

    I took that ol' Ackerman function and ran some benchmarks.  I
    compiled it using VAX/UNIX 4.2BSD "cc -O" and ran it on our VAX
    11/750.  To compute "acker(3,6)" ten times required 62.2 seconds.  I
    then rewrote it into Z-80 assembler.  The same computation takes 18.8
    seconds on a 4 MHz Z-80A.  (Yes, it got the same answer:
    acker(3,6)=509).

    So, moving up from a Z-80A to a VAX 11/750, the program only takes
    3.3 times as long when coded in C.  Or looked at another way, the
    overhead of using C code can bring a VAX down below the performance
    level of a Timex/Sinclair 1000.

Or, drawing another conclusion, "cc -O" under 4.2 is only a poor to
mediocre compiler.

Cheers, 
Tony ;-)

mat@amdahl.UUCP (Mike Taylor) (04/24/85)

> 
> So, moving up from a Z-80A to a VAX 11/750, the program only takes 3.3
> times as long when coded in C.  Or looked at another way, the overhead
> of using C code can bring a VAX down below the performance level of a
> Timex/Sinclair 1000.
> -- 
> Doug Pardee -- Terak Corp. -- !{hao,ihnp4,decvax}!noao!terak!doug

This, of course, presumes that the VAX is basically faster than a Z-80A
for short integer arithmetic.  Why not compare apples to apples ?
-- 
Mike Taylor                        ...!{ihnp4,hplabs,amd,sun}!amdahl!mat

[ This may not reflect my opinion, let alone anyone else's.  ]

doug@terak.UUCP (Doug Pardee) (04/25/85)

Are my eyes deceiving me?  Is someone really claiming that the reason
that computing the Ackerman function in C on a VAX 11/750 takes 3.3
times as long as doing it in assembler on a Z-80A is because a VAX is
just plain slower than a Timex/Sinclair??
 
> This, of course, presumes that the VAX is basically faster than a Z-80A
> for short integer arithmetic.  Why not compare apples to apples ?

Other comments I've gotten by mail indicate that some folks also
question the VAX's "call" & "return" speed compared with a TS-1000.

If a VAX can't do integer arithmetic nearly as fast as a TS-1000, and
a VAX can't do branching nearly as fast as a TS-1000, then why should
one buy a VAX for 1000 times the price of a TS-1000??  (Other than
because you can't get TS-1000's any more  :-)
-- 
Doug Pardee -- Terak Corp. -- !{hao,ihnp4,decvax}!noao!terak!doug

jdb@mordor.UUCP (John Bruner) (04/26/85)

Ackermann's function is very recursive, and the VAX calling sequence
is VERY slow.  The execution time is dominated by the time it takes
to call the recursive function.  Thus, this 750/Z80 comparison is
really comparing one aspect of the architectures of the two machines,
not the difference between assembly and C.  (Does someone want to
volunteer to rewrite Ackermann in MACRO-32?)

By recoding Ackermann as a non-recursive function (in C), I was able
to make my 11/750 run faster than a 4MHz Z80 :-).  My non-recursive
version computes ack(3,6) ten times in 15.4 seconds (user time). 

I am wary of using one program as a benchmark.  For instance, the
VAX-11/780 is faster (overall) than the PDP-11/70; however, my
non-recursive Ackermann takes 7.1 seconds (user) on an 11/780 but
only 6.5 seconds on an 11/70.
-- 
  John Bruner (S-1 Project, Lawrence Livermore National Laboratory)
  MILNET: jdb@mordor.ARPA [jdb@s1-c]	(415) 422-0758
  UUCP: ...!ucbvax!dual!mordor!jdb 	...!decvax!decwrl!mordor!jdb

sid@linus.UUCP (Sid Stuart) (04/30/85)

>Other comments I've gotten by mail indicate that some folks also
>question the VAX's "call" & "return" speed compared with a TS-1000.

	Doug, I have seen benchmarks that show a Motorola 68010 running
at 10 megahertz doing ackerman's function at about twice the speed of
a 780. Programs on both machines were written in C. I would still prefer
to have a 780 over a 68010 or even a Z80 if all other things (cost) were equal.
I do not believe ackerman's function to be a reasonable benchmark for testing
anything other than timing on a jump to a subroutine. 

						sid stuart

mat@amdahl.UUCP (Mike Taylor) (05/01/85)

> Are my eyes deceiving me?  Is someone really claiming that the reason
> that computing the Ackerman function in C on a VAX 11/750 takes 3.3
> times as long as doing it in assembler on a Z-80A is because a VAX is
> just plain slower than a Timex/Sinclair??
>  
> > This, of course, presumes that the VAX is basically faster than a Z-80A
> > for short integer arithmetic.  Why not compare apples to apples ?
> 
> Other comments I've gotten by mail indicate that some folks also
> question the VAX's "call" & "return" speed compared with a TS-1000.
> 
> If a VAX can't do integer arithmetic nearly as fast as a TS-1000, and
> a VAX can't do branching nearly as fast as a TS-1000, then why should
> one buy a VAX for 1000 times the price of a TS-1000??  (Other than
> because you can't get TS-1000's any more  :-)
> -- 
> Doug Pardee -- Terak Corp. -- !{hao,ihnp4,decvax}!noao!terak!doug

A VAX11/780, a "big brother" of the VAX11/750, has a cycle time of
200 ns.  That is, a 5 MHz. clock in microprocessor terms.  Simple
things, like for example adding two registers, probably take one cycle
on both the Z80A and the VAX.  Therefore, for at least some trivial
applications, the Z80A and the VAX are comparable. A VAX11/750 is
slower than a VAX11/780 (I believe).  If this is due to reduced clock
rate for example, then it is conceivable that some operations are
faster on the Z80A than on the VAX. I merely point out that the
performance of the two machines depends on the application, and
this uncertainty weakens the argument comparing the languages.
 
An apples-to-apples comparison would be more convincing. As to
why one should buy a VAX, I'm sure I don't know. I've never bought
one, myself.
-- 
Mike Taylor                        ...!{ihnp4,hplabs,amd,sun}!amdahl!mat

[ This may not reflect my opinion, let alone anyone else's.  ]

mann@LaBrea.ARPA (05/01/85)

> Simple
> things, like for example adding two registers, probably take one cycle
> on both the Z80A and the VAX.

Sorry.  Adding two (8-bit) registers takes 4 cycles on the Z-80A.  In fact,
none of the Z-80 instructions takes less than 4 cycles -- not even a NOP.
This is why a 1 MHz 6502 can be comparable in speed to a 4 MHz Z-80.

	--Tim