[comp.sys.hp] HP9000 performance

terry@venus.sunquest.com (Terry R. Friedrichsen) (01/30/91)

I have been tasked with comparing HP9000s with Digital's VAXen.  While I
have lots of VAXen available to beat on, I've never even SEEN an HP9000.

Some standard benchmark results would help.  I'm particularly interested
in the 800-series machines.

So if someone has some benchmark results (SPEC results would be great),
I'd love to see them.  Also I'm interested in any remarks you might have
about the I/O throughput of these machines.

What is the consensus of opinion regarding HP-UX?  Good, bad, ugly?  How's
the TCP/IP implementation?  Does X11R4 run as distributed from MIT?  How's
the C compiler?  Do they really feel as fast as their MIPS ratings?

I'm obviously inviting a wide-open spectrum of responses here, and that's
on purpose.  I promise to read every one of them, and I apologize in advance
if you don't get individual replies.

AdTHANKSvance for any information.

Terry R. Friedrichsen

terry@venus.sunquest.com  (Internet)
uunet!sunquest!terry	  (Usenet)
terry@sds.sdsc.edu        (alternate address; I live in Tucson)

Quote:  "Do, or do not.  There is no 'try'." - Yoda, The Empire Strikes Back

bb@reef.cis.ufl.edu (Brian Bartholomew) (01/31/91)

Key to authorship:

	">"  =  terry@venus.sunquest.com (Terry R. Friedrichsen)
	"&"  =  thoth@reef.cis.ufl.edu (Rob Forsman)
	"$"  =  bb@math.ufl.edu (Brian Bartholomew)


> Some standard benchmark results would help.  I'm particularly interested
> in the 800-series machines.

$ Our site has the most experience with 300 and 400 series, which are
$ Motorolla 68K based machines.  Note that the 200, 300, and 400 series
$ are all binary compatible, which tells me that the HP code is not
$ taking advantage of the opcode improvements of the later 68K's.  Note
$ also that there are signifigant differences between the 400 and the
$ 800 Operating Systems; the two OS's are subtly different, and appear
$ to be generated by two teams in parallel instead of generated by a
$ recompile (a'la Sun 3 and 4).  This leaves landmines for people
$ switching machines to find.  I have grown to think that the 800's were
$ designed to be servers, and the 300's clients.  There are a few key
$ things missing from the 400's, like disk partitioning, specifying an
$ arbitrary drive and partition to autoboot from, and so on.

> So if someone has some benchmark results (SPEC results would be great),
> I'd love to see them.  Also I'm interested in any remarks you might have
> about the I/O throughput of these machines.

> What is the consensus of opinion regarding HP-UX?  Good, bad, ugly?  How's
> the TCP/IP implementation?  Does X11R4 run as distributed from MIT?  How's
> the C compiler?  Do they really feel as fast as their MIPS ratings?

& HP-UX SUX.  Ugly.  Painful SysVR2 (not 3, not 4) variant.

> TCP/IP

& This has a problem with non-blocking output to sockets; this gives at
& least one large program (empire, a game) fits.

> C

& Screwy, arcane compiler flags to adjust STATIC tables in the various
& compiler passes.  Many programs (perl, X distribution, etc.) will not
& compile without fiddling with these switches.

$ We have played with compiling BSD source (e.g., "ls") under the
$ HP-supplied compiler, and they turn out 30 or 40% bigger than the HP
$ executable.  I suspect that the compiler supplied is a version of the
$ old AT&T portable C compiler; you might have much better luck with the
$ spiffy (and non-bundled) HP "fancy C compiler" product.  Better yet,
$ I'd suggest that you get GNU's gcc - it's free, and you get full
$ source.  However, it is tough to build with the clunky HP C compiler.

> X11R4 from MIT

& "Cannot perform realloc".  We don't know where this is coming from, we
& are still working on it.  This basically disables all X toolkit
& programs; and is especially painful because (a) HP didn't give us dbx,
& and (b) gdb can't find any symbols, even though we compile it -g.

> MIPS

& Fast.  Especially the HP-supplied X server.  However, it is a bit
& buggy (xtank generates spurious pixels [XCopyArea problem?] and long
& XDrawText requests will cause the fonts to effectively "greek".

$ This bug usually hits me with long lines in xmh.


> Conclusion:

&$ Get Suns!!!


--
"Any sufficiently advanced technology is indistinguishable from a rigged demo."
-------------------------------------------------------------------------------
Brian Bartholomew	UUCP:       ...gatech!uflorida!mathlab.math.ufl.edu!bb
University of Florida	Internet:   bb@math.ufl.edu

hardy@golem.ps.uci.edu (Meinhard E. Mayer (Hardy)) (01/31/91)

I beg to differ:
HP-UX is easy to use; upgrades are easy for nonexperts like me.
X works fine,
Never had trouble with TCP/IP
All the gnu-world works fine.
Get HP-s -- particularly the newer ones rumored two postings up!

Hardy Mayer
----****----
Professor Meinhard E. Mayer
Department of Physics
University of California
Irvine, CA, 92717
USA

mjs@hpfcso.HP.COM (Marc Sabatella) (02/01/91)

>    	">"  =  terry@venus.sunquest.com (Terry R. Friedrichsen)
>    	"&"  =  thoth@reef.cis.ufl.edu (Rob Forsman)
>    	"$"  =  bb@math.ufl.edu (Brian Bartholomew)

>    $ Our site has the most experience with 300 and 400 series, which are
>    $ Motorolla 68K based machines.  Note that the 200, 300, and 400 series
>    $ are all binary compatible, which tells me that the HP code is not
>    $ taking advantage of the opcode improvements of the later 68K's.

This is not true.  They are "upwards compatible" only.  For several releases
now the 300 compilers generate 68020/68030 specific code.  The opcode
improvements on the 68040 are minimal (move16 primarily) so we do not currently
emit 68040-specific code.  However, we have tuned the compilers to generate
code to best take advantage of the 68040 - instruction scheduling based on the
68040 pipeline architecture, avoiding address modes that are unduly expensive
on the 68040, etc.

Bottom line: current (ie, 7.40) compilers generate code to take full advantage
of the 68020/68030 opcodes, and are tuned to emit the best code possible for
the 68040.

    $ also that there are signifigant differences between the 400 and the
    $ 800 Operating Systems; the two OS's are subtly different, and appear
    $ to be generated by two teams in parallel instead of generated by a
    $ recompile (a'la Sun 3 and 4).

Starting with 7.0, most of the code is indeed just a recompile, with obvious
exception like code generators and some kernel code.

>    & HP-UX SUX.  Ugly.  Painful SysVR2 (not 3, not 4) variant.

A more diplomatic response: it is not BSD, so you may find switching awkward at
first.  If you are more used to System V, you should find it quite natural.

>    > C
>
>    & Screwy, arcane compiler flags to adjust STATIC tables in the various
>    & compiler passes.  Many programs (perl, X distribution, etc.) will not
>    & compile without fiddling with these switches.

These switches are gone.  The tables size themselves automatically now.

>    $ I suspect that the compiler supplied is a version of the
>    $ old AT&T portable C compiler; you might have much better luck with the
>    $ spiffy (and non-bundled) HP "fancy C compiler" product.  Better yet,
>    $ I'd suggest that you get GNU's gcc - it's free, and you get full
>    $ source.

gcc is indeed free, and you are welcome to use it.  Measurements show that the
HP compilers generate better code on most benchmarks we've tried on the 300,
and MUCH better on the 800.  HP's 300 compiler is based on pcc, but has been
worked on extensively - besides fixing bugs, we added ANSI C support, two
distinct optimizers, support for later Motorola processors, etc.  The 800
compiler is not pcc based.

>    > X11R4 from MIT
>
>    & "Cannot perform realloc".  We don't know where this is coming from, we
>    & are still working on it.  This basically disables all X toolkit
>    & programs; and is especially painful because (a) HP didn't give us dbx,
>    & and (b) gdb can't find any symbols, even though we compile it -g.

Try defining "MALLOC_0_RETURNS_NULL" when building X11R4.  MIT assumes
malloc(0) returns the current breakpoint, but not all malloc's behave this way,
so an #ifdef was added.

You would, of course, have no trouble debugging if you used HP's C compiler and
the supplied debugger xdb.

--------------
Marc Sabatella (marc@hpmonk.fc.hp.com)
Disclaimers:
	2 + 2 = 3, for suitably small values of 2
	Bill and Dave may not always agree with me

bailey@math-cs.kent.edu (Jeff Bailey) (02/01/91)

In article <BB.91Jan31004625@reef.cis.ufl.edu>, bb@reef.cis.ufl.edu
(Brian Bartholomew) writes:
> 
> > TCP/IP
> 
> & This has a problem with non-blocking output to sockets; this gives at
> & least one large program (empire, a game) fits.

Rumored to be fixed in 8.0

> 
> > C
> 
> & Screwy, arcane compiler flags to adjust STATIC tables in the various
> & compiler passes.  Many programs (perl, X distribution, etc.) will not
> & compile without fiddling with these switches.

Again, fixed in 8.0

> > X11R4 from MIT
> 
> & "Cannot perform realloc".  We don't know where this is coming from, we
> & are still working on it.  This basically disables all X toolkit
> & programs; and is especially painful because (a) HP didn't give us dbx,
> & and (b) gdb can't find any symbols, even though we compile it -g.

We've been running X11R4 from MIT since the *DAY* it was released on expo.
I've *never* seen this problem. You did set the MALLOC_RETURNS_0 variable
like the docs say didn't you?

> 
> > Conclusion:
> 
> &$ Get Suns!!!

Bull Sh*t!! The four Sun IPCs we have won't even keep running long enough
for them to be useful. We have 4.1.1 on order but 4.1 will *crash* several
times in one evening and has an annoying habit of killing processes at
random. *Any* process is susceptible to this (X, ls, csh,...). They are
totally useless as is. I was thrown clear off the system because of this
at least 15 times last night and had to reboot the machine at least 4 or 5
times, not to mention the 2 times it simply panic'ed and died.

> 
> 
> --
> "Any sufficiently advanced technology is indistinguishable from a
rigged demo."
>
-------------------------------------------------------------------------------
> Brian Bartholomew	UUCP:       ...gatech!uflorida!mathlab.math.ufl.edu!bb
> University of Florida	Internet:   bb@math.ufl.edu

---------------------------------------------------------------------
Jeff Bailey (JRB71) (System Administrator)      <bailey@mcs.kent.edu>
Department of Mathematics and Computer Science  |  The first academic
Kent State University                           |  institution with a
Kent - OH 44242                                 |   "WaveTracer DTC"

harry@hpcvlx.cv.hp.com (Harry Phinney) (02/02/91)

Brian Bartholomew writes:

> & Many programs (perl, X distribution, etc.) will not
> & compile without fiddling with these switches.

The distributions from the MIT X Consortium contain configuration files
for each of the supported platforms.  I firmly believe the supplied
configuration files will allow the X distributions to compile unchanged
on the supported releases of HP-UX.  If you know of a specific problem,
we would appreciate hearing about it so that we can avoid it for future
distributions.

> > X11R4 from MIT

> & "Cannot perform realloc".  We don't know where this is coming from, we
> & are still working on it.

This error is caused by linking your programs with libmalloc.a, but not
specifying MALLOC_0_RETURNS_NULL in the StandardDefines list within the
hp.cf file of the X distribution.  You can either define this flag when
you build the library (probably the best solution), or use the malloc in
libc.a.

& Fast.  Especially the HP-supplied X server.  However, it is a bit
& buggy (xtank generates spurious pixels [XCopyArea problem?] 

If you could supply more details on your system configuration i.e.  what
display card are you using and what version of the server, it would help
us prevent such bugs in the future.

> and long
> & XDrawText requests will cause the fonts to effectively "greek".

This is a known bug involving clipping of certain types of text lines,
and has been fixed for future releases.

Harry Phinney   harry@hp-pcd.cv.hp.com

bb@reef.cis.ufl.edu (Brian Bartholomew) (02/05/91)

	">"  =  (marc@hpmonk.fc.hp.com) Marc Sabatella
	"%"  =  terry@venus.sunquest.com (Terry R. Friedrichsen)
	"&"  =  thoth@reef.cis.ufl.edu (Rob Forsman)
	"$"  =  bb@math.ufl.edu (Brian Bartholomew)

-----

$ Our site has the most experience with 300 and 400 series, which are
$ Motorolla 68K based machines.  Note that the 200, 300, and 400 series
$ are all binary compatible, which tells me that the HP code is not
$ taking advantage of the opcode improvements of the later 68K's.

> This is not true.  They are "upwards compatible" only.  For several
> releases now the 300 compilers generate 68020/68030 specific code.
> The opcode improvements on the 68040 are minimal (move16 primarily) so
> we do not currently emit 68040-specific code.  However, we have tuned
> the compilers to generate code to best take advantage of the 68040 -
> instruction scheduling based on the 68040 pipeline architecture,
> avoiding address modes that are unduly expensive on the 68040, etc.

> Bottom line: current (ie, 7.40) compilers generate code to take full
> advantage of the 68020/68030 opcodes, and are tuned to emit the best
> code possible for the 68040.

It sounds like you are agreeing with my statement.  You aren't using
'040 or '030 opcode enhancements.  You are placing opcodes in a clever
way that cooperates with the '040 cache, and that hopefully doesn't
cost too much on '020s and '030s.  You are "avoiding address modes
that are unduly expensive on the 68040" by virtue of not using any of
the '040 opcode features at all.  Whether this approach is
signifigantly less efficient than one that eschews binary
compatibility in favor of optomization is a matter for benchmarks.  I
made the comment in the first place because my intuition suggests
there is a signifigant performance hit from this approach.

The above statement, that all 200, 300, and 400 binaries are
identical, is corroborated by an invocation of "file /bin/*", which
describes all the executables as written for the "s200 ..."
architecture.  I ran this on a 300 series machine, that returns "HP-UX
kzin 7.0 B 9000/375 kzin" for "uname -a".

-----

$ also that there are signifigant differences between the 400 and the
$ 800 Operating Systems; the two OS's are subtly different, and appear
$ to be generated by two teams in parallel instead of generated by a
$ recompile (a'la Sun 3 and 4).

> Starting with 7.0, most of the code is indeed just a recompile, with
> obvious exception like code generators and some kernel code.

Then how come things like the disk partitioning work quite
differently?  I have gotten the impression that there are a lot more
more differences between the 400s and 800s than I know of.

-----

> A more diplomatic response: it is not BSD, so you may find switching
> awkward at first.  If you are more used to System V, you should find
> it quite natural.

Fair enough.  In our environment, a close compatibility with the
machines that the writers of free software are using, is a very big
plus.

-----

> TCP/IP

One major irritant is that the TCP/IP implementation is based upon the
HP Network Services code, NS.  This ties you to a lot of NS
limitations, like 8-character hostnames, and the requirement to run NS
daemons if you want to run TCP/IP.

-----

% X11R4 from MIT

> Try defining "MALLOC_0_RETURNS_NULL" when building X11R4.  MIT assumes
> malloc(0) returns the current breakpoint, but not all malloc's behave
> this way, so an #ifdef was added.

Oops.  Most likely this is strictly our mistake.

-----

> You would, of course, have no trouble debugging if you used HP's C
> compiler and the supplied debugger xdb.

Again, this is more a question of the environment we are used to.  I
haven't used xdb, so I won't speak ill of it.


--
"Any sufficiently advanced technology is indistinguishable from a rigged demo."
-------------------------------------------------------------------------------
Brian Bartholomew	UUCP:       ...gatech!uflorida!mathlab.math.ufl.edu!bb
University of Florida	Internet:   bb@math.ufl.edu

mike@hpfcso.HP.COM (Mike McNelly) (02/06/91)

	@">"  =  (marc@hpmonk.fc.hp.com) Marc Sabatella
	@"%"  =  terry@venus.sunquest.com (Terry R. Friedrichsen)
	@"&"  =  thoth@reef.cis.ufl.edu (Rob Forsman)
	@"$"  =  bb@math.ufl.edu (Brian Bartholomew)
@
@-----
@
@$ Our site has the most experience with 300 and 400 series, which are
@$ Motorolla 68K based machines.  Note that the 200, 300, and 400 series
@$ are all binary compatible, which tells me that the HP code is not
@$ taking advantage of the opcode improvements of the later 68K's.
@
@> This is not true.  They are "upwards compatible" only.  For several
@> releases now the 300 compilers generate 68020/68030 specific code.
@> The opcode improvements on the 68040 are minimal (move16 primarily) so
@> we do not currently emit 68040-specific code.  However, we have tuned
@> the compilers to generate code to best take advantage of the 68040 -
@> instruction scheduling based on the 68040 pipeline architecture,
@> avoiding address modes that are unduly expensive on the 68040, etc.
@
@> Bottom line: current (ie, 7.40) compilers generate code to take full
@> advantage of the 68020/68030 opcodes, and are tuned to emit the best
@> code possible for the 68040.
@
@It sounds like you are agreeing with my statement.  You aren't using
@'040 or '030 opcode enhancements.  You are placing opcodes in a clever
@way that cooperates with the '040 cache, and that hopefully doesn't
@cost too much on '020s and '030s.  You are "avoiding address modes
@that are unduly expensive on the 68040" by virtue of not using any of
@the '040 opcode features at all.  Whether this approach is
@signifigantly less efficient than one that eschews binary
@compatibility in favor of optomization is a matter for benchmarks.  I
@made the comment in the first place because my intuition suggests
@there is a signifigant performance hit from this approach.

Your intuition is wrong.  We've measured performance in as many
different ways as we could and from our tests we've found that
addressing mode balance and pipeline scheduling are far more important
than any new opcodes for overall performance.  Early in the development
cycle for the 68040 products we conducted tests which used the new
opcodes to see if they gave us significant performance improvement.
Nope.  Given that binary compatibility is very desirable across the
Series 300/400 line we decided to put our development efforts instead
into the areas which maintained compatibility and which provided much
greater performance gains on the 68040 architecture.

No compiler I'm aware of uses the full vocabulary of opcodes for the
68040/68030/68020.  Whereas there are specific cases where a particular
opcode can be hand coded to advantage, the far more usual case is that
the compiler-generated preparatory code for a wizzy new opcode
frequently slows down the total program.

The Series 300/400 compilers do not generate optimal code.  Nor does any
compiler I've ever seen on any real machine.  We did put a lot of effort
into improving those areas where a payoff was evident.

Mike McNelly
mike@fc.hp.com

steve-t@hpfcso.HP.COM (Steve Taylor) (02/06/91)

//bb@reef.cis.ufl.edu (Brian Bartholomew)//
| The above statement, that all 200, 300, and 400 binaries are identical, is
| corroborated by an invocation of "file /bin/*", which describes all the
| executables as written for the "s200 ..." architecture.
-----

"file" is telling you a little white lie, because it's afraid that if it says
s300 some script of yours will break.  It seems unlikely that many of your 7.0
executables would run on an S200 or a Model 310, due to 68020 opcodes which
wouldn't work on those 68010 machines.
						Regards, Steve taylor

NOT A STATEMENT, OFFICIAL OR OTHERWISE, OF THE HEWLETT-PACKARD COMPANY.

jewett@hpl-opus.hpl.hp.com (Bob Jewett) (02/06/91)

> The above statement, that all 200, 300, and 400 binaries are
> identical, is corroborated by an invocation of "file /bin/*", which
> describes all the executables as written for the "s200 ..."
> architecture.  I ran this on a 300 series machine, that returns "HP-UX
> kzin 7.0 B 9000/375 kzin" for "uname -a".

This is due to a deficiency in /usr/bin/file and/or /etc/magic.  It
looks like all occurrences of "s200" in /etc/magic ought to be
"s300/s400" now.  In addition, /usr/bin/file seems not to notice that a
file has been compiled with the -ffpa flag and requires an FPA card to
run (on a 9000/350, for example).

Bob
[Not an official, etc.]

bb@dolphin.cis.ufl.edu (Brian Bartholomew) (02/06/91)

In article <7370304@hpfcso.HP.COM> mike@hpfcso.HP.COM (Mike McNelly)
writes:

> Your intuition is wrong.  We've measured performance in as many
> different ways as we could and from our tests we've found that
> addressing mode balance and pipeline scheduling are far more important
> than any new opcodes for overall performance.  Early in the
> development cycle for the 68040 products we conducted tests which used
> the new opcodes to see if they gave us significant performance
> improvement.  Nope.  Given that binary compatibility is very desirable
> across the Series 300/400 line we decided to put our development
> efforts instead into the areas which maintained compatibility and
> which provided much greater performance gains on the 68040
> architecture.

Neat! - and, thanks for taking the time to follow this up with me.  I
really appreciate it when a vendor (or their off-duty representative)
takes the time to truly answer my question - and it turns out that the
vendor has done the Right Thing(tm).

--
"Any sufficiently advanced technology is indistinguishable from a rigged demo."
-------------------------------------------------------------------------------
Brian Bartholomew	UUCP:       ...gatech!uflorida!mathlab.math.ufl.edu!bb
University of Florida	Internet:   bb@math.ufl.edu

bb@dolphin.cis.ufl.edu (Brian Bartholomew) (02/06/91)

bb@math.ufl.edu (Brian Bartholomew) writes:

| The ... statement, that all 200, 300, and 400 binaries are
| identical, is corroborated by an invocation of "file /bin/*", which
| describes all the executables as written for the "s200 ..."
| architecture.

steve-t@hpfcso.HP.COM (Steve Taylor) writes:

> "file" is telling you a little white lie, because it's afraid that if
> it says s300 some script of yours will break.  It seems unlikely that
> many of your 7.0 executables would run on an S200 or a Model 310, due
> to 68020 opcodes which wouldn't work on those 68010 machines.

I could see someone thinking of this, as an aid to portability, as
long as there was some good way to distinguish between machines so
that I didn't try to use "7.0 executables" on an "S200 or Model 310".
I expected that the "good way" was with the series of
/bin/hp9000s[2358]00 programs.  However, this is not the result I get:

	1:/users/bb> hp9000s200 ; echo $status
	0
	2:/users/bb> ^200^300
	hp9000s300 ; echo $status 
	0
	3:/users/bb> ^300^500
	hp9000s500 ; echo $status 
	1
	4:/users/bb> ^500^800
	hp9000s800 ; echo $status 
	1

Does this mean that programs are binary-compatible to the point of not
using features of a math coprocessor that wern't present in the
68010-generation version?  Exactly what is going on here?

--
"Any sufficiently advanced technology is indistinguishable from a rigged demo."
-------------------------------------------------------------------------------
Brian Bartholomew	UUCP:       ...gatech!uflorida!mathlab.math.ufl.edu!bb
University of Florida	Internet:   bb@math.ufl.edu

mjs@hpfcso.HP.COM (Marc Sabatella) (02/07/91)

>    I could see someone thinking of this, as an aid to portability, as
>    long as there was some good way to distinguish between machines so
>    that I didn't try to use "7.0 executables" on an "S200 or Model 310".
>    I expected that the "good way" was with the series of
>    /bin/hp9000s[2358]00 programs
>    ...
>    Does this mean that programs are binary-compatible to the point of not
>    using features of a math coprocessor that wern't present in the
>    68010-generation version?  Exactly what is going on here?

The /bin/hp9000* programs are not all that useful for your purposes, as you
have noticed.  Programs are "upward compatible" in that s200 programs should
work on a 300, 68010 programs should work on a 68020/30, etc.  But not
"backwards compatible" - a program compiled for a 68020/30 will not be likely
to run on a 68010, since it may use addressing modes and math functions not
supported on that processor.  As explained below, we do take (almost) full
advantage of the 68040, but since there are no new addressing modes, and only
one useful new instruction, 7.40 and later code is optimized for the 68040,
but will still run on 68020/68030.

The "version" field also reported by "file" on a 200/300/400 can be of some
help here.  "version 0" means compiled before 6.5, but unfortunately there is
no way to tell if it uses any 68020-isms or 68881-isms that won't work on a
68010.

Starting at 6.5, version numbers were assigned meaning: 1 means requires only
a 68010, 2 means requires a 68020/68030, and 3 means, well, that the program
depends on the >= 6.5 floating point save/restore conventions.  Prior to 6.5,
all floating point registers were considered scratch; at 6.5, when we added the
global optimizer, routines were expected to save and restore all but %fp0 and
%fp1.  "version 3" then has meaning for an object file, mainly - it tells the
linker that the code within the module expects all routines it calls to save
and restore floating point registers, which all compiled routines starting at
6.5 do.  But the linker can then warn if you try to mix "version 3" with
"version 0", which don't save and restore any floating point registers.

Starting at 7.0, the compilers no longer generate code that will run on a
68010, since 7.0 itself is not supported on 310's.  Thus everything now is
either version 2 or 3.

What all this means: anything with a non-zero version will not run on a plain
68000.  Anything with version >1 will not run on a 68010.  There is no way to
tell with a program with a version of zero; it must have been compiled before
6.5, so it is at least possible it will run on a 68010.

By the way, as Mike McNelly points out, there are almost no worthwhile opcode
improvements from 68000 to 68040; most improvements were addressing modes (like
the indexed modes on the 68020/68030).  For the 68040, those particular
addressing modes are slow.  The 68040 is very RISC-like if you do RISC-like
things, but complex addressing modes and instructions delay the pipeline.  So
it turns out that the best code for the 68040 is often code that will run on a
68010.  The one exception is the new move16 instruction, which is good, but not
so good that we wanted to have to go back to supporting two compile paths (one
for the 68040, one for 68020/68030).  However, several library routines have
been handcoded in assembly language and use the instruction after checking to
see if it is running on an '040.

Regarding the FPA: there is no way to tell if a program requires it.  Good
idea though; I'll submit this as an enhancement request.

--------------
Marc Sabatella
HP Colorado Language Lab (CoLL)
marc@hpmonk