[comp.arch] Standard Un*x H/W architecture

lamaster@ames.arc.nasa.gov (Hugh LaMaster) (07/15/88)

In article <607@riddle.UUCP> domo@riddle.UUCP (Dominic Dunlop) writes:

>Me, I'd love to see MacSPARC, as it would add more momentum to the
>bandwagon that's promoting SPARC as a standard UN*X hardware architecture.
>If that bandwaggon doesn't roll, we'll have another five years of having to
>accommodate multiple architectures for no good reason, and, boy, am I
>tired of doing that after the last ten years.  But I fear it won't happen.

A standard Un*x h/w architecture is a pipe dream.  There is still far too
much controversy, new technology, and progress being made.  It is also
unnecessary - good portable code runs on most Un*x systems already with
few changes required - one of the reasons for Un*x existing in the first
place.  The ABI standards are supposed to at least standardize Un*x 
within H/W families.  Personally, I would be satisfied if all vendors
agreed on a standard data representation - specifically, IEEE floating
point 32 bit and 64 bit, ASCII, and two's comp. integers, so that DATA
files could be moved between machines, a requirement in our environment.
Standard executables on all machines may have to wait another 20 years
at least.

-- 
  Hugh LaMaster, m/s 233-9,  UUCP ames!lamaster
  NASA Ames Research Center  ARPA lamaster@ames.arc.nasa.gov
  Moffett Field, CA 94035     
  Phone:  (415)694-6117

linimon@killer.UUCP (Mark Linimon) (07/15/88)

In article <11783@ames.arc.nasa.gov>, lamaster@ames.arc.nasa.gov (Hugh LaMaster) writes:

> The ABI standards are supposed to at least standardize Un*x 
> within H/W families.  Personally, I would be satisfied if all vendors
> agreed on a standard data representation - specifically, IEEE floating
> point 32 bit and 64 bit, ASCII, and two's comp. integers, so that DATA
> files could be moved between machines, a requirement in our environment.

I agree with your statements that a standard data format is a Good Thing,
and that binary compatibility between machines with the same architecture
is a Good Thing.  Everyone of course agrees that source-code compatibility
between machines of disparate architectures is a Good Thing.

> Standard executables on all machines may have to wait another 20 years
> at least.

My disagreement is with your (implict - I realize that I'm running a danger
by putting words in your mouth) assumption that a standard architecture would
ever necessarily be a Good Thing per se.  One, I think that this would tend
to stifle architectural experimentation - e.g. the advent of RISC, perhaps
even more so the "exotic" things like Multiflow.  Two, I think that different
architectures may well perform better for different applications: even Un*x
systems are used for such disparate things as general software development,
heavy number crunching, and database applications.  I doubt that there is
one "best" underlying architecture that serves equally well for all of these,
even given that they are running the same operating system.  Consider in
support of this point the number of systems that Convex, Teradata, et. al.,
sell.

Mark Linimon
Mizar Digital Systems
uucp: sun!texsun!mizarvme!linimon

dave@sdeggo.UUCP (David L. Smith) (07/15/88)

In article <11783@ames.arc.nasa.gov>, lamaster@ames.arc.nasa.gov (Hugh LaMaster) writes:
> The ABI standards are supposed to at least standardize Un*x 
> within H/W families.  

How is an ABI supposed to cope with differences in semantics across different
Unix's?  For example, a version that has enforced locking for lockf and
one that does not?  Sounds as if an ABI is only good for reasonably identical
implementations.
-- 
David L. Smith
{sdcsvax!jack,ihnp4!jack, hp-sdd!crash, pyramid, uport}!sdeggo!dave
sdeggo!dave@amos.ling.edu 
Sinners can repent but stupid is forever.

alanf%smile@Sun.COM (Alan Fargusson) (07/16/88)

In article <223@sdeggo.UUCP>, dave@sdeggo.UUCP (David L. Smith) writes:
> How is an ABI supposed to cope with differences in semantics across different
> Unix's?  For example, a version that has enforced locking for lockf and
> one that does not?  Sounds as if an ABI is only good for reasonably identical
> implementations.

Right!  I want to have only one version of the UNIX system.  Isn't that what
everyone wants?  The idea of the ABI is to allow a conforming program to be
distributed in binary so the vendor doesn't need to have one version for each
of the 99 machines based on the 68000.
- - - - - - - - - - - - - - - - - - - - -
Alan Fargusson		Sun Microsystems
alanf@sun.com		..!sun!alanf

smryan@garth.UUCP (Steven Ryan) (07/16/88)

>                      Personally, I would be satisfied if all vendors
>agreed on a standard data representation - specifically, IEEE floating
>point 32 bit and 64 bit, ASCII, and two's comp. integers, so that DATA
>files could be moved between machines, a requirement in our environment.

Not wishing to take sides, but some people like ones complement, and some
people like different floating point formats. By the way, where is the IEEE
standard for 128 bit floating point for Cray, ETA, and 180s? Will we one day
have 256 bit cpus? What happens if optical computers make it big and it
becomes cheaper to use radix 6 (red/.../purple) integers?

Be careful about standarising things--they lock you into the problems of
the past and out of the solutions of the future.

another fine specification from
   s m ryan

Be wary of general statements--the nitpickers will eat you up.

lamaster@ames.arc.nasa.gov (Hugh LaMaster) (07/18/88)

In article <980@garth.UUCP> smryan@garth.UUCP (Steven Ryan) writes:

>Not wishing to take sides, but some people like ones complement, and some
>people like different floating point formats. By the way, where is the IEEE
>standard for 128 bit floating point for Cray, ETA, and 180s? Will we one day

>Be careful about standarising things--they lock you into the problems of
>the past and out of the solutions of the future.

Glad you asked.  The reason that you need a standard is because "The network 
is the computer."  IF your network really functions that way, and, IF you
have large amounts of binary data that moves between machines (HINTS:
graphics, flow fields, etc.), THEN you have to have a binary data standard.
The inconvenience of the standard being "wrong" in some cases is greatly
outweighed by all the conveniences and efficiencies.  

One may just as well ask:  Why only 50Hz and 60 Hz power (and wouldn't one
standard be even better?) ?  Why not have every appliance use its optimal
frequency - maybe 47 Hz is ideal for drills, 53 Hz for washing machines.

-- 
  Hugh LaMaster, m/s 233-9,  UUCP ames!lamaster
  NASA Ames Research Center  ARPA lamaster@ames.arc.nasa.gov
  Moffett Field, CA 94035     
  Phone:  (415)694-6117

gillies@p.cs.uiuc.edu (07/19/88)

I once heard an expert on floating-point arithmetic state that CDC's
1's complement arithmetic was used JUST BECAUSE IT RUNS FASTER.  In
fact, the engineer estimated their 1's complement could always be
implemented to run 10% faster than IEEE arithmetic.  Since
speed-at-all-costs is the main selling-point of supercomputers, CDC
was very reluctant to abandon this advantage.

glennw@nsc.nsc.com (Glenn Weinberg) (07/19/88)

In article <11956@ames.arc.nasa.gov> lamaster@ames.arc.nasa.gov.UUCP (Hugh LaMaster) writes:
>One may just as well ask:  Why only 50Hz and 60 Hz power (and wouldn't one
>standard be even better?) ?  Why not have every appliance use its optimal
>frequency - maybe 47 Hz is ideal for drills, 53 Hz for washing machines.

Yes, all drills and washing machines (in the U.S.--let's leave the
international voltage disputes for another day!) run on 115V 60Hz
(approximately).  But does that mean that they all must have the same
motor, or even the same kind of motor?  As long as the motor runs on
standard voltage (and a manufacturer could, of course, add a transformer
if so desired) and drives the appliance, who cares about the motor's
architecture?  For that matter, should a 1/4" "handyman" drill have
the same type of motor as a 1/2" commercial-duty drill?  Does a drill
have the same type of motor as a washing machine, or a nuclear power
plant circulating pump?

The point is that, generally, standardization makes sense only within a
narrowly defined range of function and performance.  If you try to
standardize over too wide a range, you end up with something that satisfies
no one.  Bringing the analogy back to Un*x, one of the attractions of Un*x
is that it runs on a wide range of machines.  If you think of the PC as
a drill, a mini as a washing machine, and a Cray as a circulating pump,
you can also see why it may not make sense for all these different
machines to have a single architecture.

Now I realize that drills, washing machines and nuclear power plants don't
have the same intercommunication needs as Un*x boxes, but I think that's
just a complicating issue, not an overriding one.
-- 
Glenn Weinberg					Email: glennw@nsc.nsc.com
National Semiconductor Corporation		Phone: (408) 721-8102
(My opinions are strictly my own, but you can borrow them if you want.)

lamaster@ames.arc.nasa.gov (Hugh LaMaster) (07/19/88)

In article <5230@nsc.nsc.com> glennw@nsc.UUCP (Glenn Weinberg) writes:

>international voltage disputes for another day!) run on 115V 60Hz
>(approximately).  But does that mean that they all must have the same
>motor, or even the same kind of motor?  As long as the motor runs on

>a drill, a mini as a washing machine, and a Cray as a circulating pump,
>you can also see why it may not make sense for all these different
>machines to have a single architecture.

I suppose this is beating a dead horse for the third time, but it seems
that what I posted may have been confusing.

I think that the standardization of computer architectures, as desired
by a previous poster, and as touted by some people who ought to know
better in the trade press, is premature AT BEST, and could not happen
sooner than, say, twenty years, at the VERY EARLIEST.

However, the standardization of (some) data formats is not premature,
is in fact a GOOD THING, is feasible today, and is, in fact, a
REQUIREMENT for people who have large amounts of binary data to move
between machines of different types.   People who have only ASCII
source to move between their machines, or people who have only one
type of computer in their network, don't have worry about it, however.

I hope this clarifies what I personally think.

The power frequency analogy still strikes me as a reasonable one for
this situation.  It is expensive to convert, in bulk, power of
different frequencies.  60 Hz is efficient to generate and transmit.
Other frequencies might be optimal for different things, but in most
cases it is cheaper to use 60 Hz than pay the price of conversion.

The reason I cared to post on this subject is that it is important
to distinguish between the unreasonable standardization of
"All computers should be the same architecture", and the
reasonable standardization of
"Most computers should share some of the same basic data formats."
Sharing data formats, such as the IEEE floating point standard, is
not standardization for its own sake, but meets a very real need to
transmit binary data between machines of DIFFERENT architectures.
It isn't necessary, obviously, except when you do have such a 
collection of machines, since if they were all the same they are
already standardized for you.

-- 
  Hugh LaMaster, m/s 233-9,  UUCP ames!lamaster
  NASA Ames Research Center  ARPA lamaster@ames.arc.nasa.gov
  Moffett Field, CA 94035     
  Phone:  (415)694-6117

guy@gorodish.Sun.COM (Guy Harris) (07/19/88)

> However, the standardization of (some) data formats is not premature,
> is in fact a GOOD THING, is feasible today,

To what extent do you view this as "feasible"?  Would the IBM 370, the 68K,
etc. become little-endian or would the Intel 80x86 family, the VAX, etc.
become big-endian?  Or do you mean that data formats on disk files would be
standardized, with programs writing disk files translating from their native
format to the standard on-disk format?

davidsen@steinmetz.ge.com (William E. Davidsen Jr) (07/19/88)

In article <11956@ames.arc.nasa.gov> lamaster@ames.arc.nasa.gov.UUCP (Hugh LaMaster) writes:

| Glad you asked.  The reason that you need a standard is because "The network 
| is the computer."  IF your network really functions that way, and, IF you
| have large amounts of binary data that moves between machines (HINTS:
| graphics, flow fields, etc.), THEN you have to have a binary data standard.
| The inconvenience of the standard being "wrong" in some cases is greatly
| outweighed by all the conveniences and efficiencies.  

  I faced this when I was calculating some data on a variety of machines
and reading them all on one machine. I decided that 32 bit 2's
complement was the most common, so I devised a simple set of routines
which generate and read that format on any machine.

  For the same of efficiency I used compile flags for TWOCOMP (false for
1's comp machines), BITS (assumed 32, set if greater), and LSB (set if
output must be forced to/from LSB order.

  Since the computation was about 5000 times the conversion, I couldn't
see the overhead in the noise.
-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs | seismo}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

lamaster@ames.arc.nasa.gov (Hugh LaMaster) (07/19/88)

In article <76700037@p.cs.uiuc.edu> gillies@p.cs.uiuc.edu writes:
>I once heard an expert on floating-point arithmetic state that CDC's
>1's complement arithmetic was used JUST BECAUSE IT RUNS FASTER.  In
>fact, the engineer estimated their 1's complement could always be
>implemented to run 10% faster than IEEE arithmetic.  Since

CDC/ETA uses 2's complement on the Cyber 200/ETA-10 series of machines,
although they do not use IEEE.  So, you can teach an old dog new tricks,
sometimes.

There is a second argument that has come up frequently enough to
warrant an discussion:  

IF a particular program gets correct answers
using the IEEE standard, and incorrect answers using a less robust
format such as Cray's current format, is it better to get the
wrong answer 10% faster?  This is, in fact, the problem that
Kahan has been trying to address for the last decade.  More
realistically, it would be interesting to compare the rate of 
convergence of common iterative algorithms using IEEE, VAX, Cray,
ETA, and IBM arithmetic (to name some common formats) and see if
IEEE is significantly better.  Naturally, it would be allowable 
to rewrite the code completely in each case, so as to deal with
the error detection and recovery features (or lack thereof)
of each arithmetic.

I did not bring this up in my previous postings, but, eventually,
when supercomputers are using IEEE also, it will be an added
benefit that you will get the same behavior on all systems on
the same program.  I don't know how many people reading this
have run into this problem, but, I have seen many programmer
hours wasted trying to figure out why a particular algorithm
converged on one machine and diverged on another.

Getting back to the original argument, there are plenty of
cases involving graphics where it is more expensive to convert
the data between different formats than to have accepted
slightly lower performance generating the data and not have
to pay the price of conversion.

Finally, I understand that handling the IEEE gradual underflow
behavior can add an extra cycle of latency.  I also have
observed that the MIPS R2010 FPA (and maybe the new R3010 also)
can do a floating add in 2 (!) clock cycles.  How did they do
that?

-- 
  Hugh LaMaster, m/s 233-9,  UUCP ames!lamaster
  NASA Ames Research Center  ARPA lamaster@ames.arc.nasa.gov
  Moffett Field, CA 94035     
  Phone:  (415)694-6117

tim@amdcad.AMD.COM (Tim Olson) (07/20/88)

In article <12005@ames.arc.nasa.gov> lamaster@ames.arc.nasa.gov.UUCP (Hugh LaMaster) writes:
| Finally, I understand that handling the IEEE gradual underflow
| behavior can add an extra cycle of latency.  I also have
| observed that the MIPS R2010 FPA (and maybe the new R3010 also)
| can do a floating add in 2 (!) clock cycles.  How did they do
| that?

By handling denormalized numbers with software traps, and by throwing a
lot of hardware at it!

-- 
	-- Tim Olson
	Advanced Micro Devices
	(tim@delirun.amd.com)

brooks@maddog.llnl.gov (Eugene D. Brooks III) (07/20/88)

In article <12005@ames.arc.nasa.gov> lamaster@ames.arc.nasa.gov.UUCP (Hugh LaMaster) writes:
>IF a particular program gets correct answers
>using the IEEE standard, and incorrect answers using a less robust
>format such as Cray's current format, is it better to get the
>wrong answer 10% faster?  This is, in fact, the problem that
I have a lot of experience dealing with porting codes which run
on 64 bit IEEE on UNIX boxes, 64 bit "claimed to be IEEE"
UNIX boxes(don't ask me which ones these are, I don't want a
confrontation with their manufacturers), and 64 bit Cray format.
In general it is nice to have a code run and produce "exactly the same"
bitwise results everywhere and I have had plenty of sensitivitys
to the differences in the floating point surface.  Anytime a
sensitivity to the arithmetic surfaced I was usually glad that
it did as is meant that the fact that I was "on the ragged edge"
was being detected by the "machine with the bad floating" point.
This affords me the chance to move away from the ragged edge by
fixing the code.  Any machine with IEEE arithmetic outght to have
a "round inward an extra couple of bits" option for "ragged edge"
detection.  If it happens to result in 10% faster execution time
so much the better.

lamaster@ames.arc.nasa.gov (Hugh LaMaster) (07/20/88)

In article <10298@lll-winken.llnl.gov> brooks@maddog.UUCP (Eugene D. Brooks III) writes:
>fixing the code.  Any machine with IEEE arithmetic outght to have
>a "round inward an extra couple of bits" option for "ragged edge"
>detection.  If it happens to result in 10% faster execution time

Excellent point.  I agree.  One of the nice features of the Cray CFT
Fortran compiler is a compiler switch which generates an extra
truncation instruction with each assignment.  You can truncate to the
desired number of bits and see just how sensitive your code is to small
truncation errors.  I wish every compiler had such a switch.  It is nice,
however, to be able to turn it off :-)

-- 
  Hugh LaMaster, m/s 233-9,  UUCP ames!lamaster
  NASA Ames Research Center  ARPA lamaster@ames.arc.nasa.gov
  Moffett Field, CA 94035     
  Phone:  (415)694-6117

chris@mimsy.UUCP (Chris Torek) (07/22/88)

In article <60140@sun.uucp> alanf%smile@Sun.COM (Alan Fargusson) writes:
>Right!  I want to have only one version of the UNIX system.  Isn't that what
>everyone wants?

Only if it is *my* version.  Implication: no, not everyone wants only one
version of Unix.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

henry@utzoo.uucp (Henry Spencer) (07/22/88)

In article <12055@ames.arc.nasa.gov> lamaster@ames.arc.nasa.gov.UUCP (Hugh LaMaster) writes:
>... One of the nice features of the Cray CFT
>Fortran compiler is a compiler switch which generates an extra
>truncation instruction with each assignment.  You can truncate to the
>desired number of bits and see just how sensitive your code is to small
>truncation errors...

Let us not forget that IBM's Stretch machine (the 7030, late 50s) had a bit
which told the floating-point processor whether to round correctly or
randomly.  Same underlying idea:  run your program twice, with the bit set
differently, and see if the answers differ.  Note that this added zero
overhead (apart from running things twice!), since the floating-point
hardware was just as fast/slow either way.
-- 
Anyone who buys Wisconsin cheese is|  Henry Spencer at U of Toronto Zoology
a traitor to mankind.  --Pournelle |uunet!mnetor!utzoo! henry @zoo.toronto.edu