[comp.arch] Killer Micro II

brooks@physics.llnl.gov (Eugene D. Brooks III) (08/25/90)

Just when you were beginning to think that it was safe to enter
the computer market again, meaner critters come crawling out
at the IEEE sponsored Hot Chips Symposium, in Santa Clara.

As is well known to the readers of this group, there has been
a lot of discussion with regard to whether or not the Killer Micros
will supplant supercomputers.  The Killer Micros have been posting
incredible performances for scalar codes, clearly documented in the
LANL report comparing the IBM RS/6000 running at 30 MHZ to the YMP.
Many participants in this group, highlited by one recent conversion
to killer micros on the basis of productivity issues, have argued that the
Killer Micros will not supplant traditional vector processing supercomputers
until they provide equally high vector performance.  Just in case the
argument is valid, the Killer Micros are dealing with the issue...

Meet Killer Micro II, described by Bipolar Integrated Technology, of Portland
Oregon, at the Hot Chips Symposium which was held in Santa Clara this month:

-> 200K transistors on single ECL chip which dissapates 28 watts

-> Clocked at 100 MHZ (80 MHZ Cray 1, 117 MHZ XMP, 154 MHZ YMP)

-> Two 64 bit read ports, one 64 write port, concurrent transfers

-> Capable of one 64 bit ADD and one 64 bit MULT each clock
	IEEE DIVIDE, SQRT tossed in for free

-> Full Integer ALU operations

-> 200 MFLOPS, sustainable, peak performance

Read, Read, FLOP, FLOP, Write: each and every clock!


Anyone got some ``vector register'' chips, and decent memory chips,
to keep this beast fed???

usenet@nlm.nih.gov (usenet news poster) (08/27/90)

brooks@physics.llnl.gov (Eugene D. Brooks III) writes:
> Meet Killer Micro II, described by Bipolar Integrated Technology, of Portland
> Oregon, at the Hot Chips Symposium which was held in Santa Clara this month:
> -> 200K transistors on single ECL chip which dissapates 28 watts
> -> Clocked at 100 MHZ (80 MHZ Cray 1, 117 MHZ XMP, 154 MHZ YMP)
> -> Two 64 bit read ports, one 64 write port, concurrent transfers
> -> Capable of one 64 bit ADD and one 64 bit MULT each clock
> 	IEEE DIVIDE, SQRT tossed in for free
> -> Full Integer ALU operations
> -> 200 MFLOPS, sustainable, peak performance
> Read, Read, FLOP, FLOP, Write: each and every clock!
> 
> Anyone got some ``vector register'' chips, and decent memory chips,
> to keep this beast fed???

If you have a specific algorithm to implement and you are willing to
build a dedicated processor, systolic arrays of a chip like this could
give real bang for the buck.  Of course even if the calculation only
needs a single input and writes single output per cycle, you are still
talking sustained simultaneous read and write rates of 800 MB/sec at
the ends of the pipe.

Using temporally interleaved operations on a physically reenterant
systolic pipe you could use one chip several places in the calculation
and scale the overall processing rate and I/O rates back, but then you
need some flexibility in ports to the chip.  Assuming this is a board
level product to be integrated into an existing WS and you are willing
to incorporate a few chips/board in the systolic array, the individual
chip performances don't need to be nearly as aggressive.  Anyone know
of a CMOS add + mul FP chip with say 3 read and 2 write ports and an
internal crossbar switch that would run at a modest 50 MFLOP per chip?

David States

colin@array.UUCP (Colin Plumb) (08/28/90)

Maybe it's a new chip, but BIT announced 100 MFLOPS chips early
this year.  I heard about separate FP add and multiply units, with 192
bits (128 read, 64 write, from their point of view) of 100 MHz bus
each.  2 cycle (20 ns) latency, so they do 50 MFLOPS flow-through.

Now if you can just supply the 2.4 GBytes/second of data these babies
need...  It's still not going to be cheap.  Or easy to cool (28 watts
out of one chip I can almost read by).  But it's going to kill multi-board
ALUs.
-- 
	-Colin

cik@l.cc.purdue.edu (Herman Rubin) (08/28/90)

In article <603@array.UUCP>, colin@array.UUCP (Colin Plumb) writes:
> Maybe it's a new chip, but BIT announced 100 MFLOPS chips early
> this year.  I heard about separate FP add and multiply units, with 192
> bits (128 read, 64 write, from their point of view) of 100 MHz bus
> each.  2 cycle (20 ns) latency, so they do 50 MFLOPS flow-through.
 
> Now if you can just supply the 2.4 GBytes/second of data these babies
> need...  It's still not going to be cheap.  Or easy to cool (28 watts
> out of one chip I can almost read by).  But it's going to kill multi-board
> ALUs.

There are plenty of mathematical calculations which need lots of computing,
but use little data.  I doubt if these vaunted machines will be much good
at a three-dimensional numerical integral, for example.  And how good is
their integer arithmetic?  If accurate calculation is needed, and this is
not all that unusual, floating point is essentially useless.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)	{purdue,pur-ee}!l.cc!cik(UUCP)

mccalpin@perelandra.cms.udel.edu (John D. McCalpin) (08/28/90)

>>>>> On 28 Aug 90 00:53:00 GMT, cik@l.cc.purdue.edu (Herman Rubin) said:

> There are plenty of mathematical calculations which need lots of computing,
> but use little data.  I doubt if these vaunted machines will be much good
> at a three-dimensional numerical integral, for example.  And how good is
> their integer arithmetic?  If accurate calculation is needed, and this is
> not all that unusual, floating point is essentially useless.
> -- 
> Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907

Yes, it is a terrible shame that 64-bit floating-point arithmetic is
hopelessly inaccurate and "essentially useless".  It is very sad that
engineers saddled with this archaic technology have been utterly
unable to design an airplane that flies, or a bridge that doesn't
collapse, or a spacecraft that could explore the solar system....

.........

More seriously, I keep on hearing rumours from IBM that they want to
know if us users want 128-bit floating-point support on the RISC
system/6000 machines.  Apparently the combined adder/multiplier makes
128-bit add/subtract/multiply operations only about 4 times as costly
as 64-bit operations.  I would definitely like to play with it, but
I do not consider it a high-priority item....
--
John D. McCalpin			mccalpin@perelandra.cms.udel.edu
Assistant Professor			mccalpin@vax1.udel.edu
College of Marine Studies, U. Del.	J.MCCALPIN/OMNET

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (08/29/90)

In article <MCCALPIN.90Aug28121912@pereland.cms.udel.edu> mccalpin@perelandra.cms.udel.edu (John D. McCalpin) writes:

| More seriously, I keep on hearing rumours from IBM that they want to
| know if us users want 128-bit floating-point support on the RISC
| system/6000 machines.  Apparently the combined adder/multiplier makes
| 128-bit add/subtract/multiply operations only about 4 times as costly
| as 64-bit operations.  I would definitely like to play with it, but
| I do not consider it a high-priority item....

  I hear from the applications support people that few of our Cray users
are doing 128 bit. It does get used, but more as a reflection of
problems with the N.A. than because the input or output data are
significant to that extent.
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
    VMS is a text-only adventure game. If you win you can use unix.

colin@array.UUCP (Colin Plumb) (08/29/90)

In article <2482@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:
> There are plenty of mathematical calculations which need lots of computing,
> but use little data.  I doubt if these vaunted machines will be much good
> at a three-dimensional numerical integral, for example.  And how good is
> their integer arithmetic?  If accurate calculation is needed, and this is
> not all that unusual, floating point is essentially useless.

They also do integer ops at the same 100 MHz rate.  Why would they be no
good at a 3-d numerical integrals?  They work at Cray speeds and have
shorter pipelines.  50 MFLOPS *scalar*.  It impresses the hell out of me.

Besides which, it never hurts to use FP - I can always ignore the FP exponent
field and find myself dealing with 52-bit integers.  FP inexactness only
happens if you do things that wouldn't work exactly with integers, either.
-- 
	-Colin

khb@chiba.Eng.Sun.COM (Keith Bierman - SPD Advanced Languages) (08/29/90)

In article <2471@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) writes:

     I hear from the applications support people that few of our Cray users
   are doing 128 bit. It does get used, but more as a reflection of...

People are quite cost sensitive. If 128bit arithmetic is as cheap as
64-bit many will use it. If it is a factor of 2 slower, still a
significant number use it. When it gets to be 100x slower, only those
with serious desires employ it.
--
----------------------------------------------------------------
Keith H. Bierman    kbierman@Eng.Sun.COM | khb@chiba.Eng.Sun.COM
SMI 2550 Garcia 12-33			 | (415 336 2648)   
    Mountain View, CA 94043

urjlew@uncecs.edu (Rostyk Lewyckyj) (08/29/90)

One detail that should not be overlooked in this discussion of fp.
precision, is that the 64 bits used to represent your number is
subdivided into sign + exponent + fraction. So a 64 bit fp number
gives you only between 48 and 56 bits of fraction. (56 bits for
the IBM 360 architecture, and I believe 48 for a CRAY and most
other base 2 machines). IEEE is what? 80 bits divided up into
1+15+64 ? So it really takes 80 bit fp for 64 bits of precision.
    Of course you all were already taking this into account.
-----------------------------------------------
  Reply-To:  Rostyslaw Jarema Lewyckyj
             urjlew@ecsvax.UUCP ,  urjlew@unc.bitnet
       or    urjlew@uncvm1.acs.unc.edu    (ARPA,SURA,NSF etc. internet)
       tel.  (919)-962-6501

cik@l.cc.purdue.edu (Herman Rubin) (08/29/90)

In article <632@array.UUCP>, colin@array.UUCP (Colin Plumb) writes:
> In article <2482@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:
> > There are plenty of mathematical calculations which need lots of computing,
> > but use little data.  I doubt if these vaunted machines will be much good
> > at a three-dimensional numerical integral, for example.  And how good is
> > their integer arithmetic?  If accurate calculation is needed, and this is
> > not all that unusual, floating point is essentially useless.
> 
> They also do integer ops at the same 100 MHz rate.  Why would they be no
> good at a 3-d numerical integrals?  They work at Cray speeds and have
> shorter pipelines.  50 MFLOPS *scalar*.  It impresses the hell out of me.
> 
> Besides which, it never hurts to use FP - I can always ignore the FP exponent
> field and find myself dealing with 52-bit integers.  FP inexactness only
> happens if you do things that wouldn't work exactly with integers, either.

There seems to be a sufficient ignorance of the problems on this newsgroup
to require clarification.  For a very long time, arbitrary precision 
arithmetic has been done by electro-chemical computers (people) using, 
in effect, one-digit arithmetic.  I am assuming that the readers of this
group are not totally unfamiliar with what at least used to be taught in
elementary school arithmetic :-)

There was a previous discussion about the problems with the SPARC because
integer multiplication was 32x32 -> 32.  This means that to do multiple
precision arithmetic, the number must be broken into 16 bit blocks, so that
the product could be exactly obtained.  With 52 bit integers, the blocks
would be 26 bits.  Also, if the multiplication is done in the floating point
units, the results would have to be converted into integers, or some other
device to separate the most and least significant parts of the 52 bit result
for further purposes.  This requires extra operations, and may very well
negate the larger precision of the floating point units.

I have done calculations where I used both single (48 bit) and double (96 bit)
accuracy to get an idea of the accuracy of the results, and this did not
always get enough terms.  If I needed more accuracy, I could only get it
by using integer arithmetic to simulate the floating-point operations.

An unrelated point is the remark about 3-d numerical integration.  This 
requires lots of computation to be done in a straightforward manner, and
the problems get worse with the dimension.  There are approximation methods
which can be used in higher dimensions, but rarely can get much accuracy.


-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)	{purdue,pur-ee}!l.cc!cik(UUCP)

patrick@convex.COM (Patrick F. McGehearty) (08/29/90)

In article <1990Aug29.005329.13598@uncecs.edu> urjlew@uncecs.edu (Rostyk Lewyckyj) writes:
>
>One detail that should not be overlooked in this discussion of fp.
>precision, is that the 64 bits used to represent your number is
>subdivided into sign + exponent + fraction. So a 64 bit fp number
>gives you only between 48 and 56 bits of fraction. (56 bits for
>the IBM 360 architecture, and I believe 48 for a CRAY and most
>other base 2 machines). IEEE is what? 80 bits divided up into
>1+15+64 ? So it really takes 80 bit fp for 64 bits of precision.
Actually, there are several IEEE extended precision specifications
for different numbers of bits.

For IEEE, the 32 bit representation includes 23 represented bits and an
implicit 1 bit in the 24th position for the mantissa.  The exponent is
represented by 8 bits (10**-38 to 10**+38) and a sign bit.
The 64 bit representation has 52+1 bits for the mantissa and 11 bits for
the exponent, for a range of 10**-308 to 10**+308.
I don't have the full spec, does anyone know the other IEEE representation
patterns?

For those of you not into Numerical Analysis, there are series of
computations which will give radically different results with only
single bit changes in the double precision input data.  Computations
of this sort are called numerically unstable.
For example,
   d=a*(b-c)
where b = c +/- epsilon for small epsilon can change the sign of the
result.

henry@zoo.toronto.edu (Henry Spencer) (08/30/90)

In article <2471@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.com (bill davidsen) writes:
>  I hear from the applications support people that few of our Cray users
>are doing 128 bit. It does get used, but more as a reflection of
>problems with the N.A. than because the input or output data are
>significant to that extent.

I would suspect that the awful properties of Cray arithmetic might also
have something to do with an occasional need for 128 bits.
-- 
TCP/IP: handling tomorrow's loads today| Henry Spencer at U of Toronto Zoology
OSI: handling yesterday's loads someday|  henry@zoo.toronto.edu   utzoo!henry

dik@cwi.nl (Dik T. Winter) (08/30/90)

In article <1990Aug29.175933.28804@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
 > In article <2471@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.com (bill davidsen) writes:
 > >  I hear from the applications support people that few of our Cray users
 > >are doing 128 bit. It does get used, but more as a reflection of
 > >problems with the N.A. than because the input or output data are
 > >significant to that extent.
 > 
 > I would suspect that the awful properties of Cray arithmetic might also
 > have something to do with an occasional need for 128 bits.

Yes, they are awful.  But their awfulness is at least well documented.
(And if you look at Cray arithmetic as arithmetic with 46 bits plus
noise in the remainder, it works out quite well.)  I think Cray arithmetic
is useable (some will disagree with me, notably Kahan).

On the other hand DEC's D-float is extremely well behaved, but some very
well conditioned problems can be solved using standard methods on the
Cray, but not on the VAX.  The reason: insufficient exponent range.

But I digress.

The reason some people use 128 bit floating point (96 bit mantissa on the
Cray) is that they need it to get reasonable results.  And this is
independent of the precision of the input data.  There are enough processes
where the result to a large extent independent on the input data.  In most
cases the problem is 'close' to a singular problem, but it is so 'far away'
that a loose perturbation of the inputs will not make it singular.
And they do it even when 128 bit floating point is much more costly as
64 bit floating point.  (What was it on the Cray?  10 times?  20 times?
Something like that at least, it is in software.)

Some years ago I was involved in a project to separate the 'smallest'
1 billion zeros of the Riemann zeta function (actually we overshot and
got to 1.5 billion).  Also here 128 bit floating point was regularly
needed to get proper results, and this was based on a priory analysis.
(This was done on a Cyber 205.  Now, if you talk about sloppy arithmetic,
here is your machine.  What other machine has that a=b does not imply b=a?
What other machine has vector instructions that can overflow if there
are enough interrupts during execution?  Disgressing again I think.
But following another thread: timing.  On that machine there is an
instruction that can not be timed because the timer overflows.  The
instruction takes some 4 seconds to complete.  Talking about CISC.)
--
dik t. winter, cwi, amsterdam, nederland
dik@cwi.nl

colin@array.UUCP (Colin Plumb) (08/30/90)

In article <1990Aug29.005329.13598@uncecs.edu> urjlew@uncecs.edu (Rostyk Lewyckyj) writes:
> IEEE is what? 80 bits divided up into
> 1+15+64 ? So it really takes 80 bit fp for 64 bits of precision.

IEEE is 1+7+24 (single) or 1+11+52 (double).  The double-extended rules
specify only minimum mantissa and exponent ranges, not representations.
The committee members (I've talked to 'em about it) wanted to require
figures that would mean at least 96 bits, but Intel, who had already
built the 8087 on an early draft successfully lobbied the requirements
down to what the 80-bit format they'd implemented could cover.
-- 
	-Colin

mmm@cup.portal.com (Mark Robert Thorson) (08/30/90)

[This is a reposting of comments which do not seem have have made it off
my machine.  My apologies if you see them twice.]

Note that to maintain the
200 MFLOPS rate, you must be doing one ALU and one multiplier result
per clock cycle (each unit is rated at 100 MFLOPS individually), and you
must feed one of the results back to the inputs, because there is only
one output port allowing only one result to be unloaded per clock cycle.
10 ns cycle time in pipelined mode, 2-stage pipeline;  20 ns in flow-thru.
 
It certainly is a powerful chip.  28.6 watts is a lot of power.
 
Other interesting facts:  it comes in a 395-pin PGA.  It costs
$1395 in 100-unit quantity.  A shoebox full of these things would
be worth about a million dollars!

jsweedle@mipos2.intel.com (Jonathan Sweedler) (08/31/90)

In article <MCCALPIN.90Aug28121912@pereland.cms.udel.edu> mccalpin@perelandra.cms.udel.edu (John D. McCalpin) writes:
>More seriously, I keep on hearing rumours from IBM that they want to
>know if us users want 128-bit floating-point support on the RISC
>system/6000 machines.  Apparently the combined adder/multiplier makes
>128-bit add/subtract/multiply operations only about 4 times as costly
>as 64-bit operations.  

This is probably coming from Prof. Kahan via IBM.  From personal talks with 
Prof. Kahan and from some postings to the numeric interests mailing list,
it seems that Prof. Kahan's next crusade is to convince people that IEEE
double precision won't be good enough for future software.  He feels that
problems tend to grow as systems tend to grow (more memory and become
faster).  As problems grow, more accuracy is needed.  As David Hough wrote,
in a letter to the numerics interest group, more precision is needed to:

	permit grossly unstable algorithms, that parallelize well,
	to work adequately most of the time, by pushing the roundoff
	level so far down that it doesn't have enough time to cause
	trouble for the size of problems likely in the next few years.

Both the x86 line and the 68000 line support extended precision.
Prof.  Kahan thinks this will be ok for the short term, but he feels
that in the end, more precision will be needed.  It would be
interesting to know if other companies are considering support for
precisions greater than IEEE double precision.  It is interesting to
note that at this point, the RISC machines are in a worse position than
the CISC machines in this regard.  But I probably shouldn't have said
this, as I have no desire to start a new RISC vs. CISC war.

===============================================================================
Jonathan Sweedler, Microprocessor Design, Intel Corp.
UUCP: {decwrl,hplabs,oliveb}!intelca!mipos3!mipos2!jsweedle
ARPA: jsweedle%mipos2.intel.com@relay.cs.net

aglew@dwarfs.crhc.uiuc.edu (Andy Glew) (08/31/90)

It sounds like Kahan is pushing for the 128 bit quad precision that
was dropped from the IEEE FP standard. Power to him!

--
Andy Glew, a-glew@uiuc.edu [get ph nameserver from uxc.cso.uiuc.edu:net/qi]

firth@sei.cmu.edu (Robert Firth) (08/31/90)

In article <AGLEW.90Aug30210629@dwarfs.crhc.uiuc.edu> aglew@dwarfs.crhc.uiuc.edu (Andy Glew) writes:
>It sounds like Kahan is pushing for the 128 bit quad precision that
>was dropped from the IEEE FP standard. Power to him!

With respect, I disagree.  In my opinion, there are already far to
many engineers who use rotten numerical algorithms and trust to
double precision and dumb luck; going to quadruple precision will
merely encourage more of the same.

What I think we need is hardware interval arithmetic.  When the
printout shows them beyond dispute that the choice is between
50 bits of noise and 100 bits of noise, perhaps they'll spend
more time on better algorithms and less time pushing for wrong
answers faster.

cik@l.cc.purdue.edu (Herman Rubin) (08/31/90)

In article <AGLEW.90Aug30210629@dwarfs.crhc.uiuc.edu>, aglew@dwarfs.crhc.uiuc.edu (Andy Glew) writes:
> It sounds like Kahan is pushing for the 128 bit quad precision that
> was dropped from the IEEE FP standard. Power to him!

A reminder that for efficient really high precision arithmetic, INTEGER
arithmetic is needed.  Setting up the hardware so that easily programmed
exact integer arithmetic, preferably using units such as long, longlong,
etc., would simplify the handling of these problems.  There would be little
added cost in providing these capabilities as well.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)	{purdue,pur-ee}!l.cc!cik(UUCP)

meissner@osf.org (Michael Meissner) (08/31/90)

In article <8442@fy.sei.cmu.edu> firth@sei.cmu.edu (Robert Firth)
writes:

| In article <AGLEW.90Aug30210629@dwarfs.crhc.uiuc.edu> aglew@dwarfs.crhc.uiuc.edu (Andy Glew) writes:
| >It sounds like Kahan is pushing for the 128 bit quad precision that
| >was dropped from the IEEE FP standard. Power to him!
| 
| With respect, I disagree.  In my opinion, there are already far to
| many engineers who use rotten numerical algorithms and trust to
| double precision and dumb luck; going to quadruple precision will
| merely encourage more of the same.
| 
| What I think we need is hardware interval arithmetic.  When the
| printout shows them beyond dispute that the choice is between
| 50 bits of noise and 100 bits of noise, perhaps they'll spend
| more time on better algorithms and less time pushing for wrong
| answers faster.

Have we actually gotten to the point where we need that much precision
on a day to day basis?  I seem to recall that in my numerical analysis
course 12 years ago, that it was said that your average physical
measurement only had 3-5 digits of accuracy.  This means that any
answer received cannot be more accurate than the input.  Now in order
to avoid round off error, you certainly need more digits internally,
but IEEE double gives something 12-14 digits.  One of the problems the
computer has introduced is too much exact numerical quantization (ie,
the often quoted statistic that the average family has 2.4 children).
It would seem to me that providing double the precision might not give
any more accurate answers.

There are probably groups that may need such extremes in precision,
but are they really enough to drive the market?

--
Michael Meissner	email: meissner@osf.org		phone: 617-621-8861
Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142

Do apple growers tell their kids money doesn't grow on bushes?

vu0310@bingvaxu.cc.binghamton.edu (R. Kym Horsell) (08/31/90)

In article <MEISSNER.90Aug31101418@osf.osf.org> meissner@osf.org (Michael Meissner) writes:
\\\
>measurement only had 3-5 digits of accuracy.  This means that any
>answer received cannot be more accurate than the input.  Now in order
>to avoid round off error, you certainly need more digits internally,
>but IEEE double gives something 12-14 digits.  One of the problems the

Unfortunately, ``round off'' error is not the real problem.
*Loss of significance* results when, for example, two positive FP numbers with 
similar magnitude are subtracted. The occurence of same is not
always predictable and the ``quick fix'' is to use more precision.
Despite this, the cliche of failing to invert a 100 x 100 matrix
still holds fairly well, no matter what the precision (unless we
leave the realm of FP altogether for modular arithmetic, etc).

Another point, although many quantities derrived from the real world are not 
known to high accuracy, this is certainly not true of some common physical
constants, e.g. Plank's constant or the speed of light.

-Kym Horsell

bob@tera.com (Bob Alverson) (08/31/90)

In article <2868@inews.intel.com> jsweedle@mipos2.UUCP (Jonathan Sweedler) writes:
>This is probably coming from Prof. Kahan via IBM.  From personal talks with 
>Prof. Kahan and from some postings to the numeric interests mailing list,
>it seems that Prof. Kahan's next crusade is to convince people that IEEE
>double precision won't be good enough for future software.  He feels that
>problems tend to grow as systems tend to grow (more memory and become
>faster).  As problems grow, more accuracy is needed.  As David Hough wrote,
>in a letter to the numerics interest group, more precision is needed to:

One way to get "128" bit precision is with a pair of 64 bit FP numbers.
Kahan calls this "doubled" precision.  The IBM RS6000 has some support
for this, with their single-round multiply-add.  You can find the exact
product of two doubles a, b as P = round(a*b), p = a*b - P.  In their
technology book, they show how to do the product (A,a)*(B,b).
However, there is no discussion there of how to do "doubled" precision
adds.  You can do them without any special support functions, but
Kahan has advocated special hardware to make it faster.  One way is
to use a triple-add with only a single round (from Kahan):

doubled operator+(doubled a, doubled b) {
	doubled sum;
	double t1 = a.lo + b.lo;
	double t2 = add3(t1, a.hi, b.hi);
	double t3 = add3(a.hi, b.hi, -t2);
	double t4 = add3(t3, a.lo, b.lo);
	sum.hi = t2 + t4;
	sum.lo = add3(t2, t4, -sum.hi);
	return sum;
}

I think there are other ways to make "doubled" adds go fast, but I don't
want to toot my own horn until I'm sure its on pitch.

One nice thing about extending precision by using a pair of 64 bit
floats is that it extends infinitely to an arbitrary n-tuple of floats.

Bob

meissner@osf.org (Michael Meissner) (09/01/90)

In article <3922@bingvaxu.cc.binghamton.edu>
vu0310@bingvaxu.cc.binghamton.edu (R. Kym Horsell) writes:

| Path: paperboy!snorkelwacker!usc!zaphod.mps.ohio-state.edu!rpi!leah!bingvaxu!vu0310
| From: vu0310@bingvaxu.cc.binghamton.edu (R. Kym Horsell)
| Newsgroups: comp.arch
| Date: 31 Aug 90 15:18:39 GMT
| References: <527@llnl.LLNL.GOV> <603@array.UUCP> <2482@l.cc.purdue.edu> <MCCALPIN.90Aug28121912@pereland.cms.udel.edu> <2868@inews.intel.com> <AGLEW.90Aug30210629@dwarfs.crhc.uiuc.edu> <8442@fy.sei.cmu.edu> <MEISSNER.90Aug31101418@osf.osf.org>
| Reply-To: vu0310@bingvaxu.cc.binghamton.edu.cc.binghamton.edu (R. Kym Horsell)
| Organization: SUNY Binghamton, NY
| Lines: 20
| 
| In article <MEISSNER.90Aug31101418@osf.osf.org> meissner@osf.org (Michael Meissner) writes:
| \\\
| >measurement only had 3-5 digits of accuracy.  This means that any
| >answer received cannot be more accurate than the input.  Now in order
| >to avoid round off error, you certainly need more digits internally,
| >but IEEE double gives something 12-14 digits.  One of the problems the
| 
| Unfortunately, ``round off'' error is not the real problem.
| *Loss of significance* results when, for example, two positive FP numbers with 
| similar magnitude are subtracted. The occurence of same is not
| always predictable and the ``quick fix'' is to use more precision.
| Despite this, the cliche of failing to invert a 100 x 100 matrix
| still holds fairly well, no matter what the precision (unless we
| leave the realm of FP altogether for modular arithmetic, etc).

Using more precision still does not give you any more accuracy than
the original input.  GIGO.

| Another point, although many quantities derrived from the real world are not 
| known to high accuracy, this is certainly not true of some common physical
| constants, e.g. Plank's constant or the speed of light.

That's true, but my assertion that the number of things that we know
to that accuracy is small, compared to the number of things being
calculated.  We know pi to at least a million digits, but that doesn't
help much when multiplying a radius by 2*pi if we only know the radius
to 2 digits of accuracy.

Somebody in private email mentioned to me about the problem of
handling money to 10 signicant places (ie, financial transactions).  I
mentioned back that these type of calculations are (almost always)
required to be in decimal (or scaled integer), and not floating point.

Tying in with the other thread of discussion (ie, 64 bit ints), 64
bits is just barely enough bits to be able to handle COBOL's 18 digit
accuracy requirements if you are doing the calculations in integer
mode.
--
Michael Meissner	email: meissner@osf.org		phone: 617-621-8861
Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142

Do apple growers tell their kids money doesn't grow on bushes?

khb@chiba.Eng.Sun.COM (Keith Bierman - SPD Advanced Languages) (09/01/90)

In article <MEISSNER.90Aug31101418@osf.osf.org> meissner@osf.org (Michael Meissner) writes:

   on a day to day basis?  I seem to recall that in my numerical analysis
   course 12 years ago, that it was said that your average physical
   measurement only had 3-5 digits of accuracy.  This means that any
   answer received cannot be more accurate than the input. 

No. This requires a study of Estimation theory, with enough
_different_ measuring devices and a good model you can get much better
results than ANY of your input. There are many good textbooks and
monographs. You might try

	Factorization Methods For Discrete Sequential Estimation
	Academic Press, ISBN 0 12 097350 2

--
----------------------------------------------------------------
Keith H. Bierman    kbierman@Eng.Sun.COM | khb@chiba.Eng.Sun.COM
SMI 2550 Garcia 12-33			 | (415 336 2648)   
    Mountain View, CA 94043

cik@l.cc.purdue.edu (Herman Rubin) (09/01/90)

In article <MEISSNER.90Aug31101418@osf.osf.org>, meissner@osf.org (Michael Meissner) writes:
> In article <8442@fy.sei.cmu.edu> firth@sei.cmu.edu (Robert Firth)
> writes:
> 
> | In article <AGLEW.90Aug30210629@dwarfs.crhc.uiuc.edu> aglew@dwarfs.crhc.uiuc.edu (Andy Glew) writes:
> | >It sounds like Kahan is pushing for the 128 bit quad precision that
> | >was dropped from the IEEE FP standard. Power to him!
> | 
> | With respect, I disagree.  In my opinion, there are already far to
> | many engineers who use rotten numerical algorithms and trust to
> | double precision and dumb luck; going to quadruple precision will
> | merely encourage more of the same.
> | 
> | What I think we need is hardware interval arithmetic.  When the
> | printout shows them beyond dispute that the choice is between
> | 50 bits of noise and 100 bits of noise, perhaps they'll spend
> | more time on better algorithms and less time pushing for wrong
> | answers faster.
> 
> Have we actually gotten to the point where we need that much precision
> on a day to day basis?  I seem to recall that in my numerical analysis
> course 12 years ago, that it was said that your average physical
> measurement only had 3-5 digits of accuracy.  This means that any
> answer received cannot be more accurate than the input.  Now in order
> to avoid round off error, you certainly need more digits internally,
> but IEEE double gives something 12-14 digits.  One of the problems the
> computer has introduced is too much exact numerical quantization (ie,
> the often quoted statistic that the average family has 2.4 children).
> It would seem to me that providing double the precision might not give
> any more accurate answers.

Several errors have been made here.  There are many situations where 
considerably more information can be obtained on output than is available
on input.  There are also other cases in which an inherently ill-conditioned
cheap method is available, and the alternatives are expensive.  This happens,
for example, in regression analysis where there is no choice of the "design"
matrix, or where only poor designs are possible.  In this case, there ARE
usually methods available, but they are much more costly.

This can occur in other situations.  It may be necessary to obtain an
integration procedure for a particular type of problem where one has
a probability distribution for which the moments are easily computed
from a few parameters, which may be inaccurate.  Nevertheless, this
does not invalidate the derived procedure, and in many cases 10-20
digits of accuracy can be lost in the computations.  This does not
mean that the final answer has lost any accuracy at all.

That the result has only a few digits of accuracy does not mean that
there is an easily available computational procedure of that type.

BTW, interval arithmetic is unlikely to help, unless the input data
are treated as exact.  Interval arithmetic exaggerates far too much.

> There are probably groups that may need such extremes in precision,
> but are they really enough to drive the market?

Considering that the entire ALU typically costs only a small fraction
of the cost of the computer, is this a reasonable question?  Something
which adds less than $100 to the cost of a high-level PC, and only a
kilobuck to a university computing system, is not extravagant.  One
could integrate the fixed and floating point arithmetic units to save
costs, if this is a problem.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)	{purdue,pur-ee}!l.cc!cik(UUCP)

cik@l.cc.purdue.edu (Herman Rubin) (09/01/90)

In article <1990Aug31.160357.19057@tera.com>, bob@tera.com (Bob Alverson) writes:
> In article <2868@inews.intel.com> jsweedle@mipos2.UUCP (Jonathan Sweedler) writes:

>			........................

This was given as an example of how to do certain multiple precision operations
in floating point:
 
> doubled operator+(doubled a, doubled b) {
> 	doubled sum;
> 	double t1 = a.lo + b.lo;
> 	double t2 = add3(t1, a.hi, b.hi);
> 	double t3 = add3(a.hi, b.hi, -t2);
> 	double t4 = add3(t3, a.lo, b.lo);
> 	sum.hi = t2 + t4;
> 	sum.lo = add3(t2, t4, -sum.hi);
> 	return sum;

It would be easier to do this in fixed point.  In addition, the problems of
exponent underflow, a sometimes serious problem which usually gives no 
indications, is completely avoided.

> I think there are other ways to make "doubled" adds go fast, but I don't
> want to toot my own horn until I'm sure its on pitch.
> 
> One nice thing about extending precision by using a pair of 64 bit
> floats is that it extends infinitely to an arbitrary n-tuple of floats.

When more precision is needed, the exponent range can easily become strained.
With 11-bit exponents and 53-bit mantissas, a 78-precision number automatically
does this.  Even in ordinary precision, exponent underflow, and sometimes
overflow, can be a serious problem.

The releatively efficient, from the hardware standpoint, and I believe from
the software stanpoint as well, to handle the problem is to provide good
integer arithmetic, such as having exact integer product for a unit at least
as long as the longest floating point mantissa--in the usual case 64x64->128.
An overflow counter would help.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)	{purdue,pur-ee}!l.cc!cik(UUCP)

amull@Morgan.COM (Andrew P. Mullhaupt) (09/04/90)

In article <8442@fy.sei.cmu.edu>, firth@sei.cmu.edu (Robert Firth) writes:
> In article <AGLEW.90Aug30210629@dwarfs.crhc.uiuc.edu> aglew@dwarfs.crhc.uiuc.edu (Andy Glew) writes:
> With respect, I disagree.  In my opinion, there are already far to
> many engineers who use rotten numerical algorithms and trust to
> double precision and dumb luck; going to quadruple precision will
> merely encourage more of the same.

Well, people who compute without thinking usually get what they deserve,
but standards and well designed machines should not be an attempt at
idiot proofing. Going to quadruple precision _will_ allow certain
_fast_ algorithms to be used; such as using the overdetermined normal
equations to solve least squares problems to double precision accuracy
via accumulation of inner products in quad. (See Hanson and Lawson,
or Wilkinson for details.) This algorithm can be parallelized for
coarse grain multiprocessing, but the usual Householder QR is not so
simple. As someone who runs least squares problems which take hours
on multi-megaflop hardware, I have every sympathy for Kahan's propsed
high precision arithmetic. 
> 
> What I think we need is hardware interval arithmetic.  When the
> printout shows them beyond dispute that the choice is between
> 50 bits of noise and 100 bits of noise, perhaps they'll spend
> more time on better algorithms and less time pushing for wrong
> answers faster.

Ummm no. There are some non-obvious problems with interval arithmetic,
perhaps the best known is that Newton's method can converge in an
entirely tame way, yet the intervals blow up. (Any iteration which
has any unstable manifold is a threat to have this property. To
bring this closer to home, this would include the simplex algorithm
for linear programming, after Smale's analysis...). 


I think your problem is that you don't see those extra bits of
mantissa and exponent as memory. (What other kind of resource are they?)
This makes them available for the classical trade-off between memory
and speed. Sure, a lot of people who program computers don't know
how to write algorithms. That's no reason to make computers with
totally different arithmetic: the people who don't care to understand
today's floating point will also not care to understand tomorrow's.
On the other hand, they will usually be willing to hire someone who
does know and does care.

Later,
Andrew Mullhaupt

amull@Morgan.COM (Andrew P. Mullhaupt) (09/04/90)

In article <2494@l.cc.purdue.edu>, cik@l.cc.purdue.edu (Herman Rubin) writes:
> A reminder that for efficient really high precision arithmetic, INTEGER
> arithmetic is needed.  

No it's not. Check out some of the new machines - i486, RS/6000, i860,
where floating point is fast enough to make integer arithmetic a
poor substitute. Also check out a SPARC machine where integer operations
for multiplication and division are so slow that even bad floating
point competes.

Later,
Andrew Mullhaupt

amull@Morgan.COM (Andrew P. Mullhaupt) (09/04/90)

In article <MEISSNER.90Aug31101418@osf.osf.org>, meissner@osf.org (Michael Meissner) writes:
> Have we actually gotten to the point where we need that much precision
> on a day to day basis?  I seem to recall that in my numerical analysis
> course 12 years ago, that it was said that your average physical
> measurement only had 3-5 digits of accuracy.  This means that any
Enough people will set this one straight for me not to have to
comment further...
> 
> There are probably groups that may need such extremes in precision,
> but are they really enough to drive the market?

Well I would suspect that the financial community will normally require
more than 3-5 digits, and I bet we buy more computers than everyone
else put together. Now the average financial measurement can live in
7-10 digits, but you have to be careful. If you are working in Yen
denominated split adjusted equity prices, (i.e. Tokyo market) you had
better be in double precision. Do you really expect someone to want
to have to care about this stuff while he's supposed to be thinking
about finance? Well the answer is no, he doesn't want to care. So to
make everything easy you stay in double precision. And you make sure
that your tools (especially things like least squares regression) which
a lot of results will depend on, will work to that precision if at all
possible. 


Later,
Andrew Mullhaupt

cik@l.cc.purdue.edu (Herman Rubin) (09/04/90)

In article <1620@s6.Morgan.COM>, amull@Morgan.COM (Andrew P. Mullhaupt) writes:
> In article <2494@l.cc.purdue.edu>, cik@l.cc.purdue.edu (Herman Rubin) writes:
> > A reminder that for efficient really high precision arithmetic, INTEGER
> > arithmetic is needed.  
> 
> No it's not. Check out some of the new machines - i486, RS/6000, i860,
> where floating point is fast enough to make integer arithmetic a
> poor substitute. Also check out a SPARC machine where integer operations
> for multiplication and division are so slow that even bad floating
> point competes.

You are confusing apples and oranges.  Poor integer facilities obviously
will not beat simulating integer operations in floating point.  But thers
is no good reason why nxn -> 2n multiplication, for example, should not
be provided, where n is the largest word length available for floating
point.

Notice that it is simulation of integer arithmetic in floating point which
is used for high accuracy arithmetic.  It would be more efficient to provide
the integer arithmetic directly, even if it must be done in the "floating-point"
unit.  Integer arithmetic is not just for addressing and loop counting.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet)	{purdue,pur-ee}!l.cc!cik(UUCP)

jgk@osc.COM (Joe Keane) (09/05/90)

I have to agree that 128-bit floating point isn't really such a hot idea.
When you get right down to it, floating point is a hack.  It's a very useful
hack; i won't argue with that.  We admit 64-bit floating point doesn't work,
so what do we do?  We provide more of the same.  Of course this is the
conservative thing to do, so i bet we'll see some big company come out with
128-bit floating point soon.

I don't care if you have 65536-bit floating point, it's still floating point.
That means underflow, overflow, round-off error, loss of precision, need i
continue?  Not to mention that it's now slug slow from all those bits.  You
can push the problems back but they don't go away.

There are a lot of machines out there that can do IEEE 64-bit floating point,
with all its precise rules and cases, but can't multiply two 32-bit integers
in a reasonable way.  What are we to make of this?  It's just dumb.  The
silicon is there, but there's no perceived demand for that feature.  I'd be so
happy to see reasonable support for multiple-precision integers in most
machines.

Our current programming languages have a strong influence.  C has `float' and
`double' types, and most machines have single-precision and double-precision
floating point numbers.  Coincidence?  I think not.

Another interesting area is hardware support for arbitrary-precision real
numbers.  Of course that brings up the dreaded word `closure', at which point
most C programmers throw up their hands.  Some day i'll write a nifty tech
report about it.  Then maybe i can bribe a hardware guy to add a couple
instructions to some math co-processor.  Yeah right.

amull@Morgan.COM (Andrew P. Mullhaupt) (09/05/90)

In article <2510@l.cc.purdue.edu>, cik@l.cc.purdue.edu (Herman Rubin) writes:
> In article <1620@s6.Morgan.COM>, amull@Morgan.COM (Andrew P. Mullhaupt) writes:
> > In article <2494@l.cc.purdue.edu>, cik@l.cc.purdue.edu (Herman Rubin) writes:
> > > A reminder that for efficient really high precision arithmetic, INTEGER
> > > arithmetic is needed.  
> > 
> > No it's not. Check out some of the new machines - i486, RS/6000, i860,
> > where floating point is fast enough to make integer arithmetic a
> > poor substitute. Also check out a SPARC machine where integer operations
> > for multiplication and division are so slow that even bad floating
> > point competes.
> 
> You are confusing apples and oranges.  Poor integer facilities obviously
> will not beat simulating integer operations in floating point.  But thers
> is no good reason why nxn -> 2n multiplication, for example, should not
> be provided, where n is the largest word length available for floating
> point.
> 
> Notice that it is simulation of integer arithmetic in floating point which
> is used for high accuracy arithmetic.  It would be more efficient to provide
> the integer arithmetic directly, even if it must be done in the "floating-point"
> unit.  Integer arithmetic is not just for addressing and loop counting.

Yes, when you're up against it for high precision fixed point, you 
_can_ use floating point as a substitute. But for robustness and
accuracy,  I very often want floating point in the first place.

If you have several currencies, keeping accounts to the penny (or
Yen) means constant _relative_ precision, and this is almost a
specification for floating point. When you work to constant relative
precision, (and most things are like this outside of number
theory) that double-width multiplication is wasteful unless it's
really fast. (This is a really good example for the point of view 
that extra precision is a memory resource to be traded off for
time: the extra bits save you the trouble of shifting your result
and adjusting the exponent.) You are not going to opt for this if
you really want constant relative precision, i.e. you don't really
care about most of those extra bits, unless there is a savings.
Floating point is also much more robust when you go beyond just
simple arithmetic operations; you have an much greater range over
which your intermediate results can range before you have to lose
bits.

In some sense, floating point units were never really properly designed
until they were designed in a unified way, (IEEE-754/854). Once it 
became possible to think about the arithmetic without having to 
worry about which machine's floating point arithmetic, (and this is
one of the less heralded but quite important benefits of the
attack of the killer micros...) it was possible for the programmer
to begin to really understand and _rely_ on the available accuracy.
Until this point, you couldn't make an intelligent trade between
space (extra precision) and time. Now that you can, the real question
in architecture is "which one of these resources will be better used
by the programmer, and" (pace Herman) "his compiler". Let's face it;
the bulk of modern code is written via compilers and the bulk of modern
computing demands constant relative precision. Hardware floating point
is a good (effective and affordable solution). It has the great 
benefit that nearly interchangable instructions are found across a
wide range of machines. I don't really object if good integer arithmetic
gets put onto deficient (e.g. SPARC) machines. But if I can get floating
point to pipeline at one clock per result then I'll almost always take
the floating point. 

Later,
Andrew Mullhaupt

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (09/06/90)

In article <3755@osc.COM> jgk@osc.COM (Joe Keane) writes:

| Our current programming languages have a strong influence.  C has `float' and
| `double' types, and most machines have single-precision and double-precision
| floating point numbers.  Coincidence?  I think not.

  I miss this. Single and double precision came around before C by about
a decade. Yes they went into C when it was designed. If you are implying
that C was the reason manufacturers have double in hardware, the
timeline runs the wrong way.

  And the last time I checked, Cray C didn't do hardware double for the
double type, float and double were identical. As were short, long, and
int. There is no ANSI max size, just min size, so this is a conforming
implementation on that point.
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
    VMS is a text-only adventure game. If you win you can use unix.

meissner@osf.org (Michael Meissner) (09/07/90)

In article <2493@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.COM (Wm E
Davidsen Jr) writes:

| In article <3755@osc.COM> jgk@osc.COM (Joe Keane) writes:
| 
| | Our current programming languages have a strong influence.  C has `float' and
| | `double' types, and most machines have single-precision and double-precision
| | floating point numbers.  Coincidence?  I think not.
| 
|   I miss this. Single and double precision came around before C by about
| a decade. Yes they went into C when it was designed. If you are implying
| that C was the reason manufacturers have double in hardware, the
| timeline runs the wrong way.
| 
|   And the last time I checked, Cray C didn't do hardware double for the
| double type, float and double were identical. As were short, long, and
| int. There is no ANSI max size, just min size, so this is a conforming
| implementation on that point.

Note, C's promotions of floats into doubles is rather unique.  Most
languages that I'm familar with add two single precision floating
point in single precision mode, rather than promoting both sides into
double precision and doing a double precision add.  This certainly
simplifies the compiler/library, in that you only have to support one
flavor of the math routines, and have less code patterns to deal with.

It's interesting to note that some of the PDP-11 series floating point
processors were actually faster doing double precision than doing
single precision!  On most other machines of that era, the reverse is
true, and there was a big speedup in doing single precision (which is
why most languages did not convert to double unless they needed to).
Of course some of todays floating point processors, just do 80 bit
floating point internally, and take the same amount of time (possibly
being a little slower to convert a double precision value into
internal format -- I don't know for sure).
--
Michael Meissner	email: meissner@osf.org		phone: 617-621-8861
Open Software Foundation, 11 Cambridge Center, Cambridge, MA, 02142

Do apple growers tell their kids money doesn't grow on bushes?

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (09/07/90)

In article <MEISSNER.90Sep6145917@osf.osf.org> meissner@osf.org (Michael Meissner) writes:

| Note, C's promotions of floats into doubles is rather unique.  Most
| languages that I'm familar with add two single precision floating
| point in single precision mode, rather than promoting both sides into
| double precision and doing a double precision add.  This certainly
| simplifies the compiler/library, in that you only have to support one
| flavor of the math routines, and have less code patterns to deal with.

  ANSI C allows single precision. As long as the results is not
changed, the code may do what it likes. And many C compilers have done
this all along.

  Also ANSI allows float rather than double to be passed as an argument.
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
    VMS is a text-only adventure game. If you win you can use unix.

vu0310@bingvaxu.cc.binghamton.edu (R. Kym Horsell) (09/07/90)

In article <3755@osc.COM> jgk@osc.COM (Joe Keane) writes:
\\\
>Our current programming languages have a strong influence.  C has `float' and
>`double' types, and most machines have single-precision and double-precision
>floating point numbers.  Coincidence?  I think not.

Eh? This is putting the cart before the horse somewhat... Fortran
predates C by a good interval and they only put DP into *Fortran*
because the h/w supported it (and they didn't want assembly-level
guys being able to get hold of it & not Fortran guys).

>Another interesting area is hardware support for arbitrary-precision real
>numbers.  Of course that brings up the dreaded word `closure', at which point
>most C programmers throw up their hands.  Some day i'll write a nifty tech
>report about it.  Then maybe i can bribe a hardware guy to add a couple
>instructions to some math co-processor.  Yeah right.

Eh? ``Closure'' may be used in a technical sense here (and, if so, has
a meaning I never came across) but _usually_ is a term for
``pure code + environment'' (c.f. Lisp & friends). What this has
to specifically do with FP and APA I _don't know_.  (I endorse
any handling of closures by hardware, however -- it's cheap in
terms of Si & well worth it in terms of performance improvements in
certain areas).

-Kym Horsell

stephen@estragon.uchicago.edu (Stephen P Spackman) (09/07/90)

In article <3945@bingvaxu.cc.binghamton.edu> vu0310@bingvaxu.cc.binghamton.edu (R. Kym Horsell) writes:
   >Another interesting area is hardware support for arbitrary-precision real
   >numbers.  Of course that brings up the dreaded word `closure', at which point
   >most C programmers throw up their hands.  Some day i'll write a nifty tech
   >report about it.  Then maybe i can bribe a hardware guy to add a couple
   >instructions to some math co-processor.  Yeah right.

   Eh? ``Closure'' may be used in a technical sense here (and, if so, has
   a meaning I never came across) but _usually_ is a term for
   ``pure code + environment'' (c.f. Lisp & friends). What this has
   to specifically do with FP and APA I _don't know_.  (I endorse
   any handling of closures by hardware, however -- it's cheap in
   terms of Si & well worth it in terms of performance improvements in
   certain areas).

No, you got it right. If you look at the *constructive* definition of
the real numbers (as opposed to the usual classical definition - in
general you will find that construtivism is the better choice for
computation anyway because the logic doesn't insist on the relevance
of the uncomputable), you'll see that they fit hand-in-glove with
computation. WITHOUT APPROXIMATION.

That's right, you heard it, the constructive formulation permits
EXACT computation with EXACT reals.

The place the closures come into it is this: that a constructive real
is (well, after a certain amount of normalisation) a FUNCTION from the
number of bits of precision you need to the bits themselves. You so
the computation; the result is a function; and you start printing.
When the user gets tired of reading digits, she hits ^C (well, I
suppose you could use format statements instead :-). Ta-dah!

They have two problems; the first is slowth, and the second is that
the damned things aren't (for technical reasons having to do with
decidability, roughly speaking) totally ordered (there're numbers with
no SIGN, for example, because they're unordered w.r.t 0 - but they're
all pretty damned small!).

But you haven't LIVED 'til you've coded up the One True Real Number
System on your friendly neighbourhood scheme....

Just another bit of too-little-known conceptual technology....

stephen p spackman  stephen@estragon.uchicago.edu  312.702.3982

jgk@osc.COM (Joe Keane) (09/11/90)

In article <3755@osc.COM> i write:
>Our current programming languages have a strong influence.  C has `float' and
>`double' types, and most machines have single-precision and double-precision
>floating point numbers.  Coincidence?  I think not.

In article <2493@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.com (bill davidsen)
writes:
>  I miss this. Single and double precision came around before C by about
>a decade. Yes they went into C when it was designed. If you are implying
>that C was the reason manufacturers have double in hardware, the
>timeline runs the wrong way.

In article <3945@bingvaxu.cc.binghamton.edu> vu0310@bingvaxu.cc.binghamton.edu
(R. Kym Horsell) writes:
>Eh? This is putting the cart before the horse somewhat... Fortran
>predates C by a good interval and they only put DP into *Fortran*
>because the h/w supported it (and they didn't want assembly-level
>guys being able to get hold of it & not Fortran guys).

Sorry, i'm not too clear when i'm being sarcastic.  I didn't mean to imply
that C was the source of machines having two floating-point formats, only that
they have common roots.  I guess that would be whatever machine Fortran was
first implemented on.

Suppose Fortran started out with three floating-point formats, and furthermore
suppose that C also had this.  Then i bet you'd see a lot of machines today
which directly support three different floating-point formats.  Of course you
can always make two of them the same, but most machines probably wouldn't.

Another example is that Common Lisp has multi-precision integers and short
floating-point types.  If you look at a Lisp machine, you'll find that these
types are supported directly by the microcode.  There's nothing mysterious
about this; it's quite natural.  Of course people don't build Lisp machines
any more, but that's another story...

jgk@osc.COM (Joe Keane) (09/11/90)

In article <STEPHEN.90Sep6205415@estragon.uchicago.edu>
stephen@estragon.uchicago.edu (Stephen P Spackman) writes:
>They have two problems; the first is slowth, and the second is that
>the damned things aren't (for technical reasons having to do with
>decidability, roughly speaking) totally ordered (there're numbers with
>no SIGN, for example, because they're unordered w.r.t 0 - but they're
>all pretty damned small!).

I like your post, but i'd like to point out that i don't think either of these
are really problems.

The matter of speed is due to current architectures.  Suppose we have an
architecture which has fast support for continuations, associative lookups,
all the things you want for a good system anyway.  Then we can design some
weird formats for partially-computed numbers.  If we have instructions which
work on these on do whatever amount of work you can get done in a couple
cycles, everything works out very well.  Unlike floating-point numbers, the
format only affects how fast the result is computed, not the value of the
result.  So 1/3*3 is 1 no matter what machine you're on.  In fact, i'd argue
exactly the opposite of the objection.  On-demand precision is inherently
faster because you always do exactly enough work to get the result you want,
and no more.

The second problem is really a theoretical limitation which applies to all
computations.  It says we can't always be sure whether two numbers are equal.
For example, suppose we compute pi by two different methods and then ask to
compare them.  The answer from a system with arbitrary-precision is something
like ``I've computed them both to 100 decimal digits, and they agree this far.
Do you want to extend the computation?''  In contrast, the floating-point
system just makes up an answer, either ``The first is bigger by exactly
2^-53.''  (wrong) or ``They're exactly equal.'' (how does it know?).  I don't
know about you guys, but i appreciate computer systems being honest.

Just another theoretical post from the desk of...
[I don't actually have a signature.]

ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (09/12/90)

In article <3785@osc.COM>, jgk@osc.COM (Joe Keane) writes:
> Suppose Fortran started out with three floating-point formats, and furthermore
> suppose that C also had this.  Then i bet you'd see a lot of machines today
> which directly support three different floating-point formats.

C *has* three floating-point types:  float, double, and long double.
Perhaps there aren't a lot of _architectures_ with three floating-point
types (although the VAX has four, and some IBM 370s have three), but a
lot of _machines_ have been shipped with IEEE single, double, and
extended formats (supported by a coprocessor chip).

> Another example is that Common Lisp has multi-precision integers and short
> floating-point types.  If you look at a Lisp machine, you'll find that these
> types are supported directly by the microcode.

Yes, but here it is a very clear case of the language driving the machine.
Bignums (integers which are whatever size they need to be, in a language
where your program doesn't have to _say_ what size they'll be) are there
because people were writing computer algebra programs and had a _use_ for
bignums.  On Xerox Lisp machines, bignums are actually handled in Lisp
code, _not_ in microcode.  Some models of the Xerox Lisp machines handled
floating-point in Lisp, some didn't.  It's an engineering tradeoff.

Many "stock" architectures (/370, 680x0, 80*86, 32?32, ... provide all
the primitives (add with carry, n*n -> 2n multiply, and so on) that are
needed to synthesise integer arithmetic of any size.  (Perhaps this is
a lingering relic of COBOL?)  I have never understood why Pascal compilers
don't exploit this to let you declare integers of whatever size you need.

-- 
Heuer's Law:  Any feature is a bug unless it can be turned off.