[comp.arch] Fast conversions, another urban myth?

albaugh@dms.UUCP (Mike Albaugh) (09/16/89)

	Sorry for the cross-post, but I'm looking for data from
comp.lang.misc or algorithms from there or comp.arch. Besides,
this question stems from a recent bout on comp.arch.

	Anyway, the "conventional wisdom" on comp.arch was that:

	BCD math is inherently slow
	Cobol doesn't do much math anyway
  ergo: "modern" processors don't need BCD math, just convert to
	binary and do the math there, then back to BCD for output.

	A counter-example was given of the HPPA, which had decimal
correction instructions, but the general attitude seemed to be as
expressed above. This seems to imply that the conversion to binary
(and back) is free, or at least cheap, or that, contrary to the
conventional wisdom, COBOL _does_ do enough math to "buy back" the
cost of conversion. If the latter, I'm only curious and would welcome
supporting statistics. If the former, then someone, somewhere has some
really spiff BCD->binary->BCD routines, because the best I can come up
with have the following sort of "costs" (where a 32-bit binary add is 1)

	BCD add 	2-8
	BCD->Binary 	9-30
	Binary->BCD 	16(_very_ optimistic)-100

so to "pay back" the cost of conversion, an 8-digit BCD number would
need to particaipate in at least 3 additions (fastest convert, slowest
add), and this assumes that the 1-cycle multiply needed for the fastest
convert cannot be used in the BCD add. Realistic guesses are more like
10 additions to pay for conversion. When all those numbers like SSN and
part-number that are not really arithmetic at all get averaged in, it
starts looking pretty unlikely from where I sit. Yes, I know SSN _could_
be stored as a string, but consider that a lot of those COBOL programs
were written when 10Meg was a _big_ disk, so wanton waste of 4.5 bytes
on each of some large number of records was not condoned.

	I realize that not all (many?) of the numbers involved are
8 BCD digits, but all the faster algorithms I am aware of "scale"
within the range of my approximations, and all are hit in proportion
(i.e. conversion speeds up about as much as BCD add does). Actually
the "sawtooth" of time vs number of digits hits bin->BCD a bit harder
than the other two, when we exceed the word length.

	To forestall possible flames: I am _not_ contending that
"modern" processors need BCD in hardware. HP's mix may not be yours.
But BCD library routines are not really tough on most processors.

	So, where did I go wrong, or should I say, where do I find
that blazing Binary->BCD routine? :-)

				Mike
			(I'll take my answer off the air)

| Mike Albaugh (albaugh@dms.UUCP || {...decwrl!pyramid!}weitek!dms!albaugh)
| Atari Games Corp (Arcade Games, no relation to the makers of the ST)
| 675 Sycamore Dr. Milpitas, CA 95035		voice: (408)434-1709
| The opinions expressed are my own (Boy, are they ever)

ok@cs.mu.oz.au (Richard O'Keefe) (09/16/89)

In article <832@dms.UUCP>, albaugh@dms.UUCP (Mike Albaugh) writes:
[asking about BCD -vs- binary arithmetic]

Burroughs considered this when designing the B6700.  There was a paper
about it, but I'm afraid I've lost the reference.  Their decision was
to concentrate on making binary<->decimal conversion fast (love that
ScaleRightFinal opcode).  There are several points to note:
(1) COBOL requires that 18-digit arithmetic be supported (~60 bits).
    [B6700s could do arithmetic on 78-bit+sign binary integers.]
(2) COBOL doesn't just require arithmetic on "BCD" (alias "packed decimal").
    It requires that arithmetic on text (USAGE DISPLAY) (ASCII or EBCDIC
    or whatever) be supported as well.  So just having packed decimal
    arithmetic doesn't solve everything, legal COBOL programs _still_
    require conversion, so a compiler has to be ready to optimise
    text<->BCD conversion, and if it is, it has most of the machinery
    in place to optimise BCD<->binary conversion.
(3) It is often possible to amortize the cost of a conversion over
    several uses.  For example,
	77 STUPID-EXAMPLE PIC S9(12) USAGE DISPLAY.
	77 STUPID-RESULT PIC S9(12) USAGE DISPLAY.
	...
	MOVE ZEROS TO STUPID-RESULT.
	PERFORM ADD-EXAMPLE VARYING I FROM 1 TO 50.
	...
    ADD-EXAMPLE.
	ADD STUPID-EXAMPLE TO STUPID-RESULT.

    A smart compiler can treat this as

	long long int temp1, temp2;

	temp1 = text_to_binary(STUPID-EXAMPLE, S9_12);
	temp2 = 0;
	for (i = 1; i <= 50; i++) temp2 += temp1;
	binary_to_text(STUPID-RESULT, temp2, S9_12);

>[unit cost is 32-bit binary add = 1] 
> 	BCD add 	2-8
> 	BCD->Binary 	9-30
> 	Binary->BCD 	16(_very_ optimistic)-100

Don't forget your BCD multiplications and your BCD divisions!  If you're
not going to provide them, your compiler still has to hack BCD<->binary.
The conversion estimates here are FAR too high; 9 decimal <-> 30 binary
conversion can be done much faster than that with the aid of a few tables.
(You don't have to convert one digit at a time, you know!)

> add), and this assumes that the 1-cycle multiply needed for the fastest
> convert cannot be used in the BCD add.
What multiply?  BCD<->binary conversions don't need any multiplications
or divisions!

> 	So, where did I go wrong, or should I say, where do I find
> that blazing Binary->BCD routine? :-)

I've got some code somewhere, if I can dig it up I'll post it.

barry@PRC.Unisys.COM (Barry Traylor) (09/18/89)

In article <832@dms.UUCP> albaugh@dms.UUCP (Mike Albaugh) writes:
> ,,,
>	So, where did I go wrong, or should I say, where do I find
>that blazing Binary->BCD routine? :-)
>

I'm afraid I can't state specifics, but the large A Series stuff (A12/15/17)
does it with a fair amount of silicon and not very much microcode.  There
is a raging and recurring debate between the A Series types (including
myself) and the V Series types (V Series stuff is DECIMAL, including
addressing and peripherals, at least historically) about decimal arithmetic
vs convert in / binary ops / convert out.  All I can say is that with the
proper amount of silicon, OUR method beats THEIR method.  

Also keep in mind that with decimal/binary convert ops, all of the binary 
arithmetic ops, library routines, etc, etc, can be brought to bear.  The 
real trick, as has already been mentioned, is to make the convert ops as 
fast as possible.  Unfortunately this is not something I can discuss in 
such a forum.  We have to keep our engineering advantages, where they exist.

....

Adding fuel to a different fire, while our instruction set includes a lot
of ops, many are special ops for the operating system and many more are
string manipulation ops.  None the less, we have 1 unprotected (vanilla)
load op, two unprotected (vanilla) store ops, 3 array reference generations
ops, 1 each of add, subtract, mulitply, divide, remainder divide, integer
divide, integerize (rounded), integerize (truncated), 1 BCD convert in and
2 (I think) BCD convert out ops.  Pretty reduced, I'd say.  And this is an
instruction set that has been around in pretty much the same form for 20
years.

One other nice thing:  our integers data type is actually a subset of our
floating data type.  All of the arithmetic ops work against either floating
or integer, single or double precision data.

Barry Traylor
Unisys A Series Engineering
Paoli, Pa
barry@prc.unisys.com

scott@bbxsda.UUCP (Scott Amspoker) (09/18/89)

In article <832@dms.UUCP> albaugh@dms.UUCP (Mike Albaugh) writes:
>
>	Anyway, the "conventional wisdom" on comp.arch was that:
>
>	BCD math is inherently slow
>	Cobol doesn't do much math anyway
>  ergo: "modern" processors don't need BCD math, just convert to
>	binary and do the math there, then back to BCD for output.
>

This has come up a lot with us since we do business-oriented software.
I have come to discover that there are a few other considerations
when converting BCD to binary and back again:

1)  Rounding errors can and do creep in.

2)  Many business software requires rounding of results to a specified
    number of decimal places.  This cannot be done easily in binary.
    One solution is to do it after you return to BCD but you might
    end up rounding up and then you're back to messy BCD digit
    manipulation.

If decimal precision is required one can always consider a binary
format that has an implicit decimal point on the right rather than
the left.  There are other problems associated with that too.

mash@mips.COM (John Mashey) (09/19/89)

In article <11488@burdvax.PRC.Unisys.COM> barry@PRC.Unisys.COM (Barry Traylor) writes:

>I'm afraid I can't state specifics, but the large A Series stuff (A12/15/17)
>does it with a fair amount of silicon and not very much microcode.  There
>is a raging and recurring debate between the A Series types (including
>myself) and the V Series types (V Series stuff is DECIMAL, including
>addressing and peripherals, at least historically) about decimal arithmetic
>vs convert in / binary ops / convert out.  All I can say is that with the
>proper amount of silicon, OUR method beats THEIR method.  

It's good to see this topic discussed a bit more.  Can you share any
info, like frequencies of operations, i.e., info that isn't
implementation-secret?

>2 (I think) BCD convert out ops.  Pretty reduced, I'd say.  And this is an
>instruction set that has been around in pretty much the same form for 20
>years.

Although I think RISC is a much better tradeoff, for various reasons,
and the A-series is fairly far in the other direction (it's not the
number of operations, it's more the complexity of addressing modes,
and other issues), I've long been an admirer of the philosophy behind
the B-5000 (the long-ago ancestor of the A-series).  At least it was the
first major computer line I can think of that really showed a strong
vision of designing the hardware and software closely together.
Also, for the early 60s, it was amazingly far ahead of its time in many ways.
At least a brief look at this architecture is well worth the time
for anyone interested in computer architecture, even if it's not the
direction of most current machines.

As a related topic, wearing my SPEC hat, does anybody out there have
GOOD COBOL benchmarks that have a chance of following SPEC rules:
	1) >= 60 seconds on 8-mips machine, preferably more.
	2) Public-domain, or able to be widely distributable in source form.
	3) Reasonably portable, or be able to made portable
	4) Correct exeuction can be verified mechanically in some way.
I REALLY want to get some such COBOL programs.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{ames,decwrl,prls,pyramid}!mips!mash  OR  mash@mips.com
DDD:  	408-991-0253 or 408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

ECULHAM@UALTAVM.BITNET (Earl Culham) (09/19/89)

In article <832@dms.UUCP>, albaugh@dms.UUCP (Mike Albaugh) writes:

>
>        Sorry for the cross-post, but I'm looking for data from
>comp.lang.misc or algorithms from there or comp.arch. Besides,
>this question stems from a recent bout on comp.arch.
>
>        Anyway, the "conventional wisdom" on comp.arch was that:
>
>        BCD math is inherently slow
>        Cobol doesn't do much math anyway
>  ergo: "modern" processors don't need BCD math, just convert to
>        binary and do the math there, then back to BCD for output.
>
>        ...
>
>with have the following sort of "costs" (where a 32-bit binary add is 1)
>
>        BCD add         2-8
>        BCD->Binary     9-30
>        Binary->BCD     16(_very_ optimistic)-100
>
>so to "pay back" the cost of conversion, an 8-digit BCD number would
>need to particaipate in at least 3 additions (fastest convert, slowest
>add), and this assumes that the 1-cycle multiply needed for the fastest
>convert cannot be used in the BCD add. Realistic guesses are more like
>10 additions to pay for conversion. When all those numbers like SSN and
>        ...
>        So, where did I go wrong, or should I say, where do I find
>that blazing Binary->BCD routine? :-)

This is such a classic RISC discussion that I couldn't resist putting
it in the news instead of in direct mail.

First, most machines that provide decimal hardware also provide the
hardware to convert to and from binary. This skews your numbers
slightly. (Numbers from an older Amdahl.)

    Binary add        1
    Decimal add       6.5-23.5  (high end is for 16 digit numbers)
    Binary to decimal 3.5-20.5
    Decimal to binary 21

This gives us a worst case of about 7 to 1. (Every add is first
converted to binary, done, then converted back. Like I say, this is
the *WORST* case. :-) )

Now, how much time would be spent doing decimal arithmetic (even on
a machine with fast decimal hardware)? Likely less than 1%, but certainly
less than 10%.

If we throw away the hardware, and do the decimal arithmetic in software,
we get a lot simpler machine to build. But, what do the performance
numbers look like?

Hardware ==>   .9 + 1(.1) = 1
Software ==>   .9 + 7(.1) = 1.6

I've clearly exaggerated the performance gain of the decimal hardware.
Yet even with that, all that extra hardware cannot even double the
performance of the system.

Clearly, decimal arithmetic is one of those high cost, low payback
extensions. We should direct our efforts elsewhere.

barry@PRC.Unisys.COM (Barry Traylor) (09/20/89)

In article <27768@winchester.mips.COM> mash@mips.COM (John Mashey) writes:
>
>It's good to see this topic discussed a bit more.  Can you share any
>info, like frequencies of operations, i.e., info that isn't
>implementation-secret?

Argh!  I'm not quite sure what the frequency of the  ops are.  We have done
a fair amount of program analysis and backplane probing.  My impression,
talking to the analyzers and probers, is that arithmetic in COBOL is rare
compared to just about everything else.  There seems to be a lot of
character-oriented data movement.  As has already been mentioned in this
forum, a lot of cobol computation is done with non-packed character
strings.  Some thought was given to this in our architecture and the result
is that the conversion from EBCDIC to binary is not much worse than from
BCD to binary.  The compiler generated code looks pretty good too.  Our
binary to BCD ops are actually decimal scaling ops.  This seems to satisfy
the rounding problems (although I'd be hard pressed to answer the "why"
question.  After all, I'm an operating system designer, although I seem to
be spending increasing amounts of time with the hardware folks).

Keeping in mind that ALL of our I/O routines are in the operating system
and that there is virtually no penalty for the context switch into the OS,
we have found with our probes that most COBOL programs spend 80% or more of
their time doing I/O.  That interesting discovery led to some interesting
biasing of the instruction set performance to handling OS code, with very
gratifying overall results.  It also led to the offloading of process
switching and I/O processing code into special purpose processors, also
giving gratifying results.

Barry Traylor
Unisys A Series Engineering
barry@prc.unisys.com

sbf10@uts.amdahl.com (Samuel Fuller) (09/21/89)

In article <688@UALTAVM.BITNET> ECULHAM@UALTAVM.BITNET writes:
>If we throw away the hardware, and do the decimal arithmetic in software,
>we get a lot simpler machine to build. But, what do the performance
>numbers look like?
> 
>Hardware ==>   .9 + 1(.1) = 1
>Software ==>   .9 + 7(.1) = 1.6
> 
>I've clearly exaggerated the performance gain of the decimal hardware.
>Yet even with that, all that extra hardware cannot even double the
>performance of the system.
> 
>Clearly, decimal arithmetic is one of those high cost, low payback
>extensions. We should direct our efforts elsewhere.

If a few extra chips can give a machine 60% better performance
on the workloads that customers run, then it makes no sense to
not put the chips in.
A little extra complexity in the CPU will not increase the overall
system cost by 60%.  However, the 60% increase in performance over
competitors (who went the software route), will allow a 60% premium
in the price of my machines, and its all gravy.

If you are building scientific workstations then decimal hardware will
be just wasted gates.

If you are building business machines, where COBOL is still the language
of choice, then to not do decimal arithemetic in hardware is taking a
very big chance.
-- 
---------------------------------------------------------------------------
Sam Fuller / Amdahl System Performance Architecture

I speak for myself, from the brown hills of San Jose.

UUCP: {ames,decwrl,uunet}!amdahl!sbf10 | USPS: 1250 E. Arques Ave (M/S 139)
INTERNET: sbf10@amdahl.com             |       P.O. Box 3470
PHONE: (408) 746-8927                  |       Sunnyvale, CA 94088-3470
---------------------------------------------------------------------------

mash@mips.COM (John Mashey) (09/21/89)

In article <9dAz02zs58y201@amdahl.uts.amdahl.com> sbf10@amdahl.uts.amdahl.com (Samuel Fuller) writes:
>In article <688@UALTAVM.BITNET> ECULHAM@UALTAVM.BITNET writes:
>>If we throw away the hardware, and do the decimal arithmetic in software,
>>we get a lot simpler machine to build. But, what do the performance
>>numbers look like?
>> 
>>Hardware ==>   .9 + 1(.1) = 1
>>Software ==>   .9 + 7(.1) = 1.6
 
>>I've clearly exaggerated the performance gain of the decimal hardware.
>>Yet even with that, all that extra hardware cannot even double the
>>performance of the system.
>> 
>>Clearly, decimal arithmetic is one of those high cost, low payback
>>extensions. We should direct our efforts elsewhere.
>
>If a few extra chips can give a machine 60% better performance
>on the workloads that customers run, then it makes no sense to
>not put the chips in.

If a modest amount of hardware makes this much difference, people
would certainly want to put it in.

The REAL question is: how much difference does it really make?
Everybody please remember this was a "what-if" analysis, not an
analysis that claimed that it really cost 1.6X in performance.
At least several vendors who ahve looked at this and did care about
COBOL didn't think that it was this much degradation to overall system
performance, or at least, in designing their RISCs, included a few
simple instructions to help the issue, and chose to stop there.
-- 
-john mashey	DISCLAIMER: <generic disclaimer, I speak for me only, etc>
UUCP: 	{ames,decwrl,prls,pyramid}!mips!mash  OR  mash@mips.com
DDD:  	408-991-0253 or 408-720-1700, x253
USPS: 	MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086

davecb@yunexus.UUCP (David Collier-Brown) (09/21/89)

mash@mips.COM (John Mashey) writes:
>The REAL question is: how much difference does it really make?
>Everybody please remember this was a "what-if" analysis, not an
>analysis that claimed that it really cost 1.6X in performance.
>At least several vendors who have looked at this and did care about
>COBOL didn't think that it was this much degradation to overall system
>performance, or at least, in designing their RISCs, included a few
>simple instructions to help the issue, and chose to stop there.

  I wonder if there is enough evidence about COBOL use to validate
a technique used both in elderly CISC machines and tiny, archaic ICL
micros: doing arithmetic on ascii strings directly.

  I once had to recode an ICL package to increase the maximum
length of string that could be added (:-(), and noted that the
inner loop amounted to a very small number of instructions.  This
raises the possibility that one could drag this old technique out
and apply it to modern machines, possibly with a tad of hardware
assist, much like the decimal normalize instructions discussed before.
  But the ICL did ascii (well, excess-3) arithmetic because it was
easy, not because it ran COBOL (it did not), and the CISC had both
conversion and ascii-arithmetic instructions...  Which means that
these techniques were not necessarily backed up by evidence.

  Who has evidence?  And how "easy" is, for example,
	word *add(word *,word*); /* A "word" is 2 ascii characters. */
on a given RISC machine?

--dave
-- 
David Collier-Brown,  | davecb@yunexus, ...!yunexus!davecb or
72 Abitibi Ave.,      | {toronto area...}lethe!dave 
Willowdale, Ontario,  | Joyce C-B:
CANADA. 416-223-8968  |    He's so smart he's dumb.

scott@bbxsda.UUCP (Scott Amspoker) (09/21/89)

In article <27935@winchester.mips.COM> mash@mips.COM (John Mashey) writes:
 >The REAL question is: how much difference does it really make?
 >Everybody please remember this was a "what-if" analysis, not an
 >analysis that claimed that it really cost 1.6X in performance.
 >At least several vendors who ahve looked at this and did care about
 >COBOL didn't think that it was this much degradation to overall system
 >performance, or at least, in designing their RISCs, included a few
 >simple instructions to help the issue, and chose to stop there.

If I may throw in my $.02 worth:  Decimal arithmetic is alive and
well and not merely an artifact of old COBOL implementations.  We
do everything in C and require decimal arithmetic because of the
business nature of our applications.  It is a constant disappointment
when the scientific-oriented hardware industry forgets that some of
us need to have 100 pennies == 1 dollar.  Interest calculations
carried out to 3 or 4 decimal places need to round correctly.
Converting to binary and back to decimal is not the answer (as
I explained in a previous posting).

Fortunately, business applications are usually I/O bound so the
decimal math can be simulated in software without a perceived
performance penality (unless you bring in an unethical benchmark).

If decimal add and subtract are too costly in the hardware then
why not a simple decimal adjust instruction (like on the 80x86)?
That instruction alone is a great savings when you consider the
software alternatives.  The 3B2 has *nothing* to help decimal
arithmetic.

-- 
Scott Amspoker
Basis International, Albuquerque, NM
(505) 345-5232

preston@titan.rice.edu (Preston Briggs) (09/22/89)

In article <136@bbxsda.UUCP> scott@bbxsda.UUCP (Scott Amspoker) writes:

>Fortunately, business applications are usually I/O bound so the
>decimal math can be simulated in software without a perceived
>performance penality (unless you bring in an unethical benchmark).

But isn't this what the chip-makers say?
If a function is so unimportant that it can be simulated in software
without performance penalty, than it seems better not
to spend hardware on it.

beyer@cbnewsh.ATT.COM (jean-david.beyer) (09/22/89)

In article <136@bbxsda.UUCP>, scott@bbxsda.UUCP (Scott Amspoker) writes:
> 
> If I may throw in my $.02 worth:  Decimal arithmetic is alive and
> well and not merely an artifact of old COBOL implementations.  We
> do everything in C and require decimal arithmetic because of the
> business nature of our applications.  It is a constant disappointment
> when the scientific-oriented hardware industry forgets that some of
> us need to have 100 pennies == 1 dollar.  Interest calculations

As a former scientific computer user, I hardly consider that the
computer hardware industry is scientific-oriented. Shrinking wordsize
from 36 bits to 32 bits, screwing up round-off algorithms for a long time,
and so on. Begrudgingly adding floating-point operations to obviously
business-oriented machines. I do not know who the machines are really
designed for. (I recognize that some manufacturer may make machines
suited to the needs of the user, but why do they seem to be the exceptions.)

I have never done business-oriented calculations, but  it seems to me,
as an outsider, that calculating everything in pennies solves the round-off
problems (but perhaps no others), doesn't it?

-- 
Jean-David Beyer
AT&T Bell Laboratories
Holmdel, New Jersey, 07733
attunix!beyer

scott@bbxsda.UUCP (Scott Amspoker) (09/22/89)

In article <4125@cbnewsh.ATT.COM> beyer@cbnewsh.ATT.COM (jean-david.beyer) writes:
>>                                  .....It is a constant disappointment
>> when the scientific-oriented hardware industry forgets that some of
>> us need to have 100 pennies == 1 dollar.
>
>As a former scientific computer user, I hardly consider that the
>computer hardware industry is scientific-oriented.

OK, they're pissing *everybody* off.  :-)

I haven't done any real number crunching since college.  Professionally
I deal with business applications.  It's not as boring as many people
think.  (and I know where my bread is buttered).  I've always considered
companies like DEC and HP to be more scientific whereas IBM was business
oriented.

>I have never done business-oriented calculations, but  it seems to me,
>as an outsider, that calculating everything in pennies solves the round-off
>problems (but perhaps no others), doesn't it?

Because sometimes you need 3, 4, or more decimals places for calculations,
especially in some interest calculations.  People have inventories where
the unit price of a small item is a fraction of a penny (such items
are bought in *large* quantities).  I've seen the decimal scaling
approach work before but it's not always practical in a general purpose
environment.

-- 
Scott Amspoker
Basis International, Albuquerque, NM
(505) 345-5232

henry@utzoo.uucp (Henry Spencer) (09/23/89)

In article <136@bbxsda.UUCP> scott@bbxsda.UUCP (Scott Amspoker) writes:
>...do everything in C and require decimal arithmetic because of the
>business nature of our applications.  It is a constant disappointment
>when the scientific-oriented hardware industry forgets that some of
>us need to have 100 pennies == 1 dollar.  Interest calculations
>carried out to 3 or 4 decimal places need to round correctly.
>Converting to binary and back to decimal is not the answer...

Can you explain why binary integer arithmetic is less accurate than
decimal integer arithmetic for your applications?  Sounds like you're
confusing the decimal/binary choice with the integer/floating-point
choice; the two are orthogonal and independent.  (Yes, there are
decimal floating-point boxes in the world, e.g. in your pocket
calculator.)
-- 
"Where is D.D. Harriman now,   |     Henry Spencer at U of Toronto Zoology
when we really *need* him?"    | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

pmontgom@sonia.math.ucla.edu (Peter Montgomery) (09/23/89)

In article <3902@yunexus.UUCP> davecb@yunexus.UUCP (David Collier-Brown) writes:
>mash@mips.COM (John Mashey) writes:
>  But the ICL did ascii (well, excess-3) arithmetic because it was
>easy, not because it ran COBOL (it did not), and the CISC had both
>conversion and ascii-arithmetic instructions...  Which means that
>these techniques were not necessarily backed up by evidence.
>
>  Who has evidence?  And how "easy" is, for example,
>	word *add(word *,word*); /* A "word" is 2 ascii characters. */
>on a given RISC machine?
>

	Let W1 and W2 be two words containing decimal numbers in ASCII.
I assume that W1, W2 contain only digits (no signs, decimal points, 
or blanks) and that the less significant ASCII digit
is stored in the less significant binary position.  Recall the ASCII
digits are 0x30 hex (= 48 decimal) to 0x39 (= 57).

	We want to form (cnew, Wsum) = W1 + W2 + cold where cold, cnew 
are binary carries (0 or 1).  A first algorithm could resemble:

	W3 = W1 + W2 + cold - 0x3030;
	if ((W3 & 0x00FF) > 0x0039)
		W3 = W3 + (0x0100 - 0x000A);  /* Carry ones to tens */
	if ((W3 & 0xFF00) > 0x3900) {
		W3 = W3 - 0x0A00;
		cnew = 1;		    /* Carry tens to hundreds */	
	} else { 
		cnew = 0;
	}

If no (decimal) carries during the ASCII add, then the first
estimate for W3 will be correct.  The above algorithm separately checks
each byte in W3 for validity, generating a carry if needed.
We can avoid the IFs by checking all the places at once,
but must be prepared for carry propagation
(e.g., 434 + 168 = 602; the carry from the units digits
affects the hundreds digit).  Here is a more sophisticated
algorithm, which adds two 4-byte ASCII strings (32-bits each), 
producing sum and carry:

	T = W1 + W2 + cold - 0x30303030;
	Tc = (0x39393939 - T) & 0x80808080;
			/* Identify where carries generated */
	W3 = T + (256 - 10)*(Tc >> 7);
	cnew = Tc >> 31;

This code uses 5 integer additions/subtractions, 2 shifts, 1 AND, 
1 integer multiply.  If your RISC architecture lacks an integer 
multiply (shame!), then use (note four adjacent bits will be set
in Tc wherever a carry is generated, and 256 - 10 = 16*15 + 16*3/8):

	T = W1 + W2 + cold - 0x30303030;
	Tc = (0x39393939 - T) & 0xF0F0F0F0;
	W3 = T + Tc + ((Tc & 0x30303030) >> 3);
	cnew = Tc >> 31;

	If the ASCII digits are stored in a string,
this algorithm requires you be able to convert a four-character
string to a 32-bit binary number and vice versa.
Just yesterday, I wanted this operation in order to use
the system hostname as part of the initial seed for a random number
generator, but I didn't know how to do it easily in C or FORTRAN 77
(the TRANSFER intrinsic function in the FORTRAN 8x draft may do it).
How many higher level languages make this primitive operation available?

--------
        Peter Montgomery
        pmontgom@MATH.UCLA.EDU

dibble@cs.rochester.edu (Peter C. Dibble) (09/24/89)

In article <688@UALTAVM.BITNET> ECULHAM@UALTAVM.BITNET writes:
>In article <832@dms.UUCP>, albaugh@dms.UUCP (Mike Albaugh) writes:
> <<maybe hardware decimal support would go 10% faster than software>>

>I've clearly exaggerated the performance gain of the decimal hardware.
>Yet even with that, all that extra hardware cannot even double the
>performance of the system.
> 
>Clearly, decimal arithmetic is one of those high cost, low payback
>extensions. We should direct our efforts elsewhere.

How much would that extra hardware cost?  Would it double the cost of
a microprocessor?  Add 10 % to the cost of a mainframe's CPU?

Computers that run Cobol are often heavily loaded with expensive I/O (and
memory and ...).  The processor would have to go up in price a _lot_ to 
make as much difference as a 10 % improvement in decimal speed.

Peter Dibble

prc@erbe.se (Robert Claeson) (09/24/89)

In article <150@bbxsda.UUCP> scott@bbxsda.UUCP (Scott Amspoker) writes:

>>I have never done business-oriented calculations, but  it seems to me,
>>as an outsider, that calculating everything in pennies solves the round-off
>>problems (but perhaps no others), doesn't it?

>Because sometimes you need 3, 4, or more decimals places for calculations,
>especially in some interest calculations.  People have inventories where
>the unit price of a small item is a fraction of a penny (such items
>are bought in *large* quantities).

Or, because sometimes you need to count many billions of amounts (I didn't
say dollars because that's not what we count; however, the problem is the
same) with 3, 4 or even more decimal places.

Gimme a library of accurate, infinite-precision arithmetic routines, anytime.

-- 
          Robert Claeson      E-mail: rclaeson@erbe.se
	  ERBE DATA AB

scott@bbxsda.UUCP (Scott Amspoker) (09/25/89)

In article <1989Sep22.201906.10618@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>Can you explain why binary integer arithmetic is less accurate than
>decimal integer arithmetic for your applications?

I was refering to decimal floating point as opposed to binary floating
point.  Integers never entered into the discussion at all.  I have no
idea why somebody would use decimal integers instead of binary integers.

-- 
Scott Amspoker
Basis International, Albuquerque, NM
(505) 345-5232

peter@ficc.uu.net (Peter da Silva) (09/25/89)

In article <832@dms.UUCP>, albaugh@dms.UUCP (Mike Albaugh) writes:
about blowing off BCD operations and just converting BCD to binary when you
need to do math on it (well, he said "when read in and written out", but
that's a little more restrictive than we need):

> Realistic guesses are more like
> 10 additions to pay for conversion.

I'll buy that.

> When all those numbers like SSN and
> part-number that are not really arithmetic at all get averaged in,

Non-arithmetic values don't need to be converted. They can be stored as
binary or manipulated as BCD. About the most you need to do on them is
sort them, and it just so happens that BCD values can be profitably
sorted with binary operations.

> 	So, where did I go wrong, or should I say, where do I find
> that blazing Binary->BCD routine? :-)

Assuming that conversion always needs to be done. Now you just need that
ideal binary-value cache.
-- 
Peter da Silva, *NIX support guy @ Ferranti International Controls Corporation.
Biz: peter@ficc.uu.net, +1 713 274 5180. Fun: peter@sugar.hackercorp.com. `-_-'
"That is not the Usenet tradition, but it's a solidly-entrenched            U
 delusion now." -- brian@ucsd.Edu (Brian Kantor)

henry@utzoo.uucp (Henry Spencer) (09/26/89)

In article <164@bbxsda.UUCP> scott@bbxsda.UUCP (Scott Amspoker) writes:
>>Can you explain why binary integer arithmetic is less accurate than
>>decimal integer arithmetic for your applications?
>
>I was refering to decimal floating point as opposed to binary floating
>point.  Integers never entered into the discussion at all...

Decimal *floating* point?  For money?  Are you sure you don't mean
decimal *fixed* point?  (That is, integers plus a bit of compile-time
bookkeeping for decimal-point positions.)  If you're thinking of COBOL
arithmetic, that's definitely fixed-point.
-- 
"Where is D.D. Harriman now,   |     Henry Spencer at U of Toronto Zoology
when we really *need* him?"    | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

news@ism780c.isc.com (News system) (09/26/89)

In article <4125@cbnewsh.ATT.COM> beyer@cbnewsh.ATT.COM (jean-david.beyer) writes:
>I have never done business-oriented calculations, but  it seems to me,
>as an outsider, that calculating everything in pennies solves the round-off
>problems (but perhaps no others), doesn't it?

That scheme works until one includes divide as an operation or until one
needs to convert from pennies to Yen.  The problem does not depend on the
base in which arithmetic is done.  The problem is due to the fact that, in
general, a rational number cannot be represented as a radix fraction.  The
only way to avoid round off error (with divides) is to do rational
arithmetic.  Thus 1 divided by 3 is represented by the couple (1,3) and round
off error does not occur.  But even this fails if one is required to compute
the area of the penny given its diameter.  Pi is not a rational number.

Moral: There is no way to avoid round off error.  So we have to learn to
live with it.

     Marv Rubinstein

jvm@hpfcda.HP.COM (Jack McClurg) (09/26/89)

From Mash's reference earlier I looked up a reference in the Hewlett-Packard
Journal.  On an IBM370 running COBOL programs, 1.6% of the instructions
exucuted are decimal math and 5.9% of the time spent.  This is from page 31
of the August 1986 Issue.

From the January 1986 issue, here is the instruction sequence to perform a
packed add:
		r1 and r2 contain packed decimal operands
		r3 contains 0x99999999
	UADDCM	1,3,31		; prebias operand into r31
	ADD   	2,31,31		; perform binary add
	DCOR  	31,31		; correct result

an unpacked add:
		r1 and r2 contain unpacked decimal operands
		r3 contains 0x96969696
		r4 contains 0x0f0f0f0f
		r5 contains 0x30303030
	ADD	3,1,31		; prebias operand into r31
	ADD	31,2,31		; binary add into r31
	DCOR	31,31		; correct result
	AND	4,31,31		; mask result
	OR	5,31,31		; restore sum to unpacked decimal

Jack McClurg
jvm%hpfcda@hplabs.hp.com

aglew@urbana.mcd.mot.com (Andy-Krazy-Glew) (09/26/89)

Performing arithmetic in ASCII strings has the possible advantage that
you are already wasting space, so you might want to consider using
redundant arithmetic representations (eg. in base 10 allow your digits
to range from -16 to +16) using the free space that the ASCII encoding
provides (having paused to clear it out).
    If you are doing complicated calculations, this would mean that you
do not have to worry about carry propagation in intermediate results
(carries can only propagate one digit position in redundant arithmetic)
until you finally have to output the data.
    I'm not suggesting hardware, just taking advantage of the technique
in software.  Ditto for extended precision -- do the folks who do
really big extended precision (MACSYMA, MATHEMATICA, the guys who
crunch big primes) use carry-free or carry-propagate arithmetic.

Of course, business calculations are already fast enough...

--
Andy "Krazy" Glew,  Motorola MCD,    	    	    aglew@urbana.mcd.mot.com
1101 E. University, Urbana, IL 61801, USA.          {uunet!,}uiucuxc!udc!aglew
   
My opinions are my own; I indicate my company only so that the reader
may account for any possible bias I may have towards our products.

scott@bbxsda.UUCP (Scott Amspoker) (09/26/89)

In article <1989Sep25.184425.20936@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>Decimal *floating* point?  For money?  Are you sure you don't mean
>decimal *fixed* point?  (That is, integers plus a bit of compile-time
>bookkeeping for decimal-point positions.)  If you're thinking of COBOL
>arithmetic, that's definitely fixed-point.

Forgive me for not being more descriptive.  I live with this stuff every
day and forget that what I am doing is *not* very typical.  What I am
talking about is a decimal floatingpoint representation of numbers
that are normally used for business calculations which includes an
arbitrary amount of precision.  The user may specify a desired precision
at any time for any calculation.  Fixed-point is not as practical
under these circumstances.  Data items do not carry a precision attribute
with them.  Any data item must be prepared to take on any precision
(up to 14 digits) at any time.

I'm not trying to convert the whole world over to this manner of doing
math.  I merely suggested that a little help in the hardware (such as
access to the half-carry) would be of use to us and perhaps many other
developers who are in a similar situation.

-- 
Scott Amspoker
Basis International, Albuquerque, NM
(505) 345-5232

gwg@csadfa.oz (George Gerrity) (10/06/89)

   I have been following the correspondence on decimal conversion with
some interest, and it seems to me that a number of the points raised on
the use of decimal arithmetic in data processing, particularly when
using COBOL as the programming language, can only be clarified if placed 
together in context.

   The first point is that there is nothing special about accounting and
   banking computations that lead to a requirement for decimal arithmetic
   in order to maintain accuracy.  Since most architectures provide only
   integer decimal arithmetic, accuracy to the nearest cent or fraction
   thereof is usually obtained by having the compiler generate appropri-
   ate scaling computations.  But whatever operations are used, they are
   independent of the base of the underlying arithmetic system.

   Second, for current implementation technologies (with the possible ex-
   ception of Integrated Injection Logic), radix 2 or a power thereof has
   a decided edge over decimal.  That is, you can get a better performance
   with binary for a given cost, compared with decimal, or you can achieve
   a fixed performance for a lower cost.  Moreover, binary integers are
   considerably more compact than decimal codings, and this will be trans-
   lated into higher I/O throughput and smaller data bases.

   Third, conversion between binary and display format need not be too
   expensive compare with conversion between BCD and display format, and
   in any case, it can be accomplished fast enough to keep up with all
   but the fastest I/O systems, keeping in mind that system throughput
   is considerably slower than raw I/O transfer rates.

   Finally, There is no real need for data conversion unless the target
   I/O transaction is a write to a display device such as a screen or a
   printer.

   Why then, does it pay to provide decimal arithmetic in terms of real,
measured, performance gains in real environments?  The usual argument
goes something like this.  Decimal arithmetic is slow compared to binary
arithmetic, but there is usually not much arithmetic in typical business
and accounting applications.  However, there is a lot of data movement,
and if binary arithmetic is used, then the numbers have to be converted
to decimal before each data movement. It is this conversion, which is
unnecessary if decimal arithmetic is used, which costs in performance.

   But why MUST the conversion be performed, except for the case of final
output to a screen or printer?  The answer is that it doesn't, but in
most cases, it is, and the reason can be found in the genesis of COBOL
and the bias built into the language.  The original COBOL -PRESUMED- that
the underlying arithmetic was decimal, and this presumption was cast in
concrete by making DISPLAY usage the default for numeric fields in the
RECORD DESCRIPTION clause.  Thus, most COBOL programmers as a matter of
practice will only use the USAGE IS COMPUTATIONAL option when a particu-
lar computation is clearly a performance bottleneck.  If COMPUTATIONAL
were the default, and DISPLAY the option, then it would be used only
where required (for DISPLAYing on a screen or printer), and we would, I
believe, in many cases see a considerable performance improvement, due
mainly to the resulting decrease in I/O traffic, but also because of
the improvement in computational speed.  Moreover, such programs would
show NO improvement if hardware decimal arithmetic were provided, and
consequently there would be no incentive for making it available.

    In conclusion, decimal arithmetic is only an advantage because it
supports a bad data paradigm found in an obsolete, but still widely-
used programming language.

----
Dr. G. W. Gerrity                              Phone: +61 62 68 8180
Dept. of Computer Science                      Fax:   +61 62 68 8581
University College, UNSW                       Email: gwg@cs.adfa.oz.au
ADFA, Northcott Drive                          Telex: ADFADM AA62030
Canberra, ACT 2600            
AUSTRALIA

lindsay@MATHOM.GANDALF.CS.CMU.EDU (Donald Lindsay) (10/06/89)

In article <2261@csadfa.oz> gwg@csadfa.oz (George Gerrity) writes:
>
>   ...binary integers are
>   considerably more compact than decimal codings, and this will be trans-
>   lated into higher I/O throughput and smaller data bases.
>   [ .. other apparently well-reasoned stuff ... ]

When the IEEE FP standard was being formed, some consideration was
given to the idea of decimal encodings.

Normally, one stores decimal as BCD: one digit as 4 bits.  So, 3
digits takes 12 bits, whereas binary gets >1000 patterns in a mere 10
bits. Well, 20% is a penalty, true.

However, about the time of the IEEE standards effort, someone came up
with an encoding which put 1,000 patterns into 10 bits. Further, his
encoding could be transformed <=> BCD with straightforward
combinational hardware. (That is, the logic would have no carry
ripple, or other such nastiness.)

If this idea had been enshrined long enough ago, we would now store
our FP in what you might call "compact BCD". FP units would expand
numbers as they were loaded to registers, and compact them on stores.
Internal operation would be on BCD.

Well, it didn't happen, and it won't, and maybe that's for the best.
But it's an interesting thought.
-- 
Don		D.C.Lindsay 	Carnegie Mellon Computer Science

scott@bbxsda.UUCP (Scott Amspoker) (10/06/89)

In article <2261@csadfa.oz> gwg@csadfa.oz (George Gerrity) writes:
>    In conclusion, decimal arithmetic is only an advantage because it
>supports a bad data paradigm found in an obsolete, but still widely-
>used programming language.

I have to deal with decimal formats everyday and I haven't been near
a COBOL compiler in 12 years.  There are many different languages that
are more specialized and less well-known than COBOL that use floating
decimal representations for whatever reason the original designers
chose.  Perhaps we can start up a "wipe-out-decimal-math-by-the-year-2000"
campaign (I intend to be wealthy and retired by then, so you can do what
you want).  

I'm not disagreeing that decimal math has its problems.  But you can't
just say, "this is bad so we will not address the issue".

The first thing we have to do with new programmers, fresh out of college,
is make them understand the difference between the way things should be
and the way they are.  The quickest and most humane way of doing that is
to let them answer the technical support lines for a week and see what
the *real world* is like.  Decimal math is here and it is not going away
any time soon.  I personnally don't care.  The disadvantages of
decimal math presented so far are not big issues to me.  It works and
it works well.

-- 
Scott Amspoker
Basis International, Albuquerque, NM
(505) 345-5232

meissner@tiktok.dg.com (Michael Meissner) (10/08/89)

In article <2261@csadfa.oz> gwg@csadfa.oz (George Gerrity) writes:
| 
|    But why MUST the conversion be performed, except for the case of final
| output to a screen or printer?  The answer is that it doesn't, but in
| most cases, it is, and the reason can be found in the genesis of COBOL
| and the bias built into the language.  The original COBOL -PRESUMED- that
| the underlying arithmetic was decimal, and this presumption was cast in
| concrete by making DISPLAY usage the default for numeric fields in the
| RECORD DESCRIPTION clause.  Thus, most COBOL programmers as a matter of
| practice will only use the USAGE IS COMPUTATIONAL option when a particu-
| lar computation is clearly a performance bottleneck.  If COMPUTATIONAL
| were the default, and DISPLAY the option, then it would be used only
| where required (for DISPLAYing on a screen or printer), and we would, I
| believe, in many cases see a considerable performance improvement, due
| mainly to the resulting decrease in I/O traffic, but also because of
| the improvement in computational speed.  Moreover, such programs would
| show NO improvement if hardware decimal arithmetic were provided, and
| consequently there would be no incentive for making it available.

Two things.  I seem to recall that some/many COBOL programmers use
overlay schemes and such (level 88 declarations? -- it's been 15 years
since I last programmed in COBOL) that define two or more elements in
a structure sharing the same space.  One element might be numeric with
leading zeros changed into blanks, and another might be alphanumeric.
The programs would then store in one format, and read the other (sort
of the same abuse people make of unions in C, or equivalence in F77).
If fields are overlayed that way, the compiler is pretty much forced
to do the conversions at every step if they use binary.

The second point is, that I've heard about COBOL compilers that will
automatically figure out which variables arithmetic is done on, and if
there is no overlay like I mentioned above, will internally use binary
for the variable (providing the binary form can meet the accuracy
requirements from the declaration).  This is sort of like modern C
compilers which will automatically allocate stack locals to registers,
even if the programmer did not specify register directly.
--
Michael Meissner, Data General.
Uucp:		...!mcnc!rti!xyzzy!meissner		If compiles were much
Internet:	meissner@dg-rtp.DG.COM			faster, when would we
Old Internet:	meissner%dg-rtp.DG.COM@relay.cs.net	have time for netnews?

gwg@csadfa.oz (George Gerrity) (10/11/89)

    Scott Amspoker comments " ... I have to deal with decimal formats
everyday and I haven't been near a COBOL compiler in 12 years ..."

    Two points 1) perhaps Scott is confusing FORMATS with REPRESENTATION,
as the semantics of COBOL does.  That was my point -- the internal repre-
sentation is irrelevant unless format and representation are for some 
reason coupled by the semantics of the language.  Moreover, you can get
around the problem by careful programming in COBOL. 2) Scott also points
out that there are lots of other languages using (requiring?) floating 
decimal representations.  I guess so, but if COBOL (mis)usage didn't gen-
erate a (spurious) need for hardware decimal arithmetic, then those other 
languages would also learn to do without.

    Michael Meissner comments that COBOL uses overlayed structures, simi-
lar to Pascal variant records, with the implication that it somehow im-
pacts on my argument.  It does not.  Overlays don't necessarily imply
data conversion.  I suspect that if they do, then it is an abuse of the
feature.  All it means is that stored overlayed fields require as much
space as the biggest overlayed field.

    Andrew Klossner gives a contrived (his words) example of a case where
decimal arithmetic is (supposedly) faster than binary.  If his RISC mach-
ine doesn't have binary divide, then it is unlikely to have BCD instruc-
tions either.  Moreover, a smart compiler dealing with the scaling prob-
lem will delay division as late as possible, ie, until it needs to output
a datum, and will keep the internal formats in their scaled form.  Event-
ually, of course, the division will have to be done.

								George
----
Dr. G. W. Gerrity                              Phone: +61 62 68 8180
Dept. of Computer Science                      Fax:   +61 62 68 8581
University College, UNSW                       Email: gwg@cs.adfa.oz.au
ADFA, Northcott Drive                          Telex: ADFADM AA62030
Canberra, ACT 2600            
AUSTRALIA

peter@ficc.uu.net (Peter da Silva) (10/12/89)

In article <2268@csadfa.oz> gwg@csadfa.oz (George Gerrity) writes:
>     Andrew Klossner gives a contrived (his words) example of a case where
> decimal arithmetic is (supposedly) faster than binary.

Not  only is it contrived, but it's unlikely to come up in financial
calculations, since one does not tend to multiply dollars by dollars.
-- 
Peter da Silva, *NIX support guy @ Ferranti International Controls Corporation.
Biz: peter@ficc.uu.net, +1 713 274 5180. Fun: peter@sugar.hackercorp.com. `-_-'
                                                                           'U`
Quote: Structured Programming is a discipline -- not a straitjacket.

sommar@enea.se (Erland Sommarskog) (10/16/89)

Peter da Silva (peter@ficc.uu.net) writes:
>Not  only is it contrived, but it's unlikely to come up in financial
>calculations, since one does not tend to multiply dollars by dollars.

No, but one tends to multiply whatever currency with interest rates.
-- 
Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se
"My baby's a 26. On a scale from one to ten, my baby's a 26." - Chic