[net.lang.c] Why 'C' does not need BCD...

breuel@harvard.ARPA (Thomas M. Breuel) (11/25/84)

|A Fujitsu Eagle transfers data at about 1.8 megabytes per second.  Many
|mainframe disks are even faster.  If you are storing 6-byte (12-digit) BCD
|and are adding two streams, you are transferring 18 bytes per operation.
|Assuming you double-buffer but have only one controller, this is 100K
|operations per second.  You thus have to convert to binary, add, and
|convert back to BCD in approximately 10 microseconds.

This is irrelevant (sorry). You don't 'have to convert to binary, add,
and convert back to BCD in approximately 10 microseconds', since you
don't store your data in BCD on the disk in the first place if you are
working with binary numbers.

The only time when you have to convert to BCD is when you output data
in human readable form. Since output in human readable form is handled
by slow devices, like line printers, laser printers, or microfilm
printers, it hardly matters how quickly you can perform the
conversion.  (Indeed, for laser printers, most of the work is involved
in generating the bit image). In addition, the amount of data that you
present in human readable form is very likely to be only a tiny
fraction of the amount of data you are doing arithmetic with. (This is
due to the limited performance of human beings).

Apart from the above points, there is most likely an operating system
in between your nice, fast disk drives and your ('C') program. Indeed,
this operating system is likely to be UN*X. Since (if I recall
correctly) the calling sequence on the VAX takes more than 10us, and
since UN*X loves to call many functions and to do a lot of copying, it
again hardly matters whether your arithmetic takes 10us or 20us or
100us.

Even if you, as I once did on an IBM360, program your machine in
assembly language and access the disk drives directly (i.e. with
minimum overhead), you would still have to do seeks ('double buffering'
doesn't really help you), either every now-and-then (if you are lucky
enough to have three disk drives), or for every operation (if you are
not lucky enough).

The slowness of disk I/O is not related to transfer rates (1.8 Mbyte/s
is almost as fast as dynamic ram chips), but to seek times and
scheduling problems (DMA, processor interrupt service, ...).

|                                               But I have never seen an
|algorithm for converting binary to decimal without doing divisions.

If you keep on reading, you'll find an algorithm that converts binary
to BCD without division.

|And it's actually worse than that, because a mainframe that powerful
|typically has several controllers so that the full 1.8 megabyte output data
|rate could be maintained if the CPU could keep up with it.

As mentioned above, disk transfer rate is not the limiting factor.

|[somebody else mentioned something like:] 5 digit signed BCD takes
|3 bytes, whereas for a binary representation you have to go from
|16 to 32 bits.

You can do your arithmetic with 32 bits and store data as 24 bits
on disk (that's what UN*X used to do on PDP-11's).

To summarise: binary arithmetic is faster than BCD arithmetic,
binary->ASCII is slower than BCD->ASCII. Binary numbers are equally or
more compact than BCD numbers. Binary numbers are equally readable,
even for accountants, given that your debugger does not only support hex
dumps (BTW, have you ever tried to read BCD in an octal dump?).

WRT 'C', all of this is irrelevant. If you want to use BCD in 'C', write
your own assembly language subroutines, the same way that you write
your own string routines. You then even have the choice between packed
and unpacked BCD, so that your BCD routines can even *be* your string
routines :-). 'C' as the language to write UN*X and UN*X utilities in,
surely does not need BCD.

Sorry for this somewhat lengthy response. I hope it will by {my,the}
last one on the subject.

						Thomas.
						(breuel@harvard)

PS: if you are worried about efficiency in your BCD package, why don't
you just write an sed-script, which replaces all 'calls' to BCD
routines by in-line assembly language. That's how the kernel handles
'bcopy' and friends on the VAX.

PPS: the same sed-script can even be applied to PDP-11 'C' compiler
output without harm, as the 2.9BSD distribution clearly demonstrates --
PDP-11 code does not contain 'calls'...

-----------------------------------------------------------------------

; As promised: routine to convert binary to BCD without division.

; You can write this routine for any machine with BCD arithmetic.
; I can write it only for the 6502, 68000, 8085, IBM360 (:-)), or
; VAX. Of those, I think it looks neatest on the 6502.

; There are likely to be some misspelled mnemonics in here --
; I haven't written 6502 assembly language in 2 years.

; Entry: binary number between 0 and 99 in accumulator
; Exit: BCD number in accumulator (unpack to get ASCII)

	php		; save previous processor mode
	sed		; enable bcd arithmetic
	ldx #8		; #bits to convert
	ldy #0		; clear result
	sty r		
loop:
	asl a		; test msb
	tay		; save binary number
	php		; save result of msb test
	lda r		; multiply bcd by 2
	add r		
	plp		; add one if msb was set
	bcc skip	
	add #1		
skip:
	sta r		; update bcd result
	tya		; restore binary number
	dex		
	bra loop	
	lda r		; return result in accumulator
	plp		; restore processor status
	rts		

geoff@desint.UUCP (Geoff Kuenning) (11/27/84)

In article <166@harvard.ARPA> breuel@harvard.ARPA (Thomas M. Breuel) writes:

>You don't 'have to convert to binary, add,
>and convert back to BCD in approximately 10 microseconds', since you
>don't store your data in BCD on the disk in the first place if you are
>working with binary numbers.

You shot down this argument at the end of your article, see below.

>[A lengthy claim that human conversion is the important part, and that this
>is inherently slow because of the slowness of I/O devices, is omitted here.]

You are assuming that the job is running standalone.  Real computers have
some compute-bound jobs (even in business installations).  If another job
is eating CPU unnecessarily, that is CPU that is not available to others.

>[Besides, you're probably running UN*X.  Vax calls take a lot more than 10
>us, and UN*X makes lots of procedure calls and copies a lot.]

C does not run only on UN*X.  There are operating systems in the world that
are faster than UN*X, and there are machines that are faster than Vaxen.

>you would still have to do seeks ('double buffering'
>doesn't really help you), either every now-and-then

A seek once out of every 500 transfers (which a well-designed filesystem
can achieve) does not seriously impact performance.  And who says double
buffering doesn't help you?  Properly implemented double buffering most
assuredly does, and examples are easy to come by.

>The slowness of disk I/O is not related to transfer rates (1.8 Mbyte/s
>is almost as fast as dynamic ram chips), but to seek times and
>scheduling problems (DMA, processor interrupt service, ...).

So?  Ever try to swap out 1 MB on a 5-1/4" Winchester?  For that matter, ever
try to swap out 2 MB on an Eagle?  We are talking transfer times in SECONDS
here, compared to seek times in the 20-100ms range.

Further, many business programs do their I/O off of tape drives, which have
no seek times at all.

>|                                               But I have never seen an
>|algorithm for converting binary to decimal without doing divisions.
>
>If you keep on reading, you'll find an algorithm that converts binary
>to BCD without division.

The algorithm is indeed given at the end of the article;  I am hardly
impressed.  For the literalists out there, let me revise the above claim to
"I have never seen an EFFICIENT algorithm...".  The algorithm you gave is
obviously less efficient than using division, since it must do 1 BCD add per
bit, compared to one division per four bits (approx.) for a division
algorithm.

>You can do your arithmetic with 32 bits and store data as 24 bits
>on disk (that's what UN*X used to do on PDP-11's).

This is the "shooting down" of your argument that I mentioned above.  As
soon as you have to take 24-bit numbers, convert them to 32 bits, and
convert them back to 24 for output, you have lost any possible speed
advantage that binary might have given you.  6-digit BCD is much faster.

>[Summary:  binary is always faster, and the world is I/O limited in any case]

24-bit binary is not faster.  And if the world is I/O limited in any case,
why are you getting down on the business people for choosing a data
representation that is conceptually convenient?  How many UNIX programs
laboriously convert a number to ASCII only to pipe into a program that will
immediately crack it into binary?  (S package?  What S package?).  I am
getting awfully tired of this attitude that "my favorite data
representation is the only acceptable one".

There seems to be a common misconception among "serious" computer types
that 100% of business programmers are cretins.  Sorry, folks.  There are a
lot of extremely high-powered brains out there who have chosen to attack
the business world, and they are applying themselves to problems of a
magnitude (in terms of number of data items processed) that you and I
hardly ever have to deal with.  If they can come up with a better or faster
way to run the payroll, it produces a *big* savings for the company.  If
binary was really better, you can count on it that it would be in use.  (Is
in use, actually, because there are applications where it is better.)
-- 

	Geoff Kuenning
	...!ihnp4!trwrb!desint!geoff

dan@digi-g.UUCP (Dan Messinger) (12/04/84)

In article <247@desint.UUCP> geoff@desint.UUCP (Geoff Kuenning) writes:
>A seek once out of every 500 transfers (which a well-designed filesystem
>can achieve) does not seriously impact performance.

I find this incredible!  Do you have a disk drive for each application?
With many programs trying to access the same drive, I can't imagine how
you could get 500 transfers/seek.  How many drives have that many blocks
per cylinder?  How many times do you rewrite the same block?

>So?  Ever try to swap out 1 MB on a 5-1/4" Winchester?  For that matter, ever
>try to swap out 2 MB on an Eagle?  We are talking transfer times in SECONDS
>here, compared to seek times in the 20-100ms range.

So what happened to your well-designed filesystem?  A 5-1/4" winchester can
transfer ~1.3 MB/second, and an Eagle is ~1.8 MB/second.  A second, but not
secondS.

>>You can do your arithmetic with 32 bits and store data as 24 bits
>>on disk (that's what UN*X used to do on PDP-11's).
>
>This is the "shooting down" of your argument that I mentioned above.  As
>soon as you have to take 24-bit numbers, convert them to 32 bits, and
>convert them back to 24 for output, you have lost any possible speed
>advantage that binary might have given you.  6-digit BCD is much faster.

Are you serious?  You think that adding a 00 or FF byte to the begining
of the binary number is going to take LONGER than doing BCD math?  Do
you know what BCD looks like in the computer?  Have you tried writting BCD
math routines?  Are you a programmer?

BCD verses binary boils down to two issues.  First, the well known
space vs. time trade-off.  By using 4-byte binary instead of 3 byte BCD,
the program can run much faster.  If there are a lot of numbers involved,
this is no small issue.  By storing only 3 bytes on disk, the extra disk
space can be elliminated, with some loss in performance.  But I assure
you that converting a 24 bit int to a 32 bit int takes less time than
performing a single BCD add.  And how many BCD operations are going to
be performed on that number while it is in memory?

Second is portability.  A BCD math library would perform more consistantly
across a wider variaty of computers.  As long as a machine can address
8 bit bytes, the BCD math package should work.  32 bit ints may not be
available on some machines.

But BCD being faster than binary?  I suggest that you find a local software
engineer and ask him to explain it to you.  (I'm sure it will amuse him/her)

I'm not saying that binary is alwasy the better way.  I'm just saying
that your reasons are absolutely ludicrous.

Dan Messinger
ihnp4!umn-cs!digi-g!dan

g-rh@cca.UUCP (Richard Harter) (12/13/84)

	Mr. Messinger writes at length and with marvellous sarcasm
about how much slower BCD arithmetic is than binary arithmetic.  This
is all very true IF THE MACHINE IN QUESTION DOES NOT SUPPORT BCD
ARITHMETIC.  On IBM mainframes, however, BCD arithmetic is supported
in HARDWARE and is as fast as binary arithmetic.  Lest we forget,
IBM has greater revenues than all of the rest of the computer industry
combined.

	There are, in fact, very good reasons for using BCD in
certain types of applications.  The most obvious is that it is much
cheaper to convert BCD numbers to decimal character strings than it
is to convert binary numbers to decimal character strings.  In
business applications BCD is superior -- if it is supported in
hardware.

	Since C is not a particularly convenient language to use
in an IBM shop and since almost nobody besides IBM supports hardware
BCD there seems to be little need for BCD in C.

				Richard Harter (SMDS Inc.)
				cca!g-rh