[comp.unix.wizards] Vax 86xx FBOX decision

dpl@cisunx.UUCP (03/25/87)

Un*x/Ultrix Vax users:

	One of our vaxes (Vax 8650, Ultrix-32 V1.2) was ordered without
an F-BOX floating point accelerator, under the mistaken belief that
only our VMS users would be involved in number crunching.  Thus, I am
now trying to justify the acquisition ($19,600 + $665 installation charge =
$20,265.00) of a DEC 8650 F-BOX.

I asked DEC support (a biased source): 
	'What would an F-BOX do for a un*x user?'
DEC's response: 
	'all compiles would be sped up by 50%, because the link
	editor uses floating point to calculate virtual addresses.'

Can anyone verify this?

For non-wizards: 
	(1) ld (link editor) is invoked during compiles which
  complete without errors, in order to link in referenced routines, and
  create an executable file.
	(2) The FBOX of an 8600 is exactly the same as the FBOX of an 8650.

System workload characterization:
	This vax is in an academic situation, with aboout 2000 accounts,
70% of which are students learning pascal, lisp, C, ada, data structures,
simulation, etc.  The remaining 30% crunch numbers and process tapes.
The machine is quite often CPU constrained, running for hours at a time at
0% idle.  Therefore, even if DEC is off by 100%, a mere 25% overall speedup
in compiles might be worth the expense ...

	Is there a way to collect instruction frequency distribution
statistics?  Remember, if a program has just 5 floating point
instructions, if those five comprise 10% of the instructions executed
by a utility (due to their being inside an important loop), then the
execution profile will be very different from the frequency distribution
of floating point instructions in the executable file.  Thus, I there's
no point to writing a program to count the number of flt. pt. instructions
in all the most used utilities...

							-Dave
David P. Lithgow		Sr. Systems Analy./Pgmr., Univ. of Pittsburgh
USENET:  {allegra,bellcore,ihpn4!cadre,decvax!idis,psuvax1}!pitt!cisunx!dpl
CCnet(DECnet): CISVM{123}::DPL,CISVXO::DPL (I admit it: I'm a unix and VMS) 
Bitnet(Jnet):  psuvax1!dpl@pittvms.bitnet    ( systems programmer too)
ARPA: (via UUCP) pitt!cisunx!dpl@cadre.arpa
ARPA: (via Bitnet) DPL%PITTVMS.BITNET@WISCVM.WISC.EDU
CSNET: dpl%pitt@csnet-relay

guy@gorodish.UUCP (03/27/87)

> DEC's response: 
> 	'all compiles would be sped up by 50%, because the link
> 	editor uses floating point to calculate virtual addresses.'
> 
> Can anyone verify this?

Script started on Thu Mar 26 23:42:58 1987
gorodish$ egrep "float|double" /arch/4.3/usr/src/bin/ld.c
				if (t >= sizeof (double))
					rnd = sizeof (double);
gorodish$ 

script done on Thu Mar 26 23:43:06 1987

Well, I don't know what DEC did to "/bin/ld" in ULTRIX, but it must
have been pretty impressive to get it to use floating point....

It *does* use *integer* modulo to compute some hash indices when
doing relocation relative to external symbols.  The following claim
was made in "Comments on 'The Case for the Reduced Instruction Set
Computer', Douglas W. Clark and William D. Strecker, ACM Computer
Architecture News, Vol. 8, No. 6, 15 October 1980, on page 38:

	The explanation of this anomaly is that the 780's Floating
	Point Accelerator speeds up the multiply in the
	multi-instruction implementation, but doesn't see INDEX at
	all.

So it is conceivable that the F-BOX on the 8650 may speed up the modulo
computation as well.  I'd still want to see this 50% speedup
first-hand before I believed it, though.

jhc@mtune.UUCP (03/28/87)

> DEC's response: 
> 	'all compiles would be sped up by 50%, because the link
> 	editor uses floating point to calculate virtual addresses.'

I can't verify that statement, and it sounds pretty damn unlikely to
me, but apparently the FBOX uses its spare time to do pre-calculations
of effective addresses in the instruction stream, and thus speeds up
the entire machine somewhat. Since this is at or under the microcode
level, it's hardly surprising that Guy didn't find too many references
to float or double in the source of ld. It may have been that the FBOX
is doing instruction stream pre-decode, I forget now, but either of
these sounds plausible.
-- 
Jonathan Clark
[NAC,attmail]!mtune!jhc

Albatross! Stormy petrel on a stick!

whm@arizona.UUCP (03/28/87)

I've heard from good authorities that the 8600 FPA does integer multiplications,
so that's a source of potential speedup.  They apparently did this after
finding that a lot of programs do a lot of integer multiplies, but it's
also good for sales(!): "We don't do much floating point, so we don't need
an FPA."  "Well if you do integer multiplies, our FPA will help you out!"

There was an issue of the Digital Technical Journal (?) that was devoted to
the 8600 and I think there was an article on the FBOX in there.  Your
friendly DEC salesman can probably get you a copy.

					Bill Mitchell
					whm@arizona.edu
					{allegra,cmcl2,ihnp4,noao}!arizona!whm

ecc@ihuxy.UUCP (04/01/87)

One good way to determine if the FBOX will help your situation,
and to determine if your DEC salesperson is "telling the truth",
is to ask him/her to let you use an FBOX for a trial period.
If it works as expected, you'll buy it, otherwise you won't.
You may end up eating the installation and deinstallation charges,
plus any maintanence charges for the period you have it,
if it doesn't work our as expected.
You can also approach it from the view that if the salesperson is
really confident that the FBOX will improve CC performance 50% (or whatever
the claims are),
that he/she should be willing to pay for the charges if the
claims are NOT true.

Last year we had a free 4-month trial of a VAX 8600,
based upon the agreement that if it met certain performance benchmarks,
we'd acquire it, otherwise it would go out the door.
Because of the length of the trial and the cost of the 8600,
we agreed to pay all installation and deinstallation charges.

The trial ended up to the advantage of BOTH sides -- we got a more
cost effective machine, and DEC got some sales.

Eric Claeys