dpl@cisunx.UUCP (Dave P. Lithgow) (05/21/87)
This is a summary of the pertinent information concerning the article I had posted awhile ago concerning a decision to recommend/not recommend the acquisition of an FP86 (FBOX) for a Vax 8650 which had been purchased without one. This vax is running DEC Ultrix-32 V1.2. Thanks to several who greatly helped direct my investigation: Jonathan Clark, Bill Mitchell, Guy Harris, and Rex Black. [Guy Harris helps with:] Script started on Thu Mar 26 23:42:58 1987 gorodish$ egrep "float|double" /arch/4.3/usr/src/bin/ld.c if (t >= sizeof (double)) rnd = sizeof (double); gorodish$ script done on Thu Mar 26 23:43:06 1987 Well, I don't know what DEC did to "/bin/ld" in ULTRIX, but it must have been pretty impressive to get it to use floating point.... It *does* use *integer* modulo to compute some hash indices when doing relocation relative to external symbols. The following claim was made in "Comments on 'The Case for the Reduced Instruction Set Computer', Douglas W. Clark and William D. Strecker, ACM Computer Architecture News, Vol. 8, No. 6, 15 October 1980, on page 38: The explanation of this anomaly is that the 780's Floating Point Accelerator speeds up the multiply in the multi-instruction implementation, but doesn't see INDEX at all. So it is conceivable that the F-BOX on the 8650 may speed up the modulo computation as well. I'd still want to see this 50% speedup first-hand before I believed it, though. [end of Guy's response] [Bill Mitchell:] I've heard from good authorities that the 8600 FPA does integer multiplications, so that's a source of potential speedup. They apparently did this after finding that a lot of programs do a lot of integer multiplies, but it's also good for sales(!): "We don't do much floating point, so we don't need an FPA." "Well if you do integer multiplies, our FPA will help you out!" There was an issue of the Digital Technical Journal (?) that was devoted to the 8600 and I think there was an article on the FBOX in there. Your friendly DEC salesman can probably get you a copy. [end of Bill Mitchell's reply] -------------------------------------------------------------------------- [the research I promised:] I did an egrep of Ultrix-32 (V1.2) ld.c which I forgot to mention, and got: if (t >= sizeof (double)) rnd = sizeof (double); (which is exactly the same as Guy Harris got with his egrep on BSD4.3 ld.c) Further research: [ quoted from the "Vax Hardware Handbook", Vol. 1-1986: pp.7-17 ] ------------------------------------------------------------------------- Floating Point Accelerator Functions The floating-point accelerator (Fbox) is used to decrease the time to per- form floating point instructions and some integer instructions. [clue #1] The Fbox logic contains the following functions: o Adder and Multiplier Logic o 16 General Purpose Registers [I think the last sentence in this paragraph is the answer to the puzzle -dpl] Operands received from the adder are multiplied by the multiplier logic and the product is returned to the adder for rounding and normalization. During floating-point operations, the multiplier operates on the fractions and the adder operates on the exponent. During integer operations the entire two's complement number is multiplied. The adder logic performs addition, subtraction, and division and handles the exponents for all the basic operations. The adder contains the logic to unpack the floating-point numbers, to align fractions, and to round and normalize the results. The operands are received from the OP bus [operand bus -dpl] and returned to the W bus [write bus -dpl]. The Fbox receives GPR [general processor register -dpl] updates and Ebox data, and accesses some Fbox registers through the W bus. --------------------------------------------------------------------------- [end of text quoted from Vax Hardware Handbook] It would also seem that performing a straightforward simulation of the Fbox's effect would require register-to-register clock tick timing info and bus speed info ... not expecting to receive that really soon :-)... *** Further research: % cc -S ld.c % egrep "(mull2|mull3|mulw2|mulw3)" ld.s >> mulcount.ints % wc -w mulcount 27 % grep mull2 mulcount.ints | wc -l 25 % grep mull3 mulcount.ints | wc -l 2 ---> so there *are* places inside ld that an FBOX would help... ---> perhaps DEC wasn't so wrong after all... Some history: VAX Hardware Handbook 1982-83 All the floating point boards for the 730, 750 & 780 perform all floating point instructions, EMOD, POLY, and also perform conversion between floating point and integer data types, as well as: FP730 - executes "32-bit integer multiplication and division instructions." FP750 - executes "The FP750 also executes the integer multiplication instruction." FPA aka FP780 - "enhances the performance of the 32-bit integer multiplication instructions." The Vax Architecture Handbook Vol.1 1986, pp.4-18 documents 43 instructions handled by the FP750, including MULW{23}, and MULL{23}; The specific instructions handled by the 86xx FBOX are not documented, but I'd guess that at least one of the two pairs of integer opcodes handled by the FP750 is also handled by the 86xx FBOX (aka FP86). *** further, probably definitive research: At long last, DEC has come through with a page entitled "FBOX instruction set", listing the instructions handled by the FBOX: and here they are: ADDF*, ADDD*, ADDG*, SUBF*, SUBD*, SUBG*, DIVF*, DIVD*, DIVG*, MULF*, MULD*, MULG*, MULH*, POLYF, POLYD, POLYG, POLYH, MULL*, EMUL, EMODF, EMODD, EMODG, EMODH, INDEX. (where * denotes both 2-operand or 3-operand format) Notice that this set does intersect with the instructions used by ld.s (for MULL*, since ld.s uses 25 MULL2 and 2 MULL3 instructions). ld.s does not use any of the other instructions assisted or executed by the FBOX. The documentation shows that some opcodes are "executed" by the FBOX and for others the FBOX merely "assists". Those which are "assisted" by the FBOX are MULH*, POLYH, MULL*, EMUL, EMODF, EMODD, EMODG, EMODH, & INDEX. The remainder are "executed" by the FBOX. For FBOX fans (i.e number crunchers), I include here all (2 paragraphs) of the (unintended) info that DEC sent along with the FBOX instruction set table: Note again that the FP86 (i.e. FBOX) is *identical* in both the 8600 and 8650. ------- The VAX 8600 Fbox is composed of two modules. 1. Fbox Adder, FBA, a L0212 located in slot AC8 2. Fbox multiplier, FBM, a L0213 located in slot AC7 If the FBOX is not installed, the following modules are installed. 1. Fbox Terminator Module, FTM, a L0223 in slot AC8 2. Fbox Jumper Module, FJM, a L0218 in slot AC7 If an Fbox is installed into a system that previously did not contain an Fbox, no changes to the power system are required. The FTM and FJM modules are described in appendix D. The FBM contains jumpers that the console can read with the visibility paths. These jumpers inform the console of the Fbox revision. If either Fbox module is changed, the FBM must be changed to update the Fbox revision number. Appendix D lists the functions of the FTM and FJM modules. ------- end of DEC-supplied FBOX info ------- -David David P. Lithgow Sr. Systems Analy./Pgmr., Univ. of Pittsburgh USENET: {allegra,bellcore,ihpn4!cadre,decvax!idis,psuvax1}!pitt!cisunx!dpl CCnet(DECnet): CISVM{123}::DPL,CISVXO::DPL (I admit it: I'm a unix and VMS) Bitnet(Jnet): psuvax1!dpl@pittvms.bitnet ( systems programmer too) ARPA: (via UUCP) pitt!cisunx!dpl@cadre.arpa ARPA: (via Bitnet) DPL%PITTVMS.BITNET@WISCVM.WISC.EDU CSNET: dpl%pitt@csnet-relay