cik@l.cc.purdue.edu (Herman Rubin) (11/27/88)
In article <6529@june.cs.washington.edu>, kolding@june.cs.washington.edu (Eric Koldinger) writes: > In article <1961@crete.cs.glasgow.ac.uk> orr@cs.glasgow.ac.uk (Fraser Orr) writes: > >I don't agree that there is ever any necessity to code in assembler. We > >have languages that produce code just as good as hand crafted assembler > >(such as C), so why not use them for this sort of thing. > > Ah, but wouldn't that be nice. Optimizing compilers that could generate > code as good as we can generate by hand in all cases. Let me know when > someone writes one. I agree completely. Also, on the various machines I know, there are operations I want to use about which the compiler knows nothing. To SOME extent, SOME of these can be added to the HLL. But I have not seen a scheme to do this which will not mess up attempts at automatic optimization. Also, these interesting and useful instructions vary somewhat from machine to machine. I am appalled by what the language designers have left out, and also what has been relegated to subroutines. What can one think of a compiler designer who has relegated to a subroutine an operation whose inline code is shorter than the caller's code to use the subroutine? This is rampant. I recently had the occasion to produce a subroutine for a peculiar function. In principle, it should have been done in a HLL. I would prefer to do so. BUT, the following was needed: Take a double floating number, and extract the exponent and the mantissa. This involves having a double long type. Reverse the above operation. Addition, shifting, Boolean operations on double long. Checking an "overflow field." Not the usual overflow. If these are available, C would do a good job PROVIDED it put everything in registers, if possible. A few of the compilers I have seen do a fair job; the others would get a D--. (Since D-- = C in C, this might be why C is so bad :-)) But one of the most amazing things that I have seen in the workings of the designers is the assumption that the compiler has all the information necessary to produce optimized code! There is no provision for input as to frequency of branches. Should the common condition be the branch or the rare condition? Does it make a difference in the combination? Since I have examples where the branches are of comparable frequencies, examples where the ratio of the frequencies are from 10-200, where the ratio is a few thousand, and where one branch may never occur, I certainly feel that I have input. I think the compilers should be interactive, and discuss the various possibilities with the programmer. I can even give cases where the dictum to remove a calculation from within a loop is wrong. All of mankind does not know enough to produce a good language, or to produce a good implementation of a language. There are more operations appropriate to hardware than are dreamed of in all computer scientists' philosophies. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)
gwyn@smoke.BRL.MIL (Doug Gwyn ) (11/28/88)
In article <1031@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: >... on the various machines I know, there are operations >I want to use about which the compiler knows nothing. There are many of us who don't feel obligated to make use of every little whiz-bang feature that hardware designers have so thoughtfully provided. Life is too short to spend it in an effort to accommodate hardware quirks.
orr@cs.glasgow.ac.uk (Fraser Orr) (11/29/88)
In article <1031@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:
<A few of the compilers I have seen do a fair job; the others would get
<a D--. (Since D-- = C in C, this might be why C is so bad :-)) But one
<of the most amazing things that I have seen in the workings of the
<designers is the assumption that the compiler has all the information
<necessary to produce optimized code! There is no provision for input
<as to frequency of branches. Should the common condition be the branch
<or the rare condition? Does it make a difference in the combination?
<Since I have examples where the branches are of comparable frequencies,
<examples where the ratio of the frequencies are from 10-200, where the
<ratio is a few thousand, and where one branch may never occur, I certainly
<feel that I have input. I think the compilers should be interactive, and
<discuss the various possibilities with the programmer. I can even give
<cases where the dictum to remove a calculation from within a loop is wrong.
I agree entirely (including with the cheap jibe against C:^>). I think
that there would be great improvements if compilers were interactive.
Not perhaps realised with low level languages like C and its ilk,
but certainly realised in much higher level programming languages.
Interactive isn't necessarily the best by the way, since it means
you have to repeat yourself, and the decisions are not documented
in the program. There is no reason why you can't have an "advice"
command, that advises the compiler the best implementation. (Doesn't
"advise" sound so much better than "pragma"?)
<Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
==Fraser Orr ( Dept C.S., Univ. Glasgow, Glasgow, G12 8QQ, UK)
UseNet: {uk}!cs.glasgow.ac.uk!orr JANET: orr@uk.ac.glasgow.cs
ARPANet(preferred xAtlantic): orr%cs.glasgow.ac.uk@nss.cs.ucl.ac.uk
henry@utzoo.uucp (Henry Spencer) (11/29/88)
In article <1031@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: >... What can one think of a compiler >designer who has relegated to a subroutine an operation whose inline code >is shorter than the caller's code to use the subroutine? ... If the operation is infrequently used and not efficiency-critical, then what one can think is "this guy knows his tradeoffs". Adding an operation to a compiler is not free. >... But one >of the most amazing things that I have seen in the workings of the >designers is the assumption that the compiler has all the information >necessary to produce optimized code! There is no provision for input >as to frequency of branches... Not true; some C implementations will take advice from the profiler on this. It's practically a necessity for the VLIW folks. (Oh, you meant advice from the *programmer*? :-) The guy who's wrong 3/4 of the time about such things? Silly idea.) (No, I'm not kidding -- programmer intuition is vastly inferior to measured data in such matters. This has been known for many years.) -- SunOSish, adj: requiring | Henry Spencer at U of Toronto Zoology 32-bit bug numbers. | uunet!attcan!utzoo!henry henry@zoo.toronto.edu
shawn@pnet51.cts.com (Shawn Stanley) (11/29/88)
cik@l.cc.purdue.edu (Herman Rubin) writes: >All of mankind does not know enough to produce a good language, or to >produce a good implementation of a language. There are more operations >appropriate to hardware than are dreamed of in all computer scientists' >philosophies. What might be more to the point is that there is no single language that is perfect for every application. The right tool for the job, they always say. UUCP: {rosevax, crash}!orbit!pnet51!shawn INET: shawn@pnet51.cts.com
cik@l.cc.purdue.edu (Herman Rubin) (11/29/88)
In article <8993@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn ) writes: > In article <1031@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > >... on the various machines I know, there are operations > >I want to use about which the compiler knows nothing. > > There are many of us who don't feel obligated to make use of every > little whiz-bang feature that hardware designers have so thoughtfully > provided. Life is too short to spend it in an effort to accommodate > hardware quirks. But these are the natural operations (to me as the algorithm designer). Which of them are on a given machine varies. A machine has few operations, and if the information were provided for a mathematician as a reader, it would take not 1 day but 2-3 hours to understand a machine's instructions. I find that the designers of the various languages have not considered the type of operations which I would want to use WITHOUT KNOWLEDGE OF THE MACHINE. A trivial example is having a list of results to the left of the replacement operator. I do not mean a vector or a struct; the items may be of different types, and should not be stored in adjacent memory locations. Most of the time, they should end up in registers. I have not seen a language which is claimed to produce even reasonably efficient code with this property. Some of these operations are even hardware. Another effect of the HLL tyranny is that the operations which are beyond the ken of th HLL designers are disappearing from the machines. Other useful operations are not getting in. For example, suppose we want to divide a by b, obtaining an integer result i and a remainder c. I know of no machine with this instruction, and this is not that unusual an instruction to demand. It is cheap in hardware, and extremely expensive in software--at least 4 instructions. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)
mash@mips.COM (John Mashey) (11/29/88)
In article <1032@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: ... >Another effect of the HLL tyranny is that the operations which are beyond >the ken of th HLL designers are disappearing from the machines. Other >useful operations are not getting in. For example, suppose we want to >divide a by b, obtaining an integer result i and a remainder c. I know >of no machine with this instruction, and this is not that unusual an >instruction to demand. It is cheap in hardware, and extremely expensive >in software--at least 4 instructions. Although I don't necessarily subscribe to Herman's opinions, R2000 divides actually do this (leave both results in registers). Although I shouldn't have been, I was surprised to find that the following C code, compiled unoptimized (optimized, it essentially disappears, of course): main() { register i,j,k; i = j / 7; k = j % 7; } generates one divide instruction to get both of the results. -- -john mashey DISCLAIMER: <generic disclaimer, I speak for me only, etc> UUCP: {ames,decwrl,prls,pyramid}!mips!mash OR mash@mips.com DDD: 408-991-0253 or 408-720-1700, x253 USPS: MIPS Computer Systems, 930 E. Arques, Sunnyvale, CA 94086
ok@quintus.uucp (Richard A. O'Keefe) (11/29/88)
In article <1032@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: >A trivial example is having a list of results to the left of >the replacement operator. I do not mean a vector or a struct; the items >may be of different types, and should not be stored in adjacent memory >locations. If you mean things like being able to write x, i, s := x/(1.0,+x), i-2, s || "x" || s CPL had it, BCPL has it (in sequential form), MESA has it (you have to put record constructor functions around both sides, but that's merely notational), and it has appeared in several experimental languages. Common Lisp supports it under the name PSETQ: (psetq x (/ x (+ 1.0 x)) i (- i 2) s (however-you-concatenate-strings s "x" s)) I have never understood why it was omitted from Pascal, Modula, ADA: x, y := y, x is the clearest way to exchange two variables I know. I once put this construct into the compiler for an Algol-like language; it was easy, and if the compiler had done any optimisation it would still have been easy. I don't see this as a question of getting at the machine level, but as providing one of the constructs in Dijkstra's notation (:-).
cjosta@taux01.UUCP (Jonathan Sweedler) (11/29/88)
In article <8938@winchester.mips.COM> mash@mips.COM (John Mashey) writes: |In article <1032@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: |... |> suppose we want to |>divide a by b, obtaining an integer result i and a remainder c. I know |>of no machine with this instruction, and this is not that unusual an |>instruction to demand. It is cheap in hardware, and extremely expensive |>in software--at least 4 instructions. | |Although I don't necessarily subscribe to Herman's opinions, R2000 divides |actually do this (leave both results in registers). The 32000 series has a DEI (Divide Extended Integer) instruction that also does this. -- Jonathan Sweedler === National Semiconductor Israel UUCP: ...!{amdahl,hplabs,decwrl}!nsc!taux01!cjosta Domain: cjosta@taux01.nsc.com
colwell@mfci.UUCP (Robert Colwell) (11/29/88)
In article <1988Nov28.205129.2216@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes: ]In article <1031@l.cc.purdue.edu] cik@l.cc.purdue.edu (Herman Rubin) writes: ]]... But one ]]of the most amazing things that I have seen in the workings of the ]]designers is the assumption that the compiler has all the information ]]necessary to produce optimized code! There is no provision for input ]]as to frequency of branches... ] ]Not true; some C implementations will take advice from the profiler on ]this. It's practically a necessity for the VLIW folks. (Oh, you meant ]advice from the *programmer*? :-) The guy who's wrong 3/4 of the time ]about such things? Silly idea.) (No, I'm not kidding -- programmer ]intuition is vastly inferior to measured data in such matters. This ]has been known for many years.) Programmers ARE often wrong. We've even seen cases where the programmer had put in assertions for his Cray code that were provably incorrect for certain interesting sets of input data. But for branching, I wouldn't say it's a necessity for us to have the ability for the profiler to tell the compiler where branches are going. We have that mechanism, but it isn't often used. The heuristics built in to the compiler for predicting directions work awfully well. And if the programmer provides branch assertions, and they're wrong, and the performance is adversely affected, presumably he or she will discover that and fix it. Bob Colwell ..!uunet!mfci!colwell Multiflow Computer or colwell@multiflow.com 175 N. Main St. Branford, CT 06405 203-488-6090
desnoyer@Apple.COM (Peter Desnoyers) (11/30/88)
In article <1032@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > > For example, suppose we want to >divide a by b, obtaining an integer result i and a remainder c. I know >of no machine with this instruction, and this is not that unusual an >instruction to demand. I can't think of a machine that doesn't. The only machine language manuals on my desk (68000 and HPC16083) show this instruction. If anyone knows of a machine that doesn't leave the remainder in a register, I would be interested in knowing of it. Such a machine either throws the remainder away or uses some algorithm besides shift-and-subtract. (for instance hard-wired.) Peter Desnoyers
pardo@june.cs.washington.edu (David Keppel) (11/30/88)
>>cik@l.cc.purdue.edu (Herman Rubin) writes: >>>divide a by b, obtaining an integer result i and a remainder c. >>>I know of no machine with this instruction. It is cheap in hardware, >>>and extremely expensive in software--at least 4 instructions. >mash@mips.COM (John Mashey) writes: >>[R2000 has 1-instruction divide that does this ] cjosta@taux01.UUCP (Jonathan Sweedler) writes: >The 32000 series has a DEI (Divide Extended Integer) instruction that >also does this. The rather obscure but still-available VAX series of computers made by a small Nashua, New Hampshire company (Digital Equipment Corporation) introduced the EDIV instruction recently, about 1978. I heard a rumor that (at least in some early/small VAXen) it was faster to do two separate instructions than to use EDIV, although I suspect that this was an artifact of those implementations. Followups to comp.arch. ;-D on ( Now how about that `editpc' opcode? ) Pardo -- pardo@cs.washington.edu {rutgers,cornell,ucsd,ubc-cs,tektronix}!uw-beaver!june!pardo
consult@osiris.UUCP (Unix Consultation Mailbox ) (11/30/88)
I've trimmed all the language-specific groups out of the posting list for this followup and directed followups to just comp.lang.misc and comp.arch. (I feel the issues being discussed are too general to make it useful to post outside of comp.lang.misc, with the exception of comp.arch, where many of the points being raised will look very familiar.) In article <1032@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: >A trivial example [of useful operations unsupported in HLLs] is having a >list of results to the left of the replacement operator. I do not mean a >vector or a struct; the items may be of different types, and should not be >stored in adjacent memory locations. Most of the time, they should end up >in registers. I have not seen a language which is claimed to produce even >reasonably efficient code with this property. Some of these operations are >even hardware. I'll have to ask... what kind of things are you talking about here (beyond the really obvious one, see below)? (Which of them "are even hardware"?) Why "shouldn't" the results be stored in adjacent memory locations? And if you want to be able to take advantage of the knowledge that they're in registers, you'll probably have to use assembly anyway, so why is this a requirement for HLLs? The only "languages" I've seen where any of this stuff makes any sense are all assembly, and then only when catering to the specifics of the hardware, which can be different even between models of the "same" box. >Another effect of the HLL tyranny is that the operations which are beyond >the ken of th HLL designers are disappearing from the machines.... For >example, suppose we want to divide a by b, obtaining an integer result i >and a remainder c. I know of no machine with this instruction, and this >is not that unusual an instruction to demand. It is cheap in hardware, >and extremely expensive in software--at least 4 instructions. I know of at least one machine with this instruction (the LSI-11 with EIS for integer divide and multiply). There are probably many more. As for it being "cheap in hardware", that's not really true, although if you have the divide it's virtually free to offer divide with remainder. I tend to agree with Herman that it would be frequently useful to be able to do this in a HLL. However, frequently useful may not mean useful enough, as readers of comp.arch well know. HLL designers must justify features they include, lest their language be named Ada (TM Dept of Redundancy Dept :-). Also, there's that little problem Herman mentioned in his followup to Doug Gwyn, that different hardware may not implement desired features, or may implement them differently enough to make incorporating them into a standard HLL very very difficult. Successful HLLs tend to deal with common subsets of hardware features, which are increasingly small these days what with diverging architectures. The newfangled RISC chips and boards are helping somewhat although even their designers can't seem to agree on what constitutes an ideal instruction set (aside from the Z/OISC people who think even Turing machines are too complicated). Bear in mind that I am not an expert, merely someone with lots of experience with various machines and HLLs. In other words, I'm probably wrong, but probably not by much. I think that if Herman had been following comp.arch religiously for the last few years he may not have had to ask any of these questions. Maybe we need a list of answers to commonly asked questions... it might help avoid things like the recent "which is better, 80386 or 68030" posting... Phil Kos Information Systems ...!uunet!pyrdc!osiris!phil The Johns Hopkins Hospital Baltimore, MD
cik@l.cc.purdue.edu (Herman Rubin) (11/30/88)
In article <949@taux01.UUCP<, cjosta@taux01.UUCP (Jonathan Sweedler) writes: < In article <8938@winchester.mips.COM> mash@mips.COM (John Mashey) writes: > |In article <1032@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > |... > |> suppose we want to > |>divide a by b, obtaining an integer result i and a remainder c. I know > |>of no machine with this instruction, and this is not that unusual an > |>instruction to demand. It is cheap in hardware, and extremely expensive > |>in software--at least 4 instructions. > | > |Although I don't necessarily subscribe to Herman's opinions, R2000 divides > |actually do this (leave both results in registers). < < The 32000 series has a DEI (Divide Extended Integer) instruction that < also does this. I do not know if I made it clear in my initial posting, but the problem arises if the types of a, b, and c are floating. Not that the quote from my paper specifically has i an integer. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)
dik@cwi.nl (Dik T. Winter) (11/30/88)
In article <21390@apple.Apple.COM> desnoyer@Apple.COM (Peter Desnoyers) writes: > In article <1032@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > > > > For example, suppose we want to > >divide a by b, obtaining an integer result i and a remainder c. I know > >of no machine with this instruction, and this is not that unusual an > >instruction to demand. > > I can't think of a machine that doesn't. The only machine language > manuals on my desk (68000 and HPC16083) show this instruction. > If anyone knows of a machine that doesn't leave the remainder in > a register, I would be interested in knowing of it. Such a machine > either throws the remainder away or uses some algorithm besides > shift-and-subtract. (for instance hard-wired.) > Well, if I remember right, the we32000 does that (the thing in a 3b2 and a tty5620). It has a divide, quotient, remainder and modulus instruction like the ns32000, but unlike the ns32000 no instruction that gives you both. This is (I think) an example of a machine for which the instructions are dictated by a HLL (C in this case), which Herman Rubin so deplores. On the other hand, this need not be wrong if the operations can be pipelined. E.g. if i/j and i%j take 23 cycles each, but the second can start one cycle after the first you do not lose very much. In article <1034@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > I do not know if I made it clear in my initial posting, but the problem > arises if the types of a, b, and c are floating. Not that the quote from > my paper specifically has i an integer. Indeed you did not make that clear (seems evident :-)). I know no machines that gives you both quotient and remainder on a floating point division, stronger still, before the IEEE standard there was (I believe) no machine that would give you the remainder, only quotients allowed. IEEE changed that and many machines now have floating point division and floating point remainder, but the result is floating point for both. This explains also why not both are returned. To obtain the (exact) remainder may require a larger number of steps than the (inexact) quotient. I agree however that those machines could return the quotient when doing a remainder operation. Note that IEEE did *not* mandate operations that return two results. So no machine I know does return two results; even with sine and cosine, which is implemented on many (as an extension to IEEE?). The CORDIC algorithm will give you fast both sine and cosine, but is only efficient in hardware. But also here, pipelining may be a deciding point. -- dik t. winter, cwi, amsterdam, nederland INTERNET : dik@cwi.nl BITNET/EARN: dik@mcvax
news@ism780c.isc.com (News system) (11/30/88)
In article <1032@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: ... > suppose we want to >divide a by b, obtaining an integer result i and a remainder c. I know >of no machine with this instruction, and this is not that unusual an >instruction to demand. It is cheap in hardware, and extremely expensive >in software--at least 4 instructions. Then he clarifies: >I do not know if I made it clear in my initial posting, but the problem >arises if the types of a, b, and c are floating. Not that the quote from >my paper specifically has i an integer. >-- >Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Ok here is one. Quoting from the IBM/709 Reference Manual (Circa 1959) FDH - Floating Divide or Halt THe c(ac) [means contents of ac register] are divided by c(y). The quotent appears in the mq and the remainder appears in the ac. ... I leave it to Herman to find out why more modern machines leave this out. Hint: Try profiling your aplication to see what percent of the total process time is used producing the remainder. Marv Rubinstein.
ok@quintus.uucp (Richard A. O'Keefe) (11/30/88)
In article <21390@apple.Apple.COM> desnoyer@Apple.COM (Peter Desnoyers) writes: >In article <1032@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: >>For example, suppose we want to >>divide a by b, obtaining an integer result i and a remainder c. I know >>of no machine with this instruction, and this is not that unusual an >>instruction to demand. > >I can't think of a machine that doesn't. Note that Rubin was careful to identify *i* as integer, but NOT the other variables! Consider, for example, Burroughs Extended Algol, where x DIV y was defined to be SIGN(x/y)*ENTIER(ABS(x/y)) and x MOD y was defined to be x - (x DIV y)*y These definitions make perfect sense for any real x, y for which x/y is defined. And Burroughs Algol did in fact extend the definitions to real and double precision x and y as well as integers. But the B6700 did not have any instructions with two results that I can recall (discounting SWAP), so it only meets part of Rubin's requirements. In fact, if we define SIGN(x) = if x = 0 then 0 else x/ABS(x), you can extend DIV and MOD to the complex numbers as well, not that it's quite as useful... As for machines which lack an *integer* divide with those properties, the VAX has an EDIV instruction, but it is awkward to set up. The 4.1BSD C compiler would generate the equivalent of x-(x/y)*y for x%y. For those with SPARCs, a look at the code generated for x%y might be illuminating.
jmacdon@cg-atla.UUCP (Jeff MacDonald) (11/30/88)
In article <1032@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: ... > suppose we want to >divide a by b, obtaining an integer result i and a remainder c. I know >of no machine with this instruction, and this is not that unusual an >instruction to demand. It is cheap in hardware, and extremely expensive >in software--at least 4 instructions. Both the 80x86 and 680x0 machines have integer divide instructions which leave quotient and remainder. So there. Think we've beat Herman up enough? -- Jeff MacDonald ((decvax||ulowell)!cg-atla!jmacdon) c/o Compugraphic Corporation 200-3-5F 200 Ballardvale, Wilmington, MA 01887 (508) 658-0200, extension 5406
barmar@think.COM (Barry Margolin) (12/01/88)
In article <1989@crete.cs.glasgow.ac.uk> orr@cs.glasgow.ac.uk (Fraser Orr) writes: >I think that there would be great improvements if compilers were interactive. ... >Interactive isn't necessarily the best by the way, since it means >you have to repeat yourself, and the decisions are not documented >in the program. How about if the compiler saved the answers in the program text as explicit ADVICE statements? The compiler becomes a simple form of Programmer's Apprentice, helping the programmer determine where potential optimizations can be specified. The decisions become documented, and they don't have to be repeated. Barry Margolin Thinking Machines Corp. barmar@think.com {uunet,harvard}!think!barmar
barmar@think.COM (Barry Margolin) (12/01/88)
In article <771@quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes: >In fact, if we define SIGN(x) = if x = 0 then 0 else x/ABS(x), you can >extend DIV and MOD to the complex numbers as well, not that it's quite >as useful... Which is, in fact, precisely how Common Lisp (which has complex and rational numbers built in) defines SIGNUM. This definition computes the unit vector colinear with the vector in the complex plane that the original number specifies. Barry Margolin Thinking Machines Corp. barmar@think.com {uunet,harvard}!think!barmar
desnoyer@Apple.COM (Peter Desnoyers) (12/01/88)
In article <1034@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > >I do not know if I made it clear in my initial posting, but the problem >arises if the types of a, b, and c are floating. Not that the quote from >my paper specifically has i an integer. This was only implicitly stated, at best, in the original posting by use of Fortran variable naming conventions. I read the article in comp.lang.c. Enough said. Now that I understand the original question, I have a new answer - There is no natural divide-type operation mapping (real,real) -> (real,integer). The operation mentioned was either poorly defined, or merely consisted of: convert a, b to int div a,b -> int quotient, remainder convert remainder to float If what you mean is that for any integer or floating-point operation op(i1,i2,...in,o1,o2...,om) with n inputs 'i' and m outputs 'o', the full set of 2^^n+m combinations of i1 : {int, float}, ... om : {int, float} should be supported as hardware instructions, then the idea is patently absurd. Peter Desnoyers
alverson@decwrl.dec.com (Robert Alverson) (12/01/88)
In fact, the Intel 8087 almost computes i, r := a/b, a%b; The floating point remainder function returns the reminader as the function result, and the least 3 bits of the integer quotient are stored in the condition codes. For some typical uses of the remainder function (range reduction), this is all you need. Bob
gandalf@csli.STANFORD.EDU (Juergen Wagner) (12/01/88)
In article <32354@think.UUCP> barmar@kulla.think.com.UUCP (Barry Margolin) writes: >... >How about if the compiler saved the answers in the program text as >explicit ADVICE statements? The compiler becomes a simple form of >Programmer's Apprentice, helping the programmer determine where >potential optimizations can be specified. The decisions become >documented, and they don't have to be repeated. If you have such a sophisticated compiler, that would be more a tool for automated program generation than a compiler in the classical sense. Such a tool could also be fed with a pure declarative specification of what you intend to do, and it should have knowledge about all optimizations for the architectures you're going to run that program on. Of course, if you learn something really nifty about optimizing code, you would like to tell that "compiler" to use the new technique with your code. I guess, this comes close to what some LISP environments offer, where you can modify the compiler if you don't like it the way it is. In such an environment, the programmer should not only have the opportunity to choose between alternatives the compiler already knows, but also to introduce new specialized strategies for certain applications. A major problem would then be consistency of the resulting system... If I had to use such a tool in the C environment, I would certainly ask for incremental compilation, too. It is a pain in the neck if you have to recompile and relink some of your modules every time you are changing a single character in a function. Hmm... I guess, I am shifting to the more general question of suitable software development environments for C or other languages... 'nuff said. -- Juergen Wagner gandalf@csli.stanford.edu wagner@arisia.xerox.com
smryan@garth.UUCP (Steven Ryan) (12/01/88)
>I can't think of a machine that doesn't. The only machine language 6600/Cyber 170, Cyber 205 don't return the remainder, nor is there a mod instruction. -- -- s m ryan +------------------------------------------------------+-----------------------+ | ...and they were all in exactly the same nightmarish | The nerve-agent | | state: their faces were wholly burned, their | causality...will die | | eyesockets where hollow, the fluid from their melted | of asphyxiation within| | eyes had run down their cheeks. | a few minutes.... | +------------------------------------------------------+-----------------------+
ech@poseidon.ATT.COM (Edward C Horvath) (12/01/88)
In article <1032@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > suppose we want to >divide a by b, obtaining an integer result i and a remainder c. I know >of no machine with this instruction,... In article <8938@winchester.mips.COM> mash@mips.COM (John Mashey) writes: > Although I don't necessarily subscribe to Herman's opinions, R2000 divides > actually do this (leave both results in registers). From article <949@taux01.UUCP>, by cjosta@taux01.UUCP (Jonathan Sweedler): > The 32000 series has a DEI (Divide Extended Integer) instruction that > also does this. Hmm, add the MC68K, the PDP-11, and the IBM s/360 et fils. Put another way, does anyone have an example of a common processor that DOESN'T give you the remainder and quotient at the same time? I don't know the Intel chips, so perhaps the original author just knows that the *86 divide doesn't do this. It's interesting, though, how few languages provide such a "two-valued" functions (all right, I can feel the mathematicians cringing. So few languages provide functions with ranges like ZxZ, OK?). I've seen implementations of FORTH, by the way, where the expression a b /% for example, divides a by b, leaving a/b and a%b on the stack. Of course, if your favorite flavor of forth didn't provide the /% operator ("word") you'd just define it... =Ned=
ok@quintus.uucp (Richard A. O'Keefe) (12/01/88)
In article <21440@apple.Apple.COM> desnoyer@Apple.COM (Peter Desnoyers) writes: >In article <1034@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: >>[about the iq, r := x div y, x mod y instruction] >The operation mentioned was either poorly defined, or merely consisted of: > convert a, b to int > div a,b -> int quotient, remainder > convert remainder to float Wrong. The important thing is that the remainder is remainder = a - b*INT(a/b) I am sure that the IEEE floating-point committee would be interested to learn that it is not a "natural divide-type operation"; this is precisely the IEEE drem(a,b) function. Quoting the SunOS manual: drem(x, y) returns the remainder r := x - n*y where n is the integer nearest the exact value of x/y; moreover if |n-x/y|=1/2 then n is even. Consequently the remainder is computed exactly and |r| <= |y|/2. ... drem(x, 0) returns a NaN. This is obviously a range reduction operator. Oddly enough, on most present-day machines, there is a good excuse for _not_ returning the quotient (n) as well: with 64-bit floats and 32-bit integers there is no reason to expect n to be representable as a machine "integer". Both results would have to be floats. And in its use as a range reduction operation, you normally aren't interested in n.
cik@l.cc.purdue.edu (Herman Rubin) (12/01/88)
In article <2061@garth.UUCP>, smryan@garth.UUCP (Steven Ryan) writes: > >I can't think of a machine that doesn't. The only machine language > > 6600/Cyber 170, Cyber 205 don't return the remainder, nor is there a mod > instruction. On both of these machines, the division must be done in floating point and the quotient converted to integer, etc. On the CRAYs, division does not even exist--a reciprocal must be computed, and then a floating point quotient, etc. Since the machines are scoreboarded, a mod operation not tying up the hardware should be spread over some 50 instructions for the 6600 or CRAY and 75 instructions on the 205 to be efficient. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)
seanf@sco.COM (Sean Fagan) (12/01/88)
In article <6547@june.cs.washington.edu> pardo@cs.washington.edu (David Keppel) writes: -->>>cik@l.cc.purdue.edu (Herman Rubin) writes: -->>>>divide a by b, obtaining an integer result i and a remainder c. -->>>>I know of no machine with this instruction. It is cheap in hardware, -->>>>and extremely expensive in software--at least 4 instructions. -->>mash@mips.COM (John Mashey) writes: -->cjosta@taux01.UUCP (Jonathan Sweedler) writes: -->The rather obscure but still-available VAX series of computers made by -->a small Nashua, New Hampshire company (Digital Equipment Corporation) -->introduced the EDIV instruction recently, about 1978. I heard a rumor -->that (at least in some early/small VAXen) it was faster to do two -->separate instructions than to use EDIV, although I suspect that this -->was an artifact of those implementations. What the hell. Here is a list of machines that I've worked on that had this type of instruction: PDP-11, 8086, 80286, 80386, VAX, NS32k, 68k, WE32100. I wouldn't swear to it, but I think the Elxsi might also have this. On reflection, that is *every* machine I've worked on (at least, in assembly language), except for the following: Z80, 6502, Cyber 170-state machines. The Z80 and 6502 don't have divide, so I think we can ignore them. This sequence, by the way, is *really* painful on a Cyber (and, I would expect, a Cray), because the Cyber has no integer divide. The mnemonic for an integer divide actually converts into a floating point divide, after trashing a couple of registers for you, and then packs the result back into an integer. As a result, when I wanted a modulus, I ended up writing *many* instructions (but they overlapped!), using a, *ahem*, very interesting idea of register allocation. -- Sean Eric Fagan | "Engineering without management is *ART*" seanf@sco.UUCP | Jeff Johnson (jeffj@sco) (408) 458-1422 | Any opinions expressed are my own, not my employers'.
cik@l.cc.purdue.edu (Herman Rubin) (12/01/88)
In article <2416@osiris.UUCP>, consult@osiris.UUCP (Unix Consultation Mailbox ) writes: > I've trimmed all the language-specific groups out of the posting list for > this followup and directed followups to just comp.lang.misc and comp.arch. > (I feel the issues being discussed are too general to make it useful to post > outside of comp.lang.misc, with the exception of comp.arch, where many of > the points being raised will look very familiar.) > > > In article <1032@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > >A trivial example [of useful operations unsupported in HLLs] is having a > >list of results to the left of the replacement operator. I do not mean a > >vector or a struct; the items may be of different types, and should not be > >stored in adjacent memory locations. Most of the time, they should end up > >in registers. I have not seen a language which is claimed to produce even > >reasonably efficient code with this property. Some of these operations are > >even hardware. > > I'll have to ask... what kind of things are you talking about here (beyond > the really obvious one, see below)? (Which of them "are even hardware"?) > Why "shouldn't" the results be stored in adjacent memory locations? And if > you want to be able to take advantage of the knowledge that they're in > registers, you'll probably have to use assembly anyway, so why is this a > requirement for HLLs? The only "languages" I've seen where any of this > stuff makes any sense are all assembly, and then only when catering to the > specifics of the hardware, which can be different even between models of the > "same" box. Besides the quotient and remainder, there are operations like unpacking floating point numbers into exponent and mantissa, which are hardware on some machines, and should be on others. I know of machines for which the hardware instuction to do this even puts the results in different TYPES of registers. A few HLL implementations attempt to make use of registers, although most of them do things badly. Besides having this situation for hardware operations, it should also be use for software. Take, for example, the C function frexp. The syntax is y = frexp(x,&n) instead of the more reasonable y,n = frexp(x) There is no reason why the compiler must store the integer n in some location. But even if this is to be done, the second notation is more appropriate, as it clearly specifies that the operation returns TWO values. A commonly used procedure in numerical mathematics is interpolation. If one is using fixed interval interpolation, one has the operation n,x = m*y where n is the integer part of the product and x the fraction. This is hardware on the VAX, and probably should be done in fixed-point arithmetic on most other machines. (What HLLs support fixed-point "real" arithmetic?) This is especially the case if m is a power of 2, in which case the multiply instruction should usually not be used. But in any case, the pair n,x is the result of the operation. Also, one frequently needs a function and its derivative. The usual bad practice is to use two calls. In most cases, a common calculation is more efficient. This would be much more common if the list on the left of = notation were in use. Mathematicians have been using functions returning a list for centuries. How come the language gurus do not understand this yet? > >Another effect of the HLL tyranny is that the operations which are beyond > >the ken of th HLL designers are disappearing from the machines.... For > >example, suppose we want to divide a by b, obtaining an integer result i > >and a remainder c. I know of no machine with this instruction, and this > >is not that unusual an instruction to demand. It is cheap in hardware, > >and extremely expensive in software--at least 4 instructions. > > I know of at least one machine with this instruction (the LSI-11 with EIS > for integer divide and multiply). There are probably many more. As for it > being "cheap in hardware", that's not really true, although if you have the > divide it's virtually free to offer divide with remainder. I do not know if I made this clear, but a, b, and c are floating. > I tend to agree with Herman that it would be frequently useful to be able to > do this in a HLL. However, frequently useful may not mean useful enough, as > readers of comp.arch well know. HLL designers must justify features they > include, lest their language be named Ada (TM Dept of Redundancy Dept :-). > Also, there's that little problem Herman mentioned in his followup to Doug > Gwyn, that different hardware may not implement desired features, or may > implement them differently enough to make incorporating them into a standard > HLL very very difficult. Successful HLLs tend to deal with common subsets > of hardware features, which are increasingly small these days what with > diverging architectures. The newfangled RISC chips and boards are helping > somewhat although even their designers can't seem to agree on what > constitutes an ideal instruction set (aside from the Z/OISC people who think > even Turing machines are too complicated). > > Bear in mind that I am not an expert, merely someone with lots of experience > with various machines and HLLs. In other words, I'm probably wrong, but > probably not by much. > > I think that if Herman had been following comp.arch religiously for the last > few years he may not have had to ask any of these questions. Maybe we need > a list of answers to commonly asked questions... it might help avoid things > like the recent "which is better, 80386 or 68030" posting... A check of comp.arch would show that I have been posting to comp.arch for the past few years. These questions have not been answered. I also have been posting to several of the comp.lang groups. The fact that operations are not available on all machines does not necessarily mean that they should not be in the languages. I recently had to use the unpack described above from double to long long on a VAX. It SHOULD have been possible to add this to the language as a macro without having to go through gobs of overhead, as well as the other operations used not present in any HLL. The algorithm is statable in an augmented HLL and the machine dependencies are handled by having a few machine parameters available in a #include file. I have a fair amount of experience with machine operations on different machines, and I find it easier to think in terms of them and macros than to use some of the extremely clumsy HLL constructs. But by a macro processor, I mean something like what a compiler goes through in translating x = y - z into machine code. The problem with assembler language is that the above would have to be written some-absurd-mnemonic some permutation of x,y,z in most assemblers. I do know of exceptions. Now in the early days, when it was hard for computers to handle character strings, the above was needed. What is a compiler but a typed, overloaded operator, macro converter with a few gadgets added? Some of these gadgets can be very useful, but augmenting the first part to allow more types and more overloading (C++ does this, but clumsily), user-defined operators, and more flexibility would be far more useful. Also, people who understand the use of hardware-type operations can suggest them. The quotient and remainder in floating-point should be based on simialar hardware to the integer version with an internal trap if the divisor has larger magnitude than the divident to avoid loss of accuracy. I have given other examples where hardware was much better than software. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)
cik@l.cc.purdue.edu (Herman Rubin) (12/01/88)
In article <7740@boring.cwi.nl>, dik@cwi.nl (Dik T. Winter) writes: > In article <21390@apple.Apple.COM> desnoyer@Apple.COM (Peter Desnoyers) writes: < < In article <1032@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: < < > < < > For example, suppose we want to < < >divide a by b, obtaining an integer result i and a remainder c. I know < < >of no machine with this instruction, and this is not that unusual an < < >instruction to demand. < < < < I can't think of a machine that doesn't. The only machine language < < manuals on my desk (68000 and HPC16083) show this instruction. < < If anyone knows of a machine that doesn't leave the remainder in < < a register, I would be interested in knowing of it. Such a machine < < either throws the remainder away or uses some algorithm besides < < shift-and-subtract. (for instance hard-wired.) The CDC6000s and its CYBER successors and the CYBER 205 (a totally different architecture) do not even have integer division. The CRAYs do not even have division. Certainly the VAX and the PYRAMID only have the above properties if an EDIV rather than a DIV is used for division, and that had a few more problems than one would expect because they do not use sign-magnitude. It takes two instructions even to do the sign extension. > Well, if I remember right, the we32000 does that (the thing in a 3b2 and a > tty5620). It has a divide, quotient, remainder and modulus instruction > like the ns32000, but unlike the ns32000 no instruction that gives you > both. This is (I think) an example of a machine for which the instructions > are dictated by a HLL (C in this case), which Herman Rubin so deplores. > On the other hand, this need not be wrong if the operations can be > pipelined. E.g. if i/j and i%j take 23 cycles each, but the second can > start one cycle after the first you do not lose very much. Can division be pipelined? In scalar mode on the CYBER 205, division is not subject for pipelining. I believe that this is the case because too much of the dividend must be retained throughout. > In article <1034@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: < < I do not know if I made it clear in my initial posting, but the problem < < arises if the types of a, b, and c are floating. Not that the quote from < < my paper specifically has i an integer. > > Indeed you did not make that clear (seems evident :-)). > I know no machines that gives you both quotient and remainder on a floating > point division, stronger still, before the IEEE standard there was (I > believe) no machine that would give you the remainder, only quotients > allowed. IEEE changed that and many machines now have floating point > division and floating point remainder, but the result is floating point > for both. This explains also why not both are returned. To obtain the > (exact) remainder may require a larger number of steps than the > (inexact) quotient. I agree however that those machines could return the > quotient when doing a remainder operation. Note that IEEE did *not* > mandate operations that return two results. So no machine I know does > return two results; even with sine and cosine, which is implemented on > many (as an extension to IEEE?). The CORDIC algorithm will give you fast > both sine and cosine, but is only efficient in hardware. But also here, > pipelining may be a deciding point. > -- > dik t. winter, cwi, amsterdam, nederland > INTERNET : dik@cwi.nl > BITNET/EARN: dik@mcvax The implementation of quotient and remainder should be a simple modification of the fixed point algorithm. Those machines which have floating point remainder in hardware presumably do this. Even if CORDIC is not used, returning the sine and cosine would be faster in software than two function calls. The corresponding function is in some of the libraries I know, but I have not seen any other implementation than the two calls. This also holds for many other reasonable situations in software. In software, pipelining is certainly not possible except in vector situations. The lack of inclusion of a list to the left of the = in a replacement statement should have been obvious to all HLL designers from day 1; mathematicians have used functions returning a list for centuries. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)
cik@l.cc.purdue.edu (Herman Rubin) (12/01/88)
In article <19848@ism780c.isc.com>, news@ism780c.isc.com (News system) writes: > In article <1032@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > ... > Then he clarifies: > > >I do not know if I made it clear in my initial posting, but the problem > >arises if the types of a, b, and c are floating. Not that the quote from > >my paper specifically has i an integer. > Ok here is one. Quoting from the IBM/709 Reference Manual (Circa 1959) > > FDH - Floating Divide or Halt > THe c(ac) [means contents of ac register] are divided by c(y). The > quotent appears in the mq and the remainder appears in the ac. ... The format of this allows the multiple precison division of a single precision floating point number by a single precision number, and was probably used in computing double precision quotients, but it has nothing to do with the question I raised. The quotient is a truncated floating point quotient, and the remainder is the remainder from this operation. Even denormalization, which was possible on that machine, could not get the quotient as an integer. The easiest way to do it on the 709 was first to check if the quotient was 0, in which case the result is clear. Otherwise, do the appropriate unpacking and shifting, do an integer divide, and repack the integer remainder. This was more efficient on the 709 than on current machines, because division was relatively slower and comparisons and transfers were FAST instructions in those days. (No, they have not gotten slower; the others have gotten relatively much faster.) -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)
khb%chiba@Sun.COM (Keith Bierman - Sun Tactical Engineering) (12/02/88)
In article <606@poseidon.ATT.COM> ech@poseidon.ATT.COM (Edward C Horvath) writes: > >It's interesting, though, how few languages provide such a "two-valued" >functions (all right, I can feel the mathematicians cringing. So few >languages provide functions with ranges like ZxZ, OK?). I've seen >implementations of FORTH, by the way, where the expression > a b /% >for example, divides a by b, leaving a/b and a%b on the stack. Of >course, if your favorite flavor of forth didn't provide the /% operator >("word") you'd just define it... > F88 does (define a new data type, define a function which returns it, possibly overload some operator). Keith H. Bierman It's Not My Fault ---- I Voted for Bill & Opus
ecphssrw@solaria.csun.edu (Stephen Walton) (12/02/88)
Can this discussion be ended, or at least moved to ONLY comp.lang.misc? This is the second extended language war in the last few months (the last one was C vs. Fortran), and these two have convinced me to auto-KILL all comp.lang.fortran mesages cross-posted to comp.lang.c. -- Stephen Walton, Dept. of Physics & Astronomy, Cal State Univ. Northridge RCKG01M@CALSTATE.BITNET ecphssrw@afws.csun.edu swalton@solar.stanford.edu ...!csun!afws.csun.edu!bcphssrw
khb%chiba@Sun.COM (Keith Bierman - Sun Tactical Engineering) (12/02/88)
In article <1039@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > >Can division be pipelined? In scalar mode on the CYBER 205, division is not >subject for pipelining. I believe that this is the case because too much of >the dividend must be retained throughout. The cydra 5 had it pipelined, but the pipe was interlocked during operation. This was said to be due to a lack of real estate (viz could have been fully pipelined if (a) not for gradual underflow (the cydra 5 was a fully ieee compliant machine), (b) there was more space it could have been done anyway. > >Even if CORDIC is not used, returning the sine and cosine would be faster in >software than two function calls. The corresponding function is in some of >the libraries I know, but I have not seen any other implementation than the >two calls. DEC's famous math$sincos does this. SunFORTRAN takes two calls (in the user code) and turns them into one call to the sun sincos routine. Unless I am mistaken this is a very old and well known optimization (which is typically invisible to the user). On many machines the cost of both is only a couple of % more than the cost of one. Keith H. Bierman It's Not My Fault ---- I Voted for Bill & Opus
consult@osiris.UUCP (Unix Consultation Mailbox ) (12/02/88)
In article <1034@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: >I do not know if I made it clear in my initial posting, but the problem >arises if the types of a, b, and c are floating. Not that the quote from >my paper specifically has i an integer. And I, ex-FORTRAN programmer that I am, should have picked up on this immediately... :-) Phil Kos Information Systems ...!uunet!pyrdc!osiris!phil The Johns Hopkins Hospital Baltimore, MD
ok@quintus.uucp (Richard A. O'Keefe) (12/02/88)
In article <1032@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > suppose we want to >divide a by b, obtaining an integer result i and a remainder c. I know >of no machine with this instruction,... Various people have listed machines with integer quotient-and-remainder instructions. I seldom agree with Herman Rubin, but he is quite capable of reading an instruction set manual. At any rate, he is better at that than most of the repliers were at reading his message. He was asking for the equivalent of double a, b, r; long int i; r = drem(a, b); i = (int)( (a-r)/b ); Another poster says >I've implementations of FORTH, by the way, where the expression > a b /% >for example, divides a by b, leaving a/b and a%b on the stack. Pop-2 has a//b which does much the same thing. Common Lisp has (floor number divisor) plus ceiling, truncate, and round. All four functions return TWO values: the quotient and the remainder. E.g. (multiple-value-setq (Q R) (truncate N D)) In particular, Common Lisp's (round - -) function, if given floating- point arguments, is the function that Rubin wants a single instruction for. (Well, that's drem() anyway. He might want one of the other three.) I don't know, but I wouldn't be surprised if the Symbolics machines had an instruction for this.
ok@quintus.uucp (Richard A. O'Keefe) (12/02/88)
In article <1037@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: >> In article <1032@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: >> >A trivial example [of useful operations unsupported in HLLs] is having a >> >list of results to the left of the replacement operator. I do not mean a >> >vector or a struct; the items may be of different types, and should not be >> >stored in adjacent memory locations. >Besides the quotient and remainder, there are operations like unpacking >floating point numbers into exponent and mantissa, which are hardware on >some machines, and should be on others. The problem of functions returning a single result has long been a thorn in the side of Lisp programmers (who didn't have VAR parameters either). The Lisp community finally fixed this. See "multiple-" in the index of "Common Lisp the Language". As a specific example of this: (multiple-value-setq (Significand Exponent Sign) (decode-float SomeFloatingPointExpression)) [see P218 of CLtL.] The fact that the result variables are in a list has no non-syntactic significance. A Common Lisp implementation can support multiple values any way the implementor pleases (a list or array of results *might* be built, or it might *not*) but no structure is visible to the programmer. (For example, if you just write (decode-float SomeFloatingPointExpression) you will not get a list of results, just the first one.) If the compiler knows how to generate inline code for this particular operation, it is quite at liberty to do so. Prolog does not distinguish in any way between "inputs" and "outputs". If a version of Prolog had this particular operation, it would be decode_float(ANumber, Significand, Exponent, Sign) and you could use this to test for zero the hard way: decode_float(ANumber, 0.0, _, _) All functions in MESA *look* as though they receive a single argument (a record) and return a single result (another record). It's a long time since I read a MESA manual, so I've probably got this wrong, but the function in question would look like function decode_float[x: real] returns [significand: real, exponent: integer, sign: real]; begin ... end; ... [m, e, s] := decode_float[27.3]; Although the syntax of a record is used, this has no more significance than the list of variables in Common Lisp. The implementation *might* construct a record, or it might not. (VAR parameters might be used instead.) I don't understand why ADA didn't pick this up from MESA. The fact that procedures and record constructors both support keyword association is automatic in MESA, but requires separate statement in ADA. And MESA multiple results also support keyword association.
mch@ukc.ac.uk (Martin Howe) (12/02/88)
In article <1039@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: >In article <7740@boring.cwi.nl>, dik@cwi.nl (Dik T. Winter) writes: >> In article <21390@apple.Apple.COM> desnoyer@Apple.COM (Peter Desnoyers) writes: >< < In article <1032@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: >> mandate operations that return two results. So no machine I know does >> return two results; even with sine and cosine, which is implemented on >> many (as an extension to IEEE?). What exactly, then is the 80387 FSINCOS instruction, which takes ST(0) and returns it's sine and cosine in ST(0) and ST(1), ready for division (tangent) or whatever else. In what way does it not return two results ? For that matter how are the flags set if both results return the same exception ? Id be interested to know, if anyone will post or mail ... - Martin -- Martin C. Howe (mch@ukc.ac.uk) | "See them coming, Atlantis will rise, LAN/MICRO Group, Computing Lab. | it's absolute lunacy, they seek to despise!" The University, CANTERBURY, UK. | Tel. (+44 227) 764000 x 7592 | - Jon Cyriss
cik@l.cc.purdue.edu (Herman Rubin) (12/02/88)
In article <787@quintus.UUCP>, ok@quintus.uucp (Richard A. O'Keefe) writes: > In article <21440@apple.Apple.COM> desnoyer@Apple.COM (Peter Desnoyers) writes: > >In article <1034@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: ............................. > Wrong. The important thing is that the remainder is > remainder = a - b*INT(a/b) > I am sure that the IEEE floating-point committee would be interested to > learn that it is not a "natural divide-type operation"; this is > precisely the IEEE drem(a,b) function. Quoting the SunOS manual: > drem(x, y) returns the remainder r := x - n*y where n is the > integer nearest the exact value of x/y; moreover if |n-x/y|=1/2 > then n is even. Consequently the remainder is computed exactly > and |r| <= |y|/2. ... drem(x, 0) returns a NaN. > > This is obviously a range reduction operator. Oddly enough, on most > present-day machines, there is a good excuse for _not_ returning the > quotient (n) as well: with 64-bit floats and 32-bit integers there > is no reason to expect n to be representable as a machine "integer". > Both results would have to be floats. And in its use as a range > reduction operation, you normally aren't interested in n. I disagree. In the varied situations I have wanted this, one of the following happens. I know the result is small, so that sometimes even 8 bits is enough. I only care about the last few bits. I want an overflow trap if it is too large. BTW, I think there should be alternatives about the range of the remainder. There are situations in which I want the remainder forced positive. This is much cheaper in hardware than in software. Thus we have several instructions wanted, or one can look at it as one instruc- tion with a "tag field." Either way, it should be done. And the integer is wanted in many situations. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)
einari@rhi.hi.is (Einar Indridason) (12/03/88)
In article <606@poseidon.ATT.COM> ech@poseidon.ATT.COM (Edward C Horvath) writes: > >Hmm, add the MC68K, the PDP-11, and the IBM s/360 et fils. Put another way, >does anyone have an example of a common processor that DOESN'T give you the >remainder and quotient at the same time? I don't know the Intel chips, so >perhaps the original author just knows that the *86 divide doesn't do this. Are we talking about the old 8-bit processors as well, or are we just talking about the "new" processors. (Define it for your self :-) If we are talking about those 8-bitters then the MOSTEK-6502 does not contain dividing or multiplication instructions. And if memory serves me right, neither did the Z-80. Now, one question: does the ARM chip in Acorn Archimedes include multiply and/or divide instructions. To quote Alfred E. Neuman: "What! Me worry????" Internet: einari@rhi.hi.is UUCP: ..!mcvax!hafro!rhi!einari -- To quote Alfred E. Neuman: "What! Me worry????" Internet: einari@rhi.hi.is UUCP: ..!mcvax!hafro!rhi!einari
consult@osiris.UUCP (Unix Consultation Mailbox ) (12/03/88)
In article <1037@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: >[lots of stuff in reply to my reply to his reply to Doug Gwyn's reply to >his original posting] I'd like to thank Herman publicly for taking the time to gently correct my misreading of his postings. He's certainly asking for some unusual programming constructs, and I'm more and more convinced that he's got a good case. I'm also more and more convinced that he's describing a brand new HLL which would make mathematical programming a horse of a very different color for folks like him. (The usual comments about the inherent unsuitability of digital computers for complicated mathematical computation rear their ugly head but I'm going to ignore them for the nonce.) Maybe Herman should get together with an experience language designer or two and try to come up with something. I for one would be interested in hearing how it turned out. I guess my current job (providing tools to support the development of reliable, regular information systems) has really blinded me to the differing needs of other computer users.. I'll need to consider other angles the next time I put my foot into one of these discussions... Phil Kos Information Systems ...!uunet!pyrdc!osiris!phil The Johns Hopkins Hospital Baltimore, MD
dave@lethe.UUCP (David Collier-Brown) (12/03/88)
In article <1032@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: | Another effect of the HLL tyranny is that the operations which are beyond | the ken of th HLL designers are disappearing from the machines. From article <8938@winchester.mips.COM>, by mash@mips.COM (John Mashey): | Although I don't necessarily subscribe to Herman's opinions, R2000 divides | actually do this (leave both results in registers). Although I shouldn't | have been, I was surprised to find that the following C code, compiled | unoptimized (optimized, it essentially disappears, of course): | main() { | register i,j,k; | i = j / 7; | k = j % 7; | } | generates one divide instruction to get both of the results. Actually its isn't too surprising: one of the primitives I want at code-generator-generation time is an n-tuple consisting of: instruction name input register(s) and content constraints (aka types) output register(s) ditto pattern to generate instruction size time (or other figure of merit) If I have these, I claim I can write a code generator that will do a "good" job [see note below]. If I preserve these, however, I also get the ability to do usefull, if simple, improvements in the code generated to do setup/teardown before and after the operation. If I plan on doing so, I get the ability to take a new instruction, specified by the programmer, and provide the same kind of results. The simple case is old hat [press n if you've seen this one before] Define a system call in Ada by defining the representation of a trap word, provide the types of input and output by declaring it in functional/procedural form and the register constraints by pragmas. Then look at the code generated (about 3 years ago on a small honeybun) and note that no redundant register-register moves were generated on either call or return. --dave (this also answers the question of rarely-used instructions being skimped on) c-b note: No, I **can't** write you a code-generator-generator. I'm not a good enough academic. But I've met people who have... I can only write code-generators themselves.
seanf@sco.COM (Sean Fagan) (12/04/88)
In article <1039@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > >Can division be pipelined? In scalar mode on the CYBER 205, division is not >subject for pipelining. I believe that this is the case because too much of >the dividend must be retained throughout. I don't know how well it can do it theoretically, but I do know that, on a Cyber 170/760, multiplication is pipelined so that it can be doing a maximum of three mults (first step, intermediate steps, last step). Note that the bottleneck (on a five-cycle instruction, wow) is in the intermediate steps. This is why you try to multiply using registers that are otherwise unused at the time 8-). Division, on the other hand, is *not* pipelined. You can only do one division at a time (why he didn't put in more than one division unit is something I still don't understand), and it takes a *long* time (70 or more cycles). -- Sean Eric Fagan | "Engineering without management is *ART*" seanf@sco.UUCP | Jeff Johnson (jeffj@sco) (408) 458-1422 | Any opinions expressed are my own, not my employers'.
cik@l.cc.purdue.edu (Herman Rubin) (12/04/88)
In article <6089@eagle.ukc.ac.uk>, mch@ukc.ac.uk (Martin Howe) writes: > In article <1039@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > >In article <7740@boring.cwi.nl>, dik@cwi.nl (Dik T. Winter) writes: > >> In article <21390@apple.Apple.COM> desnoyer@Apple.COM (Peter Desnoyers) writes: > >< < In article <1032@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > >> mandate operations that return two results. So no machine I know does > >> return two results; even with sine and cosine, which is implemented on > >> many (as an extension to IEEE?). > > What exactly, then is the 80387 FSINCOS instruction, which takes ST(0) and > returns it's sine and cosine in ST(0) and ST(1), ready for division (tangent) or > whatever else. In what way does it not return two results ? For that matter > how are the flags set if both results return the same exception ? I did not know of the 80387 instruction. But how will programmers use it? The languages still do not allow x, y = fsincos(z); This being comp.arch, I must also state that I consider it unfortunate that this type of operation requires that specific registers must be used; in other situations, this is certainly considered a bad thing. Now I do not always agree with the gurus, but this is one case where I do. Unless the hardware actually uses the registers for the computation, such as the accumulator and mq registers in the early von Neumann machines, the registers for a hardware operation should be flexible. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)
csc21824@unsvax.UUCP (Jay) (12/05/88)
In article <622@krafla.rhi.hi.is> einari@krafla.UUCP (Einar Indridason) writes: >In article <606@poseidon.ATT.COM> ech@poseidon.ATT.COM (Edward C Horvath) writes: >>does anyone have an example of a common processor that DOESN'T give you the >>remainder and quotient at the same time? I don't know the Intel chips, so >>perhaps the original author just knows that the *86 divide doesn't do this. The 8086 divide doesn't? Then I'd like to know why my integer-to string conversion routine works? It repeatedly divides the number by 10, saving the remainder --------------------------------------------------------------------- This space for Rent Eric J. Schwertfeger CIS [72657,1166] or csc21824%unsvax.uns.edu Disclaimer:These are just the mad ramblings of a man forced to use vi once too often, and as such, should be ignored.
baum@Apple.COM (Allen J. Baum) (12/06/88)
[] >In article <1842@scolex> seanf@sco.COM (Sean Fagan) writes: >In article <1039@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: >> >>Can division be pipelined? In scalar mode on the CYBER 205, division is not >>subject for pipelining. I believe that this is the case because too much of >>the dividend must be retained throughout. Division can be pipelined, allthough I suspect that the reason it isn't is that you may as well just duplicate the divider (I'm talking integer division here). There is at least one example of a systolic array integer divider: see "Integer Division in Linear Time with Bounded Fan-in by C. Purdy and G. Purdy, IEEE Trans. on Computers, V. C-36#5, May 1987, pp640-644. Finally, there were some claims of a Newton iteration divide step operation on the RPM-40. Those folks are still on the net, but I guess decided they weren't allowed to talk about it, so no details emerged. -- baum@apple.com (408)974-3385 {decwrl,hplabs}!amdahl!apple!baum
guy@auspex.UUCP (Guy Harris) (12/06/88)
>I did not know of the 80387 instruction. But how will programmers use it? >The languages still do not allow > > x, y = fsincos(z); Yes, but they allow X = SIN(Z) Y = COS(Z) and I think there are FORTRAN compilers (hence the CAPITAL LETTERS :-)) that will make a single call to the "sin/cos" function, or generate a single "sin/cos" instruction, and store the two results into X and Y. This could conceivably be done with C as well; consider: math.h: #define sin(x) _builtin_sin(x) #define cos(x) _builtin_cos(x) and a compiler that knows about "_builtin_sin" and "_builtin_cos". There's no particular need, in general, for programming language constructs to directly match hardware constructs in order for the hardware constructs to be used....
jeffa@hpmwtd.HP.COM (12/07/88)
Mathematicians have been using cis(x) = cos(x) + i sin(x) for ages. Ergo, fsincos can be accessed as a complex function. (Or rotation matrix, or ...)
peter@stca77.stc.oz (Peter Jeremy) (12/07/88)
In article <1047@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: [ 80387 FSINCOS instruction returns results in ST(0) and ST(1) ] >This being comp.arch, I must also state that I consider it unfortunate that >this type of operation requires that specific registers must be used; in other >situations, this is certainly considered a bad thing. This instruction is typical of Intel processors - they all have highly non- orthogonal instruction sets, with specialised instructions containing implicit register references. Intel promote this as "a good thing" and have register names to indicate their uses. Intel processors in general are a real pain to use. Even when you go to the 80386 and finally escape from the 64k limit, you still can't nest loops, or use loops with shifts or string instructions without shuffling registers. (Not to mention having to remember all the side effects whilst writing code). -- Peter Jeremy (VK2PJ) peter@stca77.stc.oz Alcatel-STC Australia ...!uunet!stca77.stc.oz!peter 41 Mandible St peter%stca77.stc.oz@uunet.UU.NET ALEXANDRIA NSW 2015
rmarks@KSP.Unisys.COM (Richard Marks) (12/09/88)
In article <670001@hpmwjaa.HP.COM> jeffa@hpmwtd.HP.COM writes: >Mathematicians have been using cis(x) = cos(x) + i sin(x) for ages. Ergo, >fsincos can be accessed as a complex function. or Lisp can return an arbitrary number of multiple values.
aarons@cvaxa.sussex.ac.uk (Aaron Sloman) (12/18/88)
> From: ok@quintus.uucp (Richard A. O'Keefe) > Message-ID: <764@quintus.UUCP> > Date: 29 Nov 88 11:24:15 GMT > In article <1032@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > > A trivial example is having a list of results to the left of > > the replacement operator. I do not mean a vector or a struct; the items > > may be of different types, and should not be stored in adjacent memory > > locations. > If you mean things like being able to write > x, i, s := x/(1.0,+x), i-2, s || "x" || s > CPL had it, BCPL has it (in sequential form), MESA has it (you have to > put record constructor functions around both sides, but that's merely > notational), and it has appeared in several experimental languages. > Common Lisp supports it under the name PSETQ: > (psetq x (/ x (+ 1.0 x)) > i (- i 2) > s (however-you-concatenate-strings s "x" s)) > I have never understood why it was omitted from Pascal, Modula, ADA: > x, y := y, x > is the clearest way to exchange two variables I know. I once put > this construct into the compiler for an Algol-like language; it was > easy, and if the compiler had done any optimisation it would still > have been easy. ----------------------- I think Algol68 also has it ("collateral assignment"??). I have not read the discussion that preceded all this, but just for information, if you want to do multiple assignments in Pop-11 (where assignments go left to right) you can do things like [a list], 'a string', "word", 99 -> x1 -> x2 -> x3 -> x4; Because Pop-11 uses a stack this is equivalent to 99 -> x1; "word" -> x2; 'a string' -> x3; [a list] -> x4; etc. Swapping the values of two variables in Pop-11 is done thus: x, y -> x -> y I.e. you don't need the usual "temporary" variable - just stack the value of x, stack the value of y, then pop the top of the stack into x, then pop into y. I assume Forth can do this too. It would be trivial in Pop-11 to write a macro that did a multiple assignment in the "natural" order, i.e. not in the above stack-oriented last-in first-out order. Not quite so trivial, but also possible (and more efficient), is defining a new "syntax word" to do it. The difference is that a macro reads in text then rearranges the compiler input stream, whereas a syntax word reads in some of the input stream and then plants Poplog virtual machine instructions to be compiled. Also, defining syntax words helps the parser give more helpful error messages if you make syntactic mistakes. E.g. to define ">->" as a new Pop-11 syntax operator to do multiple assignment, with a precedence of 11, i.e. equal to that of the built-in assignment symbol "->", you could do something like the following: define syntax 11 >-> ; ;;; Define local recursive sub-procedure to read in comma-separated ;;; variables up to semi-colon and plant assignments to them ;;; in reverse order (on exit from recursion). define plant_assignments(); lvars varname; if nextitem() /== ";" then ;;; Read the next item in input stream itemread() -> varname; ;;; Do all the rest up to the semi-colon plant_assignments(); unless varname == "," then ;;; plant VM code to assign top of stack to variable sysPOP(varname); endunless; endif; enddefine; ;;; Now the main body of the procedure for ">->" ;;; First some mumbo jumbo to plant code for last Pop-11 operation pop_expr_inst(pop_expr_item); pop11_FLUSHED -> pop_expr_inst; ;;; Now read in variable names and plant assignments plant_assignments(); enddefine; We can test the above definition as follows ;;; Declare some variables vars a, b, c; ;;; Try a multiple assignment to a, b, c. 11, 22, 33 >-> a, b, c; ;;; Make a list containing values of a, b, and c, and check that ;;; the order is right when it is printed out [^a ^b ^c] => ** [11 22 33] Ideally the definition of ">->" would include extra instructions to do more syntax checking (e.g. complaining if a syntax word is read in before ";", or if a comma is missing). Aaron Sloman, School of Cognitive and Computing Sciences, Univ of Sussex, Brighton, BN1 9QN, England ARPANET : aarons%uk.ac.sussex.cvaxa@nss.cs.ucl.ac.uk JANET aarons@cvaxa.sussex.ac.uk BITNET: aarons%uk.ac.sussex.cvaxa@uk.ac UUCP: ...mcvax!ukc!cvaxa!aarons or aarons@cvaxa.uucp