sjc@key.COM (Steve Correll) (07/18/90)
In article <138349@sun.Eng.Sun.COM>, lm@snafu.Sun.COM (Larry McVoy) writes: > I have before me "Programmers Guide > to Fortran 90" which contains things, such as pointers, that make me wonder > if this is going to make fortran an uninteresting language. In article <2306@l.cc.purdue.edu>, cik@l.cc.purdue.edu (Herman Rubin) writes: > I understand that Fortran 90 is eliminating or restricting some of the useful, > but not much used, features of Fortran considered bad by language gurus. > The use of an ASSIGNED goto, especially without list, can make life > difficult for the automatic optimizer, but it can sure speed up a program. Be comforted. Fortran 90 does not eliminate or restrict any of the control constructs of Fortran 77. The advocates of modernization bowed to a compromise in which those constructs are merely deprecated and printed in small type. The POINTER attribute is a more bizarre compromise which requires the implementor to manipulate descriptors rather than mere addresses, but which also requires the programmer to declare all target objects with a TARGET attribute so as to limit the damage. If you never use the TARGET attribute, the challenge confronting the optimizer is no worse than that of Fortran 77. Regarding the merits of assigned GOTO, consider an architecture where the address of an instruction cannot in fact be stored in an INTEGER variable (because, for example, an address has 64 bits whereas an integer has 32). -- ...{sun,pyramid}!pacbell!key!sjc Steve Correll
cik@l.cc.purdue.edu (Herman Rubin) (07/19/90)
In article <1991@key.COM>, sjc@key.COM (Steve Correll) writes: > In article <138349@sun.Eng.Sun.COM>, lm@snafu.Sun.COM (Larry McVoy) writes: > > I have before me "Programmers Guide > > to Fortran 90" which contains things, such as pointers, that make me wonder > > if this is going to make fortran an uninteresting language. > > In article <2306@l.cc.purdue.edu>, cik@l.cc.purdue.edu (Herman Rubin) writes: ........................ > Regarding the merits of assigned GOTO, consider an architecture where the > address of an instruction cannot in fact be stored in an INTEGER variable > (because, for example, an address has 64 bits whereas an integer has 32). Wither put in longer integers, or use an integer array, or something equivalent. If you cannot do this, consider the implementation inadequate. The purpose of a language should be to enable programmers, not to disable them, to get things done. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet) {purdue,pur-ee}!l.cc!cik(UUCP)
peter@ficc.ferranti.com (Peter da Silva) (07/20/90)
> Regarding the merits of assigned GOTO, consider an architecture where the > address of an instruction cannot in fact be stored in an INTEGER variable > (because, for example, an address has 64 bits whereas an integer has 32). Given the relatively small number of addresses involved, using a lookup table and a small integer token for each numbered line is an easy solution to this problem. -- Peter da Silva. `-_-' +1 713 274 5180. 'U` <peter@ficc.ferranti.com>
preston@titan.rice.edu (Preston Briggs) (07/21/90)
In article <ARS4N0F@xds13.ferranti.com> peter@ficc.ferranti.com (Peter da Silva) writes: >> Regarding the merits of assigned GOTO, consider an architecture where the >> address of an instruction cannot in fact be stored in an INTEGER variable >> (because, for example, an address has 64 bits whereas an integer has 32). > >Given the relatively small number of addresses involved, using a lookup >table and a small integer token for each numbered line is an easy solution >to this problem. Sure. Another alternative would be to use relative offsets from the beginning of the routine (not from the ASSIGN point!). 32 bits should cover most routines :-) I think the real problem with label variables is the tangle they make of the flow graph. The poor ol' optimizer sees a "GOTO lvar" and says "Well, I guess he can branch to any label?" We can do better by noticing which labels participate in ASSIGN statements, but it's still a mess. Nice programmers (who'd like the optimizer to be nice to them) will provide a list of alternatives on their assigned gotos. -- Preston Briggs looking for the great leap forward preston@titan.rice.edu
hirchert@ux1.cso.uiuc.edu (Kurt Hirchert) (07/22/90)
In article <10070@brazos.Rice.edu> preston@titan.rice.edu (Preston Briggs) writes: >In article <ARS4N0F@xds13.ferranti.com> peter@ficc.ferranti.com (Peter da Silva) writes: >>> Regarding the merits of assigned GOTO, consider an architecture where the >>> address of an instruction cannot in fact be stored in an INTEGER variable >>> (because, for example, an address has 64 bits whereas an integer has 32). >> >>Given the relatively small number of addresses involved, using a lookup >>table and a small integer token for each numbered line is an easy solution >>to this problem. > >Sure. Another alternative would be to use relative offsets from the >beginning of the routine (not from the ASSIGN point!). 32 bits should >cover most routines :-) > >I think the real problem with label variables is the tangle they make >of the flow graph. The poor ol' optimizer sees a "GOTO lvar" and >says "Well, I guess he can branch to any label?" We can do better by >noticing which labels participate in ASSIGN statements, but it's >still a mess. Nice programmers (who'd like the optimizer to be >nice to them) will provide a list of alternatives on their assigned gotos. If one is only concerned about handling of standard-conforming program, then you can note that a variable that has been set by an ASSIGN statement is undefined for normal use. This means o In the 64 bit/32 bit problem, simply have _two_ variables with the given name, a 32 bit variable for normal integer use and a 64 bit variable for use in the ASSIGN and assigned GOTO statements. o If an optimizer sees "GOTO lvar", it can assume that the label branched to is one set by "ASSIGN label TO lvar" in the same program unit, where we are talking about the _same_ lvar. In most cases, this is a pretty good match with the list the programmer would have provided. -- Kurt W. Hirchert hirchert@ncsa.uiuc.edu National Center for Supercomputing Applications
cik@l.cc.purdue.edu (Herman Rubin) (07/22/90)
In article <10070@brazos.Rice.edu>, preston@titan.rice.edu (Preston Briggs) writes: > In article <ARS4N0F@xds13.ferranti.com> peter@ficc.ferranti.com (Peter da Silva) writes: > >> Regarding the merits of assigned GOTO, consider an architecture where the > >> address of an instruction cannot in fact be stored in an INTEGER variable > >> (because, for example, an address has 64 bits whereas an integer has 32). > >Given the relatively small number of addresses involved, using a lookup > >table and a small integer token for each numbered line is an easy solution > >to this problem. It is not just a matter of space but of time. What is really wanted is a *label to go with the GOTO. Whether this is stored in an integer or not is of little concern for wfficient programming. Even a small looking table instead of using a register variable costs. This should have been an automatic efficiency consideration for at least the past 25 years. > Sure. Another alternative would be to use relative offsets from the > beginning of the routine (not from the ASSIGN point!). 32 bits should > cover most routines :-) If a machine handles offests, fine. Do what works best on the machine. But if the language does not allow this, there should be some way of overriding that aspect of the language. > I think the real problem with label variables is the tangle they make > of the flow graph. The poor ol' optimizer sees a "GOTO lvar" and > says "Well, I guess he can branch to any label?" We can do better by > noticing which labels participate in ASSIGN statements, but it's > still a mess. Nice programmers (who'd like the optimizer to be > nice to them) will provide a list of alternatives on their assigned gotos. Fine, as long as only the optimizer uses it. BTW, besides the goto, there should be a branchto, where it is the responsibility of the remote code to see that the relevant aspects of the current configuration are preserved or restored. This is, at the present time, difficult to do automatically if there are space proglems, such as an inadequate number of registers. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet) {purdue,pur-ee}!l.cc!cik(UUCP)
bengtl@maths.lth.se (Bengt Larsson) (07/28/90)
(sorry for commenting this late, we seem to have some problem with the news system here) In article <2384@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: (about ASSIGNed GOTOs) >> >Given the relatively small number of addresses involved, using a lookup >> >table and a small integer token for each numbered line is an easy solution >> >to this problem. > >It is not just a matter of space but of time. What is really wanted is >a *label to go with the GOTO. Whether this is stored in an integer or >not is of little concern for wfficient programming. Even a small looking >table instead of using a register variable costs. This should have been an >automatic efficiency consideration for at least the past 25 years. Are you _really_ concerned about _one_ extra memory access? (and one likely to be in the cache at that?) What sort of code do you write? (And it would be more polite to put it like "What _I_ really want..." instead of boldly stating "What is really wanted" all the time. Your wishes and desires may not be the whole worlds.) >Fine, as long as only the optimizer uses it. > >BTW, besides the goto, there should be a branchto, where it is the >responsibility of the remote code to see that the relevant aspects of >the current configuration are preserved or restored. This is, at the >present time, difficult to do automatically if there are space proglems, >such as an inadequate number of registers. What do you mean by this? "where it is the responsibility of the remote code to see that the relevant aspects of the current configuration are preserved or restored"? This sounds like a subroutine call (with callee-save)? -- Bengt Larsson - Dep. of Math. Statistics, Lund University, Sweden Internet: bengtl@maths.lth.se SUNET: TYCHE::BENGT_L
cik@l.cc.purdue.edu (Herman Rubin) (07/28/90)
In article <1990Jul28.090036.374@lth.se>, bengtl@maths.lth.se (Bengt Larsson) writes: > > (sorry for commenting this late, we seem to have some problem with the > news system here) > > In article <2384@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > > (about ASSIGNed GOTOs) [Discussion was about ASSIGNed GOTOs without list.] > Are you _really_ concerned about _one_ extra memory access? (and one likely > to be in the cache at that?) What sort of code do you write? An ASSIGNed GOTO is unlikely to have only one memory access. It may even be that not all the GOTO locations are known at compile time, and even that some of the locations may be in another program block. Of course, the compiler of that block needs the information that this might happen. > (And it would be more polite to put it like "What _I_ really want..." > instead of boldly stating "What is really wanted" all the time. Your wishes > and desires may not be the whole worlds.) Criticism accepted. > > > >BTW, besides the goto, there should be a branchto, where it is the > >responsibility of the remote code to see that the relevant aspects of > >the current configuration are preserved or restored. This is, at the > >present time, difficult to do automatically if there are space proglems, > >such as an inadequate number of registers. > > What do you mean by this? "where it is the > responsibility of the remote code to see that the relevant aspects of > the current configuration are preserved or restored"? This sounds like > a subroutine call (with callee-save)? Sometimes this will do, but not always. For one thing, the code being branched to is local to the subprogram, and is not going to be called from anywhere else, so there is no need for a call-return. The code will often be short. Also, the code does not necessarily return to the location from which it is called. A common situation is the following: comp: some code; texp -= expr; if(texp < 0) branchto bad; .... bad: more code; unbranchto comp; A global optimizer might handle this, but the point which has to be made is that the branch target optimization is subordinate to the main optimization. The "structured" way of doing this is likely to insert the branch block into the main code, and unless fall-through is not faster than branch, and the instruction cache is large, this is likely to lead to slower code. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet) {purdue,pur-ee}!l.cc!cik(UUCP)
bengtl@maths.lth.se (Bengt Larsson) (07/31/90)
In article <2413@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > [Discussion was about ASSIGNed GOTOs without list.] > >> Are you _really_ concerned about _one_ extra memory access? (and one likely >> to be in the cache at that?) What sort of code do you write? > >An ASSIGNed GOTO is unlikely to have only one memory access. Why (if you have a fixed number of cases)? In Vax assembler it would be "jmp @jumptable[r0]". On a RISC, it would be a load followed by a branch. (and the VAX has it in a special instruction, CASE, which also takes care of arrays not starting at 1 and checks boundary conditions). >It may even be >that not all the GOTO locations are known at compile time, and even that >some of the locations may be in another program block. So you essentially want to be able to jump to any label, from any ASSIGNed GOTO? Are you aware that this will screw up many optimizations which otherwise could be done across labels? (at least I think it will. I'm no expert on compiler writing, please correct me if I'm wrong). >Of course, the >compiler of that block needs the information that this might happen. I think it will, yes. And I would believe that it will suffer from it. Another idea is that C, for example, allows pointers to functions. Maybe you could make use of that? And _still_ another thing is that if you want local subroutines ("which may not be called from outside"), that's exactly what you get with nested scopes, as mentioned in another article (a local procedure has access to all the surrounding procedures local variables). >Sometimes this will do, but not always. For one thing, the code being >branched to is local to the subprogram, and is not going to be called >from anywhere else, so there is no need for a call-return. The code >will often be short. Also, the code does not necessarily return to the >location from which it is called. A common situation is the following: > >comp: some code; > texp -= expr; > if(texp < 0) branchto bad; > .... > >bad: more code; > unbranchto comp; Which, in a slightly more structured way would be comp: some code; texp -= expr; if (texp >= 0) { more code; goto comp; } wouldn't it? Are you worried that "more code" will mostly not be executed, and therefore you move it, reverse the test, branch and branch back? Is the execution time of your code that critical? And what in the world is an "unbranch to"? What sets it apart from a "branchto" (+ an optimizer)? >A global optimizer might handle this, but the point which has to be made is "the point which _I_ want to make is..." >that the branch target optimization is subordinate to the main optimization. >The "structured" way of doing this is likely to insert the branch block into >the main code, and unless fall-through is not faster than branch, and the >instruction cache is large, this is likely to lead to slower code. It may be slower. But: do you value your own time? How many hours does it take you to implement an algorithm if you micro-manage it like that? Is it going to be run ten billion times? Is it going to be sold to ten thousand customers? Is it really worth it? If you can achieve 80% of the speed in 20% of the programming time, isn't it worth it? Do you actually _get_ _anything_ _done_ when you program in such detail? How many programs of more than 100 lines have you completed? Of more than a thousand lines? Did they port to faster machines? That's a lot of questions, but I think they have to be made. It may be that you are spending lots of time just to get it "perfect". Unless your code is critical, it may (maybe) not be worth the effort. --------------------------------------------------------------------------- Disclaimer: I'm NOT a compiler writer, if I have totally screwed up somewhere, put it to me gently... :-) --------------------------------------------------------------------------- -- Bengt Larsson - Dep. of Math. Statistics, Lund University, Sweden Internet: bengtl@maths.lth.se SUNET: TYCHE::BENGT_L
bengtl@maths.lth.se (Bengt Larsson) (07/31/90)
In article <1990Jul30.183426.26506@lth.se> I write: >>comp: some code; >> texp -= expr; >> if(texp < 0) branchto bad; >> .... >> >>bad: more code; >> unbranchto comp; > >Which, in a slightly more structured way would be > >comp: some code; > texp -= expr; > if (texp >= 0) { > more code; > goto comp; > } > >wouldn't it? No it wouldn't. Silly me. The test shouldn't be reversed. Sorry (I couldn't cancel the original article). -- Bengt Larsson - Dep. of Math. Statistics, Lund University, Sweden Internet: bengtl@maths.lth.se SUNET: TYCHE::BENGT_L
cik@l.cc.purdue.edu (Herman Rubin) (07/31/90)
In article <1990Jul30.183426.26506@lth.se>, bengtl@maths.lth.se (Bengt Larsson) writes: > In article <2413@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: ..................... > >comp: some code; > > texp -= expr; > > if(texp < 0) branchto bad; > > .... > > > >bad: more code; > > unbranchto comp; > > Which, in a slightly more structured way would be > > comp: some code; > texp -= expr; > if (texp < 0) { This was corrected by Mr.Larson in a subsequent post. > more code; > goto comp; > } > > wouldn't it? > > Are you worried that "more code" will mostly not be executed, and therefore > you move it, reverse the test, branch and branch back? This is definitely the case. > Is the execution time of your code that critical? This block of code is likely to be executed at least tens of thousands of times, and quite likely tens of millions. It is code for refilling a buffer to be used in Monte Carlo calculations. > And what in the world is an "unbranch to"? What sets it apart from a > "branchto" (+ an optimizer)? Not much. It is an instruction to the compiler that that particular form of "optimization" is to be carried out, whether or not the compiler recognizes it. A possible HLL way to do this would be comp: some code; texp -= expr; if (texp < 0) remote { more code; goto comp; } ......................... > It may be slower. But: do you value your own time? How many hours does it > take you to implement an algorithm if you micro-manage it like that? Is > it going to be run ten billion times? Is it going to be sold to ten thousand > customers? Is it really worth it? This part of the code ten million, anyhow. > If you can achieve 80% of the speed in 20% of the programming time, > isn't it worth it? More likely 30% of the speed in 120% of the programming time. The HLL syntax is that stilted compared to the machine design. I suspect that if people were taught the structure of the hardware, that they would do these things without effort. At least some of them would, which is much better than the present situation. You are denying the artisan the tools. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet) {purdue,pur-ee}!l.cc!cik(UUCP)
ted@nmsu.edu (Ted Dunning) (08/01/90)
In article <2419@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: In article <1990Jul30.183426.26506@lth.se>, bengtl@maths.lth.se (Bengt Larsson) writes: > In article <2413@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > Are you worried that "more code" will mostly not be executed, and therefore > you move it, reverse the test, branch and branch back? This is definitely the case. > Is the execution time of your code that critical? This block of code is likely to be executed at least tens of thousands of times, and quite likely tens of millions. It is code for refilling a buffer to be used in Monte Carlo calculations. well, how much is it going to cost? on a highly pipelined, heavily cached RISC, both alternatives will be in the cache (and are likely to be in the same page), so the only cost differential will be the cost of the jump taken vs. not taken. on many riscs, this will be a penalty of 1-2 cycles because of bubbles in the pipeline, or because much of the pipeline will have to be invalidated. on the mips, the profiler and optimizer would notice that the branch is unlikely to be taken, and it would rearrange the code for optimality. on the rs/6000, the jump will probably be executed early in the pipeline and there will be no penalty. in fact, the branch will appear to take 0 cycles. just for handy reference, let us take one cycle as 50 ns. this is a relatively slow machine anymore. so how much will the penalty be? This part of the code ten million, anyhow. hmmm.... ten million times one cycle comes out to .... 500 milliseconds. so, mister rubin is SOOOO worried about his monte carlo program that he will spend all this effort to save half a second of run time on a program that probably runs for at least minutes. > If you can achieve 80% of the speed in 20% of the programming time, > isn't it worth it? More likely 30% of the speed in 120% of the programming time. and then he overestimates his own value and his own effectiveness by an enormous factor. i don't think that he has much of an argument when he makes claims like this. -- Offer void except where prohibited by law.
peter@ficc.ferranti.com (Peter da Silva) (08/01/90)
In article <2419@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > You are denying the artisan the tools. Oh, garbage. If you're writing non-portable code anyway use assembly. Or... have you considered Forth? -- Peter da Silva. `-_-' +1 713 274 5180. 'U` <peter@ficc.ferranti.com>
cik@l.cc.purdue.edu (Herman Rubin) (08/03/90)
In article <8E+4_83@ggpc2.ferranti.com>, peter@ficc.ferranti.com (Peter da Silva) writes: > In article <2419@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > > You are denying the artisan the tools. > > Oh, garbage. If you're writing non-portable code anyway use assembly. > Or... have you considered Forth? I have never objected to having semi-portable code. Most of the code called portable is really in this class, and making Fortran fully portable would make it so slow as to be almost useless. Which machine's floating point representation are you going to use? The several supercomputers I know, as well as the VAXen, are different from IEEE. If fact, their hardware as actually implemented on several models predates IEEE floating- point standards. The situation is not quite as bad for integer arithmetic, if the sizes of the objects are the same, but it can be even then. If one machine has 8 bit bytes and another 9, there are problems, but good. When characters are read, if 8-bit characters are allowed, is there sign extension or not? Forth is also clumsy and inefficient. An intelligently written assembler, which allowed overloaded operators and weak typing, would be useful, as would a versatile macro processor. For those who claim they have one, the following c{'z} ={tc} {-}{|}{ta}a{'x} OP{mod} {|}{tb}b{'y} {/\{~}w} is the way I would write the general vector instruction on the CYBER 205. with braces enclosing optional fields. Portable constructs are one thing, but portable code is another. Packing and unpacking floating-point numbers are portable constructs, but I know of no HLL which has them. These should be operators, not function calls. The idea of open subroutines, a cross between macros and inlining, existed at the time of Fortran I. Fortran VI on the CDC3600 allowed 3 additional types (to make a total of 8), and allowed the user to define the Fortran operations on these types, but only by subroutine calls. Why can we not have additional operator SYMBOLS added to a HLL? C and C++ allow additional types, only they do not call them types. I was informed by soeone at ATT that the non-allowance of additional operator symbols in C++ was an unfortunate oversight. I have not run into anything which can not be written in a semi-portable manner. But in choosing which of the extremely large number of algortithms I can understand, I do take into account the machine. Remember that all of the present machines, and even pocket calculators if they were equipped with a large enough memory, are capable of doing anything that any other machine can do, although it may very well be very inefficient. Any language can simulate any other, although not always well. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet) {purdue,pur-ee}!l.cc!cik(UUCP)
gl8f@astsun9.astro.Virginia.EDU (Greg Lindahl) (08/03/90)
In article <2426@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: >The idea of open subroutines, a cross between macros and inlining, existed >at the time of Fortran I. So what do open subroutines give you, performance-wise, that inlining doesn't? I believe several posters explained to you a few days ago that many compiler/linkers provide inlining these days. However, you never seem to respond to anyone who has a simple, existing way to do what you want. -- "In fact you should not be involved in IRC." -- Phil Howard
ted@nmsu.edu (Ted Dunning) (08/04/90)
In article <2426@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes:
Which machine's floating point representation are you going to use?
whichever machine you are on.
The several supercomputers I know, as well as the VAXen, are
different from IEEE. If fact, their hardware as actually
implemented on several models predates IEEE floating- point
standards.
so? that really doesn't cause much problem except insofar as accuracy
is concerned. the actual format of the floating point numbers only
matters in exceedingly rare situations.
c{'z} ={tc} {-}{|}{ta}a{'x} OP{mod} {|}{tb}b{'y} {/\{~}w}
is the way I would write the general vector instruction on the
CYBER 205.
care to justify this? or is the fact that you would write it that was
sufficient to make everyone want to write it that way?
Packing and unpacking floating-point numbers are portable
constructs, but I know of no HLL which has them.
actually this isn't even true, since the sizes of and limits on the
components can very tremendously from machine to machine.
These should be operators, not function calls.
you don't ever seem to answer the point that you only need to do fast
what you do often. do you really unpack as many floats as you
multiply? why does this operation need to be an operator instead of a
function call?
is it that there some deep psychological need to unpack things
quickly?
The idea of open subroutines, a cross between macros and inlining,
existed at the time of Fortran I. Fortran VI on the CDC3600
allowed 3 additional types (to make a total of 8), and allowed the
user to define the Fortran operations on these types, but only by
subroutine calls.
these seem irrelevant. unless, of course, you are still programming
in fortran I, or on the 3600.
Why can we not have additional operator SYMBOLS added to a HLL?
you can. note prolog.
C and C++ allow additional types, only they do not call them types.
they do if you define them with typedef in c. if you don't, then they
aren't really types since they are syntactically distinguished.
in c++, they call them structures or classes. the fact that they are
types is never denied, this only recognizes the etymology of the
situation.
I was informed by soeone at ATT that the non-allowance of
additional operator symbols in C++ was an unfortunate oversight.
only partially. it does preserve the ability of the language to be
parsed by a static parser. if you allow new operators and new
precedences, then you get into the serious problem of needing to
extend the parser in non-trivial ways *at compile time*. the lisp
community has been trying to get this just right for decades and is
still trying. prolog newcomers still cause insiders to giggle by
asking for a context free grammer for the language.
why don't you admit that the fact that *you* don't see any reason for
something does not mean that there isn't a reason. you have provided
ample evidence that it only means that you don't see.
--
Offer void except where prohibited by law.
cik@l.cc.purdue.edu (Herman Rubin) (08/04/90)
In article <TED.90Aug3120910@kythera.nmsu.edu>, ted@nmsu.edu (Ted Dunning) writes: > > In article <2426@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: > > Which machine's floating point representation are you going to use? > > whichever machine you are on. Then you do not have full portability. In using a Monte Carlo procedure, you will get an acceptance on one machine rejected on another. If many random inputs go inot an observation, all observations after this happens are different. This is a lack of portability, and is not merely a matter of accuracy. ...................... > c{'z} ={tc} {-}{|}{ta}a{'x} OP{mod} {|}{tb}b{'y} {/\{~}w} > > is the way I would write the general vector instruction on the > CYBER 205. > > care to justify this? or is the fact that you would write it that was > sufficient to make everyone want to write it that way? Only partially. I do not expect someone else to come up with exactly the same thing, but it would be very similar. The vector c is the destination vector, with z an optional offset; tc overrides the type (half or full) of the operation, - means negation, | means absolute value, ta and tb can override the vector or scalar aspect of the corresponding type, and if vectors they can have an offset, if present w is a bit vector determining which values are substituted, and ~ means complementation. OP is the operator, and some of the operators have modifications, with defaults depending also on the types. Compare this with the customary assembler type of statement for ease of understanding. > Packing and unpacking floating-point numbers are portable > constructs, but I know of no HLL which has them. > > actually this isn't even true, since the sizes of and limits on the > components can very tremendously from machine to machine. So can the sizes of the various types in C or Fortran. It is no less portable than the arithmetic operators in the HLLs. > These should be operators, not function calls. > > you don't ever seem to answer the point that you only need to do fast > what you do often. do you really unpack as many floats as you > multiply? why does this operation need to be an operator instead of a > function call? Even if the function is totally inlined, so that it is as fast as an operation, I see no more reason why =PACK (or something like that) should be a function call than +. There also is the problem, lacking in most languages (I believe this can be done in LISP) of multiple returns from an operation. +UP (unpack) returns two values, as does division with quotient and remainder. > is it that there some deep psychological need to unpack things > quickly? Why do some computer scientists seem to have some deep psychological need to make things clumsy? If I want to unpack things, I see no reason why I should have to use inadequate notation because the language guru does not see why someone would want to unpack. Nor should I have to pass an address and load because they forgot, more than 30 years ago, that natural operations can return more than one result. Why should I have to do things slowly because of the antipathy of some computer scientists to the idea of the user using the relatively simple structure of the machine? Why do they want to leave out the natural hardware operations? By a natural hardware operation, I mean one which can be done easily by simple use of computer circuits and at most nanocode. > The idea of open subroutines, a cross between macros and inlining, > existed at the time of Fortran I. Fortran VI on the CDC3600 > allowed 3 additional types (to make a total of 8), and allowed the > user to define the Fortran operations on these types, but only by > subroutine calls. > > these seem irrelevant. unless, of course, you are still programming > in fortran I, or on the 3600. > > Why can we not have additional operator SYMBOLS added to a HLL? > > you can. note prolog. > > C and C++ allow additional types, only they do not call them types. > > they do if you define them with typedef in c. if you don't, then they > aren't really types since they are syntactically distinguished. > > in c++, they call them structures or classes. the fact that they are > types is never denied, this only recognizes the etymology of the > situation. At least according to all the C manuals I have read, typedef does not allow one to add types, only add aliases. That there were other types was explicitly recognized long before C and C++. > I was informed by soeone at ATT that the non-allowance of > additional operator symbols in C++ was an unfortunate oversight. > > only partially. it does preserve the ability of the language to be > parsed by a static parser. if you allow new operators and new > precedences, then you get into the serious problem of needing to > extend the parser in non-trivial ways *at compile time*. the lisp > community has been trying to get this just right for decades and is > still trying. prolog newcomers still cause insiders to giggle by > asking for a context free grammer for the language. This is true. I would be willing to give up a heck of a lot on precedence structure, however. The precedences, except for the common arithmetic operators, are fairly arbitrary, anyhow. Why not allow the parser to ask the programmer for the preference order, as well as evaluation order if it makes a difference? A lot can be helped by allowing explicitly temporary variables to handle intermediate results. There are ways around the problem of precedence. Maybe we should get rid of it completely, like APL. There are parentheses. The compliler can ask the programmer these facts. When coding myself, I have had only trouble with these arbitrary precedences. Which has higher precedence, << or +? > why don't you admit that the fact that *you* don't see any reason for > something does not mean that there isn't a reason. you have provided > ample evidence that it only means that you don't see. The only reasons I have seen advanced are that not too many people would use them. Suppose someone comes to me with a new type of statistical problem. After some work, an appropriate class of numerical procedures is produced for that problem. I see how a given machine works on the various procedures, and it is definitely not the case that the same procedure works well on all machines. Now, I have to go into contortions to get the procedure reasonably implemented. One does not use a different part of statistics for problems in biology and problems in physics. One should not have to use a different programming language for different types of problems. If a mathematical notation is clumsy, it usually does not last long. Neither should clumsy notation in Fortran or C or Lisp or assembler. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet) {purdue,pur-ee}!l.cc!cik(UUCP)
ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (08/04/90)
In article <2426@l.cc.purdue.edu>, cik@l.cc.purdue.edu (Herman Rubin) writes: > Portable constructs are one thing, but portable code is another. Packing > and unpacking floating-point numbers are portable constructs, but I know > of no HLL which has them. 1. ANSI C. 2. Common Lisp. 3. The Ada Numerics Group have a very nice proposal for operations on floating point numbers (including, would you believe, math functions with guaranteed error bounds) which should be in Ada 9x. 4. Fortran 90. In detail, if x = s * b**e * f, s = +1 or -1, 0 <= f < 1 EXPONENT(x) => e (exponent(0.0) .eq. 0) FRACTION(x) => f .eq. x * b**(-EXPONENT(x)) RADIX(x) => b SIGN(1.0, x) => s (sign(1.0, 0.0) .eq. 1.0) DIGITS(x) => number of base-b digits in f NEAREST(x, s) => if s > 0, the machine number _just_ .gt. x if s < 0, the machine number _just_ .lt. x RRSPACING(x) => ABS(SCALE(x,-EXPONENT(x)))* RADIX(x)**DIGITS(x) SPACING(x) => RADIX(x)**(EXPONENT(x)-DIGITS(x)) SCALE(x, i) => x * RADIX(x)**i SETEXPONENT(x,i)=> x * RADIX(x)**(i-EXPONENT(x)) There's a proposal for a standard for the floating-point part of programming language standards. It was published in an issue of SIGPlan Notices this year. Anyone interested in this matter should see that and send in comments. A language standard says what the operations _look_ like. ANSI C, Fortran 90, and the proposal for Ada 9x, are all for languages with static typing, so a compiler can figure out at compile time exactly which hardware floating-point type is in use, and may generate the appropriate instructions. On a VAX, X = SETEXPONENT(X, 2) might well be compiled into a single INSV instruction. Why not? Mathematically, these things _are_ functions, and they might as well _look_ like functions. > Why can we not have additional operator SYMBOLS added to a HLL? In Algol 68, they _can_. When Algol 68 needed friends, where were you? Haskell lets you declare new (overloaded) operators. > I have not run into anything which can not be written in a semi-portable > manner. But in choosing which of the extremely large number of algorithms > I can understand, I do take into account the machine. I think that the answer here may be to think in terms of meta-programs. I've often run into an irritating problem. There is some special function I would like to compute, and I'm not enough of a numerical analyst to work out how to do it well myself, so I look around in journals and such for an implementation I can copy. _Far_ too often I'm presented with something with around a dozen coefficients which are (a) presented to fewer decimal places than I can get in IEEE 754 arithmetic, (b) designed for single precision arithmetic on some antique machine in the first place, and (c) -- this is the killer -- unaccompanied by any explanation of how to compute suitable coefficients for some other machine. What I would really like to see is meta-programs: tell them something about the machine arithmetic and they generate the appropriate table of coefficients for you. (No, tables for a dozen different existing machines are not enough. I've yet to see the 80-bit numbers you get on a Mac in such tables.) -- Distinguishing between a work written in Hebrew and one written in Aramaic when we have only a Latin version made from a Greek translation is not easy. (D.J.Harrington, discussing pseudo-Philo)
lgm@cbnewsc.att.com (lawrence.g.mayka) (08/05/90)
In article <TED.90Aug3120910@kythera.nmsu.edu> ted@nmsu.edu (Ted Dunning) writes: >parsed by a static parser. if you allow new operators and new >precedences, then you get into the serious problem of needing to >extend the parser in non-trivial ways *at compile time*. the lisp >community has been trying to get this just right for decades and is >still trying. prolog newcomers still cause insiders to giggle by I think the Common Lisp community finds lexical extension (macro characters) quite useful and syntactic extension (macros) absolutely essential. Your "still trying" barb seems to be an attack on the very process of experimentation, growth, development, standardization, and extension inherent in the introduction of new technology. Lawrence G. Mayka AT&T Bell Laboratories lgm@iexist.att.com Standard disclaimer.
lgm@cbnewsc.att.com (lawrence.g.mayka) (08/05/90)
In article <2428@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: >This is true. I would be willing to give up a heck of a lot on precedence >structure, however. The precedences, except for the common arithmetic >operators, are fairly arbitrary, anyhow. Why not allow the parser to >ask the programmer for the preference order, as well as evaluation order >if it makes a difference? A lot can be helped by allowing explicitly >temporary variables to handle intermediate results. > >There are ways around the problem of precedence. Maybe we should get rid >of it completely, like APL. There are parentheses. The compliler can >ask the programmer these facts. As you say, precedence is unnecessary. Parentheses are sufficient. Application-oriented notations, such as the use of infix numerical operators with standard mathematical precedence, can be an add-on feature, operative within delimited textual regions (e.g., on the Symbolics workstation, any program text surrounded by sharp-lozenge and lozenge characters). One can define a notation to one's own liking in the same way. Lawrence G. Mayka AT&T Bell Laboratories lgm@iexist.att.com Standard disclaimer.
peter@ficc.ferranti.com (Peter da Silva) (08/05/90)
Herman Rubin: "You're denying the artisan his tools". Me: "Oh, garbage. If you're writing non-portable code anyway use assembly. Or... have you considered Forth?" In article <2426@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) follows up with a complete non-sequiter: > I have never objected to having semi-portable code. So why don't you write this code the same way everyone else does: by implementing it in a high level language and then modifying the algorithm and recoding parts of it in assembly. > Forth is also clumsy and inefficient. An intelligently written assembler, > which allowed overloaded operators and weak typing, would be useful, Forth *is* an intelligently-written assembler, for an ideal stack-based computer. Most production Forth code also includes high level assembly code for the actual underlying hardware. You're welcome to write your whole program that way if it suits you. [the following gibberish] > c{'z} ={tc} {-}{|}{ta}a{'x} OP{mod} {|}{tb}b{'y} {/\{~}w} > is the way I would write the general vector instruction on the CYBER 205. > with braces enclosing optional fields. If you explain what that means, I'll attempt to write it in Forth assembler. I'm sure it would be more readable. You've actually come up with something less readable than naive Forth. Be proud. > I have not run into anything which can not be written in a semi-portable > manner. So now who is it that is denying the artisan his tools? Herman, I really think you'd like Chuck Moore. -- Peter da Silva. `-_-' +1 713 274 5180. 'U` <peter@ficc.ferranti.com>
mikeb@ee.ubc.ca (Mike Bolotski) (08/05/90)
In article <2428@l.cc.purdue.edu>, cik@l.cc.purdue.edu (Herman Rubin) writes: > [ ... ]. One should not have to use a different programming > language for different types of problems. [ ... ] The most ridiculous statement I've seen on this newsgroup. It rather causes the reader to question the qualifications of the author on any aspect of language design or implementation. And I've written quite a few language parsers with lex/yacc, some of which have syntax as horrid as the one Herman proposed. It doesn't require much effort to slop together a language for a temporary project. -- Mike Bolotski VLSI Laboratory, Department of Electrical Engineering mikeb@salmon.ee.ubc.ca University of British Columbia, Vancouver, Canada
gudeman@cs.arizona.edu (David Gudeman) (08/06/90)
In article <1352@fs1.ee.ubc.ca> mikeb@ee.ubc.ca (Mike Bolotski) writes: ] ]In article <2428@l.cc.purdue.edu>, cik@l.cc.purdue.edu (Herman Rubin) writes: ]> [ ... ]. One should not have to use a different programming ]> language for different types of problems. [ ... ] ] ]The most ridiculous statement I've seen on this newsgroup. It rather ]causes the reader to question the qualifications of the author on ]any aspect of language design or implementation. Only ridiculous if you have a limited imagination. It is perfectly plausible that it is possible to design a ``universal'' language of some sort, that makes it unnecessary to use other languages. Such a language will probably not be small, and it may be divided into many sublanguages, each appropriate for different problems, but there could be a unifying framework and user interface. In fact a lot of current research is going into the search for such a language, although the researchers don't generally think of their work in this way. Of course, it is also perfectly plausible that no such unification is possible, and that the current state of affairs will continue indefinitely, where it is necessary to pick an appropriate language for your problem. To suggest that the issue has been decided is a little premature to say the least. -- David Gudeman Department of Computer Science The University of Arizona gudeman@cs.arizona.edu Tucson, AZ 85721 noao!arizona!gudeman
gudeman@cs.arizona.edu (David Gudeman) (08/06/90)
In article <2428@l.cc.purdue.edu> cik@l.cc.purdue.edu (Herman Rubin) writes: >In article <TED.90Aug3120910@kythera.nmsu.edu>, ted@nmsu.edu (Ted Dunning) writes: > >Why do some computer scientists seem to have some deep psychological >need to make things clumsy? Computer scientists have a deep psychological need to do real research, and not play around with trivialities such as syntax. To this end, many languages today are parsed with automatically generated parsers, and the syntax is designed specifically to so that it is easy to write a yacc grammar for. I don't think you are going to find much sympathy for your complaints that you want operator syntax instead of function syntax for your pet operations. You may as well complain that the language is named C instead of a much more presentable letter such as D. >> why don't you admit that the fact that *you* don't see any reason for >> something does not mean that there isn't a reason. you have provided >> ample evidence that it only means that you don't see. > >The only reasons I have seen advanced are that not too many people would >use them. Suppose someone comes to me with a new type of statistical >problem... You aren't coming to us with a new type of problem. You are coming to us with a minor problem that was noted and satisfactorily answered thirty years ago. The problem is: there are a finite number of operator symbols and and an infinite number of operations. The most common solution is: use names for operations and and have a special operator for applying names as operators (function notation). It's not the only solution, but it seems adequate to almost everyone except you. All of your other problems are either (1) equally trivial or (2) already solved by easily available languages or compilers. In particular, your apparent need for using direct machine instructions is solved by gcc, a _free_ C compiler. > If a mathematical notation is >clumsy, it usually does not last long. Neither should clumsy notation in >Fortran or C or Lisp or assembler. In mathematics, there is no great investment involved in changing the notation. In programming languages, the investment is often huge. And in this case, the motivation is miniscule. -- David Gudeman Department of Computer Science The University of Arizona gudeman@cs.arizona.edu Tucson, AZ 85721 noao!arizona!gudeman
ok@goanna.cs.rmit.oz.au (Richard A. O'Keefe) (08/06/90)
There's a rather interesting idea I picked up from Burroughs.
They had this really neat way of producing assembly code for the
Datacom Processor (a little parasite that sat inside a B6700 and
did network stuff). Your assembly code program was a program
written in Algol. Imagine something like an M680x0. You might do
movl(d0, offset(a0,20)); % [a0+20] := d0
Here the arguments to movl were expressions which constructed trees.
So you could also do things like
myvar := offset(a0,20); % myvar .equ [a0+20]
movl(myvar, d0); % d0 := [a0+20]
clrl(myvar); % [a0+20] := 0
The scheme was a two-pass one. The top level program had the form
begin
integer pass;
<declare all the "opcodes" and tree builders>
procedure user_program;
begin
<"macros" become ordinary procedures>
<instructions become procedure calls>
end;
pass := 1;
user_program;
pass := 2;
user_program;
save_virtual_store(outputfile);
end.
With a scheme like that, one might handle Rubin's example thus:
> > c{'z} ={tc} {-}{|}{ta}a{'x} OP{mod} {|}{tb}b{'y} {/\{~}w}
1. An operand is an address, perhaps with a type. So we might use
myvec for a simple address
SINGLE(myvec) for a typed address
2. A source operand is
<operand> use operand
or -<operand> use NEG(operand)
or |<operand> use ABS(operand)
or -|<operand> use NEG(ABS(operand))
3. I've forgotten what {mod} and {/\{~}w} are, I think w has to do
with masking. So let's assume
vecop(Dest, Src1, Op, Src2) -- mask omitted
vecmask(Dest, Src1, Op, Src2, Mask) -- mask required
e.g.
vecmask(INT(out), ABS(SINGLE(in1)), GTR, ABS(SINGLE(in2)), NOT(mask));
instead of
out int = |in1 single > |in2 single /\~mask
I've written "assembly code" this way using Algol, Pop, Lisp, and Prolog
as the actual source languages. Works like a charm, and you have *really*
nice languages for writing your "macros"... The syntax isn't assembly-like,
and it isn't all that much like what Herman Rubin wants, but a small dose
of Stage 2 will fix that. (Anyone remember Stage 2? Anyone remember Wisp?)
Doing a really thorough job of this kind of thing requires a sort of
macro facility where the macro language can ask about types and attributes
of the arguments. Something not altogether unlike Ada might actually make
a good basis for this.
--
Distinguishing between a work written in Hebrew and one written in Aramaic
when we have only a Latin version made from a Greek translation is not easy.
(D.J.Harrington, discussing pseudo-Philo)