jlg@lanl.gov (Jim Giles) (09/14/90)
From article <1990Sep13.185833.17455@cunixf.cc.columbia.edu>, by wp6@cunixa.cc.columbia.edu (Walter Pohl): > > What do you mean about C's sins of commission? > Do you mean the lack of type checking? [...] Actually, what you're asking is a tough question. There are so many problems with C that just listing the more obvious ones would take many pages. It is difficult to turn to _any_ page of the C draft standard without stumbling upon something with which I completely disagree. (By the way, it is difficult to turn to any page of the final C standard because I haven't seen any copies of it. Has it even been published yet? It was finalized in January/February.) Yes, type checking is a problem with C. To my mind, it is one of C's least egregious faults. For one thing, most violations _are_ illegal in C - just that most implementations don't bother checking. I make a careful distinction between a language and any particular implementation. The faults of C that I most object to are those which cannot be corrected because the language itself requires them. As I said, _most_ type violations are already illegal. Not all though. Unions are not discriminated. Pointer 'casts' are allowed (essentially between _any_ two pointer types - officially, casts can only be between 'void' pointers and others but cast first to void then to anything else is legal). This leads us to pointers. Just about everything about C pointers is bad. From the fact that pointers are hopelessly confused with arrays (which are completely separate conceptually) to the syntax of pointer use, C's pointers are a mess. In addition, many language design people now feel that pointers of _any_ kind are a bad idea. C.A.R. Hoare condemned them as long ago as the early 70's (about the time C was 'designed'). He pointed out that pointers are the data structuring element that corresponds to GOTOs in flow control - if the one is bad, so is the other. ----------------------------------------------------------------------------- Since this is comp.society.futures, I will discuss pointer replacements. Essentially, pointers only do three things for you: 1) recursive data structures (graphs, trees, etc....); 2) dynamic memory; and 3) run-time 'equivalence'. C pointer arithmetic only does what one dimensional array indexing already does (scaled address calculations): arrays are better for this - so it's _not_ counted as one of the features of pointers. Recursive data structures are best implemented directly (to use a C/Fortran like declaration syntax with the type names on the left): Type Tree is record integer :: value tree :: left, right end type Tree Note that the elements inside a tree-valued data type are not _pointers_ but are actually trees themselves. No more confusing pointers with what they point to - the pointers aren't explicitly visible. No more forgetting the dereference operator (or, conversely, putting it in incorrectly) - there isn't a dereferencing operator. To be sure, the compiler _may_ internally use pointers to do the implementation of these recursive structures (but then, it probably uses GOTOs to internally implement loops), but since they aren't explicitly visible to the user, his life is much easier. Dynamic memory should also be implemented directly. Again, here is an example: Dynamic Integer :: a(:,:) !-- declares two dimensional a ... use of a here is illegal - not allocated yet ... ALLOCATE a(50,100) !-- allocates 5000 words memory for a ... use of a here is legal ... Of course, there would have to be an inquiry function do detect whether the object was allocated or not. Further, the decision would have to made in the language design whether deallocation would be automatic (garbage count, reference count, etc.) or whether the user would have to explicitly deallocate things. Either way, this is simpler, safer, and easier to code, use, and debug than pointer usage. Further, the compiler can optimize uses of the dynamic object with the knowledge that it's not aliased to anything - a fact the compiler cannot deduce from malloc() calls (which as far as the compiler knows is just a function which might be returning just any old address it feels like). Run-time equivalencing is a feature which some people (with a good deal of justification) claim shouldn't be allowed at all. I disagree. But there are still some distintions to be made. First, equivalencing might be used just reuse statically allocated space (although, using dynamic memory is probably better). Equivalence might also be used to provide a form of array reshaping or slicing - here pointers are inadequate: try the ALIAS/IDENTIFY feature in the first draft Fortran 8X proposal. Equivalence might also be used for defeating type checking - but here I prefer to recommend the below: Type Float_internal is record bit.1 :: sign bit.8 :: exponent bit.23:: significand End type Float_internal Float :: x !-- x is a simple float variable Map x as Float_internal !-- overlays record onto x x = 5.0 !-- x used as usual x.sign = 1 !-- negate x - use the mapping x.exponent=x.exponent+1 !-- multiply x by 2 - use the map ... etc ... This makes the defeating of the type checking explicit and also makes the indended use clearer. One of the problems with C pointers is that you can locally tell if a pointer is supposed to be an array, a recursive structure, an allocated object, or some exotic run-time equivalence. Providing all these possible features with high-level syntax and separate functionality improves the clarity of the code. It usually even makes the code more succinct (shorter). So, to make a long story short (too late), I haven't yet found any application which _needs_ explicit pointers either for speed or functionality. The above replacements either conceal or eliminate pointers and are as (or more) efficient and easier to use. ----------------------------------------------------------------------------- Now, back to C. Related to type checking is mixed mode. I don't object to mixed mode, in fact: I support it. But C's rules for applying it are not reasonable. The _claim_ is that the rules are designed to allow speed. Actually, there is no rational reason for minus five divided by a thousand to _ever_ be positive or to _ever_ be larger than one in magnitude. The C rules sometimes require that (-5/1000U == some large machine dependent constant). The C type heirarchy needs considerable adjustment. This brings us to mixed type operations (not just mixed mode). Since C has no 'logical' type, you are allowed to mix arithmetic with the results of conditionals with wild abandon. I have never seen any advantage to this - I HAVE seen a lot of people make a lot of costly and time consuming mistakes as a result. Further, the lack of a 'logical' data type means that they must provide more than one set of boolean operators (and, or, not, xor) in order to have bitwise and logical distinguished. So, the next point is this bit about C's operators. There are too many operators and too many precidence levels. Some (like the logical vs. bitwise problem) would not be necessary if C had better intrinsic data types. Others perform functions which would probably be better done as function calls (intrinsics which could be inlined of course). Still others (like pointer dereferencing) should probably not exist at all. In spite of all these operators, character string concatenation, string comparison, and substring operations are _not_ operators. Even Fortran is better. Data type declaration "operators" (or whatever you want to call the syntax elements) are particularly ugly, obscure, peculiar, difficult, and arcane. I'm told that this is because they wanted a declaration of a data type to look like a use of that type. This leads us to: The use of complicated data types is particularly ugly, obscure, peculiar, difficult, and arcane. At least they met their goal, the syntax of using the variables is every bit as bad as that for declaring them. Assignment operators are necessary in a procedural language. But, these combinations of assignment with other operators is just useless syntactic sugar. Personally, I don't care if the language has them or not, but they do clutter up the syntax quite a bit. The main problem with assignment is not the operators, per se, but the fact that they are allowed _within_ an expression. There have been several well conducted experiments on the effect of such operators on user productivity - the conclusion has been that assignment should be a statement level operator and _not_ an expression level one - at least, if you want to maximize user productivity. While we're on the subject of productivity experiments, here's a few other C features that have failed such tests: Control structures which used 'compound statements' (ie. sequences bounded by BEGIN/END or {/} as C spells them). Better is the IF/ELSEIF/ELSE/ENDIF, WHILE/ENDWHILE , etc. style. Even better is allowing control constructs to be given unique labels and matching them up (ie. Ada and Fortran 90 have this feature). End-of-line ignored within comments. Comments should be terminated by the end-of-line mark. C++ has the option of doing this. Unfortunately, it still retains the old wraparound version as well (the danger of developing a backward compatible language is the load of junk that you can't get rid of). End-of-line ignored within statements. The experimenters decided that people just seem to regard the end-of-line as the same as the end-of-statement, they really do. Even C programmers intuitively know this. I examined 10,000+ lines of commercial C code and found only 12 lines which used the C ability to wrap statements across lines automatically. Even so, forgotten semicolons almost _all_ occur at the end-of-line, and it is still a very common syntax error. I think the end-of-line mark should be a synonym for semicolon and should be escaped in the rare (12 out of 10,000) case that a continuation is needed. Pointers - well, we've talked about them. GOTOs. This is an interesting subject because there are actually conflicting results here. Spaghetti code clearly (and in the experiments, this was shown) causes massive productivity problems. However, in the test involving BEGIN/END control flow brackets, GOTOs were found to be one of the things which were better (by about a factor of 2) than 'compound statements'. Other experiments involving "disciplined" GOTO usage (with "disciplined" pretty much meaning you'd expect) were compared with "Structured" GOTO-less programs and _no_ statistically significant difference with productivity was observed at all. Actually, in this one case, I think C has got it exactly right - leave unrestricted GOTO in the language _and_ provide all the "Structured" control flow constructs. One of the very few things that I think C did right. There are several other experimental results - this is just a sampling. The only experiment that I've ever seen in which the losing feature wasn't in C was the one that showed that semicolon should be a terminator not a separater. C got this one right. C was on the wrong side of every other experiment I've ever seen. Some non-experimental features which are widely regarded as bad ideas: Case sensitive syntax. In a case insensitive language, code can be easily shared, teamwork is easier, and upper-case can be used for emphasis or other documentation purposes. In a case sensitive syntax, communication between sites (or even down the hall) is impeded by differing case conventions. People waste time ironing this out and not doing more useful work. Nonintuitive syntax. This is very common in C. If a concept has a widely developed and simple notation which is compatible with the keyboard and/or print devices available, the language _should_ make every effort to accomodate this common notation. I will give one specific example: what in the world possessed them to use a leading zero to distinguish octal from decimal??? Inconsistent syntax. Also common in C. An operator, keyword, or construct should have the same meaning (as nearly as possible) in every context in which it is allowed. A specific example is the keyword 'static', which means that the memory for the corresponding variable being declared is permanently associated with the variable for the entirity of run-time - except in the beginning of a file (outside and procedure), where 'static' suddenly means the same thing that other languages call 'private'. (All variables declared outside of procedures have permanently allocated memory anyway - so, 'static' should be regarded as redundant there.) Well, as I predicted, even to touch on the small number of obvious problems is several pages long. I trust that you can see there are still others lurking in the language specification (like 'switch', which doesn't automatically put a 'break' between the cases - whoops - I can't stop once I'm on a roll). J. Giles
marick@m.cs.uiuc.edu (09/14/90)
/* Written 9:08 pm Sep 13, 1990 by jlg@lanl.gov in m.cs.uiuc.edu:comp.society.futures */ > The main problem > with assignment is not the operators, per se, but the fact that they are > allowed _within_ an expression. There have been several well conducted > experiments on the effect of such operators on user productivity - the > conclusion has been that assignment should be a statement level operator > and _not_ an expression level one - at least, if you want to maximize > user productivity. By lost productivity, do you mean time spent discovering and correcting errors where people wrote if (a = b) when they meant if (a == b) Did they study languages where that kind of error is less likely? For example, I doubt many Lisp programmers write (if (eq a b) when they meant if (setq a b) Were the studies of novice programmers? How were costs calculated? It's important to not to interpret experiments more broadly than the data allows. For example, were code reads used in the experiment? With a code read checklist, such errors are readily caught, at low additional cost. If they were not used, the experiment says only that value-returning-assignment raises costs in the absence of code reads, not that they raise costs, period. Because experiments require interpretation, I'd like to see citations. Thanks. (BTW: I recommend this convention: if (5 == a) instead of if (a == 5) Cuts down on errors considerably. I saw this on the net somewhere, but I don't know whose idea it was originally.)
jcburt@ipsun.larc.nasa.gov (John Burton) (09/14/90)
>This leads us to pointers. Just about everything about C pointers is >bad. From the fact that pointers are hopelessly confused with arrays >(which are completely separate conceptually) to the syntax of pointer >use, C's pointers are a mess. Not a mess, just cryptic, as is most of C. C IS NOT a high level language, it was originally designed as a mid-level language, somewhere between Pascal and Assembler...it is NOT designed to be used by novice! > In addition, many language design people >now feel that pointers of _any_ kind are a bad idea. C.A.R. Hoare >condemned them as long ago as the early 70's (about the time C was >'designed'). He pointed out that pointers are the data structuring >element that corresponds to GOTOs in flow control - if the one is >bad, so is the other. Again, what they are considering is a high-level language. C is a mid-level language. Assembler DOES NOT have arrays and the PRIMARY method of flow control IS the GOTO... BTW: since the early 70's it has also been shown that the total LACK of GOTO's is also bad. Selective use of GOTO is the key. >Since this is comp.society.futures, I will discuss pointer replacements. >Essentially, pointers only do three things for you: 1) recursive data >structures (graphs, trees, etc....); 2) dynamic memory; and 3) run-time >'equivalence'. C pointer arithmetic only does what one dimensional array >indexing already does (scaled address calculations): arrays are better for >this - so it's _not_ counted as one of the features of pointers. Sorry, but it IS more expensive (execution time wise) to sequentially index through an array, than it is to simply increment a pointer... Array indexing is better for randomlly accessing the array... just because random access is better for somethings, should we totally do away with sequential access? come on, be fair... >Recursive data structures are best implemented directly (to use a >C/Fortran like declaration syntax with the type names on the left): > > Type Tree is record > integer :: value > tree :: left, right > end type Tree > >Note that the elements inside a tree-valued data type are not _pointers_ >but are actually trees themselves. No more confusing pointers with >what they point to - the pointers aren't explicitly visible. No more >forgetting the dereference operator (or, conversely, putting it in >incorrectly) - there isn't a dereferencing operator. To be sure, the >compiler _may_ internally use pointers to do the implementation of >these recursive structures (but then, it probably uses GOTOs to internally >implement loops), but since they aren't explicitly visible to the user, >his life is much easier. when i think of a (data structure) tree, i tend to think in terms of nodes being linked together NOT immediately adjacent (draw a data structure tree...nodes are not directly attached...there is a line connecting the the nodes...the line is a pointer... Not using pointers would make life easier for a novice programmer. It would severly limit the experienced programmer... > > [more stuff about the evils of C] > >J. Giles One thing that has apparently slipped the mind here is that comparing C, Pascal and Fortran, is the same as comparing apples to oranges... Fortran was designed to be a "high-level" scientific language, it was designed to "protect" the user from the machine (and protect the machine from the user) while still allowing him/her some freedom to do calculations. Protecting the user from himself is another story...I still have nightmares about debugging large Fortran systems that use COMMON & EQUIVALENCE extensively. Few errors (other than straight syntax) were caught by the compiler...most errors were debugged (or not) at the runtime stage. Pascal was designed from the ground up as a "high-level" teaching language. It was designed to enforce structured programming, and to detect as many errors as possible at the earliest possible stage (i.e. the compiler). Basically it provided a high level of insulation between the machine and the user, but at the expense of functionallity. (Note: you can still do almost anything you want in Pascal, it just takes more work). C on the other hand was design as a "Mid-Level" systems language. It was to be used to write device drivers and low level routines. Essentially it filled the gap between Assembly Language (high functionality/flexiblity, *low* user protection) and the high level languages (High user protection and lowered functionality/flexibility). It was designed to have freer access to the machine, but still provide some level of protection. Basically what the above posting indicated is that the average programmer should not use C. Fine, the average programmer should also not do systems programming, and the average programmer should not use assmebler... Speed of execution is another aspect for comparison...Pascal and Fortran were designed to accomplish tasks safely (Pascal more so than Fortran). C was designed to accomplish tasks quickly. The often made comparison of indexing through a 1-dimensional array (linear representation of a multidimensional array) instead of incrementing pointers is not strictly valid. For every array access, there is a corresponding index calculation (usually a multiplication and an addition) to determine where to look for the data. Incrementing a pointer is faster (generally a register increment operation). The difference between the two methods is the same as the difference between a random access data structure (array) and a sequential access data structure (pointers). Neither is inherently better, they each have particular applications where they are the best choice. The problem is that C, Pascal, and FORTRAN were design for (different) specific purposes, and are currently being used as high level general purpose languages which is NOT what they were designed to do. C wasn't even designed as a high level language...Each is probably the best choice in its area, but not necessarily outside its area... Ada is an example of a (government decreed) high level general purpose language...supposedly it can do everything C, Pascal, and FORTRAN can do, but i'm not sure it would be the language of choice in the specifc areas that C, Pascal, and FORTRAN were designed for... When asked the question, "what is the *best* programming language?" the answer should be "it depends on what you want to do..." perhaps the future of computer languages shouldn't be trying to find a "best" general purpose language, but instead, develop transparent ways for modules written in one language to be used in programs written in other languages...this is already being done to some extent, but it should be taken further... John Burton (jcburt@ipsun.larc.nasa.gov) (jcburt@cs.wm.edu)
bzs@WORLD.STD.COM (Barry Shein) (09/15/90)
>>This leads us to pointers. Just about everything about C pointers is >>bad. From the fact that pointers are hopelessly confused with arrays >>(which are completely separate conceptually) to the syntax of pointer >>use, C's pointers are a mess. > >Not a mess, just cryptic, as is most of C. C IS NOT a high level >language, it was originally designed as a mid-level language, somewhere >between Pascal and Assembler...it is NOT designed to be used by novice! I think we're working from axioms here that may not be self-evident. For example, HLLs are easier for novices than MLLs. I don't know if that's true or not. Novices make a lot of semantic errors. HLL's provide such a high level of abstraction and so many complicated rules that it's nearly impossible to explain why something is wrong, or in particular, figure it out for yourself. The PL/I manual set (an HLL if there ever was one) was something like 19 linear shelf-feet. The rules for arithmetic alone demanded an understanding of all sorts of issues. MLL's like C provide a very simple semantics (can be explained in a few dozen pages) but a reasonably high-level syntax. That is, "algebraic", analogous to what is learned in grade-school math rather than having to do the translation needed for assembler etc. Pascal is considered an HLL. I've taught several languages over the years to college students. Pascal had to be one of the worst, most kids couldn't get past where the semi-colons go. That sort of subtlety and worshipping of abstraction is typical of HLL's. Also, most students have problems with pointers because their teachers didn't understand them. No joke. Somewhere they became these bogey-men and teachers in intro courses would stand up there and give all these signals to the class that what s/he is about to teach is impossible to understand so here we go... I just told them funny stories about the confusions between the thing and the thing contained ("The White House said today...") and went on with it and never had any problems. A location is not really a hard problem, houses have addresses etc, pigeon-holes with box numbers and so forth. The hard problem was lousy teachers with worse attitudes. Self taught programmers almost never had any problem with pointers or thought they were particularly interesting/challenging/whatever. I think the future of programming, however, lies in moving the solution of the problem closer to the problem. Programming languages are for programmers, people trained in a specific skill. People who do not have that training should have applications packages and generators. Remember, it wasn't that long ago that when we wanted a simple graph we used to write a program. Today it would be rare to do that. The point being, that trying to make programmers out of everyone (typically by designing languages so easy to use even "your secretary" could program...that was absolutely beyond a doubt the typical sexist claim, I was there) was a strange, 1970's dream that by and large has become unnecessary. Programming is a skill, like driving a semi, most people shouldn't need that skill.
jcburt@ipsun.larc.nasa.gov (John Burton) (09/15/90)
>Also, most students have problems with pointers because their teachers >didn't understand them. No joke. Somewhere they became these bogey-men >and teachers in intro courses would stand up there and give all these >signals to the class that what s/he is about to teach is impossible to >understand so here we go... > >I just told them funny stories about the confusions between the thing >and the thing contained ("The White House said today...") and went on >with it and never had any problems. A location is not really a hard >problem, houses have addresses etc, pigeon-holes with box numbers and >so forth. The hard problem was lousy teachers with worse attitudes. > I couldn't agree more...I guess the point I was trying to make was that pointers are NOT difficult to understand AND they provide much needed flexibility...If a programmer does not understand them, most languages provide useful alternatives for them to use...but don't take them away from people who understand and can use them effectively... > >I think the future of programming, however, lies in moving the >solution of the problem closer to the problem. Programming languages >are for programmers, people trained in a specific skill. People who do >not have that training should have applications packages and >generators. > >[...stuff deleted...] > >Programming is a skill, like driving a semi, most people shouldn't >need that skill. Exactly!!! Programmers should help provide the tools for non-programmers to use...Programming languages SHOULD NOT be restricted to provide safety for novice/non-programmers at the expense of those that can benefit from the flexibility of "dangerous" attributes such as pointers... John (jcburt@cs.wm.edu) (jcburt@ipsun.larc.nasa.gov)
mst@vexpert.dbai.tuwien.ac.at (Markus Stumptner) (09/17/90)
From article <1990Sep14.160429.2732@abcfd20.larc.nasa.gov>, by jcburt@ipsun.larc.nasa.gov (John Burton): >>This leads us to pointers. Just about everything about C pointers is >>bad. From the fact that pointers are hopelessly confused with arrays >>(which are completely separate conceptually) to the syntax of pointer >>use, C's pointers are a mess. >>Since this is comp.society.futures, I will discuss pointer replacements. >>Essentially, pointers only do three things for you: 1) recursive data >>structures (graphs, trees, etc....); 2) dynamic memory; and 3) run-time >>'equivalence'. C pointer arithmetic only does what one dimensional array >>indexing already does (scaled address calculations): arrays are better for >>this - so it's _not_ counted as one of the features of pointers. > > Sorry, but it IS more expensive (execution time wise) to sequentially > index through an array, than it is to simply increment a pointer... > Array indexing is better for randomlly accessing the array... > just because random access is better for somethings, should we > totally do away with sequential access? come on, be fair... I have a friend who works for CDC. The current series of graphics workstations sold by CDC (to my knowledge, very similar to Silicon Graphics machines) are based on the MIPS RISC chip family and use heavily optimizing compilers. One day, while leafing through a C compiler manual for the system (I don't know what manual it was exactly, have never seen it), he discovered to his amazement that the programming guidelines include the following rules: - Do not use the increment and decrement operators (++ and --) - Do not use pointer incrementing for sequential array access This was a year ago, and my memory is fuzzy. As far as I remember, according to the manual, in most cases using ordinary assignment/expression syntax and incrementing an array index will be MUCH FASTER since the compiler has more freedom in keeping values in registers instead of having to store them back in memory immediately. I confess I have been very amused by this. Can anyone support it? The MIPS architecture is used by DEC and lots of other manufacturers. Perhaps somebody else has stumbled on this. What about other RISC architectures? Perhaps there is still hope for high-level languages... Markus Stumptner mst@vexpert.at Technical University of Vienna vexpert!mst@uunet.uu.net Paniglg. 16, A-1040 Vienna, Austria ...mcsun!vexpert!mst
jlg@lanl.gov (Jim Giles) (09/18/90)
From article <1990Sep14.160429.2732@abcfd20.larc.nasa.gov>, by jcburt@ipsun.larc.nasa.gov (John Burton): > [...] > BTW: since the early 70's it has also been shown that the total LACK > of GOTO's is also bad. Selective use of GOTO is the key. Yes, I believe I mentioned that in the article to which you are responding. Although, I have not been able to find any evidence that GOTO-less is actually _bad_, there were several experiments that showed that disciplined use of GOTOs was not worse (or better) than "Structured" coding in any statistically significant way. The conclusion of most researchers was that all the "Structured" alternatives to GOTO should be provided in a language - and GOTO should also be provided just in case. > [...] > Sorry, but it IS more expensive (execution time wise) to sequentially > index through an array, than it is to simply increment a pointer... Sorry, but it is not. The compiler technology to tell that array indexing is semantically _identical_ to the pointer incrementing scheme to which you are referring is about 30 years old: if your compiler is _that_ far behind the state-of-the-art, you got rooked. (The optimization is called "constant folding". The address of the array is added to the initial value of the array index variable at _compile_ time - the interior of the loop (or wherever) uses this combined value as your C program would use a pointer - including the simple increment as th loop progresses.) > [...] >>Recursive data structures are best implemented directly [...] > [...] > Not using pointers would make life easier for a novice programmer. It would > severly limit the experienced programmer... The method I gave is semantically _identical_ to using pointers. The only difference is the lack of the need for dereferencing. There is no functionality that C can perform that the features I gave cannot also perform. There is no reason (at the present state-of-the-art) for the compiler to generate less efficient code than using explicit pointers would use. There is (at the present state-of-the-art) excellent reason to expect that the features I propose could be _MORE_ efficiently implemented since the presence (or absence) of aliasing is easier to detect. > [...] > C was designed to accomplish tasks quickly. The often made comparison > of indexing through a 1-dimensional array (linear representation of > a multidimensional array) instead of incrementing pointers is not strictly > valid. For every array access, there is a corresponding index calculation > (usually a multiplication and an addition) to determine where to look for > the data. Incrementing a pointer is faster (generally a register increment > operation). [...] As I pointed out, the modern (more recent than the late 50's) compiler can eliminate the addition you refer to. The multiplication is only needed for multidimensional arrays - which C doesn't, strictly speaking, even have. The multiply should also be eliminated by a modern (more recent than the early 60's) compiler - the technique is called "strength reduction". If your compiler doesn't have it, you been rooked again. > [...] > When asked the question, "what is the *best* programming language?" the > answer should be "it depends on what you want to do..." The first correct thing you've said. However, you have not made a convincing argument that the answer should _ever_ be C - no matter what the application is. J. Giles
jlg@lanl.gov (Jim Giles) (09/18/90)
From article <3643@vexpert.dbai.tuwien.ac.at>, by mst@vexpert.dbai.tuwien.ac.at (Markus Stumptner): > [...] > - Do not use the increment and decrement operators (++ and --) > > - Do not use pointer incrementing for sequential array access > > This was a year ago, and my memory is fuzzy. As far as I remember, > according to the manual, in most cases using ordinary > assignment/expression syntax and incrementing an array index will be > MUCH FASTER since the compiler has more freedom in keeping values in > registers instead of having to store them back in memory immediately. This is quite possibly true. You see, pointers are unrestricted alias generators. If you have a subroutine which (say) copies one array into another: for (i=0;i<max;i++) A[i] = b[i]; The compiler probably just does the constant folding and zips through the assignment. If you do the following instead: for (lim=a+max; a<lim; *a++ = *b++); /* the usual idiom */ The compiler probably has to complete each store of 'a' before the load of the next 'b'. Further, since *b may be an alias for itself or for 'a', the values of the pointers are probably stored and reloaded each trip through the loop as well. For example, suppose memory is like this: address content name 0199 0200 a ... 0200 0300 b _and_ *a ... 0300 0123 *b In this case, the arrays don't overlap, but *a points to the place that 'b' is stored - so the first trip through the loop alters the location of the *b array. This is legal in C, the compiler has no way of knowing that the user didn't do this, the compiler must genetrate code to allow this to happen - in this case: 'b' must be reloaded after each loop trip. Other memory configurations would require other special actions. The only "safe" thing the compiler can do is store/reload _all_ the variables on each loop trip. Of course, this shows that the CDC compiler was wrong. The two programs given here should both generate the same code (since compiler technology is sufficiently advanced for the compiler to see that both do the same thing - except for setting 'i' and/or 'lim'). However, the compiler should generate the same _slow_ code for both. J. Giles
jlg@lanl.gov (Jim Giles) (09/18/90)
From article <1990Sep14.212806.8131@abcfd20.larc.nasa.gov>, by jcburt@ipsun.larc.nasa.gov (John Burton): > [...] > I guess the point I was trying to make was > that pointers are NOT difficult to understand AND they provide > much needed flexibility... Really? Iv'e been asking this same question for over two years on the net - no one has yet answered it: Please give me a specific example of a _legal_ C data structure which _cannot_ be implemented with the same efficiency with the data structuring features below. Note, there is not a _single_ explicit pointer data type in the following list. 1) 'Atomic' types (floats (various sizes), ints (various sizes), booleans, characters (various character sets), etc....) 2) Enumerated data types. These are simply a vay to allow the user to invent a new 'atomic' type. 3) Arrays. Mappings from a tuple of indices to a typed value. Note: an array of arrays is legal and is _NOT_ the same as a 2-d array (although a little syntactic sugar could allow this later - no one has ever asked for it). 4) Sequences. An ordered collection of zero or more objects to be accessed in a specific order. Obvious syntactic sugar (like direct referencing of the last element or an arbitrary element) is permitted. The usual implementation of character strings in C is an example of an inefficient implementation of a sequence. You can have a sequence of any data type (including a sequence of sequences - which is what a dictionary is). 5) Records. Like C struct. No difference at all really. 6) Unions. These are _always_ discriminated. The compiler is responsible for maintaining and checking the type tags. Note that this only _seems_ inefficient: _legal_ C programs should always explicitly maintain a tag anyway. 7) Recursive types. These may be given the attribute 'aliased' in order to allow circular and overlapping references. Other than that, we have discussed these before. In addition, all variables can be declared with a 'dynamic' attribute, which means that they must be allocated before use (dynamic arrays give their size at allocation time). It might be desireable for sequences and recursive data type to be given the dynamic attribute automatically. I can demonstrate sample syntax for these if anyone thinks it is required. Anyone who proposes a C data object that he claims is not representable here is invited to do so (I'm not joking - I'm designing a language with these features - this challenge is an attempt to find out whether I'm leaving something out). > [...] >>Programming is a skill, like driving a semi, most people shouldn't >>need that skill. > > Exactly!!! Programmers should help provide the tools for non-programmers > to use...Programming languages SHOULD NOT be restricted to provide > safety for novice/non-programmers at the expense of those that can > benefit from the flexibility of "dangerous" attributes such as pointers... > [...] I agree completely ... except with the last line. Pointers are a bad example of the philosophy you are discussing. The presence of pointers is an example of something useful having been _left_out_ of programming languages. Making me use pointers when what I really _want_ is dynamic memory, or arrays, or sequences, or recursive data structures, etc.; is like forcing the semi driver to use a crank to start his truck. We give semi trucks electric starters because it is a simple, reliable, and easy to use replacement for the crank (in the role of truck starter anyway). In fact, the electric starter is _so_ good, they no longer bother providing cranks for trucks (or, any way to use a crank even if you had one). In fact, truck engines are now so large and heavy, you couldn't turn it over by hand anyway - the presence of the starter has made it possible to design larger and more powerful trucks than would otherwise be possible. A programming language should be designed with simple, reliable, and easy to use replacements for hardware level concepts as well - or would you rather have only conditional jumps for flow control and bit twiddling for data? The presence of more powerful language features should allow programmers to concentrate on _what_ they want the code to do, not _how_ the machine does it internally. This should make possible larger and more powerful programs than are presently feasible with sufficient reliability. J. Giles
pirinen@cc.helsinki.fi (09/18/90)
From article <1990Sep14.160429.2732@abcfd20.larc.nasa.gov>, by jcburt@ipsun.larc.nasa.gov (John Burton): > Sorry, but it IS more expensive (execution time wise) to sequentially > index through an array, than it is to simply increment a pointer... > Array indexing is better for randomlly accessing the array... > just because random access is better for somethings, should we > totally do away with sequential access? come on, be fair... Pointer derefencing, as such, is not sequential, it is in fact more random than array indexing (can you say "aliasing"?). A loop over an array that uses indexing can be compiled using pointer incrementing -- it's a standard compiler technique. In article <3643@vexpert.dbai.tuwien.ac.at>, mst@vexpert.dbai.tuwien.ac.at (Markus Stumptner) writes: > [C compiler manual for MIPS RISC chip-based computer says:] > - Do not use pointer incrementing for sequential array access This doesn't surprise me a bit: modern chips have been designed to execute high-level languages effectively, array indexing being a case in point. Intel 80286, 386, and i486 data sheets say that all indirect memory addressing modes take a equal number of clocks, including scaled indexed addressing. This eliminates the supposed advantage of pointer incrementing in most cases, if the compiler didn't already. Pekka P. Pirinen University of Helsinki pirinen@cc.helsinki.fi pirinen@finuh.bitnet ..!mcvax!cc.helsinki.fi!pirinen Read my Lisp: no new syntax! -nil
pirinen@cc.helsinki.fi (09/18/90)
In article <1990Sep14.212806.8131@abcfd20.larc.nasa.gov>, jcburt@ipsun.larc.nasa.gov (John Burton) writes: > I guess the point I was trying to make was > that pointers are NOT difficult to understand AND they provide > much needed flexibility...If a programmer does not understand them, > most languages provide useful alternatives for them to use... Except C, Pascal, etc. > but don't take them away from people who understand and can use them > effectively... I agree pointers can be used effectively. It would be interesting to program in a language that had pointers AND a useful alternative, to see how often one would choose each. Are there any such languages? > Programming languages SHOULD NOT be restricted to provide > safety for novice/non-programmers at the expense of those that can > benefit from the flexibility of "dangerous" attributes such as pointers... Where does this idea of C-hackers come from, that only novices need safety? I'm no novice (10 years of programming), and I want all the safety I can get. I'm sick and tired of debugging for hours to find simple errors that could have been caught at the expense of a few seconds of the compiler's time. Programmers are not machines, even good programmers make simple mistakes. Pekka P. Pirinen University of Helsinki pirinen@cc.helsinki.fi pirinen@finuh.bitnet ..!mcvax!cc.helsinki.fi!pirinen Read my Lisp: no new syntax! -nil
brendan@batserver.cs.uq.oz.au (Brendan Mahony) (09/18/90)
jlg@lanl.gov (Jim Giles) writes:
- 1) 'Atomic' types (floats (various sizes), ints (various sizes), booleans,
- characters (various character sets), etc....)
- 2) Enumerated data types. These are simply a vay to allow the user to
- invent a new 'atomic' type.
- 3) Arrays. Mappings from a tuple of indices to a typed value. Note: an
- array of arrays is legal and is _NOT_ the same as a 2-d array (although
- a little syntactic sugar could allow this later - no one has ever asked
- for it).
- 4) Sequences. An ordered collection of zero or more objects to be
- accessed in a specific order. Obvious syntactic sugar (like direct
- referencing of the last element or an arbitrary element) is permitted.
- The usual implementation of character strings in C is an example of an
- inefficient implementation of a sequence. You can have a sequence of
- any data type (including a sequence of sequences - which is what a
- dictionary is).
- 5) Records. Like C struct. No difference at all really.
- 6) Unions. These are _always_ discriminated. The compiler is
- responsible for maintaining and checking the type tags. Note that
- this only _seems_ inefficient: _legal_ C programs should always
- explicitly maintain a tag anyway.
- 7) Recursive types. These may be given the attribute 'aliased' in order
- to allow circular and overlapping references. Other than that, we have
- discussed these before.
- In addition, all variables can be declared with a 'dynamic' attribute,
- which means that they must be allocated before use (dynamic arrays give
- their size at allocation time). It might be desireable for sequences
- and recursive data type to be given the dynamic attribute automatically.
- I can demonstrate sample syntax for these if anyone thinks it is required.
- Anyone who proposes a C data object that he claims is not representable
- here is invited to do so (I'm not joking - I'm designing a language with
- these features - this challenge is an attempt to find out whether I'm
- leaving something out).
Yes you are. You are leaving out memory mapped I/O and operating system
vectors and other disgusting cludges that make the computing world go
round. Other than than you are spot on!
--
Brendan Mahony | brendan@batserver.cs.uq.oz
Department of Computer Science | heretic: someone who disgrees with you
University of Queensland | about something neither of you knows
Australia | anything about.
KPURCELL@liverpool.ac.uk (Kevin Purcell) (09/18/90)
On 17 Sep 90 08:21:00 GMT eru!hagbard!sunic!mcsun!tuvie!vexpert.dbai.tuwien.ac. (eru!hagbard!sunic!mcsun!tuvie!vexpert.dbai.tuwien.ac.%!mst@edu.mit.bl) said: >From article <1990Sep14.160429.2732@abcfd20.larc.nasa.gov>, by > jcburt@ipsun.larc.nasa.gov (John Burton): [stuff about pointers and merits versus multiply and add indexing of arrays] > > ... he discovered to his amazement that the programming >guidelines include the following rules: > > - Do not use the increment and decrement operators (++ and --) > > - Do not use pointer incrementing for sequential array access > >This was a year ago, and my memory is fuzzy. As far as I remember, >according to the manual, in most cases using ordinary >assignment/expression syntax and incrementing an array index will be >MUCH FASTER since the compiler has more freedom in keeping values in >registers instead of having to store them back in memory immediately. If the machines has access to a vetorising processor or can run stuff through a pipeline processor very quickly it is sometimes better to avoid doing it explicitly fast (compared to say a PDP-11) with pointer and just say what you really want. The compiler is probably better at picking up this form to vectorise or unroll. For example, on some machines, for(i=0; i<5; i++) a[i] = 0.0; might execute faster as an unrolled loop: a[1] = 0.0; a[2] = 0.0; a[3] = 0.0; a[4] = 0.0; a[5] = 0.0; or it may get dispatched to a say 5 cpus in a vectorised procesor. Writing it as: i = 5; ap = &a; while(i--) *a++ = 0; may not be pulled out by the optimiser. In this case I think most optomisers would find it, but I can imagine some slightly more complex constructs that might fool them. There is a simple review of what RISC compilers do in UnixWorld Aug 1990 that expands on these ideas. > >Markus Stumptner mst@vexpert.at >Technical University of Vienna vexpert!mst@uunet.uu.net >Paniglg. 16, A-1040 Vienna, Austria ...mcsun!vexpert!mst I fear we are in danger of drifting into alt.religion.comp or even comp.lang.c territory and way from futures. Kevin Purcell | kpurcell@liverpool.ac.uk Surface Science, | Liverpool University | Programming the Macintosh is easy if you understand Liverpool L69 3BX | how the Mac works and hard if you don't. -- Dan Allen
jeremy@ultima.socs.uts.edu.au (Jeremy Fitzhardinge) (09/19/90)
In comp.society.futures you write: |From article <1990Sep14.212806.8131@abcfd20.larc.nasa.gov>, by jcburt@ipsun.larc.nasa.gov (John Burton): |> [...] |> I guess the point I was trying to make was |> that pointers are NOT difficult to understand AND they provide |> much needed flexibility... | |Really? Iv'e been asking this same question for over two years |on the net - no one has yet answered it: Please give me a specific |example of a _legal_ C data structure which _cannot_ be implemented |with the same efficiency with the data structuring features below. |Note, there is not a _single_ explicit pointer data type in the |following list. | |1) 'Atomic' types (floats (various sizes), ints (various sizes), booleans, | characters (various character sets), etc....) | |2) Enumerated data types. These are simply a vay to allow the user to | invent a new 'atomic' type. | |3) Arrays. Mappings from a tuple of indices to a typed value. Note: an | array of arrays is legal and is _NOT_ the same as a 2-d array (although | a little syntactic sugar could allow this later - no one has ever asked | for it). | |4) Sequences. An ordered collection of zero or more objects to be | accessed in a specific order. Obvious syntactic sugar (like direct | referencing of the last element or an arbitrary element) is permitted. | The usual implementation of character strings in C is an example of an | inefficient implementation of a sequence. You can have a sequence of | any data type (including a sequence of sequences - which is what a | dictionary is). How does this differ from an array (or vica versa)? Is it that a sequence is a completely dynamic object, where arrays are created with a specific size (whether it be at compile or run time)? You seem to imply this later. Is an object in a sequence of any type? Do all objects have to be the same type? I guess a sequence of unions would achieve that effect while still only allowing a sequence of one type. |5) Records. Like C struct. No difference at all really. | |6) Unions. These are _always_ discriminated. The compiler is | responsible for maintaining and checking the type tags. Note that | this only _seems_ inefficient: _legal_ C programs should always | explicitly maintain a tag anyway. Careful selection of the tagging mechanism would be needed, I suppose. At a guess you would have some sort of tag field in the union that has a number representing the current type of the union. Since the union can be of any type (simple and derived) the actual tags used would have to be decided at compile time. This would make linking individually compiled modules and libraries using shared union types difficult, since they would all have to use the same tagging convention. I think it should become a task of the linker to organize this kind of thing. Have something like a "tag table" along side the "symbol table", and treat them similarly. Scoping of [union] types would have to be handled like scoping of variables, with similar name-space conflicts. Perhaps I'm taking a too C oriented approach, but this seems to accord with current practice with practical languages I know/use. |7) Recursive types. These may be given the attribute 'aliased' in order | to allow circular and overlapping references. Other than that, we have | discussed these before. | |In addition, all variables can be declared with a 'dynamic' attribute, |which means that they must be allocated before use (dynamic arrays give |their size at allocation time). It might be desireable for sequences |and recursive data type to be given the dynamic attribute automatically. | |I can demonstrate sample syntax for these if anyone thinks it is required. |Anyone who proposes a C data object that he claims is not representable |here is invited to do so (I'm not joking - I'm designing a language with |these features - this challenge is an attempt to find out whether I'm |leaving something out). How would you handle what is currently handled by pointers to functions in C? I'm primarily a C programmer, but I certainly don't want to let myself be caught by "How can I do it in C" as opposed to "How can it be done". For the things I do (OS hacks, graphics, realtime interactive) I've found C to be the most useful since, it is a simple language that can be found on a wide range of machines, or tends to be the best supported/implemented on those machines. Reciently I've been teaching myself C++ and found it to fill a lot of gaps and problems in C (although I hadn't noticed them until I used C++). No doubt there are other languages I will come across that have features I want in C++. I don't think they will be FORTRAN or COBOL. -- Jeremy Fitzhardinge: jeremy@ultima.socs.uts.edu.au jeremy@utscsd.csd.uts.edu.au DEATH TO ALL FANATICS!
jlg@lanl.gov (Jim Giles) (09/20/90)
From article <4905@uqcspe.cs.uq.oz.au>, by brendan@batserver.cs.uq.oz.au (Brendan Mahony): > [...] > Yes you are. You are leaving out memory mapped I/O and operating system > vectors and other disgusting cludges that make the computing world go > round. Other than than you are spot on! Perhaps you can be kind enough to point out the reason I need pointers (or anything else that's not on my list) to provide the functionality you mention. The first memory mapped I/O I ever used was done in Fortran. And, not an extended Fortran either - passing arrays with call-by-reference is quite adequate to tell the system where my I/O buffer is to be. J. Giles
jlg@lanl.gov (Jim Giles) (09/20/90)
From article <18377@ultima.socs.uts.edu.au>, by jeremy@ultima.socs.uts.edu.au (Jeremy Fitzhardinge): > In comp.society.futures you write: > [...] > |4) Sequences. An ordered collection of zero or more objects to be > | accessed in a specific order. [...] > > How does this differ from an array (or vica versa)? Is it that a sequence > is a completely dynamic object, where arrays are created with a specific > size (whether it be at compile or run time)? You seem to imply this later. > Is an object in a sequence of any type? Do all objects have to be the same > type? I guess a sequence of unions would achieve that effect while still > only allowing a sequence of one type. The answers to your questions are: 1) ther're different (see below); 2) exactly, arrays are fixed size/shape, sequences are always one-d and variable length (initialized empty unless the declaration does an initialization); 3) and 4) "sequence" is a declaration attribute which can be applied to any type - all elements of a sequence have the same type; 5) exactly, all the elements of a sequence can be in the same union type - the union can be collections of any types. Examples: Integer sequence :: x !empty sequence of integers Integer sequence(0:256:16) :: y !empty sequence, no space initially !allocated (the zero), max space is !256 elements, allocate in 16 element !chunks. ASCII sequence :: s !empty character string - ASCII char sequence :: ss = "abc" !character string in native character !set (which may or may not be ASCII), !initial value is three long "abc". !quotes is syntactic sugar for character !sequence types. type u_test is union(integer, ASCII sequence) !declares a union type where members !are integers or ASCII strings u_test sequence :: directory !directory is a sequence of ints or !ASCII sequences (each element may differ). ... s = ss !native character set is automatically !converted to ASCII ss = ss | "def" !concatenate is '|', ss is "abcdef" ss(2:4) = "xyz" !substring usage, ss is "axyzef" x = (1,3,5) !parenthesis are sequence constructor !for non-character types ... Anyway, I think you get the picture. > [... unions ...] > Careful selection of the tagging mechanism would be needed, I suppose. > At a guess you would have some sort of tag field in the union that has > a number representing the current type of the union. Since the union > can be of any type (simple and derived) the actual tags used would have > to be decided at compile time. This would make linking individually > compiled modules and libraries using shared union types difficult, since > they would all have to use the same tagging convention. I think it > should become a task of the linker to organize this kind of thing. Have > something like a "tag table" along side the "symbol table", and treat > them similarly. Scoping of [union] types would have to be handled like > scoping of variables, with similar name-space conflicts. > [...] The type tags would indeed be in the representation of the object itself. As a practical matter, the union object would probably consist of a type tag and a reference (pointer) to the data that represents the object. This would permit arrays and sequences (etc.) of unions to be allocated without having to worry about variable space depending on the type of object actually stored. Note: this use of pointers would be hidden from the user's view and subject to stronger compiler control so that it shouldn't raise any concerns about aliasing and other abuse - this is no different that the fact that IF/THEN/ELSE uses GOTOs internally. The scoping of user defined types of all flavors (not just unions) is a problem that the linker has to worry about (or <shudder> the run-time environment). Actually, the proper use of interface specifications and the requirement that the type declarations in the caller match the ones in the callee would simplify the loader's work in this regard. This would also be a safer approach from the user's point of view, since it would make the type assumptions the user is making very explicit. As a practical matter, the type definitions and function prototypes could be contained in an include file so that the user would not be forever retyping them. > [...] > How would you handle what is currently handled by pointers to functions > in C? [...] Functions are an 'atomic' data type. Their attributes include the number and type of their arguments as well as the type of their result (if any). A variable of type 'function' can be declared and assigned to. Function definitions declare a named function with a 'constant' attribute. A possible syntax is: !comment sin() and cos() are the usual trig functions !comment the exclamation point begins a comment !end-of-line ends a comment float function (float) :: x ! x is a variable whose type ! is the same as sin() and cos() if (some_condition) then x=sin !function name with arg list causes no evaluation else x=cos endif ... ans = x(0.123) !does either sin(0.123) or cos(0.123) Syntactic/semantic sugar (such as adding function constructors, etc.) would allow adding complete functional language support. No pointers are implied here - not necessarily even in the low-level implementation - the assignment _could_ actually copy the code for the function. As a practical matter, function assignment should only copy the local variable space of the function and use a pointer to the code. This would permit, for example, multiple copies of a random number generator with a different seed in each. J. Giles
bson@AI.MIT.EDU (Jan Brittenson) (09/20/90)
So how would you propose to accomplish the following, for example, without pointers or pointer arithmetic? 1. Pointer range check (see if a buffer crosses page boundaries, for instance). 2. Calculate physical addresses for DMA controllers. 3. Sort a linked list on addresses of some data pointed to from within the node. Or to keep it sorted as new (addresses of) data is added. 4. Implement malloc()/free(). 5. Read and write addresses from/to pipes. I wonder whether any compiler can be designed to successfully determine when to duplicate data and when to use a reference. As a rule, it's bad practise to duplicate data other than in the rare occasions when an explicit duplicate is needed. Usually any processing of the data constitutes duplication in itself, with the exact type/size of the resultant duplicate dependending on the interpretation of the original. I'm curious as to why so many programmers engage themselves in hot debates over how to best implement strings. String processing is proportionally insignificant - the first thing done after a read is usually a tokenization, either through hand-written code or the output of a lexical front-end generator. The string is nothing but a character buffer, irrelevant to the data or command itself - it's only the written/read representation of it, and can be discarded once tokenized. Strings held on to rarely need any further processing, they mostly get passed around by reference, or located through hash tables. Just like some algorithms cannot be reasonably coded without gotos - the alternative would likely be even worse - some operations on data or certain functionality cannot reasonably be performed without pointers. Certainly, all data types can be expressed without pointers - but so can they also in everyday English. Declarations and talk are to be distinguished from working programs. (No macho intent.) Finally, not all people agree with the view that "pointers are the data type equivalent of gotos." I for one think it's been a little stretched lately. A program like a device driver could probably be coded perfectly well without a single goto, but not without pointer arithmetic. Just my two cents worth. -- Jan Brittenson bson@ai.mit.edu
isr@rodan.acs.syr.edu (Michael S. Schechter - ISR group account) (09/20/90)
In article <63475@lanl.gov> jlg@lanl.gov (Jim Giles) writes: >From article <4905@uqcspe.cs.uq.oz.au>, by brendan@batserver.cs.uq.oz.au (Brendan Mahony): >> Yes you are. You are leaving out memory mapped I/O and operating system >> vectors and other disgusting cludges that make the computing world go >> round. Other than than you are spot on! >And, not an extended Fortran either - passing arrays with call-by-reference >is quite adequate to tell the system where my I/O buffer is to be. Now in C it's easy to go MyPtr=(Pointer)0x100 In fortran, it's a pain. At least in the one's I've used I'd have to use IPEEK and IPOKE functions to access memory in this way. I suppose it could be done via CALL MYSUB(ivalue,100) SUBROUTINE MYSUB(ivalue,MYADD()) MYADD(0)=ivalue But let's face it, what's the difference between them? None. They both allow you to get into trouble. So why complain about being allowed to do EASILY what you can do in any other language with a little effort. This further illustrates what started the entire thread- FORTRAN is UGLY and a PAIN for things that use a lot of pointers. -- Mike Schechter, Computer Engineer,Institute Sensory Research, Syracuse Univ. InterNet: Mike_Schechter@isr.syr.edu isr@rodan.syr.edu Bitnet: SENSORY@SUNRISE
jlg@lanl.gov (Jim Giles) (09/21/90)
From article <9009201017.AA06087@rice-chex>, by bson@AI.MIT.EDU (Jan Brittenson): > > So how would you propose to accomplish the following, for example, > without pointers or pointer arithmetic? > > 1. Pointer range check (see if a buffer crosses page > boundaries, for instance). Well, without pointers, why do you need a pointer range check? Computing the range of something that doesn't exist seems a little silly. However, you parenthetical remark is of value - unfortunately, it's not possible in _legal_ ANSI C. Pointer arithmetic cannot be carried out past the bounds of an individual object. Pointers to different objects cannot be subtracted or compared _legally_. This is so that pointer arithmetic operations can ignore the segment part of addresses on segmented machines. So, you can't tell with C pointers whether your buffer crosses page boundaries or anything because you can only compare the pointer _within_ the buffer itself - and you don't know the relative position of the beginning of the buffer to page boundaries. I think you had in mind casting the pointer to an int and looking at the raw address - the ANSI standard leaves this process undefined. Now, if you're talking about non-standard extensions to C which would allow you to do this stuff - then any other language can contain the same non-standard extensions. > [...] > 2. Calculate physical addresses for DMA controllers. Why should I care? The system/environment should be able to give me the address if I need it. But, how do I use a raw address anyway? _Standard_ C pointers don't give me any such access. Access to such things as hardware controllers should be privilaged to the system - and _it_ can contain machine dependent code - like assembly. > [...] > 3. Sort a linked list on addresses of some data pointed to > from within the node. Or to keep it sorted as new (addresses > of) data is added. I guess you'll have to tell me how this differs from sorting on the index of the data within an array or sequence. Since the sequence is dynamic, you can add all the elements you wish - and still sort on index. And, once again, the integer value of different pointers is _not_ defined by the ANSI standard - nor it their relative order. > [...] > 4. Implement malloc()/free(). When I found out that the ANSI C standard prohibited comparing/subtracting pointers to different objects, I pointed out on comp.lang.c that malloc() and free() could not not be written in _standard_ C. They agreed with me. I pointed out that the ability to use pointers as raw addresses was the only thing of value that C pointers had (in my opinion). They said I was wrong for wanting it, I couldn't do it, that's that. In fact, I'm on the side of the rest of you who _want_ pointers to do raw address calculations - C pointers don't. > [...] > 5. Read and write addresses from/to pipes. Again, standard C can't do this. However, this is also something that the system should provide a clearer, higher-level way to do. > [...] > I wonder whether any compiler can be designed to successfully > determine when to duplicate data and when to use a reference. As a rule, > it's bad practise to duplicate data other than in the rare occasions when > an explicit duplicate is needed. [...] It is even worse programming practice to alias data (by copying references) other than those rare occasions when aliasing is a required part of the algorithm. Inadvertent aliasing leads to many man-hours of unnecessary debugging time. Besides, you aren't paying attention. The list of data structures I gave included an alias attribute. Data types with the alias attribute are assigned by copying the reference instead of the data. Thus, the programmer has explicit control over whether aliasing is allowed or not, And, when it's not, the compiler can detect and signal an error when the programmer inadvertently tries to do it. > [...] > I'm curious as to why so many programmers engage themselves in hot > debates over how to best implement strings. String processing is > proportionally insignificant - the first thing done after a read is > usually a tokenization, either through hand-written code or the output of > a lexical front-end generator. [...] Tokens are also strings (which must be frequently compared and efficiently stored). Symbol tables also contain strings (among other stuff). Text processors usually don't have much data that isn't part of one string or another. This is a vital question to _some_ applications. If it isn't for you, so be it. But don't question the need that others feel for strings - they may know something about their application that you don't. > [...] > Just like some algorithms cannot be reasonably coded without gotos - > the alternative would likely be even worse - some operations on data or > certain functionality cannot reasonably be performed without pointers. > [...] I still you'd tell me what those applications are. Absolute raw addresses aren't even in C (though, I think that for the systems programmer, they are absolutely necessary - not for anyone else though). I'm still looking for a _legal_ C application that can't be done with the 7 data structuring tools I gave. J. Giles
jlg@lanl.gov (Jim Giles) (09/21/90)
From article <1990Sep20.161852.22977@rodan.acs.syr.edu>, by isr@rodan.acs.syr.edu (Michael S. Schechter - ISR group account): > [...] > Now in C it's easy to go MyPtr=(Pointer)0x100 [...] Well, yes, you can do that in some extended versions of C no doubt. You could extend Fortran to do such things too. But, _standard_ C leaves the result of that cast undefined. For example, on a segmented machine, the above statement _may_ put the number 0x100 into the offset part of MyPtr - leaving the segment part of the pointer unchanged - and _still_ be standard conforming C. This is fine if it is what you want, but you may have intended that the statement set the segment address as well as the offset. > [...] In fortran, it's a > pain. [...] In _standard_ C it's a pain. Some extended Fortran's I've seen will let me declare an array and base it at any hardware address I want. Some PC users do this to allow them to address display memory as a simple 2-d array. Presumably, _extended_ versions of C can do the same. But, we were talking about _legal_ uses of the language as defined by the appropriate language definition. > [...] > This further illustrates what started the entire thread- > FORTRAN is UGLY and a PAIN for things that use a lot of pointers. Perhaps it's the use of pointers that's "UGLY and a PAIN". It's maybe less ugly in C, but it's usually not necessary elsewhere. In fact, it's usually better not to use pointers at all if they can be avoided. J. Giles
bzs@WORLD.STD.COM (Barry Shein) (09/21/90)
LISP is a good example of a language which has rarely included any explicit pointers yet has been made to do most anything. I've seen LISPs with explicit pointer types (address foo) but I've never seen that do anything which a dynamically allocated, type-coercible array can't do (e.g. load object code into an array and then set its type to compiled-function.) A nice example is the (PD) Franz Lisp arrays package, implemented entirely in LISP except for one or two little malloc-like primitives. Of course, LISP has its own problems (which might make for interesting "futures" discussions), notably its run-time interpreter overhead making it difficult to deliver layered products w/o first making the customer buy an entire LISP system. For example, I have friends with LISP products they'd like to sell for, say, $995/workstation. But the customer first has to buy a $2500 (or more) LISP to load the package into (and the customer may have no interest in the LISP.) This has really stifled the market for low-cost software implmented in LISP (not to mention that you're also dependent on the OS and LISP system continuing to work together thru upgrades which is often not the case, and the LISP vendor wants $$$ for new releases.) -Barry Shein Software Tool & Die | {xylogics,uunet}!world!bzs | bzs@world.std.com Purveyors to the Trade | Voice: 617-739-0202 | Login: 617-739-WRLD
mwm@DECWRL.DEC.COM (Mike Meyer, My Watch Has Windows) (09/21/90)
>> I think you had in mind casting the pointer to an int and looking at >> the raw address - the ANSI standard leaves this process undefined. You keep acting like "undefined" means "unusable". That's not the case. It means that the compiler can do whatever it wants, which may or may not be usefull in your environment. If there's an environment where the C compiler does something usefull that some "replacement" can't do, then that replacement is lacking. One of the reasons C became popular was because operations were undefined originally, and compiler writers were free to do whatever made sense on their machines. The ANSI standard holds true to that spirit, but marks such areas clearly, so that you have a fighting chance of writing code that will port to multiple environments. Used in that light, as a high level, semi-portable assembler, (aka a "systems" language) C is acceptable. If you're working on a replacement for it in that sense, then you have to be able to allow users to do things that they do in C in their environment, even if those things are "undefined" by the standard. I'd be interested in any such attempt. As a general-puprose programming language, C is old and has a large number of problems (many of which are shared by lots of language from that era). However, there are a large number of replacements around, and so far, I've seen nothing in your proposals that are new. If you haven't yet, you might take a look at Euclid and peoples comments on it to see how some of your proposed features work in practice. You've as yet to answer my questions about garbage collection, and what constructs would be provided for working with sequences. As an addendum to that list, I'd be interested to know how you would recode the following C sequence with undefined behavior using your proposed constructs so that it would do what it's author intended: long word ; char *pointer ; word = '1234' ; pointer = (char *) word ; printf("%c %c %c %c\n", pointer++, pointer++, pointer++, pointer++) ; The intent was to determine the byte ordering of the machine the code was running on. <mike
pmorris@BBN.COM (Phil Morris) (09/21/90)
>Date: Thu, 20 Sep 90 14:04:29 PDT >From: Mike Meyer <mwm@decwrl.dec.com> >Subject: Re: C's sins of commission (was: (pssst...fortran?)) >To: jlg@lanl.gov >Cc: info-futures@encore.com [...] >You've as yet to answer my questions about garbage collection, and >what constructs would be provided for working with sequences. As an >addendum to that list, I'd be interested to know how you would recode >the following C sequence with undefined behavior using your proposed >constructs so that it would do what it's author intended: > > long word ; > char *pointer ; > > word = '1234' ; > pointer = (char *) word ; > printf("%c %c %c %c\n", pointer++, pointer++, pointer++, pointer++) ; > >The intent was to determine the byte ordering of the machine the code was >running on. > > <mike How about something as simple as: #define SZLONG sizeof(long) union xxx { long word; char str[SZLONG]; } un; un.word = '1234'; printf("%c %c %c %c\n", un.str[0], un.str[1], un.str[2], un.str[3]); It works on my machine, and the hypothetical language under discussion can handle these constructs (except he didn't mention how to do sizeof(xxx)). -Phil -------- Phil Morris (pmorris@dgi0.bbn.com) Disclaimer: ME? I'm only a non-smoking cat; can't believe a word I meow.
pmorris@BBN.COM (Phil Morris) (09/21/90)
> printf("%c %c %c %c\n", un.str[0], un.str[1], un.str[2], un.str[3]);
Whoops -- stupid assumption -- make that:
int i;
...
for (i = 0; i < SZLONG; i++)
printf("%c ", un.str[i]);
printf("\n");
Sorry,
-Phil
--------
Phil Morris (pmorris@dgi0.bbn.com)
Disclaimer: ME? I'm only a non-smoking cat; can't believe a word I meow.
mwm@DECWRL.DEC.COM (Mike Meyer, My Watch Has Windows) (09/21/90)
>> >The intent was to determine the byte ordering of the machine the code was >> >running on. >> > >> > <mike >> >> How about something as simple as: I goofed. Should have stated the problem, rather than trying to demonstrate it. The problem is with order of evaluation. You generally have to choose between three options: 1) Don't allow side effects, so that normally it isn't critical. 2) Specify it exactly, so you can predict side effects. 3) Leave it "undefined". This problem arises in all expressions, not just function arguments. There are problems with all three solutions. I was curious as to which was going to be taken in this case. <mike
jlg@lanl.gov (Jim Giles) (09/21/90)
From article <9009202104.AA21146@raven.pa.dec.com>, by mwm@DECWRL.DEC.COM (Mike Meyer, My Watch Has Windows): > [...] |> You've as yet to answer my questions about garbage collection, and |> what constructs would be provided for working with sequences. As an |> addendum to that list, I'd be interested to know how you would recode |> the following C sequence with undefined behavior using your proposed |> constructs so that it would do what it's author intended: |> |> long word ; |> char *pointer ; |> |> word = '1234' ; |> pointer = (char *) word ; |> printf("%c %c %c %c\n", pointer++, pointer++, pointer++, pointer++) ; |> |> The intent was to determine the byte ordering of the machine the code was |> running on. Answered via email. I can post the answer here if there's any interest. J. Giles
jlg@lanl.gov (Jim Giles) (09/21/90)
From article <9009202211.AA15218@encore.encore.com>, by pmorris@BBN.COM (Phil Morris): > [...] |> How about something as simple as: |> |> #define SZLONG sizeof(long) |> |> union xxx { |> long word; |> char str[SZLONG]; |> } un; |> |> un.word = '1234'; |> printf("%c %c %c %c\n", un.str[0], un.str[1], un.str[2], un.str[3]); |> |> |> It works on my machine, and the hypothetical language under discussion can handle these |> constructs (except he didn't mention how to do sizeof(xxx)). Actually, I prefer sizeof() to be measured in bits, not bytes. And I prefer 'union' to be non-storage order dependent. I actually prefer the use of 'mapping' declarations (such as I described in the first article I posted with this title). Other than that, your solution is much like the one I sent via email to the person who posted the problem. J. Giles
jlg@lanl.gov (Jim Giles) (09/21/90)
From article <9009202211.AA15218@encore.encore.com>, by pmorris@BBN.COM (Phil Morris): > [...] |> How about something as simple as: |> |> #define SZLONG sizeof(long) |> |> union xxx { |> long word; |> char str[SZLONG]; |> } un; |> |> un.word = '1234'; |> printf("%c %c %c %c\n", un.str[0], un.str[1], un.str[2], un.str[3]); |> |> |> It works on my machine, and the hypothetical language under discussion can handle these |> constructs (except he didn't mention how to do sizeof(xxx)). Actually, I prefer sizeof() to be measured in bits, not bytes. And I prefer 'union' to be non-storage order dependent. I actually prefer the use of 'mapping' declarations (such as I described in the first article I posted with this title). Other than that, your solution is mech like the one I sent via email to the person who posted the problem. J. Giles
jlg@lanl.gov (Jim Giles) (09/21/90)
From article <9009202236.AA21344@raven.pa.dec.com>, by mwm@DECWRL.DEC.COM (Mike Meyer, My Watch Has Windows): > [...] |> I goofed. Should have stated the problem, rather than trying to |> demonstrate it. |> |> The problem is with order of evaluation. You generally have to choose |> between three options: |> |> 1) Don't allow side effects, so that normally it isn't critical. |> |> 2) Specify it exactly, so you can predict side effects. |> |> 3) Leave it "undefined". |> |> This problem arises in all expressions, not just function arguments. |> There are problems with all three solutions. I was curious as to which |> was going to be taken in this case. Oh, well. If it's a question of side-effects. I oppose them outright. As a practical matter, you can convince most programmers that operators should not have side-effects. But, when it comes to functions, they always demand to be allowed side-effects. So, functions should be locally declared (in the interface or prototype or whatever you choose to call it) if they have side-effects. Those functions that don't, can be evaluated in any order in the expression. Functions that _do_ have side-effects should be evaluated in a specific order that can be determined by the user from looking at the source. Unfortunately, nested function calls and even funny operator precidence rules can require certain constraints on the evaluation order. The compiler has no trouble discovering these, but the user might. This can be hard for the user if several of the operators are "left- associative" while several others are "right-associative". Clearly, operator precidence should be consistent. And there should be as few different precidence levels as possible. In any case, the order of side-effects _should_not_ be left undefined. J. Giles
michels@cs.UAlberta.CA (Michael Michels) (09/21/90)
To put this discussion about C back in the "futures" group I would like to present here my two words or 64 bits on my machine as Jim Giles would prefer it :-). I like C because it alows me to do things I want and am paid to do in nice and easy way. If I want my code anylized to death then I can use 'lint' or "syntax" options on the compiler. If I want my code stay as it is my C compiler lets me do it. Why should I be forced to write my programs in a cryptic form just because someone else has a different opinion. As already mentioned before there are other languages that are better for other tasks but for writing system routines and compilers I cannot imagine a better one. I would like to se Jim Giles to write that sort of code in any of the solutions that he proposed :-). Actually that all may change now that the ANSI started to play with it. I thing that one "ADA" is enough :-). Besides C like any other language evolves and changes as the times change. To standarize it and stop it from evolving is the same as killing it and I would not like to see it happen. Other aspect that was touched on was the "wonderfull" role of optimizers. Sure, they are getting "better" all the time but when I hear that the compiler ignores the "register" modifier I get upset. It is like someone telling you that automatic optimazation can do better job than you can. My view on this subject is that if someone wants to drive TOYOTA let them but if I want to build a FERRARI I should be allowed as well. In any futaristic languages I would like to see the same things that I like about C. I want to be able to write my programs that do the job and are short and easy to understand. I guess that is the same thing that Gorge Orwell said about writing. Why should writing programs be any different? Michael Michels
bzs@WORLD.STD.COM (Barry Shein) (09/21/90)
Jim Giles, Have you looked closely at PL/I? I could probably dig up some old, good, textbook recommendations. It has almost everything exactly as you're describing (including the declaration of dynamic, recursive objects.) PL/I was definitely a bloated language with far too many rules violating "the law of least astonishment", but it did have a lot of good ideas. I think PL/I's main problem was that it was perceived as a "pig" of a language in a time when resources were much more dear. The compiler probably needed almost 1MB to run! (in a day when large mainframes had 2MB of real memory and tried to run 100 logged in users, this made it anathema at university computing centers and the rest was history.) The bit-oriented sizeof() is also in PL/I (which is what touched off this remark.) In fact, it worked both ways: declare array fixed bin(31); declared a 32-bit integer array (you always omitted the sign bit if there was one, strange.) In fact you could declare any odd-sized object pretty much: declare I fixed bin(11); and it would do its arithmetic on such objects constrained to that many bits. How? by turning all ops into function calls and linking into a variable bit library...like I said, bloated...it took an expert to get mostly native code out of PL/I as some small indiscretion might make it turn all your code into library function calls.) But it did work in all sorts of crazy situations (e.g. if you recompiled code with bit-level declarations that didn't match the current hardware, it would still work, albeit slowly.) Anyhow, doomed to repeat it...as they say. -Barry Shein Software Tool & Die | {xylogics,uunet}!world!bzs | bzs@world.std.com Purveyors to the Trade | Voice: 617-739-0202 | Login: 617-739-WRLD
bzs@WORLD.STD.COM (Barry Shein) (09/22/90)
Probably an unnecessary correction but... > declare array fixed bin(31); > >declared a 32-bit integer array (you always omitted the sign bit if >there was one, strange.) Should have been something like: declare array(MAXARRAY) fixed bin(31); or thereabouts. My PL/I is rusty, but it ain't that rusty... DCL FOO(MAXFOO) FIXED BIN(31) CONTIGUOUS BASED (THING); heh heh. -Barry Shein Software Tool & Die | {xylogics,uunet}!world!bzs | bzs@world.std.com Purveyors to the Trade | Voice: 617-739-0202 | Login: 617-739-WRLD
jlg@lanl.gov (Jim Giles) (09/22/90)
From article <michels.653892230@menaik>, by michels@cs.UAlberta.CA (Michael Michels): > [...] > I like C because it alows me to do things I want and am paid to do > in nice and easy way. [...] That's exactly why I don't like C. It doesn't let me do anything I find useful without first droping myself to my knees and scraping around on the implementation level of a computer model (which doesn't even match the _real_ machine that I'm on). > [...] Why should I be forced > to write my programs in a cryptic form just because someone else > has a different opinion. Hear, Hear!! I like a programming language to allow me to say what I mean - not to have to convert my algorithm into something cryptic. However, C forces me to encrypt my programs - I can't use arrays, I have to encrypt them as pointers; I can't use dynamic memory, I have to encrypt them as pointers; I can't use mapping (run-time equivalence), I have to encrypt them as pointers; etc.... And that's just the problem with _pointers_ - C promotes other difficulties as well. And, with as many _different_ things all being encrypted as pointers, how can I hope to easily decipher someone else's code to determine which of these concepts he intends his variables to represent? > [...] > I would like to se Jim Giles to write that sort of code in any > of the solutions that he proposed :-). I can't see that it could be anything but easier. Being able to say what you _mean_ - and not have to squash down into the confines of an inadequate language model can only be an improvement. > [...] > My view on this subject is that if someone wants to drive TOYOTA let them > but if I want to build a FERRARI I should be allowed as well. Of course, the proper automobile analogy for C is a '72 Jeep CJ (with the wrong transmission). It's clunky and uncomfortable. It does poorly on the road (that is, as a machine independent portable language). It has pretenses of being a all terrain verhicle, but it only does well on its home turf - byte addressed, 32-bit word, CISC architectures with VAX style structure. > [...] > In any futaristic languages I would like to see the same things that > I like about C. [...] And, I don't want to see anything I don't like about C. (By the way, it's "futuristic". I spell badly too, but if I don't complain someone else will. :-) > [...] I want to be able to write my programs that do the job > and are short and easy to understand. [...] Well, at least we agree about something. J. Giles
jlg@lanl.gov (Jim Giles) (09/22/90)
From article <9009211543.AA25709@world.std.com>, by bzs@WORLD.STD.COM (Barry Shein): > [...] > PL/I was definitely a bloated language with far too many rules > violating "the law of least astonishment", [...] I answered the bulk of this message over email. However, it is my that the worst violator of "the law of least astonishment" among currently popular languages is C - by a _very_ long margin. J. Giles
isr@rodan.acs.syr.edu (Michael S. Schechter - ISR group account) (09/22/90)
In article <63613@lanl.gov> jlg@lanl.gov (Jim Giles) writes: >From article <1990Sep20.161852.22977@rodan.acs.syr.edu>, by isr@rodan.acs.syr.edu (Michael S. Schechter - ISR group account): >> Now in C it's easy to go MyPtr=(Pointer)0x100 [...] >Well, yes, you can do that in some extended versions of C no doubt. >You could extend Fortran to do such things too. But, _standard_ I'm sorry, ANSI can dream all they want, I think you'll find that most people will agree that K&H is 'standard' C, NOT ANSI. That's why the compiler mfr's advertise that they support ANSI, because it's NOT standard enough to be assumed. AND AS YOU SAY: "You ****could**** extend Fortran" Yeah, but i do real work, not write preproccessors, that's better left as exercises for students. And since virtually every ****EXISTING**** C will allow it in some way, why bother doing Satan's work and extending Fortran? >C leaves the result of that cast undefined. For example, on a >segmented machine, the above statement _may_ put the number 0x100 Out of context, your point is valid, however I was talking about hardware addresses, presumably, the system programmer or real-time programmer (ones who _must_ access hardware, not just use system calls) knows what must be done to get valid pointers. Enough. I quit. -- Mike Schechter, Computer Engineer,Institute Sensory Research, Syracuse Univ. InterNet: Mike_Schechter@isr.syr.edu isr@rodan.syr.edu Bitnet: SENSORY@SUNRISE
jcburt@ipsun.larc.nasa.gov (John Burton) (09/22/90)
In article <63722@lanl.gov> jlg@lanl.gov (Jim Giles) writes: >From article <michels.653892230@menaik>, by michels@cs.UAlberta.CA (Michael Michels): >> [...] Why should I be forced >> to write my programs in a cryptic form just because someone else >> has a different opinion. > >Hear, Hear!! I like a programming language to allow me to say what I >mean - not to have to convert my algorithm into something cryptic. >However, C forces me to encrypt my programs - I can't use arrays, I >have to encrypt them as pointers; I can't use dynamic memory, I have >to encrypt them as pointers; I can't use mapping (run-time equivalence), > [...] > Excuse me? are you REALLY saying you CAN'T use arrays in C without resorting to pointers??? I'm confused. Does this mean that I can't use the statement a[i][j] = 123; in a C program. If thats so, you'd better tell my C compilers that (SunOS 4.1 C compiler, Turbo C, Turbo C++...all using ANSI standard mode) Can't use dynamic memory without using pointers? Again, I assume that's not *really* what you mean...I can send you code to create 2,3,... whatever Dimensioned array you want from the heap (using malloc & calloc) that can be used in any situation where you use a statically declared one (I have yet to find a situation where it doesn't work) using exactly the same syntax. It works on all the compilers mentioned above using the ANSI standard mode. I use this routine regularly in the image processing work I do...the only problem I've run into is running out of memory for 2-D arrays larger than 1024x1024 of type float. Obviously I have misinterpreted what you are saying, perhaps you could clarify? John Burton
jlg@lanl.gov (Jim Giles) (09/22/90)
From article <1990Sep21.193403.20381@abcfd20.larc.nasa.gov>, by jcburt@ipsun.larc.nasa.gov (John Burton): > In article <63722@lanl.gov> jlg@lanl.gov (Jim Giles) writes: > [...] >>Hear, Hear!! I like a programming language to allow me to say what I >>mean - not to have to convert my algorithm into something cryptic. >>However, C forces me to encrypt my programs - I can't use arrays, I >>have to encrypt them as pointers; I can't use dynamic memory, I have >>to encrypt them as pointers; I can't use mapping (run-time equivalence), >> [...] >> > Excuse me? are you REALLY saying you CAN'T use arrays in C without > resorting to pointers??? I'm confused. Does this mean that I can't > use the statement > a[i][j] = 123; Syntactic suger. Try sending the array to a subroutine as a parameter. Then you'll find out what the array _really_ is. Try referencing the array in the subroutine with the above statement - lots of luck. What I want is an array that _stays_ an array when I pass it around. Arrays that have to be locally declared or global are practically useless for programs which do any serious array manipulation. For them, arrays in C are not anthing but another name for pointer. > Can't use dynamic memory without using pointers? Again, I assume that's > not *really* what you mean...I can send you code to create 2,3,... > whatever Dimensioned array you want from the heap (using malloc & > calloc) that can be used in any situation where you use a statically > declared one (I have yet to find a situation where it doesn't work) > using exactly the same syntax. It works on all the compilers mentioned > above using the ANSI standard mode. I use this routine regularly in > the image processing work I do...the only problem I've run into is > running out of memory for 2-D arrays larger than 1024x1024 of type > float. Ok. Now give me code in which those declarations resemble static array declarations in any significant way. The declaration of a dynamic object should be _identical_ to the declaration of a static object of the same type (with the possible exception of place-holders for the information to be filled in at allocation time). Once you've failed to do that. Then you can tell me how the compiler knows that the pointers (hidden under your clever declarations) are to dynamic objects and are not aliased to _any_ other pointers. In order to get any kind of efficiency, the compiler must be able to detect aliasing so that it can optimize non-aliased references. However, as far as the compiler knows, the result of a malloc() or calloc() call is just any old pointer, could be aliased to anything. No, the techniques you are advocating (which I've seen before and dismissed for the same reasons) merely hide the facts. This is doubly cryptic - you are pretending to be arrays when you're really using pointers, and you are using pointers because the language doesn't really have arrays. I still prefer to tell the compiler straight out what it is I want to do. J. Giles
jcburt@ipsun.larc.nasa.gov (John Burton) (09/22/90)
In article <63751@lanl.gov> jlg@lanl.gov (Jim Giles) writes: >From article <1990Sep21.193403.20381@abcfd20.larc.nasa.gov>, by jcburt@ipsun.larc.nasa.gov (John Burton): >> In article <63722@lanl.gov> jlg@lanl.gov (Jim Giles) writes: >> [...] >>>Hear, Hear!! I like a programming language to allow me to say what I >>>mean - not to have to convert my algorithm into something cryptic. >>>However, C forces me to encrypt my programs - I can't use arrays, I >>>have to encrypt them as pointers; I can't use dynamic memory, I have >>>to encrypt them as pointers; I can't use mapping (run-time equivalence), >>> [...] >>> >> Excuse me? are you REALLY saying you CAN'T use arrays in C without >> resorting to pointers??? I'm confused. Does this mean that I can't >> use the statement >> a[i][j] = 123; > >Syntactic suger. Try sending the array to a subroutine as a parameter. >Then you'll find out what the array _really_ is. Try referencing the >array in the subroutine with the above statement - lots of luck. > >What I want is an array that _stays_ an array when I pass it around. >Arrays that have to be locally declared or global are practically >useless for programs which do any serious array manipulation. For >them, arrays in C are not anthing but another name for pointer. By the same token, in reality ALL variables in C, FORTRAN, Pascal, etc are simply pointers. They do not contain a value themselves, they point to a memory location that contains the value. Also in reality an array is just a sequence of memory locations. How you access those locations is a matter of preference. Any way you do it you ultimately reference a memory location (or register) and obtain the value there. Explicitly defined pointers simply are indentifiers pointing to a memory location which contain the address of another memory location. One point that intrigues me is _how_ you plan to pass your array around so that it _stays_ an array without using pointers. The primary choices for paramter passing seem to be either make a copy of the data within the subroutine (pass by value or copy-in) or tell the routine _where_ the data is stored (pass by reference), i.e. you either pass the value or a pointer to a value. My interpretation of what you say is that you want to eliminate passing the pointer (pass by reference). This does have the advantage of not allowing anyone else access to the copy of the array as the subroutine works on it, but it has the disadvantage of requiring copies to be made of the array. I'm not convinced that making a copy of a 4 meg array can be done faster than passing a pointer. > >> Can't use dynamic memory without using pointers? Again, I assume that's >> not *really* what you mean...I can send you code to create 2,3,... >> whatever Dimensioned array you want from the heap (using malloc & >> calloc) that can be used in any situation where you use a statically >> declared one (I have yet to find a situation where it doesn't work) >> using exactly the same syntax. It works on all the compilers mentioned >> above using the ANSI standard mode. I use this routine regularly in >> the image processing work I do...the only problem I've run into is >> running out of memory for 2-D arrays larger than 1024x1024 of type >> float. > >Ok. Now give me code in which those declarations resemble static >array declarations in any significant way. The declaration of a >dynamic object should be _identical_ to the declaration of a static >object of the same type (with the possible exception of place-holders Why should it be identical? what purpose would that serve save hiding the fact that the machine is doing two different operation (allocating space at compile time vs. allocating space at run time). > >Once you've failed to do that. Then you can tell me how the compiler >knows that the pointers (hidden under your clever declarations) are to >dynamic objects and are not aliased to _any_ other pointers. In order to >get any kind of efficiency, the compiler must be able to detect aliasing >so that it can optimize non-aliased references. However, as far as the >compiler knows, the result of a malloc() or calloc() call is just any old >pointer, could be aliased to anything. > >No, the techniques you are advocating (which I've seen before and >dismissed for the same reasons) merely hide the facts. This is >doubly cryptic - you are pretending to be arrays when you're really >using pointers, and you are using pointers because the language doesn't >really have arrays. I still prefer to tell the compiler straight out >what it is I want to do. As far as a machine goes, there is no such thing as array. So whats the difference between *me* "hiding the facts" and the compiler "hiding the facts". Personally I prefer to know whats going on as opposed to handing the job to the compiler and hopes it does what i think it does. If I want to use pointers as opposed to arrays, or vice versa, that should be my choice, NOT a restriction of the language. This whole discussion boils down to a difference of opinion. I hold that a programmer should be allowed the freedom to create programs in whatever way he/she chooses and be provided the tools to do the job. What you're proposing significantly limits this freedom of choice. What advantages does this limiting provide? By your own words, None...both methods (supposedly) can be used to create the same end product. John Burton "Save me from those who seek to save me from myself"
bson@AI.MIT.EDU (Jan Brittenson) (09/22/90)
From info-futures-request@encore.com Fri Sep 21 19:06:58 1990 Return-Path: <info-futures-request@encore.com> Received: from encore.encore.com by life.ai.mit.edu (4.1/AI-4.10) id AA01650; Fri, 21 Sep 90 19:06:52 EDT Received: by encore.encore.com (5.64/25-eef) id AA16278; Fri, 21 Sep 90 18:41:01 -0400 Received: from ucbvax.Berkeley.EDU by encore.encore.com with SMTP (5.64/25-eef) id AA16245; Fri, 21 Sep 90 18:40:43 -0400 Received: by ucbvax.Berkeley.EDU (5.63/1.42) id AA10526; Fri, 21 Sep 90 15:34:13 -0700 Received: from USENET by ucbvax.Berkeley.EDU with netnews for info-futures-mail@encore.com (info-futures@encore.com) (contact usenet@ucbvax.Berkeley.EDU if you have questions) Date: 21 Sep 90 22:21:52 GMT From: sdd.hp.com!uakari.primate.wisc.edu!abcfd20.larc.nasa.gov!ipsun.larc.nasa.gov!jcburt@ucsd.edu (John Burton) Organization: NASA Langley Research Center, Hampton, VA USA Subject: Re: C's sins of commission (was: (pssst...fortran?)) Message-Id: <1990Sep21.222152.22479@abcfd20.larc.nasa.gov> References: <1990Sep21.193403.20381@abcfd20.larc.nasa.gov>, <63751@lanl.gov> Sender: info-futures-request@encore.com To: info-futures@encore.com Status: R In article <63751@lanl.gov> jlg@lanl.gov (Jim Giles) writes: >> Excuse me? are you REALLY saying you CAN'T use arrays in C without >> resorting to pointers??? I'm confused. Does this mean that I can't >> use the statement >> a[i][j] = 123; > Syntactic suger. Try sending the array to a subroutine as a parameter. > Then you'll find out what the array _really_ is. Try referencing the > array in the subroutine with the above statement - lots of luck. bar(some_array) int some_array[3][4]; { some_array[0][1] += some_array[1][0]; } snorf() { int foo[3][4]; bar(foo); } "Look Mom, no pointers!!!"
bson@AI.MIT.EDU (Jan Brittenson) (09/22/90)
This message is quite long. I apologize if you think I'm filling up your mailbox with junk flamage. Jim Giles: >> 1. Pointer range check (to see if a buffer crosses page >> boundaries, for instance). > Well, without pointers, why do you need a pointer range check? Computing > the range of something that doesn't exist seems a little silly. Pointers _are_ addresses, and nothing else. Regardless of whether they include segment information, or other information relevant only to non-state-of-the-art architectures. The "address" idiom covers all information relevant to locating the addressee. Pointers may be interpreted differently, depending on the datum, though. On a pdp-10, not only is a word address necessary, but also a character index within the word if it's a character pointer. > I think you had in mind casting the pointer to an int and looking at > the raw address - the ANSI standard leaves this process undefined. You're right, that was my intent with the buffer example. But unless _somehow_ a means of retrieving the address of the buffer - a pointer to it - is provided, the page boundary check can not be done _at all_, defined or undefined, portable or not. To me the simple C-style casting is preferable to some obscure union declared miles away, since pointer-to-int casting at least tells me what is going on. Besides, in almost any implementation casting a pointer to an int of sufficient size and then later back, will yield the original pointer. I most certainly would refuse to use a compiler for which this assumption wasn't correct. If the machine hardware is such that it's not a reasonable assumption to make - say on a Lisp Machine, for instance - then, well, forget about portable C code. > Now, if you're talking about non-standard extensions to C which would > allow you to do this stuff - then any other language can contain the > same non-standard extensions. Extensions, or non-uptight about pointer typing, call it whatever you like. >> [...] >> 2. Calculate physical addresses for DMA controllers. > Why should I care? The system/environment should be able to give me the > address if I need it. But, how do I use a raw address anyway? _Standard_ > C pointers don't give me any such access. Access to such things as > hardware controllers should be privilaged to the system - and _it_ > can contain machine dependent code - like assembly. ...or like C, which most certainly is more defined than assembler! I'm not sure what kind of programming you're talking about. There are languages which are defined similar to what you have described here, but few outside academia use them - Euclid for instance. According to my experience, programmers can be put into either of two major groups: application programmers and system programmers. While the former use various 4G and other kinds of application-oriented tools - such as XYZ-SQL, COBOL, or Prolog, to write applications, the latter do the system-dependent stuff, such as database, server, and support tool implementations - mostly things that are system-dependent to start with. Neither of these groups would have particular use for your proposed language - the application people would ask you what syntax applies to selecting records in a database, while the system people would ask you how to set up 2D bitblt operation in a graphics device, or how to create a channel program in a mainframe environment. For sure, some of the work done by system folks falls somewhere in-between. But I seriously doubt programming efficiency or maintenance would be improved to any degree worth mentioning by forcing everyone to learn Yet Another Language and an entirely new set of idioms when the previous ones are considered quite sufficient. Can you give me one example of a project you or a first-hand reference has been involved in that falls between the two major categories I've outlined above, and which by itself constitutes a project large enough to warrant not simply making do with what you've got and are used to, and possibly for an employer to require experience with your language as desirable? >> [...] > 3. Sort a linked list on addresses of some data >> pointed to > from within the node. Or to keep it sorted as new >> (addresses > of) data is added. > I guess you'll have to tell me how this differs from sorting on the > index of the data within an array or sequence. Since the sequence is > dynamic, ... So how do I know where a certain index resides? I guess this would be an undefined topic - although in this example it would be well defined in C, since the buffers would be of the same type (i.e. arbitrarily dimensioned character vectors). > ... you can add all the elements you wish - and still sort on index. How do I know that the addresses of the previous indexes do not change as new elements are added? This would have to be undefined, as well. >> [...] >> 4. Implement malloc()/free(). > When I found out that the ANSI C standard prohibited comparing/subtracting > pointers to different objects, I pointed out on comp.lang.c that malloc() > and free() could not not be written in _standard_ C. They agreed with me. No doubt you're correct. Implementation is fairly trivial in "nonstandard" C, and I fail to see how it could be made easier or more "defined" without any pointers (i.e. explicit object addresses) at all? >> [...] >> I'm curious as to why so many programmers engage themselves in hot >> debates over how to best implement strings. String processing is >> proportionally insignificant - the first thing done after a read is >> usually a tokenization, either through hand-written code or the output of >> a lexical front-end generator. [...] > Tokens are also strings .... Symbol tables also contain strings > among other stuff). First, tokens are best handled as small integers or enumerated types, while symbol tables are commonly hashed. Other than converting strings-to-int-tokens and symbols-to-hash-values, very little string processing is done. Second, take a look at an assembler or compiler, and you'll be amazed at the total lack of string operations. (Apart from the lexical front-ends, of course.) > Text processors usually don't have much data that isn't part of one > string or another. Granted, but then for most text processors, a simple string or any other sequence isn't enough to store the text and all relevant information. A couple of years ago I wrote a type-setting system - it should qualify as a "text processor" as good as any. The first thing done with the incoming text was chopping it up in segments containing font-pitch-kerning-etc-info unique to the segment. The actual characters of the segment weren't used again until it was time to print them. _All_ work was performed on the remaining segment information, the lists of segments, and lists of lists of segments. Of all hairy things done, _none_ involved character data. (And rarely any duplication either, for that matter.) Let's distinguish between "defined," and "portable." Even if a program adheres to a formal definition, there is no guarantee that it's going to run on every other system that adheres to the same definition. In the end, common sense and portability constraints will have to lead all development. -- Jan Brittenson bson@ai.mit.edu
jlg@lanl.gov (Jim Giles) (09/25/90)
From article <9009220030.AA03386@rice-chex>, by bson@AI.MIT.EDU (Jan Brittenson): > [...] | bar(some_array) | int some_array[3][4]; | { | some_array[0][1] += some_array[1][0]; | } | | snorf() | { | int foo[3][4]; | | bar(foo); | } > > > "Look Mom, no pointers!!!" A procedure which claime to be able to do array manipulation and yet only works on a _fixed_ array size is useless. When I pass the _other_ array (the one you left out: int bilvet [4][3]), the procedure "bar" with mangle it. Try again. J. Giles
jlg@lanl.gov (Jim Giles) (09/25/90)
From article <9009220848.AA00539@wheat-chex>, by bson@AI.MIT.EDU (Jan Brittenson): > [...] > Jim Giles: > > >> 1. Pointer range check (to see if a buffer crosses page > >> boundaries, for instance). > > > Well, without pointers, why do you need a pointer range check? Computing > > the range of something that doesn't exist seems a little silly. > > Pointers _are_ addresses, and nothing else. [...] Yes, but you're missing the point. Surely what's wanted above is a reliable method to allocate buffers that don't contain page boundaries or other unpleasant hardware dependent things. This is clearly the job of the memory manager - to give the programmer adequate support for machine dependent problems of this kind. The programmer should merely have to allocate memory (with the right mode flags on his request, and the manager should return an allocated object with the right memory boundary properties (whether this means "doesn't cross a boundary", or "starts on a boundary", etc.). And, as I've pointed out before, dynamic memory allocation probably should not involve programmer visible pointers. > [...] > > Now, if you're talking about non-standard extensions to C which would > > allow you to do this stuff - then any other language can contain the > > same non-standard extensions. > > Extensions, or non-uptight about pointer typing, call it whatever > you like. Ah, but I'm still not convinced that the language of the future should even _contain_ pointers. No one has yet provided an example of a user level application that _requires_ them. System level applications also _mostly_ don't need them. And, like GOTOs in flow control, pointers in data structuring tend to result in spaghetti. If this is deliberate, it on the programmers head. If someone just get some wires crossed though, I'm willing to apportion at least _some_ of the blame on the language feature itself. > > >> [...] > >> 2. Calculate physical addresses for DMA controllers. > > > Why should I care? The system/environment should be able to give me the > > address if I need it. But, how do I use a raw address anyway? [...] > [...] > ...or like C, which most certainly is more defined than assembler! Once again, you've missed the point. Not even the systems programmer that has to write the access routines for the DMA controller _cares_ what its address is. What he wants is to be able to say "DMA_port=command" whenever he needs to. Why not have the compiler (or the loader) contain a list of all the hardware-specific addresses with some mnemonic names that the programmer can just declare and use? Why does the programmer have to mess with addresses at all? > [...] > Neither of these groups would have particular use for > your proposed language - the application people would ask you what > syntax applies to selecting records in a database, [...] I don't understand the objection. I give the programmer _more_ clear, explicit, direct, data structuring tools that C has, remove pointers (which still haven't been proven useful), and you claim it will be harder to use. The database person would probably use whatever syntax he _presently_ uses except without the need for dereferencing on the linked list types of stuff. Give me a _specific_ example of what you think would be hard. > [...] while the system > people would ask you how to set up 2D bitblt operation in a graphics > device, [...] Again, the same way they do now - except the graphics device itself would now be a named object and the programmers would no longer have to pretend the absolute address of it was somehow part of their task. > [...] or how to create a channel program in a mainframe environment. Oh? You can do that in C? On the Cray for example, channel programs aren't even written in the same _machine_ language. The C compiler doesn't even generate channel code. Not at all. (Maybe there's a different C compiler that does, but I've not seen it.) What does this have to do with the discussion about whether pointers (or any other feature) should be incorporated in a programming language? > [...] > For sure, some of the work done by system folks falls somewhere > in-between. But I seriously doubt programming efficiency or > maintenance would be improved to any degree worth mentioning by > forcing everyone to learn Yet Another Language and an entirely new set > of idioms when the previous ones are considered quite sufficient. Yes, I can see your point. Once you've learned a language it _is_ kind of like a trap. You begin to see only how _that_ language works and not how to solve your _actual_ problems at all. This kind of thing has happened _many_ times before in history. The great modern bridges were not built by the same people who built the great ancient ones - or even their intellectual descendents. Stone masons couldn't _ever_ span distances over 200 feet - yet they considered their work to be "quite sufficient" for all practical bridge building work. Yes, perhaps the current generation of C programmers will have to retire before a fresh group - without the biases - can address the problem from new and more effective points of view. > [...] > Can you give me one example of a project you or a first-hand > reference has been involved in that falls between the two major > categories I've outlined above, and which by itself constitutes a > project large enough to warrant not simply making do with what you've > got and are used to, and possibly for an employer to require > experience with your language as desirable? Yes. Our organization is currently switching _TO_ C/UNIX from something else. All the trouble you predict is indeed upon us - the retraining, the expense, the incidental blunders along the way. Unfortunately, after all this, is it becomming clear that C/UNIX are worse that what we had. Fewer features, harder to use, _SLOW_. Those of us that _knew_ C/UNIX before the conversion warned that this would be the case. But, both users and management succumbed to the hype. (Note, there are still people here that think we did the right thing. In a couple of years all this trauma will be behind us, they say. Then, we will have advantages of a industry standard system, they say. Unfortunately, we had to add so much non-UNIX stuff to the system - just to make it marginally acceptable - that switching to any other UNIX system later would be just as traumatic as what we are doing now. Oh, well) > >> [...] > 3. Sort a linked list on addresses of some data > >> pointed to > from within the node. Or to keep it sorted as new > >> (addresses > of) data is added. > > > I guess you'll have to tell me how this differs from sorting on the > > index of the data within an array or sequence. Since the sequence is > > dynamic, ... I don't understand your answer on this. It seems to me that the objections you raise are _MORE_ applicable to pointers, not less. > >> [...] > >> 4. Implement malloc()/free(). > > [... I said standard C can't do it ...] > > No doubt you're correct. Implementation is fairly trivial in > "nonstandard" C, and I fail to see how it could be made easier or more > "defined" without any pointers (i.e. explicit object addresses) at all? Actually, the implementation is quite trivial. period. I don't see why we have to mung-up the language design to do something which should be done in assembly for efficiency anyway. Even so, the system kernel starts up a tool called the memory manager which handles all the rest of memory as a single large array, using indices relative to wherever the kernel ends. Where's the user-visible pointers in that? The run-time memory manager for the I/O library and application programs works the same way - it has all the memory that the system allocated for the heap in one large array. The only piece that needs to have _raw_ pointer access is the part that performs the alias - that is, the copy of the reference when the allocation process finishes (or, when the deallocation process starts). This little fragment amounts to less than a half-dozen instructions on most machines -surely you can't object to _that_ much assembly in an operation that is _blatantly_ machine dependent. > [... discussion about text string efficiency. I refuse to argue ...] > [... any further. It is clear that _some_ people regard it as an ...] > [... important issue. You don't. I regard it as a subset of the ...] > [... sequence type construct - Which has other applications: most ...] > [... linked lists in C that I encounter would really have been _much_...] > [... more efficient as sequences - for example. ...] J. Giles
bson@AI.MIT.EDU (Jan Brittenson) (09/25/90)
Jim Giles: >From article <9009220848.AA00539@wheat-chex>, by bson@AI.MIT.EDU (Jan Brittenson): > Surely what's wanted above is a reliable method to allocate buffers > that don't contain page boundaries or other unpleasant hardware > dependent things. This is clearly the job of the memory manager - to > give the programmer adequate support Who is the `programmer' - the application programmer or the system programmer? You don't seem to have a very clear concept of who does what. I for one wouldn't touch 4G programs with 5-ft pole; the same can be heard from the 4G people. Which isn't a big surprise, different programmers have different concerns. And at least in my experience there have easily 10-15 4G/COBOL/you-name-it application programmers to one or two system programmers. > Ah, but I'm still not convinced that the language of the future should > even _contain_ pointers. I'm not even convinced computers of the future will execute instructions sequentially, just because this happens to be the case today. > And, like GOTOs in flow control, pointers in data structuring tend to > result in spaghetti. If this is deliberate, it on the programmers > head. If someone just get some wires crossed though, I'm willing to > apportion at least _some_ of the blame on the language feature itself. Yeah, I tend to agree here. No one has said that system programming is easy to learn or get accustomed to, and regardless, constitutes a mere fraction of all programming effort, as well. You'll find a lot of system hackers on the networks writing free software of course - but I think they're a considerable minority at their respective workplaces, or else they're students. > What [the programmer] wants is to be able to say "DMA_port=command" > whenever he needs to. Why not have the compiler (or the loader) > contain a list of all the hardware-specific addresses with some > mnemonic names that the programmer can just declare and use? Why does > the programmer have to mess with addresses at all? Now this is a catch-22 as good as any! Who is to implement the semantics of "DMA_port=command"? I mean, I can type it at my terminal as many times as I like, even check the syntax, without anything happening to the DMA_port. > The database person would probably use whatever syntax he _presently_ > uses except without the need for dereferencing on the linked list > types of stuff. Give me a _specific_ example of what you think would > be hard. SELECT NAME=some_name AND SHOESIZE=10.5 FROM Register_1 You tell an applications programmer that he or she is to use something like: execute_sql(CMD_SELECT_ROM, 3, KEY("NAME"), find_var("some_name"), KEY_LOGICAL("AND"), KEY("SHOESIZE"), 10.5, "Register_1"); and the person is going to laugh you in the face! You *don't* want to build *everything* pereceivably useful into the compiler, or add syntax for it. There are excellent (well...) 4G compilers that generate C code, and they don't mind generating code that uses pointers one itsy bit. >> [...] or how to create a channel program in a mainframe environment. > Oh? You can do that in C? Certainly, with data structures and pointers. Although on IBM machines it's more commonly done in assembler, which compared to C is totally unintelligible and structureless. I much rather maintain a program written in C than ASSEMBLER XF. > Once you've learned a language it _is_ kind of like a trap. You begin > to see only how _that_ language works and not how to solve your > _actual_ problems at all. I am very wary of falling into that trap. But I'm not obsessed with syntax or a clear correspondence between syntax and semantics - semantics is most important to me. To a student or application programmer a clear correspondence between syntax and semantics is probably of more importantance than it is to me. If C doesn't allow me to do what I need to do, then I'd ditch C and go dig up an assembler. Which is a rare occurence indeed, but I doubt it would be less frequent without pointers or some address data type. > Our organization is currently switching _TO_ C/UNIX from something > else. All the trouble you predict is indeed upon us - the retraining, > the expense, the incidental blunders along the way. Perhaps C/UNIX was a bad choice then. Why C? What kind of software did you port? >> >> [...] >> >> 4. Implement malloc()/free(). >> >> [... I said standard C can't do it ...] >> >> No doubt you're correct. Implementation is fairly trivial in >> "nonstandard" C, and I fail to see how it could be made easier or more >> "defined" without any pointers (i.e. explicit object addresses) at all? > Actually, the implementation is quite trivial. period. I don't see why > we have to mung-up the language design to do something which should be > done in assembly for efficiency anyway. I don't think the efficiency issue is all that important. Portability and maintainability is of greater economical concern - it's penny-wise to code RTLs and kernels in assembler to gain 15% speed, while it'll take at least twice as long, be twice as hard to maintain, and not portable for shit. As a customer, you're stuck with a single vendor who can literally rip you off with upgrades and patches, i.e. make you even more dependent on half-working assembler code. --- Regarding the example I posted, against the importance of strings. I just like to add that it was not an argument for pointers, only against the often stated importance of strings as a data type. In fact, the program would probably have been much easier to write *without* explicit pointers, in Common Lisp for instance. Unfortunately, that was not an option.
brendan@batserver.cs.uq.oz.au (Brendan Mahony) (09/26/90)
jlg@lanl.gov (Jim Giles) writes: >Oh, well. If it's a question of side-effects. I oppose them outright. >As a practical matter, you can convince most programmers that operators >should not have side-effects. But, when it comes to functions, they >always demand to be allowed side-effects. Do they really? Why in the world do they want to make their life so difficult? I don't see what side-effects in functions can do that tuple valued expressions can't do, execept make the code unreadable and impossible to reason about. For example if f : int -> int; a := f(b); also updates c and d what is wrong with using a tuple valued function, f : int x int x int -> int x int x int; (a,c,d) := f(b,c,d); If that takes too much typing for you why not #define (A):=F(B) (A,c,d) := f(B,c,d) Even allowing functions to look at global state variables leads to confusion, letting them change them means you don't have a function and terms are not terms. What is the point of such confusion? -- Brendan Mahony | brendan@batserver.cs.uq.oz Department of Computer Science | heretic: someone who disgrees with you University of Queensland | about something neither of you knows Australia | anything about.
mickey@ncst.ernet.in (R Chandrasekar) (09/27/90)
In article <3114.26f57247@cc.helsinki.fi> pirinen@cc.helsinki.fi writes: >Where does this idea of C-hackers come from, that only novices need >safety? I'm no novice (10 years of programming), and I want all the >safety I can get. I'm sick and tired of debugging for hours to find >simple errors that could have been caught at the expense of a few >seconds of the compiler's time. Programmers are not machines, even good >programmers make simple mistakes. I agree completely. In fact, it is the more experienced programmers who need to be 'protected' -- they are more likely to be writing bigger applications, and many of them might be over-confident with their prowess with a programming language. My complaint is not neccessarily with C - it is with any language which provides 'flexible' ways to goof. C-philes say that a variety of syntactic problems could be trapped with tools such as lint. But hardly anyone uses lint ot lint-like programs (usual comments:"lint gives too many vague messages" etc etc). The smart programmer is one who uses safe programming practices, perhaps a layer of code over the basic language, to achieve what is required. >Pekka P. Pirinen University of Helsinki >pirinen@cc.helsinki.fi pirinen@finuh.bitnet ..!mcvax!cc.helsinki.fi!pirinen -- Chandrasekar ______________________________________________________________________ R Chandrasekar, National Centre for Software Technology, Gulmohar Cross Rd No. 9, Juhu, Bombay 400 049,INDIA E-mail : mickey@ncst.ernet.in OR mickey@ncst.in ______________________________________________________________________
peter@ficc.ferranti.com (Peter da Silva) (09/29/90)
In article <5006@uqcspe.cs.uq.oz.au> brendan@batserver.cs.uq.oz.au writes: > Even allowing functions to look at global state variables leads to confusion, > letting them change them means you don't have a function and terms are > not terms. What is the point of such confusion? OK, how do you implement the function "rnd", which returns a random number, without letting it have side effects? -- Peter da Silva. `-_-' +1 713 274 5180. 'U` peter@ferranti.com
asylvain@felix.UUCP (Alvin E. Sylvain) (09/29/90)
Say, I'd like to suggest that these articles on the sins of C be posted to comp.lang.c and/or comp.std.c. Then, let's get back on track to this newsgroup's actual charter. I don't have the actual charter in front of me, but it seems to me that 'society' and 'futures' kinda sez it all ... and doesn't include discussion of whether pointers in C are good, bad or indifferent. What is the *future* of computing, and how will it impact *society*. Thanks mucho. ------------------------------------------------------------------------ "I got protection for my | Alvin "the Chipmunk" Sylvain affections, so swing your | Natch, nobody'd be *fool* enough to have bootie in my direction!" | *my* opinions, 'ceptin' *me*, of course! -=--=--=--"BANDWIDTH?? WE DON'T NEED NO STINKING BANDWIDTH!!"--=--=--=- -- ------------------------------------------------------------------------ "I got protection for my | Alvin "the Chipmunk" Sylvain affections, so swing your | Natch, nobody'd be *fool* enough to have bootie in my direction!" | *my* opinions, 'ceptin' *me*, of course! -=--=--=--"BANDWIDTH?? WE DON'T NEED NO STINKING BANDWIDTH!!"--=--=--=-
brendan@batserver.cs.uq.oz.au (Brendan Mahony) (10/01/90)
My line: -> Even allowing functions to look at global state variables leads to confusion, -> letting them change them means you don't have a function and terms are -> not terms. What is the point of such confusion? peter@ficc.ferranti.com (Peter da Silva) writes: >OK, how do you implement the function "rnd", which returns a random >number, without letting it have side effects? Fine use a tuple valued function function rnd (oldseed : integer) -> (newseed, ran : integer) begin newseed := ... ran := ... end The function would be used in code (seed, ran) := rnd(seed); But seriously folks what you really want is a procedure. Note that this function (which makes explicit the action of the of the operation) cannot (easily) be used as an integer term, and must be accompanied with an update to seed for the whole thing to work properly. rnd is not an integer term, its purpose is not solely to define the value of an integer. Why then do you want to include rnd in the grammar of integer terms? Is it just to save a few keystrokes in the initial coding? Pretty silly given that this is such a small part of the software cycle. The idea is not conciseness but clarity! Side effects in rnd may not worry you, but that is only because everyone knows that it must have side effects. If it was not as well known it would be easy to overlook the fact that a function called rnd, appearing deep in some complicated expression, actually goes and plays with global variables. I think it is very worthwhile to have a clear seperate notion of term: expression defining a value and not to have to worry about any side effects when reading the code, and also when using the function. The second point is important, for my function an expression like 4*second(rnd(seed)) + 3-second(rnd(seed)) has a well defined meaning. For a side effect rnd it may not be as clear what the result is. At the very least order of evaluation becomes important. I know we don't care with rnd but we often will. Note also that the compiler also knows for sure that it will only have to evaluate rnd(seed) once for this expression, if side-effects are possible this optimisation becomes very difficult to determine. For instance it is not even clear that two references to the same variable will yield the same result. Concider seed*rnd + seed There are several possible evaluation strategies for this expression, all yield different results. Are the few keystrokes saved worth the extra complication? Can't think of any more problems at the moment, but I am sure they are there. -- Brendan Mahony | brendan@batserver.cs.uq.oz Department of Computer Science | heretic: someone who disgrees with you University of Queensland | about something neither of you knows Australia | anything about.
aahz@netcom.UUCP (Dan Bernstein) (10/01/90)
Okay, I hate sounding like an ignoramus, but just WHERE do you get the ability to return tuples?
brendan@batserver.cs.uq.oz.au (Brendan Mahony) (10/01/90)
aahz@netcom.UUCP (Dan Bernstein) writes: >Okay, I hate sounding like an ignoramus, but just WHERE do you get the >ability to return tuples? Not sure what you mean. Are you questioning the theoretical possibility or are you simply telling us that this facility does not exist in C? It does exist in some (functional) languages and is a simple extension to a procedural languages run-time stack conventions. If your problem is the second then I think you have lost this thread as we are discussing the inadequacies of C, and other "industrial" programming languages. -- Brendan Mahony | brendan@batserver.cs.uq.oz Department of Computer Science | heretic: someone who disgrees with you University of Queensland | about something neither of you knows Australia | anything about.
reg@lti2.UUCP (Rick Genter x18) (10/01/90)
> OK, how do you implement the function "rnd", which returns a random > number, without letting it have side effects? This is trivial. Pass in the initial seed; rnd() must return the new seed as well as the random number (some algorithms may allow the new seed to be the random number). Let us get the discussion back to *futures*. - reg --- Rick Genter reg%lti.uucp@bu.edu Language Technology, Inc.
peter@ficc.ferranti.com (Peter da Silva) (10/01/90)
In article <5049@uqcspe.cs.uq.oz.au> brendan@batserver.cs.uq.oz.au writes: > But seriously folks what you really want is a procedure. Note that this > function (which makes explicit the action of the of the operation) > cannot (easily) be used as an integer term, and must be accompanied with > an update to seed for the whole thing to work properly. Precisely. > rnd is not an > integer term, its purpose is not solely to define the value of an > integer. Why then do you want to include rnd in the grammar of integer > terms? Why do you assume that the algebra you find useful for the basis of a programming language is the same algebra that I find useful for the basis of a programming language? > Is it just to save a few keystrokes in the initial coding? Pretty > silly given that this is such a small part of the software cycle. The > idea is not conciseness but clarity! Suppose one is implementing an algorithm taken from the literature. Would it not be clearer to use the same syntax as that in the source document? (this is beginning to sound like the famous GOTO debates. If so, GOTO alt.flame). Suppose one is working with a database library, then. How about the following code: struc3 = join(struc1, struc2, key_info); This the clearest way of expressing this, yet join() modifies all sorts of global state. Even if struc1 and struc2 are local, the new struc3 has to be allocated... now you're modifying the heap. If the structs are in external files you've done lots of global operations to read and write parts of the files. Yet it is desirable to implement the database library with this sort of interface... it's clearer. -- Peter da Silva. `-_-' +1 713 274 5180. 'U` peter@ferranti.com
jlg@lanl.gov (Jim Giles) (10/03/90)
From article <5006@uqcspe.cs.uq.oz.au>, by brendan@batserver.cs.uq.oz.au (Brendan Mahony): > jlg@lanl.gov (Jim Giles) writes: > >>Oh, well. If it's a question of side-effects. I oppose them outright. >>As a practical matter, you can convince most programmers that operators >>should not have side-effects. But, when it comes to functions, they >>always demand to be allowed side-effects. > > Do they really? Why in the world do they want to make their life so > difficult? I don't see what side-effects in functions can do that tuple > valued expressions can't do, execept make the code unreadable and > impossible to reason about. [...] I quite agree with what you said. I don't think that functions need to be 'tuple valued' - they just need to be able to return data of _any_ type, including user defined types. This way, if a tuple result is needed, the function would be defined as returning values of some 'record' type (said type describes the tuple). However, tuples don't solve the function side-effect problem. There are three distinct ways a function might have a side-effect: 1) the function may modify its arguments (which is what the tuple idea would eliminate the need for); 2) the function may perform I/O or modify global data; 3) the function may contain internal context which causes its return value to be dependent on the number of times it's called or the order it recieves arguments. Now, some people may disagree with the last two points above and refuse to use the term 'side-effect' for these features. I don't want to argue about it. From a practical standpoint, all three types of side-effects inhibit optimization in exactly the same ways. Such calls can't be reordered, they can't be eliminated (if they happen to have the same argument values), and they can't run in parallel. In any case, the user community seems to regard the ability to have functions with side-effects as indispensable (and, at least with regard to random number generators, I tend to agree). This means that the language designer must think about this issue very carefully. I still support allowing functions to have side-effects - but only if the nature of their side-effects are clearly described in the 'function prototype' or 'interface' or whatever you decide to call it. J. Giles
jlg@lanl.gov (Jim Giles) (10/03/90)
From article <14197@netcom.UUCP>, by aahz@netcom.UUCP (Dan Bernstein): > Okay, I hate sounding like an ignoramus, but just WHERE do you get the > ability to return tuples? This _is_ comp.society.futures you know. You will, I hope, get the ability in future languages. Nemesis has them. By the way Dan, I still have a lot of back mail from you I need to answer. Don't assume silence is agreement! J. Giles
jlg@lanl.gov (Jim Giles) (10/03/90)
From article <151675@felix.UUCP>, by asylvain@felix.UUCP (Alvin E. Sylvain): > [...] > I don't have the actual charter in front of me, but it seems to me > that 'society' and 'futures' kinda sez it all ... and doesn't include > discussion of whether pointers in C are good, bad or indifferent. > [...] On the contrary, in a discussion about future language design, this is a quite appropriate topic. It is my contention that future languages shouldn't have pointers at all. Not just no C-like pointers, none at all. I just picked on C as the most unpleasant example of what I'm against. J. Giles
brendan@batserver.cs.uq.oz.au (Brendan Mahony) (10/03/90)
jlg@lanl.gov (Jim Giles) writes: [a good exposition of the nature of side effects] >In any case, the user community seems to regard the ability to have >functions with side-effects as indispensable It seems to me that this attitude could reasonably be likened to the antiquated European belief that regular bathing was dangerous to the health. Indeed we now believe that bathing is good for the health provided the water supply is clean and hygenic. The percieved need for side-effects in terms is merely a by-product of the poor state of language design, and would not be missed at all in better languages. The purpose of terms is to define values. In serving this purpose they need above all to be UNAMBIGUOUS. Terms with side effects are ambiguous. >This means that the >language designer must think about this issue very carefully. I still >support allowing functions to have side-effects - but only if the nature >of their side-effects are clearly described in the 'function prototype' >or 'interface' or whatever you decide to call it. Actually to do this you will need to specify the local/global state being side effected by the function. Since this is required why not "officially" include it in the interface? >(and, at least with regard to random number generators, I tend to agree). I have discussed random number generators elsewhere, but I do not see how using a procedure instead of a function is such a burden that we must admit ambiguous terms. -- Brendan Mahony | brendan@batserver.cs.uq.oz Department of Computer Science | heretic: someone who disgrees with you University of Queensland | about something neither of you knows Australia | anything about.
bzs@WORLD.STD.COM (Barry Shein) (10/03/90)
>Okay, I hate sounding like an ignoramus, but just WHERE do you get the >ability to return tuples? You mean what language supports this feature? Common Lisp, it's in the Guy Steele book. You need all sorts of additional operators to support it like a parallel assignment statement. The (more than) rumor I heard was that Symbolics successfully lobbied to have multiple-value-return put into the common lisp standard because there was something about their hardware that made this very desireable (I think it was that the top of their return stack, the first 256 bytes, was made of very fast stuff mapped into memory.) So the whole thing may have been a hoax (as far as any abstract motivations were concerned.) It didn't really add anything useful to the language that people hadn't been doing for decades by returning a list (it is a LISt Processor..., but that's the same basic crew that added hunks and hasharrays and all sorts of other non-lists, some useful, some sorta dumb.) -Barry Shein Software Tool & Die | {xylogics,uunet}!world!bzs | bzs@world.std.com Purveyors to the Trade | Voice: 617-739-0202 | Login: 617-739-WRLD
nevin@igloo.scum.com (Nevin Liber) (10/04/90)
[I added comp.lang.misc to the newsgroup list; please follow-up to the appropriate newsgroup only.] In article <64618@lanl.gov> jlg@lanl.gov (Jim Giles) writes: >It is my contention that future languages >shouldn't have pointers at all. Not just no C-like pointers, none at >all. I just picked on C as the most unpleasant example of what I'm >against. I really hate to agree with you Jim :-), but I'm beginning to think that you are right. The only real argument I can see _for_ having pointers is efficiency; more specifically, to help in hand-optimisation. Extensions to C such as C++ are showing that pointers aren't needed nearly as much as they use to be; heck, code seems to be more readable w/o them. In languages such as Icon and LISP I find that I don't even miss them. -- NEVIN ":-)" LIBER nevin@igloo.Scum.com or ..!gargoyle!igloo!nevin (708) 831-FLYS California, here I come! Public Service Announcement: Say NO to Rugs!
nevin@igloo.scum.com (Nevin Liber) (10/04/90)
[I added comp.lang.misc to the list of newsgroups; please follow-up to the appropriate newsgroup ONLY.] In article <5088@uqcspe.cs.uq.oz.au> brendan@batserver.cs.uq.oz.au writes: >The percieved need for >side-effects in terms is merely a by-product of the poor state of language >design, and would not be missed at all in better languages. I disagree. This would throw out all functions which maintain their own state (eg: i/o). Heck, you might ask why we even have variables? Even the LISP community gave into this as being a helpful programming technique. -- NEVIN ":-)" LIBER nevin@igloo.Scum.com or ..!gargoyle!igloo!nevin (708) 831-FLYS California, here I come! Public Service Announcement: Say NO to Rugs!
jlg@lanl.gov (Jim Giles) (10/05/90)
From article <5088@uqcspe.cs.uq.oz.au>, by brendan@batserver.cs.uq.oz.au (Brendan Mahony): > [...] > It seems to me that this attitude could reasonably be likened to the > antiquated European belief that regular bathing was dangerous to the > health. Indeed we now believe that bathing is good for the health > provided the water supply is clean and hygenic. [...] A very good analogy, and one which is also directly applicable to the issue of pointers in languages. > [...] The percieved need for > side-effects in terms is merely a by-product of the poor state of language > design, and would not be missed at all in better languages. This, however, is not clear. Mathematical notation for random variates existed before computing languages. The variates were always of the same form as ordinary variables. In fact, that is what I think the use of random generators should look like. What you are suggesting is that the programmer should alter his notation to suit the language, not the other way around. I think that it is the language that should cater to the desires of the programmer. > [...] > The purpose of terms is to define values. In serving this purpose they > need above all to be UNAMBIGUOUS. Terms with side effects are ambiguous. No, they aren't ambiguous (necessarily). As long as the order of evaluation of such terms are predictable and the nature of the side- effect of each is known, their use is completely unambiguous. Now, in languages like C (where the order of evaluation is not specified and the nature of the side-effects needn't be declared), the feature is indeed quite ambiguous. If all side-effects that a function might cause are clearly defined in the function interface, the compiler can then generate code so that the side-effects are evaluated in a fixed order with respect to other functions (or local assignments) which have side-effects on the same objects. This is a completely unambiguous solution - and functions that _don't_ have side-effects can still be optimized fully. > [...] > Actually to do this you will need to specify the local/global state > being side effected by the function. Since this is required why not > "officially" include it in the interface? Indeed. The interface should contain a complete list of global variables that it modifies, the specific arguments that it modifies should be identified, and the function declaration should specify if it has any internal, time-dependent, state. > [...] >>(and, at least with regard to random number generators, I tend to agree). > > I have discussed random number generators elsewhere, but I do not see > how using a procedure instead of a function is such a burden that we > must admit ambiguous terms. One of the problems with language design is that most designers are not in daily touch with their potential user base. How much of a burden the prohibition of side-effects in functions would pose is not for the language designer to say. I know mathematical/scientific users who would deem it a considerable burden indeed. Note: the requirement of explicit declarations of all side-effects, especially if backed up by the loader which can check that such assertions are true, would actually decrease the use of side-effects on its own. After all, who would go to such trouble unless it was outweighed by the trouble of avoiding it? This indeed, would be a good empirical test of your claim that side-effect free functions aren't a burden - how much trouble are the users willing to go to in order to still have them? J. Giles
jbickers@templar.actrix.co.nz (John Bickers) (10/05/90)
Quoted from - jlg@lanl.gov (Jim Giles): > On the contrary, in a discussion about future language design, this is > a quite appropriate topic. It is my contention that future languages > shouldn't have pointers at all. Not just no C-like pointers, none at Perhaps future languages should control pointers in new and more fascinating ways, rather than do away with them altogether. This seems similar to the argument about goto, except with less basis in reality, and goto is still with us. Look for ways to improve on what seems to be deficient, rather than ban it altogether. Since the usefulness of pointers seems to be a matter of judgement (I intensely dislike the ideas of Pascal or BASIC string constructs), it'd probably be more useful to look at improving "lint"s, having seperate languages for seperate applications, and so on. > J. Giles -- *** John Bickers, TAP, NZAmigaUG. jbickers@templar.actrix.co.nz *** *** "All I can do now is wait for the noise." - Numan ***
brendan@batserver.cs.uq.oz.au (Brendan Mahony) (10/05/90)
Me: -> [...] The percieved need for -> side-effects in terms is merely a by-product of the poor state of language -> design, and would not be missed at all in better languages. jlg@lanl.gov (Jim Giles) writes: >This, however, is not clear. Mathematical notation for random variates >existed before computing languages. The variates were always of the same >form as ordinary variables. I can't remember my prob theory too well. Is it true that no distinction is made between integer variables and integer random variables? I have a vague feeling that random variables represented sequences of values, you could talk about the average of a random variable and that sort of thing? Perhaps what is required is a more sophisticated data structure? How about a non-deterministic choose operator? Use a special notation in the special case, don't force people to have to worry about non-determinism all the time! Possible suggestion, declare a random variable r : seq of (1..10) | "normally distributed" now when you want a random number use "choose(r)". >In fact, that is what I think the use of >random generators should look like. What you are suggesting is that >the programmer should alter his notation to suit the language, not >the other way around. I think that it is the language that should >cater to the desires of the programmer. Look the programmer is not the only person who has to cope with the code the is written. The programmer may well be the person who spends the least amount of time trying to understand the stuff. The idea should be to produce code that is easily comprehensible, rather than easily written. Included in that criteria should be the ability to easily reason about the behaviour of the code. I would agrue against global variables in both procedures and functions on the grounds of comprehensibility. Procedures have the mediating factor that their syntactic intent is to change program state. Side effects in terms deny the syntactic intent of terms, which is to define a value. The rest of your article gives a reasonable way of formalising the action of side effects. The general gist of it seems to be that to understand the "meaning" of a term with side effects you must break its evaluation down to a set of state changes, and determine the sequence of this actions. If this activity is required to make the code readable it should be reflected in the code. -- Brendan Mahony | brendan@batserver.cs.uq.oz Department of Computer Science | heretic: someone who disgrees with you University of Queensland | about something neither of you knows Australia | anything about.
jlg@lanl.gov (Jim Giles) (10/06/90)
From article <5116@uqcspe.cs.uq.oz.au>, by brendan@batserver.cs.uq.oz.au (Brendan Mahony): > [...] > Look the programmer is not the only person who has to cope with > the code the is written. The programmer may well be the person > who spends the least amount of time trying to understand the > stuff. The idea should be to produce code that is easily > comprehensible, rather than easily written. [...] All the more reason to use the conventional terminology and notation rather than force the user to conform to some purist's idea of what should be allowed by a programming language. My experience talking to large-scale users of such features is that they would be quite willing to spen considerable effort in the declaration of a random generator in order tha the _use_ of the thing retain conventional properties. For example, say I want a triangular probability distribution. The following two codes are examples of your style and mine: Yours: qran(z,seed) qran(x,seed) tri_dist = z-x Mine: tri_dist = ranf - ranf or: tri_dist = ranf() - ranf() Note, my experience is that the second of my forms (with the explicit denotation that ranf is a function call) is quite acceptable to users while your form is usually not. > [...] Included in that > criteria should be the ability to easily reason about the > behaviour of the code. [...] Given the rules of the language and a clear declaration of the fact that ranf() has side-effects, the forms I gave are susceptable to reasoning _identically_ well compared to your proposed form. The ability to reason about programs is impossible without knowledge of the language's rules - but it should be equally possible in any two well defined languages. > [...] I would agrue against global variables in > both procedures and functions on the grounds of > comprehensibility. [...] And I would argue in favor of global variables for the same reason. I find procedures with large and complicated calling sequences to be quite incomprehensible. Further, having to pass a data structure around through the calling sequence because it represents information which is shared by low-level routines I find appalling. For example, a simulator of a helicopter might have three routines (all deep in the call chain) which need the data structure describing the tail rotor: the power routine needs to know the state of the rotor to compute the power required to drive it, the structure routine needs to know the stresses the tail rotor is producing, and the aerodynamics routine needs to know how much torque the rotor is imparting into the air. Clearly, these functionalities are completely separate in the simulation - so you don't want to combine the three routines into one multi-purpose routine. But, you also don't want to have to force the rest of the program to carry around the tail rotor data which you don't want anything except the three low-level routines to be able to change or examine. The problem is, your data is 'helicopter shaped' but your program's procedure call chain is roughly tree shaped. By depriving the programmer the use of global data, you deprive him of the ability to partition his data into manageable small pieces which are imported only by those routines which actually use them. Now, of course, global data can be misused. I have seen some programs which deliberately import _all_ global variables into every routine. This means that you have no means of determining where a given data item might be used or changed. However, carefully used, global data can improve the comprehensibility of programs by isolating the data to those routines which actually need it and guaranteeing that all other routines will keep their electronic hands off. > [...] Procedures have the mediating factor that > their syntactic intent is to change program state. Side effects > in terms deny the syntactic intent of terms, which is to define > a value. I agree that this constraint makes the analysis of expressions much simpler. This is why I advocate explicit declaration of all side-effects that a function may produce - so that side-effect free expressions, (the majority) can be analysed in this simple way. Procedures which _have_ side-effects may make the program easier to analyse in other ways and at other levels than the expression level. I think that the user should be the one to decide which is most important to him. > [...] to understand the "meaning" of a term with side effects you > must break its evaluation down to a set of state changes, and > determine the sequence of this actions. If this activity is > required to make the code readable it should be reflected in the > code. Exactly. But this usually need not be such a burden as you seem to think. For random number generators for example, all that's needed is an attribute on the interface specification to the effect that it has side-effects. (The language I'm designing presently has the rather fanciful term 'fickle' for this property: a random number generator is a fickle function. Before we actually release the language to outside users we will probable switch to some more dignified or techie type of word. I don't know though - look through your thesaurus some time to see if you can find a better word - we couldn't.) In any case, a 'fickle' function must be regarded as having some internal state that causes its value to be different from call to call - even if the same arguments are sent (or no arguments at all in the case of random number functions). By the way, with your attitude toward side-effects, you must dislike C even more than I do. I thought I was the most anti-C person on the net. Maybe not. J. Giles
peter@ficc.ferranti.com (Peter da Silva) (10/08/90)
In article <2883@igloo.scum.com> nevin@igloo.UUCP (Nevin Liber) writes: > pointers aren't needed nearly as much as they use to be; heck, code > seems to be more readable w/o them. In languages such as Icon and > LISP I find that I don't even miss them. Last time I checked the primary data objects in Lisp were the symbol and the pointer. (oh sure, a DOTPR is a constrained pointer (well, pointer pair)... but when it can in principle point to any data or code object it's just as dangerous as pointers in C. What makes it safe is the limited types of the objects it can point to: other pointers or symbols. It can't point to the second part of a DOTPR, or into a primitive, or the middle of a symbol). -- Peter da Silva. `-_-' +1 713 274 5180. 'U` peter@ferranti.com
dmiller@YODA.EECS.WSU.EDU (10/11/90)
Just a minor clarification, you should probably be saying "no programmer- visible pointers" rather than "no pointers." An unqualified "no pointers" can be read to imply that a particular implementation would not use pointers. This would, however, be dependent upon the underlying architecture of the target machine and other implementation specific considerations. --DLM ********************************************************************** David L. Miller Internet: millerd@prime.tricity.wsu.edu (?) Systems Analyst or dmiller@yoda.eecs.wsu.edu WSU Tri-Cities Bitnet: MILLERD@WSUVM1 100 Sprout Rd. Voice: (509) 375-9245 or 375-3176 Richland, WA 99352 >>>>>>>>>>> Support the FSF <<<<<<<<<<<
jlg@lanl.gov (Jim Giles) (10/11/90)
From article <9010102226.AA16028@yoda.eecs.wsu.edu>, by dmiller@YODA.EECS.WSU.EDU: > Just a minor clarification, you should probably be saying "no programmer- > visible pointers" rather than "no pointers." An unqualified "no pointers" > can be read to imply that a particular implementation would not use pointers. I have been saying that all along. The word "pointer" or the phrase "explicit pointer" means "a variable whose _value_ is an address". It is my contention that no high-level language should have or should need such a data type. What the implementation does internally is the compiler writer's decision. I assume that the people who champion GOTO free languages don't object to the compiler generating jump instructions internally to support IFs and SELECT/CASE, etc.. J. Giles
asylvain@felix.UUCP (Alvin "the Chipmunk" Sylvain) (10/13/90)
In article <2884@igloo.scum.com> nevin@igloo.UUCP (Nevin Liber) writes: ::[I added comp.lang.misc to the list of newsgroups; please follow-up to ::the appropriate newsgroup ONLY.] Which is the appropriate newsgroup? If comp.lang.misc is not appropriate, why'd you add it? Why do I get the feeling that no one in comp.society.futures was or is or will be interested? Anywho ... ::In article <5088@uqcspe.cs.uq.oz.au> brendan@batserver.cs.uq.oz.au writes: :: ::>The percieved need for ::>side-effects in terms is merely a by-product of the poor state of language ::>design, and would not be missed at all in better languages. :: ::I disagree. This would throw out all functions which maintain their ::own state (eg: i/o). [...] Nope. C (at least) allows for variables to be 'static'. No need for side-effects to maintain the function's internal state. Unfortunately, Pascal doesn't suffer from this convenience. This causes a programmer to load up more and more junk into the 'program'-level 'var' declaration, until it's almost as hard to debug as FORTRAN COMMON statements. Heaven help you if two or more subroutines are both using a global 'IDX' in different contexts. There, your argument about side-effects probably holds. Any post-modern (after Pascal) language allows the declaration of a variable which is local to the routine, but doesn't change between invokations. Side-effects are not necessary for state-maintenance. Period. -- =============Opinions are Mine, typos belong to /bin/ucb/vi============= "We're sorry, but the reality you have dialed is no | Alvin longer in service. Please check the value of pi, | "the Chipmunk" or pray to your local diety for assistance." | Sylvain = = = = = =I haven't the smoggiest notion what my address is!= = = = = =
peter@ficc.ferranti.com (Peter da Silva) (10/20/90)
In article <152323@felix.UUCP> asylvain@felix.UUCP (Alvin "the Chipmunk" Sylvain) writes: > Nope. C (at least) allows for variables to be 'static'. No need > for side-effects to maintain the function's internal state. Those *are* side-effects, since they mean the same function may return different values on successive calls with the same calling sequence. This has the same effects on predictably and optimisation as more obvious side effects. -- Peter da Silva. `-_-' +1 713 274 5180. 'U` peter@ferranti.com
cik@l.cc.purdue.edu (Herman Rubin) (10/23/90)
In article <5EJ64J3@xds13.ferranti.com>, peter@ficc.ferranti.com (Peter da Silva) writes: > In article <152323@felix.UUCP> asylvain@felix.UUCP (Alvin "the Chipmunk" Sylvain) writes: > > Nope. C (at least) allows for variables to be 'static'. No need > > for side-effects to maintain the function's internal state. > > Those *are* side-effects, since they mean the same function may return > different values on successive calls with the same calling sequence. This > has the same effects on predictably and optimisation as more obvious side > effects. I have yet to see a random number. or pseudo-random number, procedure which did not exploit this. The same is true for uses of buffers, reading external media, etc. It is also the case when one makes calls by reference, and uses code to change the values of the arguments. This even applies to a subroutine to multiply two matrices. This means it is the programmer who must decide, and pass on the information to the compiler, about the side-effects. Sometimes, but not always, the compiler can tell by looking at the global code. -- Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907 Phone: (317)494-6054 hrubin@l.cc.purdue.edu (Internet, bitnet) {purdue,pur-ee}!l.cc!cik(UUCP)