jcallen@Encore.COM (Jerry Callen) (08/28/90)
I've run across a curious anomaly with the value returned by a function; if the function value is returned by reference, an alias can be created. The following program illustrates the problem. NOTE: I will not attempt to justify the following code; it's boiled down from some code supplied by a user. Surely _I_ would never write such bizarre code. :-) ----------------------------------------------------------------- with report; use report; procedure alias1 is package p1 is ELEMENTS: constant := 100; subtype range_t is integer range 1..ELEMENTS; type array_t is array(range_t) of boolean; function make_set (which: range_t) return array_t; function transform (a, b: array_t) return array_t; end p1; use p1; v1, v2, v3: array_t; package body p1 is global: array_t; null_array: constant array_t := (others => FALSE); function make_set (which: range_t) return array_t is begin global := null_array; global(which) := TRUE; return global; -- may create an alias for "global" end make_set; function transform (a, b: array_t) return array_t is begin -- if the value returned by make_set is used as a -- parameter to this function, AND the compiler -- created an alias, the next statement will tromp -- on the parameter. global := null_array; global := (a xor b) and a; return global; end transform; end p1; begin test ("alias1", "see if an alias is created for function return"); v1 := (others => TRUE); v2 := make_set (20); v3 := transform (v1, v2); if v3(20) then failed ("base case failed - compiler is TOTALLY broke"); end if; v3 := transform (v1, make_set (20)); -- interesting call if v3(20) then failed ("compiler used 'return by value' and created an alias"); end if; result; end alias1; ----------------------------------------------------------------------- This is obviously a pretty contrived test case, but it should get the point across. Is a compiler permitted to return objects "by reference" (with the possibility of creating aliases) or must it create a copy of the return value? I can't find anything in the LRM on this point, although there is some gobble-de-gook in 6.2.13 that might be relevent. Aside: this program is also dependent upon the mechanism used to pass parameters; to fail, "return by reference" and "call by reference" must BOTH be used. I would LIKE to think that objects may be returned by reference; otherwise potentially large objects would have to be copied. I would call the program above erroneous, since it is dependent upon the mechanism used for parameter passing and value return. -- Jerry Callen jcallen@encore.com
davidr@inmet.inmet.com (08/29/90)
It's a compiler bug, but a useful one, and there are no ACVC's which test for it; so it's probably best to assume that the program is erroneous. I couldn't find an obvious language rule that proves it's a compiler bug, but 5.8:5 states that a "value" is returned by a return statement, and the note in 5.2.1 implies that values are always independent of each other. David Rosenfeld Intermetrics Inc. -- davidr@inmet.com
NCOHEN@IBM.COM ("Norman H. Cohen") (08/30/90)
Jerry Callen presents a program that fails if the statement return Global; is implemented by returning a REFERENCE to Global, rather than a COPY of Global, to the caller. He then writes: >I would LIKE to think that objects may be returned by reference; >otherwise potentially large objects would have to be copied. >I would call the program above erroneous, since it is dependent upon >the mechanism used for parameter passing and value return. David Rosenfeld, referring to a compiler that "returns by reference," responds: >It's a compiler bug, but a useful one, and there are no ACVC's which >test for it; so it's probably best to assume that the program >is erroneous. Hold on there! It is not up to the programmer or the implementer to decide what will be considered erroneous. Dependence on the PARAMETER- passing mechanism is erroneous because the reference manual explicitly states (6.2(7)): "For a parameter whose type is an array, record, or task type, an implementation may likewise achieve the above effects by copy, as for scalar types.... Alternatively, an implementation may achieve these effects by reference.... The execution of a program is erroneous if its effect depends on which mechanism is selected by the implementation." No such choice is given for passing function results back to the caller. Indeed, RM 5.8(6) specifically states, "For the execution of a return statement, the expression (if any) is first evaluated," and 5.8(5) states, "The value of the expression determines the result returned by the function." The offenses constituting erroneous execution are specifically enumerated in the RM (with a few more added by AIs). Dependence on the result-passing mechanism is not such an offense. Switching my perspective from language legalities to implementation practicalities, the implementation approach suggested by Callen will very rarely be useful. Most of the time, the expression in a return statement does not name a global variable, but a variable local to the function body, or some expression such as a catenation or aggregate. A reference to a local variable is a pointer into the stack frame that is about to disappear. Indeed, in a context like if Array_Func_1(X) = Array_Func_2(X) then ... the stack frame will likely be overwritten by the second function call before the value of the first function call is dereferenced. When the function result subtype is CONSTRAINED, the caller can determine an appropriate destination for the function result and pass a pointer to that destination into the function body. The return statement is then implemented by copying the result value into the specified destination. In a context like A := Array_Func(X); the caller can pass the address of A, so that both the return and the assignment statement are achieved with one copy of the array value. In a context like Proc(Array_Func(X)); the caller can pass an address in the stack frame it is setting up for the call on Proc, or the address of a temporary whose address will be placed in that stack frame. Usually, the only useful context for a call on a function with an UNCONSTRAINED array result is as a parameter to some outer subprogram call--as in Text_IO.Put(Text_IO.Name(Input_File)); --or as the initial value of an allocator--as in String_Pointer := new String'(Text_IO.Name(Input_File)); --or as PART of the initial value for an allocator for some record type with discriminants--as in String_List := new List_Node_Type' (Size => Name_Length, Text => Text_IO.Name(Input_File), Link => String_List); In the first case, the implementation can allocate temporary space at the other end of storage, construct the result value there (entailing a copy of each array component value), and return a pointer into that space for the caller to use in passing the array by reference to the outer subprogram call. (It is essential that the caller free the allocated space at the end of the outer call, to prevent storage leaks.) In the second case, the implementation can proceed as above, but allocating space for the function result directly in the collection for the corresponding access type, so that a pointer to the allocated space can be used directly as the value of the allocator. The third case is more complicated. The most reasonable approach is probably for the function to yield a pointer as in the first case and for evaluation of the record aggregate to perform the check that the array pointed to satisfies the index constraint for the corresponding array component, then copy the array yet again, then free the temporary copy constructed by the function. The extra copy can be avoided by passing the address and index constraint of the record component, which is allocated on the basis of discriminant values before the function is called, and proceeding as in the constrained case; however, potential benefits may be outweighed by the cost and complexity of adding an extra implicit parameter to functions with unconstrained array-type results, indicating which return method to use for a given call. Callen bemoans the fact that potentially large objects have to be copied, but returning an array as a function result is inherently a potentially expensive operation, especially if the result subtype is unconstrained. Fortunately, clever compilation can keep the cost reasonable in most cases, while preserving the semantics required by the reference manual. Norman H. Cohen
sampson@cod.NOSC.MIL (Charles H. Sampson) (08/31/90)
In article <12594@encore.Encore.COM> jcallen@encore.com (Jerry Callen) writes: >I've run across a curious anomaly with the value returned by a function; >if the function value is returned by reference, an alias can be created. > > [Example showing one of the usual aliasing problems deleted.] > > ... Is a compiler permitted to return objects "by reference" >(with the possibility of creating aliases) or must it create a copy >of the return value? I can't find anything in the LRM on this point, >although there is some gobble-de-gook in 6.2.13 that might be relevent. I don't remember this problem being explicitly covered in the LRM, I can't think of any place to look, the ARG appears never to have dis- cussed it, and 6.2(13) doesn't apply because the subject there is sub- program parameters and a function's value is not one of its parameters. However, there's no need for an explicit statement in the LRM. The implementation is wrong because it violates the semantics of a function invocation. Each invocation of a function determines an associated value and that value cannot mysteriously change after the invocation has com- pleted. Another respondent has said this might be a useful compiler error, presumably on the basis of efficiency of the implementation. I don't want my function values to change after they've been calculated, no matter how efficient the code is. It should just be accepted that func- tions that return large structures are not going to do so with the same kind of code as a Boolean-value function, but this "inefficiency" might well be offset by other considerations, for example the avoidance of un- pleasant side-effects. (Does anybody else remember the quote from The Psychology of Computer Programming: "If you're not interested in correct answers, I can write a very fast program for you.", or words to that effect?) Charlie Sampson
eachus@linus.mitre.org (Robert I. Eachus) (09/08/90)
I'll bring this up informally at the URG adn ARG meetings next week, but it looks like a confirmation...the return statement returns the value of an expression, evaluated before the return. -- Robert I. Eachus with STANDARD_DISCLAIMER; use STANDARD_DISCLAIMER; function MESSAGE (TEXT: in CLEVER_IDEAS) return BETTER_IDEAS is...
mfeldman@seas.gwu.edu (Michael Feldman) (09/08/90)
In article <EACHUS.90Sep7150018@aries.linus.mitre.org> eachus@linus.mitre.org (Robert I. Eachus) writes: > > I'll bring this up informally at the URG adn ARG meetings next >week, but it looks like a confirmation...the return statement returns >the value of an expression, evaluated before the return. > Let's talk about abstraction vs. implementation here. Does this mean that a value that happens to be large - say a large object of an unconstrained array type - MUST be copied twice, i.e. once to return it to the caller and again to store it in its final resting place? Or can an implementer store it in heap space, for example, returning its address (or a dope vector only) to the caller? Since some implementers build all unconstrained array types this way, by allocating only a dope vector and acquiring object space dynamically, it would seem that returning function values in the same way makes sense, right? Then the "value" of the expression is the dope vector which points to the actual data in the heap, instead of the data itself. Since the LRM generally leaves storage management issues to the implementer, especially for arrays, there seems to be no violation here. Am I right? --------------------------------------------------------------------------- Prof. Michael Feldman Department of Electrical Engineering and Computer Science The George Washington University Washington, DC 20052 202-994-5253 mfeldman@seas.gwu.edu ---------------------------------------------------------------------------