dl@cmx.npac.syr.edu (Doug Lea) (08/08/89)
This is an argument for the `official' adoption in C++ of the notion
of named return values, as have been present in GNU g++ for the last
few versions.
Named return values are a g++ extension that gives a name to the
object returned by a function, in a way vaguely similar to named
return `variables' in Pascal and other languages, but with some
important C++-specific benefits.
To allow for some examples and discussion, first declare a
very simple vector-like class X:
static const int Xsize=100000; // constant size, just for simplicity
class X
{
int p[Xsize]; // Xs are BIG objects
void copy(const X& a) { for (int i=0; i<Xsize; ++i) p[i] = a.p[i]; }
void fill(int v) { for (int i=0; i<Xsize; ++i) p[i] = v; }
public:
X() {}
X(const X& a) { copy(a); }
X(int v) { fill(v); }
void operator=(const X& a) { copy(a); }
int get(int i) { return p[i]; }
void set(int i, int x) { p[i] = x; }
};
and an X-returning function, iota, first in standard C++ form,
X iota() // return an X with all p[i] = i
{
X y; // create a local X
for (int i=0; i<Xsize; ++i) // fill it up
y.set(i, i);
return y; // return a copy via X(X&)
}
and then using named return values,
X iota_nrv() return x // x is the name of the X we are returning
{
for (int i=0; i<Xsize; ++i) // fill it up & get out
x.set(i, i);
}
First off, the rules associated with named return values are:
1) The named object is the one received by the caller upon
function return.
2) Any kind of constructor, perhaps based on runtime function
arguments, may be specified for the return value. Thus
something like X f(int i) return x(i) {...} is legal.
3) The constructor for the named return value is executed
before any statements in the function body.
4) Only a `return' (or returning by falling off the edge) are
legal ways to return from a function declaring named return
values. Use of `return <expression>' is illegal.
(Note: (4) is open for negotiation. It is arguable that a
return <expression> should be legal, meaning to perform
re-intialization or assignment to the already-initialized
return value. However, this seems inelegant, error-prone,
and substantially less simple than just disallowing it.)
The most practical reason for desiring named return values is
efficiency. Named return values allow one to write a function
returning some large object without also first allocating space for
and creating a local version that is to be somehow built-up solely for
the sake of returning via a copy (i.e., using X(X&)).
In the example, function iota declares local X y, which is allocated,
constructed, modified, and then returned by value (via X(X&)) to
the caller, whereas function iota_nrv directly creates and modifies
the object being sent back to the caller, saving a time-consuming
X(X&) constructor, and also saving the space otherwise needed to
hold two Xs (during X(X&) construction), rather than just one.
Thus, function iota_nrv is about twice as fast as function iota. The
difference can, of course, be made arbitrarily dramatic by increasing
Xsize.
The best counterargument against this efficiency claim is that a
*very* smart compiler could translate function iota into something
like function iota_nrv all by itself. While conceivably true, in
cases of arbitrarily complicated nesting of arbitrarily complicated
functions, eliminating all such useless copying would sometimes
require interprocedural data flow analyses stretching the capabilities
of any optimizing compiler for any language I know. Thus, relying on
automated detection seems unrealistic, especially considering that
this effect can be had in such a simple and natural way with named
return values.
The second reason is in terms of C++ design goals and semantics,
at least as I perceive them.
[Please forgive the grandiosity of this apparent digression!]
In a language like C++, which supports, to varying extents,
object-oriented, procedural, and functional programming styles, one
can think of objects (and their member variables and methods) as being
central, and procedures and functions as being constructs that use and
extend object capabilities in specific ways.
To illustrate, assume you need procedure setv, a special-purpose
routine that modifies some chunk of an X.
void setv(X& x, int from, int to, int val)
{
for (int i = from; i <= to; ++i) x.set(i, val);
}
But there is often no reason that setv *must* exist as a stand-alone
procedure. One could obtain the same kind of utility, albeit more
awkwardly, by staying within a pure object-oriented framework and
creating a new subclass that incorporates setv:
class X_with_setv : public X
{
public:
X_with_setv() :X() {}
X_with_setv(int v) :X(v) {}
void setv(int from, int to, int val)
{ for (int i = from; i <= to; ++i) set(i, val); }
};
(The constructors are needed to ensure everything inherits transparently.)
In some `pure' object-oriented languages, you might have to do
something like this, but in C++, you can just write setv as a
stand-alone procedure (as before) without all of this fuss, and
with various other advantages.
Enough warm-up. Now consider how one would similarly create a subclass
incorporating function iota if one had to:
class Xiota : public X
{
public:
Xiota() :X() { for (int i=0; i<Xsize; ++i) set(i,i); }
};
The most defensible way to encapsulate iota is as a constructor, since
it constructs a new object.
The Xiota::Xiota() constructor is suspiciously similar in form to the
iota_nrv function. This is because FUNCTIONS ARE CONSTRUCTORS in C++.
Since C++ allows non-object-based functions, one can think of such
functions as ways of extending constructors, in exactly the same sense
that procedures extend methods (as with setv). These are both very
useful in cases where providing such extensions does not merit (or
gets into logistic problems resulting from) creation of new
subclasses.
Even *member* functions that return new objects of the current class
(as is often the case with overloaded operators) fall into this category.
In a sense, a constructive operator like, say, unary negation:
class X // as above
{
//...
public:
X operator - () return x { for (int i=0; i<Xsize; ++i) x.p[i]= -p[i]; }
//...
};
is just syntactic sugar for either a new constructor,
class X // as above
{
//...
public:
enum NEGOP { NEG };
X(const X& a, NEGOP) { for (int i=0; i<Xsize; ++i) p[i] = -a.p[i]; }
//
};
or a even new subclass,
class XNeg : public X
{
XNeg(const X& a) :X() { for (int i=0; i<Xsize; ++i) p[i]= -a.p[i]; }
};
If you accept the argument that functions are extensions and/or
disguised forms of constructors, then it seems only natural that the
use of named return values should nicely parallel existing constructor
syntax/semantics, substituting `return <id><opt-constr>' for
`:<type><constr>'. (Compare rules (1-4) above with similar rules for
constructors).
If it weren't for the probable error-proneness resulting from the
introduction of a different syntactic form for `:', (i.e., following
it by a declaration for functions, but base initializers for
constructors), using the colon instead of the return keyword for named
return values, as in `X f() :x { ... }' might be a more logical
choice. But the `return' syntax seems natural enough, and has some
precedent in other languages.
The only disadvantage I know of is that there is no programmer-defined
control over when the constructor for the named return value is
called. It must be called before any function body statements. But
even this has a compensating advantage: It is impossible to write a
function using named return values that `forgets' to return anything
at all on one or more of its execution paths. (Although, of course, it
is still possible to mistakenly write one that doesn't return what you
had in mind!)
There are still plenty of cases in which the old function syntax is
still useful, often for logistics reasons:
X f(int v) // no way to generate named ret val via a single expression
{
extern void g(int&);
g(v);
return X(v);
}
This example was also designed to show a construct, `return X(...)'
(rather than a return of some local) that is compiled (probably by
all C++ compilers) to be just as efficient as named return values since
the constructor is in a tail call. This is among the very few cases
where this is true. Its use is especially encouraged for those using
C++ compilers lacking named return values.
Support? Disagreement? Comments?
Doug Lea, Computer Science Dept., SUNY Oswego, Oswego, NY, 13126 (315)341-2367
email: dl@oswego.edu or dl%oswego.edu@nisc.nyser.net
UUCP :...cornell!devvax!oswego!dl or ...rutgers!sunybcs!oswego!dljima@hplsla.HP.COM (Jim Adcock) (08/09/89)
> > X iota() // return an X with all p[i] = i > { > X y; // create a local X > for (int i=0; i<Xsize; ++i) // fill it up > y.set(i, i); > return y; // return a copy via X(X&) > } > > ... > > The best counterargument against this efficiency claim is that a > *very* smart compiler could translate function iota into something > like function iota_nrv all by itself. While conceivably true, in > cases of arbitrarily complicated nesting of arbitrarily complicated > functions, eliminating all such useless copying would sometimes > require interprocedural data flow analyses stretching the capabilities > of any optimizing compiler for any language I know. Thus, relying on > automated detection seems unrealistic, especially considering that > this effect can be had in such a simple and natural way with named > return values. It doesn't seem to me like it takes a very smart compiler to be able to figure out that iota uses y as a dummy return variable. But these simple cases are precisely where one needs compilers to be a little smarter. Long and complex functions can afford the cost of the extra copy. Little functions can't afford the cost -- but should be easy to optimize. Let's get C++ compilers that can truly handle the uniquely C++ features of the language -- initialization, assignment, return values, array initialization, registering small structures, efficient this passing, etc. These issues are *not* the same as in C compilers -- C++ makes very different demands on a compiler's code generating capabilities. Let's ask C++ compiler writers to design their compilers to meet the state of the art for the 1990's. Let's not bollix up the language with a lot more tired anachronisms like "register."
rjc@maui.cs.ucla.edu (Robert Collins) (08/09/89)
In article <1826@cmx.npac.syr.edu> dl@oswego.edu (Doug Lea) writes: > > >This is an argument for the `official' adoption in C++ of the notion >of named return values, as have been present in GNU g++ for the last >few versions. > >Named return values are a g++ extension that gives a name to the >object returned by a function, in a way vaguely similar to named >return `variables' in Pascal and other languages, but with some >important C++-specific benefits. > [ ... ] >The most practical reason for desiring named return values is >efficiency. Named return values allow one to write a function >returning some large object without also first allocating space for >and creating a local version that is to be somehow built-up solely for >the sake of returning via a copy (i.e., using X(X&)). > [ ... ] >Support? Disagreement? Comments? > I think named returnn values are a great idea. I have been using them since they appeared in g++ (and whined enough to get some of the bugs fixed :-) ). I use them primarily for efficiency. For my particular application, I have seen a 5-10% reduction in run-time. This is a big win, since this application has already run for many hundreds of hours on our Connection Machine. In addition, I find the syntax easy to write and easy to read, and the semantics are clear and easy to understand (at least to me). >Doug Lea, Computer Science Dept., SUNY Oswego, Oswego, NY, 13126 (315)341-2367 >email: dl@oswego.edu or dl%oswego.edu@nisc.nyser.net >UUCP :...cornell!devvax!oswego!dl or ...rutgers!sunybcs!oswego!dl rob collins ------------------------------------------------------------------------------- rjc@cs.ucla.edu C++/Paris on the CM2: The only way to fly. -------------------------------------------------------------------------------
kearns@read.columbia.edu (Steve Kearns) (08/09/89)
I am in favor of named return values; however.... much of their use can be avoided, I have found. Many of the libg++ library classes attempt to simulate "value" semantics. For example, you might be able to use a matrix class by saying: matrix m1, m2, m3; ... m1 = m1 + 2*m2 + m3; While this is very nice in theory, in practice it can lead to horrible performance because of the various temporary matrices that are created. I think the best near term solution to this problem is to allow the programmer to explicitly control memory use. The way this is done is by declaring destructive plus and times operations. m1.plus(m2) is a routine which modifies m1 so that (new)m1 = (old)m1+m2. Then the above program could be rewritten: m1.plus(matrix(m2).times(2)).plus(m3) (assume that matrix(m2) is a constructor returning a matrix equal to m2.) Admittedly, the syntax is much less readable. On the other hand, it gives the programmer the chance to optimize his expressions and significantly reduce the number of large temporaries. It is also more "honest": matrices are NOT good candidates for having value semantics because their copying time is large. By writing in this style when appropriate, most uses of named return values go away. -steve (kearns@cs.columbia.edu)
rjc@maui.cs.ucla.edu (Robert Collins) (08/09/89)
In article <6444@columbia.edu> kearns@cs.columbia.edu writes: >I am in favor of named return values; however.... > [ ... ] >matrix m1, m2, m3; >... >m1 = m1 + 2*m2 + m3; > [ rewrite as ] >m1.plus(matrix(m2).times(2)).plus(m3) That is REALLY ugly. I am using C++ so I don't have to write code like that! >(assume that matrix(m2) is a constructor returning a matrix equal to >m2.) Admittedly, the syntax is much less readable. On the other >hand, it gives the programmer the chance to optimize his expressions >and significantly reduce the number of large temporaries. It is also >more "honest": matrices are NOT good candidates for having value semantics >because their copying time is large. Part of this cost is unnecessary copies that named return values can help allieviate. >By writing in this style when appropriate, most uses of named return values >go away. Yes, if you write code like that, you don't need named return values. I don't know anyone who wants to write code like that! I think the fact that people are tempted to write in this sort of style is a very strong argument for named return values! Anyone who is considering writing code in C++ and sees examples in this `efficient' style is going to run away screaming into the night. Just say no. >-steve >(kearns@cs.columbia.edu) rob collins ------------------------------------------------------------------------------- rjc@cs.ucla.edu C++/Paris on the CM2: The only way to fly. -------------------------------------------------------------------------------
lgy@blake.acs.washington.edu (Laurence Yaffe) (08/09/89)
In article <6444@columbia.edu> kearns@cs.columbia.edu writes: >While this is very nice in theory, in practice it can lead to horrible >performance because of the various temporary matrices that are created. [ various comments about the desirability of explicitly controlling memory use for matrix operations deleted ] > It is also >more "honest": matrices are NOT good candidates for having value semantics >because their copying time is large. >-steve >(kearns@cs.columbia.edu) The claim that frequent copying of matrices causes unacceptable performance degradation appears to be common dogma, but what real evidence supports this? Since most common operations on matrices (multiplication, diagonalization, decomposition, inversion, ...) involve order N^3 operations for N-dimensional matrices, while copying is only order N^2, the overhead of copying will be significant only if (a) matrices are small and copies are very frequent (compared to other operations), (b) matrices are so large that memory limitation intervene, or (c) no O(N^3) operations are being performed. In years of my own work, I've never seen real examples of case (c), and only a few examples of case (a). Over quite a range of applications, I've found that the breakeven point where O(N^2) copies become important is well under N=10, typically 3 or 4. And for compute intensive applications with matrices that small, special methods tend to be more appropriate (fixed dimension types, inline coding, ...). I have run into examples in case (b), most recently in a calculation involving 1280 x 1280 dimensional matrices which needed more than 80 Mb of swap space! But this type of problem seems to be largely a thing of the past - unless you have a very fast machine or the patience to do O(N^3) operations on 1000 x 1000 matrices. On all the machines I've used, sequentially accessing all matrix elements in a row is significantly faster than accessing a column (better locality of reference, faster pointer increment). And yet surprisingly few canned matrix multiply routines pre-transpose one of the matrices (or use equivalent tricks involving an O(N^2) movement of data) in order to take advantage of this fact. Absolutely criminal... Anyone have real data (or just more anecdotal tales) on the significance of matrix copies in real applications? -- Laurence G. Yaffe Internet: lgy@newton.phys.washington.edu University of Washington Bitnet: yaffe@uwaphast.bitnet
lisch@mentor.com (Ray Lischner) (08/10/89)
In <6444@columbia.edu>, kearns@read.columbia.edu writes that > m1.plus(matrix(m2).times(2)).plus(m3) can prevent the "horrible performance" of > m1 = m1 + 2*m2 + m3; When m1, m2, and m3 are matrices. If efficiency is that important, then why not define assignment operators: +=, *=, etc., to do what you call plus(), times(), etc.: tmp = m2; tmp *= 2; tmp += m3; m1 += tmp; I would still rather write readable code, but if, for a given compiler, the readable version does perform adequately, then I would rather see assignment operators than plus(), etc. -- Internet: lisch@mntgfx.mentor.com UUCP: tektronix!sequent!mntgfx!lisch
dl@cmx.npac.syr.edu (Doug Lea) (08/10/89)
I have a great deal of sympathy with Steve Kearns views, but disagree
with his conclusions. For many, `algebraic' classes, including vectors
and matrices, there is a natural object-oriented operations &
semantics, that is substantially different-looking than the more
familiar value-based operations & semantics. I think many such classes
ought to support BOTH.
To compose an argument about why this should be so, I'll scale down,
and consider a class-based re-implementation of simple integers:
class Int
{
int rep;
public:
Int(const Int& a) :rep(a.rep) {}
Int(int i = 0) :rep(i) {}
void operator = (const Int& a) { rep = a.rep; }
void negate() { rep = -rep; }
void operator +=(const Int& a) { rep += a.rep; }
void operator -=(const Int& a) { rep -= a.rep; }
void operator *=(const Int& a) { rep *= a.rep; }
void operator /=(const Int& a) { rep /= a.rep; }
};
This definition is very much an object-oriented one. However, I bet
that few programmers would like to use it. Instead of coding integer
operations in a familiar expression-based (value-based) notation like,
for Int a, b, c, d;
a = (b - a) * -(d / c); (*)
they would be forced into something akin to assembly language
calculations, hand-translating their intentions (i.e., (*)) into
Int t1(b); t1 -= a; (**)
Int t2(d); t2 /= c;
Int t3(t2); t3.negate();
Int t4(t1); t4 *= t3;
a = t4;
Unless, of course, they are pretty good hand-optimizers! In which
case, using some basic, well-known arithmetic optimization (rewriting)
principles, they could write this more efficiently as,
a -= b; a *= d; a /= c; (***)
Hardly anyone likes to hand-optimize such expressions. (The fact that
the object operations employ operator notation (`t4 *= t3') instead of
prefix notation (e.g., `t4.mul(t3)') scarcely makes this any more fun.)
Now, it is not hard to completely automate the first (*) => (**)
translation step via
Int operator - (const Int& a) return r(a) { r.negate(); }
Int operator + (const Int& a, const Int& b) return r(a) { r += b; }
Int operator - (const Int& a, const Int& b) return r(a) { r -= b; }
Int operator * (const Int& a, const Int& b) return r(a) { r *= b; }
Int operator / (const Int& a, const Int& b) return r(a) { r /= b; }
As I mentioned in my previous note, these are perhaps best thought of
as extensions or disguised forms of Int constructors. Named return
values make these simpler, faster, and more obvious, I think.
My points so far are:
1) For many classes, expression-/value- based operations support a more
natural programming style, that C++, as a hybrid language, is fully
capable of supporting.
2) Object-based operations are surely the more central and basic in
any object-oriented language, since value-based operations may
be layered on top of the object-based ones.
3) As proponents of functional programming like to argue, the use of
value-based operations results in code that is almost always
easier to informally and formally verify for correctness. However,
layering-in value operations as a `translation' step, makes design
and verification at the object level easier too, since one need
`merely' determine that the translation preserves correctness.
The kinds of optimizations seen going from (**) to (***) are further
refinements of this expression translation process, familiar to
assembly language programmers, compiler writers, and those studying
general program transformation techniques (e.g., the work inspired by
Burstall & Darlington).
Automation of such optimizations requires capabilities that seem
currently unavailable in C++. There are a lot of well known `tricks'
(reference counting, temp classes, etc.) that can exploit some of the
most glaring opportunities, but there are no built-in
`compiler-compiler' constructs that would allow programmers to specify
all of the kinds of optimizations possible for these kinds of classes.
Among the most frustrating aspects of all this is that these
optimizations are, on the one hand, fairly obvious and well-known,
but, on the other hand, very tedious and error-prone to carry out
manually every time you write out an expression. Yet more frustrating
is the fact that nearly any compiler already knows how to optimize
expressions involving built-in data types like int, float, char, but
knows nothing at all about expressions involving user-defined types
like Int.
These seem to be the available options for implementors of such classes:
1) Refuse to support expression-oriented operations.
2) Support both the object-oriented methods, and simple
value-to-object translations, thereby allowing programmers to
hand-optimize using the object operations if they need to.
3) Support as much optimization as you can manage from within
the confines of C++.
4) Write expression node classes that form expression trees during
*runtime*, and fully optimize the trees, again during runtime
before processing. (E.g., ExprNode operator + (Int&, Int&); makes
a node, and Int::operator=(ExprNode&); evaluates it.)
5) Write class-specific preprocessors, that perform optimized
translations using knowledge that can't be expressed in C++.
6) Extend the language.
Choosing (1) means that people will find the class difficult and/or
annoying to use. Even the purest object-oriented languages provide at
least some arithmetic operator support, probably because programmers
would refuse to use them otherwise. As Michael Tiemann says, code
isn't reusable if it's not usable!
(2) is perhaps the most practical way to proceed right now, and is
fully within the spirit of the C/C++ maxim of `make it right, *then*
make it fast'. (Note, for example, that C/C++, unlike many languages,
already supports many object-based operations on builtin types, like
operator +=, that were designed, in part, to make hand-optimization by
programmers easier.)
I have written a bunch of libg++ classes along the lines of (3).
Unfortunately, such coding often involves cleverness or trickery that
obfuscates otherwise simple designs, and does not have a great payoff
since only a very small subset of common optimizations can be done
within such limitations.
In effect, (4) turns such classes into little interpretors. This *is*
doable, but approaches viability only when the objects and expressions
are so large and complex that the time taken to create and evaluate
the expression tree during runtime is always paid off by the resulting
savings. Moreover, it seems just plain wrong to do compile-time
processing during run time.
I am currently investigating some forms of (5), in part, to scope out
the kinds of issues and problems inherent in (6) about which I
do not yet have any good concrete suggestions.
Among the things I find most interesting about this topic is that
there are some classes (numbers, strings...) in which value-based
programming is seen by most everyone as the most natural and desirable
methodology, others (queues, windows, ...) in which object-oriented
programming is most natural, and some (sets, matrices, ...) where people
like to use both. I'm not exactly sure why this is so. But a great
attraction of C++ is that there is the possibility that both views can
be accomodated.
Doug Lea, Computer Science Dept., SUNY Oswego, Oswego, NY, 13126 (315)341-2367
email: dl@oswego.edu or dl%oswego.edu@nisc.nyser.net
UUCP :...cornell!devvax!oswego!dl or ...rutgers!sunybcs!oswego!dlnagle@well.UUCP (John Nagle) (08/14/89)
In article <1989Aug9.175337.12165@mentor.com> lisch@mentor.com (Ray Lischner) writes: >If efficiency is that important, then why not define assignment >operators: +=, *=, etc., to do what you call plus(), times(), etc.: > > tmp = m2; > tmp *= 2; > tmp += m3; > m1 += tmp; > Very nice. A useful approach to matrix math. An interesting thought is that a compiler could optimize m1 += 2*m2 + m3; into the above sequence by itself, thus minimizing the intermediate storage consumed. This is a strange, but valid, way to optimize arithmetic expressions generally. Should a compiler do such a thing? Should there be pragmas associated with overloaded operators to advise the compiler on which operations cost more than others, so that the optimizer can perform better. Copying big objects is expensive, and the ability to reduce it is a worthwhile optimization. It's worth thinking about what this means in string operations, where such operations as sequential concatenation usually result in either ugly source code or multiple recopying. John Nagle
SRWMRBD@windy.dsir.govt.nz (ROBERT) (08/16/89)
This is to support Doug Lea's proposal for "named return values". I don't want to comment on the specific syntax he proposed - simply to say we need a better way of getting objects back from functions. I think this is more important for small objects than big ones where one can use tricks to avoid the copying. For example I presume (I haven't tested it) in a complex add that a significant amount of time (and space) is used in the return. For big objects like big matrices only the pointers and dimensional information will need be copied at the end of a function or operation, for functions or operations written by the library developer. For example if you use my vector/matrix package (see my note of sometime ago) Matrix m1(100,100), m2(100,100), m3(100,100); ... m1 = m1 + m2*2.0 + m3; will create only one temporary which gets destroyed almost immediately and does no copying of matrices. (At least it doesn't in Zortech; Doug Lea found that AT&T C++ could generate extra initialises under some circumstances). However if you, a user, write a function Matrix Sqrt(Matrix&) then m1 = Sqrt(m2); will do one unnecessary copy. That is unless you know how to tell the initialisation routines that the space occupied by the matrix you are returning can be recycled. So I want namded return values or the equivalent to make handling small objects more efficient and make life simpler for people handling big objects.
kearns@read.columbia.edu (Steve Kearns) (08/17/89)
Here is another argument for named return values. In the olden days, C routines could only return something pointer sized or smaller. (Lets ignore floats!) Later, returning structures was allowed. This is where a basic conceptual mistake was made. C is not a "value" language. Data structures not only have values, they have locations. When routines return small structures we can ignore their location because they can be copied from place to place cheaply. For large objects this is no longer true. EVERY OBJECT RETURNED IS RETURNED TO SOME LOCATION. Therefore the default semantics should allow the programmer access to this location, and the special case should be the small structures for which copying is cheap. Named return values provide this functionality. -steve (kearns@cs.columbia.edu)
jss@jra.ardent.com (Jerry Schwarz (Compiler)) (08/18/89)
There are two distinct questions about named function return.
A) Are they a good idea stylistically?
B) Do they add something to the language? (I.e. is there something
you can say with them that you can't without them?)
A is a non-technical question and I will grant that there
are arguments on both sides. My conclusion is that
they are a bad idea, but I will not discuss this issue further
in this note.
The answer to B hinges on some subtle points of C++ semantics.
To answer B we must consider whether there is any difference in
meaning between the proposed
X n() return x // named return value
... x = ...
}
and
X u() { // unnamed return value
X x ;
...
return x ;
}
In C++, functions that return an X always do so by construction.
The above differ in what I call their cannonical interpretation.
According to this interpretation, the x is constructed
on entry to n in the place where the value is to be returned.
u constructs x on the stack on entry and uses
X(X&) to construct the return value just before returning.
The key question is a subtle one, is it a legal optimization for
u to behave like n. I believe the answer is yes, and that
having named return values adds nothing to the language.
(Although it forces the compiler to implement the optimization when
naemd return values are used.)
Doug Lea and I have had discussions about this kind of issue
before. He has generally taken the position that such optimizations
should not be allowed.
That may be why he wants named return values.
Jerry Schwarz
jss@ardent.comcline@sun.soe.clarkson.edu (Marshall Cline) (08/18/89)
In article <13118@well.UUCP> nagle@well.UUCP (John Nagle) writes about optimizing: m1 += 2*m2 + m3; into: tmp = m2; tmp *= 2; tmp += m3; m1 += tmp; He says: >Very nice. A useful approach to matrix math. >An interesting thought is that a compiler could optimize > m1 += 2*m2 + m3; >into the above sequence by itself, thus minimizing the intermediate >storage consumed. This is a strange, but valid, way to optimize arithmetic ^^^^^--but see below >expressions generally. Should a compiler do such a thing? > ... >John Nagle Unfortunately it is valid _only_ if: a += b always does the same thing as: a = a + b (and similarly for a *= b and a = a * b, etc). The meanings of +=, *=, -=, etc, are firmly entrenched in the minds of C programmers, however C++ doesn't _require_ the same semantics to be preserved for user-defined types. For example, there's no reason why (a *= b) can't open the file named (b) and print the value of (a) to it [Kids: don't try this at home]. Even "=" doesn't need to assign (although assignment is its default behavior). Marshall Cline -- __________________________________________________________________ Marshall P. Cline Internet: cline@sun.soe.clarkson.edu ECE Department Usenet: uunet!sun.soe.clarkson.edu!cline Clarkson University Bitnet: BH0W@CLUTX Potsdam, NY 13676 AT&T: 315-268-6591
dl@cmx.npac.syr.edu (Doug Lea) (08/20/89)
Jerry Schwarz contemplates whether an optimizing C++ compiler should be
allowed to optimize a function like u(),
class X { ... X(X&) { ... } ... };
X u() { // unnamed return value
X y ;
...
return y ;
}
into that which would be produced via a function like n()
X n() return x { // named return value
... x ...
}
The question is whether a compiler may *legally* skip the X(X&)
constructor to get the value of y out of function u by constructing
and dealing with y as if y itself were the return value, thus, in this
case at least, automating some of the efficiency benefits of
named return values.
Jerry and I have indeed been through a few exchanges on such points.
To summarize some of the discussions:
The C++ reference manual does not enumerate exactly those conditions
under which X(X&) will or will not be invoked, and further, does not
impose any restrictions upon user definitions of X(X&). Therefore, as
I demonstrated in a posting last spring, one can write an X(X&)
constructor that possesses arbitrary side effects, and the results of
the corresponding program will differ depending on whether you use a
compiler that does invoke X(X&) in situations like getting the return
value out of function u() versus one that does not. This has some
pragmatic interest since various C++ compilers `elide' out X(X&)
constructors in some circumstances but not others.
The use of named return values happily and successfully evades this
issue, at least with respect to return values, so I didn't bother to
get into it in my previous NRV postings. But some kind of resolution
of this part of the story could be had by tightening up the
description of X(X&) semantics just enough to better reflect the
general assumptions that are or could be made inside of cfront, g++,
and probably all other C++ compilers. Jerry and I once came up with
something along the lines of
Given any class X, and any X object `obj', when another object
with the value of obj is required, C++ compilers may optionally
use obj instead of a new object, `newobj', constructed via X(obj)
if obj is not subsequently modified differently than newobj during
newobj's potential lifetime. Similarly, compilers may in such
cases use any other X object constructed via X(obj), but never
since or subsequently modified differently than obj during obj's
lifetime. X(X&) constructors should be defined in such a way that
these optional compiler-generated actions do not change the
meanings of programs.
To be more complete, this description would have to be accompanied by
an emumeration of those cases where any such value/object is
`required'. The current reference manual appears to list these,
although not all in one place, or in these terms.
The idea is to spell out those situations in which a compiler may safely
(1) alias and/or reuse objects (as in the case of a local and a
return value) rather than forcing copies via X(X&), and
(2) invoke X(X&) (say, into a register) even in cases where it is not
explicitly needed if it somehow discovered that this would improve
overall efficiency.
The reference manual currently touches on some of these issues in
terms of `generating temporaries', leading one to believe that it is
describing case (2) here, in situations where it is really referring
to unexploited instances of case (1), i.e., the fact that compilers
don't have to guarantee that they will optimize out logically
required, but obviously unnecessary X(X&)'s.
Part of the reason that all this is controversial is that, both from a
safety perspective and in order to enable more aggressive optimization
(safety, correctness, and optimizability are almost always just
different ways of looking at the same problem), it would be nice to
banish *all* side effects from X(X&) constructors. However, it does
not seem at all desirable to disallow `innocuous' side effects, such as
reference counting manipulations. Additionally, a case can be made
that programmers should be allowed to write non-innocuous-side-effect
laden X(X&)'s and to have a way to disable optimizations in order to
force X(X&) constructors whenever they are logically or explicitly
required, in a way analogous to how ANSI C `volatile' variables are
handled, although I am no longer convinced that this is especially
worthwhile.
The important practical implication is that programmers should write
X(X&) constructors that build faithful copies of their arguments
without any kinds of side effects that would cause programs to behave
differently depending on whether the constructors were actually called
or not when they are logically required. Of course, it is impossible
for a compiler alone to prove to itself whether X(X&) side effects are
considered by the programmer to be innocuous or not, or even whether a
compiler-generated or programmer-defined X(X&) really does make a
`correct' copy, so any kind of restriction on X(X&) is mostly
unenforceable by compilers.
Again, these semantic issues do not impact one way or another my other
arguments for the utility of named return values. In particular, they
do not alter the fact that named return values provide a deterministic
guarantee that X(X&) will NOT be invoked, whereas, even with further
refinement of X(X&) semantics, the question of whether any compiler
actually can and does optimize out all return-based X(X&) constructors
remains a non-trivial compiler implementation matter.
Why is return-value X(X&) optimization hard for a compiler to guarantee?
Consider a function with lots of local X's, and lots of conditionals,
loops, inline and non-inline function calls, returns, etc. It would
take analyses at the edge of current state of the art compiler
technology to discover which variable(s) could safely be aliased to
the return value slot, and how to manage the slot. I am all in favor
of compiler implementors adding such features to C++ compilers. But
even with these kinds of analyses, it seems naive to believe that a
compiler could always arrive at code that would always be as good as
that specified in a simple and natural manner by a programmer using
named return values. Permitting such control over how ones code gets
compiled is surely within the spirit of C and C++.
Doug Lea, Computer Science Dept., SUNY Oswego, Oswego, NY, 13126 (315)341-2367
email: dl@oswego.edu or dl%oswego.edu@nisc.nyser.net
UUCP :...cornell!devvax!oswego!dl or ...rutgers!sunybcs!oswego!dl