[comp.lang.c++] named return values

dl@cmx.npac.syr.edu (Doug Lea) (08/08/89)

This is an argument for the `official' adoption in C++ of the notion
of named return values, as have been present in GNU g++ for the last
few versions.

Named return values are a g++ extension that gives a name to the
object returned by a function, in a way vaguely similar to named
return `variables' in Pascal and other languages, but with some
important C++-specific benefits.

To allow for some examples and discussion, first declare a 
very simple vector-like class X:

static const int Xsize=100000; // constant size, just for simplicity

class X
{
    int  p[Xsize];             // Xs are BIG objects

    void copy(const X& a)      { for (int i=0; i<Xsize; ++i) p[i] = a.p[i]; }
    void fill(int v)           { for (int i=0; i<Xsize; ++i) p[i] = v; }

public:

         X()                   {}
         X(const X& a)         { copy(a); }
         X(int v)              { fill(v); }
    
    void operator=(const X& a) { copy(a); }

    int  get(int i)            { return p[i]; }
    void set(int i, int x)     { p[i] = x; }
};

and an X-returning function, iota, first in standard C++ form,

X iota()                       // return an X with all p[i] = i
{
  X y;                         // create a local X
  for (int i=0; i<Xsize; ++i)  // fill it up
    y.set(i, i); 
  return y;                    // return a copy via X(X&)
}

and then using named return values,

X iota_nrv() return x          // x is the name of the X we are returning
{
  for (int i=0; i<Xsize; ++i)  // fill it up & get out
     x.set(i, i); 
}


First off, the rules associated with named return values are:

    1) The named object is the one received by the caller upon
       function return.

    2) Any kind of constructor, perhaps based on runtime function
       arguments, may be specified for the return value. Thus
       something like X f(int i) return x(i) {...} is legal.

    3) The constructor for the named return value is executed
       before any statements in the function body. 

    4) Only a `return' (or returning by falling off the edge) are
       legal ways to return from a function declaring named return
       values.  Use of `return <expression>' is illegal.

           (Note: (4) is open for negotiation. It is arguable that a
           return <expression> should be legal, meaning to perform
           re-intialization or assignment to the already-initialized
           return value. However, this seems inelegant, error-prone,
           and substantially less simple than just disallowing it.)




The most practical reason for desiring named return values is
efficiency.  Named return values allow one to write a function
returning some large object without also first allocating space for
and creating a local version that is to be somehow built-up solely for
the sake of returning via a copy (i.e., using X(X&)).

In the example, function iota declares local X y, which is allocated,
constructed, modified, and then returned by value (via X(X&)) to
the caller, whereas function iota_nrv directly creates and modifies
the object being sent back to the caller, saving a time-consuming
X(X&) constructor, and also saving the space otherwise needed to
hold two Xs (during X(X&) construction), rather than just one.

Thus, function iota_nrv is about twice as fast as function iota. The
difference can, of course, be made arbitrarily dramatic by increasing
Xsize.

The best counterargument against this efficiency claim is that a
*very* smart compiler could translate function iota into something
like function iota_nrv all by itself. While conceivably true, in
cases of arbitrarily complicated nesting of arbitrarily complicated
functions, eliminating all such useless copying would sometimes
require interprocedural data flow analyses stretching the capabilities
of any optimizing compiler for any language I know. Thus, relying on
automated detection seems unrealistic, especially considering that
this effect can be had in such a simple and natural way with named
return values.



The second reason is in terms of C++ design goals and semantics,
at least as I perceive them. 

[Please forgive the grandiosity of this apparent digression!]

In a language like C++, which supports, to varying extents,
object-oriented, procedural, and functional programming styles, one
can think of objects (and their member variables and methods) as being
central, and procedures and functions as being constructs that use and
extend object capabilities in specific ways.

To illustrate, assume you need procedure setv, a special-purpose
routine that modifies some chunk of an X.

void setv(X& x, int from, int to, int val)
{
  for (int i = from; i <= to; ++i) x.set(i, val);
}

But there is often no reason that setv *must* exist as a stand-alone
procedure. One could obtain the same kind of utility, albeit more
awkwardly, by staying within a pure object-oriented framework and
creating a new subclass that incorporates setv:

class X_with_setv : public X
{
public:

         X_with_setv()      :X()  {}
         X_with_setv(int v) :X(v) {}       
    
    void setv(int from, int to, int val)
         { for (int i = from; i <= to; ++i) set(i, val); }
};

(The constructors are needed to ensure everything inherits transparently.)

In some `pure' object-oriented languages, you might have to do
something like this, but in C++, you can just write setv as a
stand-alone procedure (as before) without all of this fuss, and
with various other advantages.

Enough warm-up. Now consider how one would similarly create a subclass
incorporating function iota if one had to:

class Xiota : public X
{
public:
         Xiota() :X() { for (int i=0; i<Xsize; ++i) set(i,i); }
};

The most defensible way to encapsulate iota is as a constructor, since
it constructs a new object.

The Xiota::Xiota() constructor is suspiciously similar in form to the
iota_nrv function. This is because FUNCTIONS ARE CONSTRUCTORS in C++.

Since C++ allows non-object-based functions, one can think of such
functions as ways of extending constructors, in exactly the same sense
that procedures extend methods (as with setv).  These are both very
useful in cases where providing such extensions does not merit (or
gets into logistic problems resulting from) creation of new
subclasses. 

Even *member* functions that return new objects of the current class
(as is often the case with overloaded operators) fall into this category.
In a sense, a constructive operator like, say, unary negation:

class X // as above
{
  //...
public:
  X      operator - () return x { for (int i=0; i<Xsize; ++i) x.p[i]= -p[i]; }
  //...
};

is just syntactic sugar for either a new constructor,

class X // as above
{
  //...
public:
  enum NEGOP { NEG };
         X(const X& a, NEGOP) { for (int i=0; i<Xsize; ++i) p[i] = -a.p[i]; }
  //
};

or a even new subclass,

class XNeg : public X
{
         XNeg(const X& a) :X() { for (int i=0; i<Xsize; ++i) p[i]= -a.p[i]; }
};


If you accept the argument that functions are extensions and/or
disguised forms of constructors, then it seems only natural that the
use of named return values should nicely parallel existing constructor
syntax/semantics, substituting `return <id><opt-constr>' for
`:<type><constr>'. (Compare rules (1-4) above with similar rules for
constructors).

If it weren't for the probable error-proneness resulting from the
introduction of a different syntactic form for `:', (i.e., following
it by a declaration for functions, but base initializers for
constructors), using the colon instead of the return keyword for named
return values, as in `X f() :x { ... }' might be a more logical
choice. But the `return' syntax seems natural enough, and has some
precedent in other languages.



The only disadvantage I know of is that there is no programmer-defined
control over when the constructor for the named return value is
called. It must be called before any function body statements. But
even this has a compensating advantage: It is impossible to write a
function using named return values that `forgets' to return anything
at all on one or more of its execution paths. (Although, of course, it
is still possible to mistakenly write one that doesn't return what you
had in mind!)



There are still plenty of cases in which the old function syntax is
still useful, often for logistics reasons:

X f(int v) // no way to generate named ret val via a single expression
{
  extern void g(int&);
  g(v);
  return X(v); 
}

This example was also designed to show a construct, `return X(...)'
(rather than a return of some local) that is compiled (probably by
all C++ compilers) to be just as efficient as named return values since
the constructor is in a tail call. This is among the very few cases
where this is true. Its use is especially encouraged for those using
C++ compilers lacking named return values.

Support? Disagreement? Comments? 

Doug Lea, Computer Science Dept., SUNY Oswego, Oswego, NY, 13126 (315)341-2367
email: dl@oswego.edu              or dl%oswego.edu@nisc.nyser.net
UUCP :...cornell!devvax!oswego!dl or ...rutgers!sunybcs!oswego!dl

jima@hplsla.HP.COM (Jim Adcock) (08/09/89)

> 
> X iota()                       // return an X with all p[i] = i
> {
>   X y;                         // create a local X
>   for (int i=0; i<Xsize; ++i)  // fill it up
>     y.set(i, i); 
>   return y;                    // return a copy via X(X&)
> }
> 
> ...
> 
> The best counterargument against this efficiency claim is that a
> *very* smart compiler could translate function iota into something
> like function iota_nrv all by itself. While conceivably true, in
> cases of arbitrarily complicated nesting of arbitrarily complicated
> functions, eliminating all such useless copying would sometimes
> require interprocedural data flow analyses stretching the capabilities
> of any optimizing compiler for any language I know. Thus, relying on
> automated detection seems unrealistic, especially considering that
> this effect can be had in such a simple and natural way with named
> return values.

It doesn't seem to me like it takes a very smart compiler to be able
to figure out that iota uses y as a dummy return variable.  But these
simple cases are precisely where one needs compilers to be a little smarter.
Long and complex functions can afford the cost of the extra copy.  Little
functions can't afford the cost -- but should be easy to optimize.
Let's get C++ compilers that can truly handle the uniquely C++ features
of the language -- initialization, assignment, return values, array 
initialization, registering small structures, efficient this passing, etc.  
These issues are *not* the same as in C compilers -- C++ makes very different 
demands on a compiler's code generating capabilities.

Let's ask C++ compiler writers to design their compilers to meet the state of
the art for the 1990's.  Let's not bollix up the language with a lot more tired
anachronisms like "register."

rjc@maui.cs.ucla.edu (Robert Collins) (08/09/89)

In article <1826@cmx.npac.syr.edu> dl@oswego.edu (Doug Lea) writes:
>
>
>This is an argument for the `official' adoption in C++ of the notion
>of named return values, as have been present in GNU g++ for the last
>few versions.
>
>Named return values are a g++ extension that gives a name to the
>object returned by a function, in a way vaguely similar to named
>return `variables' in Pascal and other languages, but with some
>important C++-specific benefits.
> [ ... ]
>The most practical reason for desiring named return values is
>efficiency.  Named return values allow one to write a function
>returning some large object without also first allocating space for
>and creating a local version that is to be somehow built-up solely for
>the sake of returning via a copy (i.e., using X(X&)).
> [ ... ]
>Support? Disagreement? Comments? 
>

I think named returnn values are a great idea.  I have been using them
since they appeared in g++ (and whined enough to get some of the
bugs fixed :-) ).  I use them primarily for efficiency.  For my
particular application, I have seen a 5-10% reduction in run-time.
This is a big win, since this application has already run for many
hundreds of hours on our Connection Machine.

In addition, I find the syntax easy to write and easy to read, and the
semantics are clear and easy to understand (at least to me).

>Doug Lea, Computer Science Dept., SUNY Oswego, Oswego, NY, 13126 (315)341-2367
>email: dl@oswego.edu              or dl%oswego.edu@nisc.nyser.net
>UUCP :...cornell!devvax!oswego!dl or ...rutgers!sunybcs!oswego!dl

rob collins



















-------------------------------------------------------------------------------
rjc@cs.ucla.edu	                    C++/Paris on the CM2:  The only way to fly.
-------------------------------------------------------------------------------

kearns@read.columbia.edu (Steve Kearns) (08/09/89)

I am in favor of named return values; however....

much of their use can be avoided, I have found.  Many of the libg++
library classes attempt to simulate "value" semantics.  For example,
you might be able to use a matrix class by saying:

matrix m1, m2, m3;
...
m1 = m1 + 2*m2 + m3;

While this is very nice in theory, in practice it can lead to horrible
performance because of the various temporary matrices that are created.  

I think the best near term solution to this problem is to allow the
programmer to explicitly control memory use.  The way this is done is
by declaring destructive plus and times operations.  m1.plus(m2) is a
routine which modifies m1 so that (new)m1 = (old)m1+m2.  Then the
above program could be rewritten:

m1.plus(matrix(m2).times(2)).plus(m3)

(assume that matrix(m2) is a constructor returning a matrix equal to
m2.)  Admittedly, the syntax is much less readable.  On the other
hand, it gives the programmer the chance to optimize his expressions
and significantly reduce the number of large temporaries.  It is also
more "honest":  matrices are NOT good candidates for having value semantics
because their copying time is large.  

By writing in this style when appropriate, most uses of named return values
go away.

-steve
(kearns@cs.columbia.edu)

rjc@maui.cs.ucla.edu (Robert Collins) (08/09/89)

In article <6444@columbia.edu> kearns@cs.columbia.edu writes:
>I am in favor of named return values; however....
> [ ... ]
>matrix m1, m2, m3;
>...
>m1 = m1 + 2*m2 + m3;
> [ rewrite as ]
>m1.plus(matrix(m2).times(2)).plus(m3)

That is REALLY ugly.  I am using C++ so I don't have to write
code like that!

>(assume that matrix(m2) is a constructor returning a matrix equal to
>m2.)  Admittedly, the syntax is much less readable.  On the other
>hand, it gives the programmer the chance to optimize his expressions
>and significantly reduce the number of large temporaries.  It is also
>more "honest":  matrices are NOT good candidates for having value semantics
>because their copying time is large.  

Part of this cost is unnecessary copies that named return values can
help allieviate.

>By writing in this style when appropriate, most uses of named return values
>go away.

Yes, if you write code like that, you don't need named return values.  I
don't know anyone who wants to write code like that!  I think the fact
that people are tempted to write in this sort of style is a very strong
argument for named return values!  Anyone who is considering writing
code in C++ and sees examples in this `efficient' style is going to
run away screaming into the night.  Just say no.

>-steve
>(kearns@cs.columbia.edu)

rob collins
-------------------------------------------------------------------------------
rjc@cs.ucla.edu	                    C++/Paris on the CM2:  The only way to fly.
-------------------------------------------------------------------------------

lgy@blake.acs.washington.edu (Laurence Yaffe) (08/09/89)

In article <6444@columbia.edu> kearns@cs.columbia.edu writes:

>While this is very nice in theory, in practice it can lead to horrible
>performance because of the various temporary matrices that are created.  

	[ various comments about the desirability of explicitly
	  controlling memory use for matrix operations deleted ]

> It is also
>more "honest":  matrices are NOT good candidates for having value semantics
>because their copying time is large.  

>-steve
>(kearns@cs.columbia.edu)

    The claim that frequent copying of matrices causes unacceptable
performance degradation appears to be common dogma, but what real evidence
supports this?  Since most common operations on matrices  (multiplication,
diagonalization, decomposition, inversion, ...) involve order N^3 operations
for N-dimensional matrices, while copying is only order N^2, the overhead
of copying will be significant only if (a) matrices are small and copies
are very frequent (compared to other operations), (b) matrices are so large
that memory limitation intervene, or (c) no O(N^3) operations are being
performed.

    In years of my own work, I've never seen real examples of case (c),
and only a few examples of case (a).  Over quite a range of applications,
I've found that the breakeven point where O(N^2) copies become important
is well under N=10, typically 3 or 4.  And for compute intensive
applications with matrices that small, special methods tend to be more
appropriate (fixed dimension types, inline coding, ...).  I have run
into examples in case (b), most recently in a calculation involving
1280 x 1280 dimensional matrices which needed more than 80 Mb of swap space!
But this type of problem seems to be largely a thing of the past - unless
you have a very fast machine or the patience to do O(N^3) operations on
1000 x 1000 matrices.

    On all the machines I've used, sequentially accessing all matrix elements
in a row is significantly faster than accessing a column (better locality
of reference, faster pointer increment).  And yet surprisingly few canned
matrix multiply routines pre-transpose one of the matrices (or use equivalent
tricks involving an O(N^2) movement of data) in order to take advantage of this
fact.  Absolutely criminal...

    Anyone have real data (or just more anecdotal tales) on the significance
of matrix copies in real applications?

-- 
Laurence G. Yaffe		Internet: lgy@newton.phys.washington.edu
University of Washington	  Bitnet: yaffe@uwaphast.bitnet

lisch@mentor.com (Ray Lischner) (08/10/89)

In <6444@columbia.edu>, kearns@read.columbia.edu writes that

> m1.plus(matrix(m2).times(2)).plus(m3)

can prevent the "horrible performance" of

> m1 = m1 + 2*m2 + m3;

When m1, m2, and m3 are matrices.

If efficiency is that important, then why not define assignment
operators: +=, *=, etc., to do what you call plus(), times(), etc.:

    tmp = m2;
    tmp *= 2;
    tmp += m3;
    m1 += tmp;

I would still rather write readable code, but if, for a given compiler,
the readable version does perform adequately, then I would rather see
assignment operators than plus(), etc.
-- 
Internet: lisch@mntgfx.mentor.com     UUCP: tektronix!sequent!mntgfx!lisch

dl@cmx.npac.syr.edu (Doug Lea) (08/10/89)

I have a great deal of sympathy with Steve Kearns views, but disagree
with his conclusions. For many, `algebraic' classes, including vectors
and matrices, there is a natural object-oriented operations &
semantics, that is substantially different-looking than the more
familiar value-based operations & semantics. I think many such classes
ought to support BOTH.

To compose an argument about why this should be so, I'll scale down,
and consider a class-based re-implementation of simple integers:

class Int
{
  int  rep;
public:
       Int(const Int& a)         :rep(a.rep) {}
       Int(int i = 0)            :rep(i) {}

  void operator = (const Int& a) { rep = a.rep; }

  void negate()                  { rep = -rep; }
  void operator +=(const Int& a) { rep += a.rep; }
  void operator -=(const Int& a) { rep -= a.rep; }
  void operator *=(const Int& a) { rep *= a.rep; }
  void operator /=(const Int& a) { rep /= a.rep; }
};


This definition is very much an object-oriented one. However, I bet
that few programmers would like to use it. Instead of coding integer
operations in a familiar expression-based (value-based) notation like,
for Int a, b, c, d;

   a = (b - a) * -(d / c);     (*)

they would be forced into something akin to assembly language
calculations, hand-translating their intentions (i.e., (*)) into


  Int t1(b);  t1 -= a;         (**)
  Int t2(d);  t2 /= c;
  Int t3(t2); t3.negate();
  Int t4(t1); t4 *= t3;
  a = t4;

Unless, of course, they are pretty good hand-optimizers! In which
case, using some basic, well-known arithmetic optimization (rewriting)
principles, they could write this more efficiently as,

  a -= b; a *= d; a /= c;      (***)

Hardly anyone likes to hand-optimize such expressions. (The fact that
the object operations employ operator notation (`t4 *= t3') instead of
prefix notation (e.g., `t4.mul(t3)') scarcely makes this any more fun.)

Now, it is not hard to completely automate the first (*) => (**)
translation step via

Int operator - (const Int& a)               return r(a) { r.negate(); }
Int operator + (const Int& a, const Int& b) return r(a) { r += b; }
Int operator - (const Int& a, const Int& b) return r(a) { r -= b; }
Int operator * (const Int& a, const Int& b) return r(a) { r *= b; }
Int operator / (const Int& a, const Int& b) return r(a) { r /= b; }

As I mentioned in my previous note, these are perhaps best thought of
as extensions or disguised forms of Int constructors.  Named return
values make these simpler, faster, and more obvious, I think.


My points so far are:

1) For many classes, expression-/value- based operations support a more
    natural programming style, that C++, as a hybrid language, is fully
    capable of supporting.

2) Object-based operations are surely the more central and basic in
    any object-oriented language, since value-based operations may
    be layered on top of the object-based ones.

3) As proponents of functional programming like to argue, the use of
    value-based operations results in code that is almost always
    easier to informally and formally verify for correctness. However,
    layering-in value operations as a `translation' step, makes design
    and verification at the object level easier too, since one need
    `merely' determine that the translation preserves correctness.



The kinds of optimizations seen going from (**) to (***) are further
refinements of this expression translation process, familiar to
assembly language programmers, compiler writers, and those studying
general program transformation techniques (e.g., the work inspired by
Burstall & Darlington).

Automation of such optimizations requires capabilities that seem
currently unavailable in C++. There are a lot of well known `tricks'
(reference counting, temp classes, etc.) that can exploit some of the
most glaring opportunities, but there are no built-in
`compiler-compiler' constructs that would allow programmers to specify
all of the kinds of optimizations possible for these kinds of classes.

Among the most frustrating aspects of all this is that these
optimizations are, on the one hand, fairly obvious and well-known,
but, on the other hand, very tedious and error-prone to carry out
manually every time you write out an expression.  Yet more frustrating
is the fact that nearly any compiler already knows how to optimize
expressions involving built-in data types like int, float, char, but
knows nothing at all about expressions involving user-defined types
like Int.


These seem to be the available options for implementors of such classes:

    1) Refuse to support expression-oriented operations.

    2) Support both the object-oriented methods, and simple
       value-to-object translations, thereby allowing programmers to
       hand-optimize using the object operations if they need to.

    3) Support as much optimization as you can manage from within
       the confines of C++.

    4) Write expression node classes that form expression trees during
       *runtime*, and fully optimize the trees, again during runtime
       before processing. (E.g., ExprNode operator + (Int&, Int&); makes
       a node, and Int::operator=(ExprNode&); evaluates it.)
 
    5) Write class-specific preprocessors, that perform optimized
       translations using knowledge that can't be expressed in C++.

    6) Extend the language.

Choosing (1) means that people will find the class difficult and/or
annoying to use.  Even the purest object-oriented languages provide at
least some arithmetic operator support, probably because programmers
would refuse to use them otherwise. As Michael Tiemann says, code
isn't reusable if it's not usable!

(2) is perhaps the most practical way to proceed right now, and is
fully within the spirit of the C/C++ maxim of `make it right, *then*
make it fast'. (Note, for example, that C/C++, unlike many languages,
already supports many object-based operations on builtin types, like
operator +=, that were designed, in part, to make hand-optimization by
programmers easier.)

I have written a bunch of libg++ classes along the lines of (3).
Unfortunately, such coding often involves cleverness or trickery that
obfuscates otherwise simple designs, and does not have a great payoff
since only a very small subset of common optimizations can be done
within such limitations.

In effect, (4) turns such classes into little interpretors. This *is*
doable, but approaches viability only when the objects and expressions
are so large and complex that the time taken to create and evaluate
the expression tree during runtime is always paid off by the resulting
savings. Moreover, it seems just plain wrong to do compile-time
processing during run time.

I am currently investigating some forms of (5), in part, to scope out
the kinds of issues and problems inherent in (6) about which I
do not yet have any good concrete suggestions.



Among the things I find most interesting about this topic is that
there are some classes (numbers, strings...) in which value-based
programming is seen by most everyone as the most natural and desirable
methodology, others (queues, windows, ...) in which object-oriented
programming is most natural, and some (sets, matrices, ...)  where people
like to use both. I'm not exactly sure why this is so. But a great
attraction of C++ is that there is the possibility that both views can
be accomodated.


Doug Lea, Computer Science Dept., SUNY Oswego, Oswego, NY, 13126 (315)341-2367
email: dl@oswego.edu              or dl%oswego.edu@nisc.nyser.net
UUCP :...cornell!devvax!oswego!dl or ...rutgers!sunybcs!oswego!dl

nagle@well.UUCP (John Nagle) (08/14/89)

In article <1989Aug9.175337.12165@mentor.com> lisch@mentor.com (Ray Lischner) writes:
>If efficiency is that important, then why not define assignment
>operators: +=, *=, etc., to do what you call plus(), times(), etc.:
>
>    tmp = m2;
>    tmp *= 2;
>    tmp += m3;
>    m1 += tmp;
>
     Very nice.  A useful approach to matrix math.

     An interesting thought is that a compiler could optimize 

	m1 += 2*m2 + m3;

     into the above sequence by itself, thus minimizing the intermediate
storage consumed.  This is a strange, but valid, way to optimize arithmetic
expressions generally.  Should a compiler do such a thing?  Should there
be pragmas associated with overloaded operators to advise the compiler on
which operations cost more than others, so that the optimizer can perform
better.   Copying big objects is expensive, and the ability to reduce
it is a worthwhile optimization.  It's worth thinking about what this
means in string operations, where such operations as sequential concatenation
usually result in either ugly source code or multiple recopying.

					John Nagle

SRWMRBD@windy.dsir.govt.nz (ROBERT) (08/16/89)

This is to support Doug Lea's proposal for "named return values". I
don't want to comment on the specific syntax he proposed - simply to say
we need a better way of getting objects back from functions.

I think this is more important for small objects than big ones where one
can use tricks to avoid the copying. For example I presume (I haven't
tested it) in a complex add that a significant amount of time (and
space) is used in the return.

For big objects like big matrices only the pointers and dimensional
information will need be copied at the end of a function or operation,
for functions or operations written by the library developer.

For example if you use my vector/matrix package (see my note of sometime
ago)

Matrix m1(100,100), m2(100,100), m3(100,100);

...

m1 = m1 + m2*2.0 + m3;

will create only one temporary which gets destroyed almost immediately
and does no copying of matrices. (At least it doesn't in Zortech; Doug
Lea found that AT&T C++ could generate extra initialises under some
circumstances).

However if you, a user, write a function  Matrix Sqrt(Matrix&)  then

m1 = Sqrt(m2);

will do one unnecessary copy. That is unless you know how to tell the
initialisation routines that the space occupied by the matrix you are
returning can be recycled.

So I want namded return values or the equivalent to make handling small
objects more efficient and make life simpler for people handling big
objects.

kearns@read.columbia.edu (Steve Kearns) (08/17/89)

Here is another argument for named return values.  

In the olden days, C routines could only return something pointer sized or
smaller.  (Lets ignore floats!)  Later, returning structures was allowed.

This is where a basic conceptual mistake was made.  C is not a "value"
language.  Data structures not only have values, they have locations.
When routines return small structures we can ignore their location 
because they can be copied from place to place cheaply.  For large
objects this is no longer true.  

EVERY OBJECT RETURNED IS RETURNED TO SOME LOCATION.  Therefore the default
semantics should allow the programmer access to this location, and the
special case should be the small structures for which copying is cheap.
Named return values provide this functionality.

-steve
(kearns@cs.columbia.edu)

jss@jra.ardent.com (Jerry Schwarz (Compiler)) (08/18/89)

There are two distinct questions about named function return.

    A) Are they a good idea stylistically?
    B) Do they add something to the language? (I.e. is there something
       you can say with them that you can't without them?)

A is a non-technical question and I will grant that there
are arguments on both sides. My conclusion is that
they are a bad idea, but I will not discuss this issue further
in this note.

The answer to B hinges on some subtle points of C++ semantics.

To answer B we must consider whether there is any difference in 
meaning between the proposed

    X n() return x	// named return value
	... x = ... 		    
	}

and

    X u() {		// unnamed return value
	X x ;
	...
	return x ;
	} 

In C++, functions that return an X always do so by construction.
The above differ in what I call their cannonical interpretation.
According to this interpretation, the x is constructed 
on entry to n in the place where the value is to be returned.  
u constructs x on the stack on entry and uses 
X(X&) to construct the return value just before returning.  

The key question is a subtle one, is it a legal optimization for
u to behave like n.  I believe the answer is yes, and that
having named return values adds nothing to the language.
(Although it forces the compiler to implement the optimization when
naemd return values are used.)

Doug Lea and I have had discussions about this kind of issue
before. He has generally taken the position that such optimizations
should not be allowed. 
That may be  why he wants named return values.

Jerry Schwarz
jss@ardent.com

cline@sun.soe.clarkson.edu (Marshall Cline) (08/18/89)

In article <13118@well.UUCP> nagle@well.UUCP (John Nagle) writes
about optimizing:	m1 += 2*m2 + m3;
into:			tmp = m2;  tmp *= 2;  tmp += m3;  m1 += tmp;

He says:
>Very nice.  A useful approach to matrix math.
>An interesting thought is that a compiler could optimize 
>	m1 += 2*m2 + m3;
>into the above sequence by itself, thus minimizing the intermediate
>storage consumed.  This is a strange, but valid, way to optimize arithmetic
                                           ^^^^^--but see below
>expressions generally.  Should a compiler do such a thing?
> ...
>John Nagle

Unfortunately it is valid _only_ if:		a += b
always does the same thing as:			a = a + b
(and similarly for a *= b and a = a * b, etc).

The meanings of +=, *=, -=, etc, are firmly entrenched in the minds of
C programmers, however C++ doesn't _require_ the same semantics to be
preserved for user-defined types.  For example, there's no reason why
(a *= b) can't open the file named (b) and print the value of (a) to it
[Kids: don't try this at home].  Even "=" doesn't need to assign (although
assignment is its default behavior).

Marshall Cline
--
	__________________________________________________________________
	Marshall P. Cline	Internet: cline@sun.soe.clarkson.edu
	ECE Department		Usenet:   uunet!sun.soe.clarkson.edu!cline
	Clarkson University	Bitnet:   BH0W@CLUTX
	Potsdam, NY  13676	AT&T:     315-268-6591

dl@cmx.npac.syr.edu (Doug Lea) (08/20/89)

Jerry Schwarz contemplates whether an optimizing C++ compiler should be
allowed to optimize a function like u(),

    class X { ... X(X&) { ... } ... }; 

    X u() {		// unnamed return value
	X y ;
	...
	return y ;
	} 

into that which would be produced via a function like n()

    X n() return x	{ // named return value
	... x ... 		    
	}

The question is whether a compiler may *legally* skip the X(X&)
constructor to get the value of y out of function u by constructing
and dealing with y as if y itself were the return value, thus, in this
case at least, automating some of the efficiency benefits of
named return values. 

Jerry and I have indeed been through a few exchanges on such points.
To summarize some of the discussions:

The C++ reference manual does not enumerate exactly those conditions
under which X(X&) will or will not be invoked, and further, does not
impose any restrictions upon user definitions of X(X&).  Therefore, as
I demonstrated in a posting last spring, one can write an X(X&)
constructor that possesses arbitrary side effects, and the results of
the corresponding program will differ depending on whether you use a
compiler that does invoke X(X&) in situations like getting the return
value out of function u() versus one that does not.  This has some
pragmatic interest since various C++ compilers `elide' out X(X&)
constructors in some circumstances but not others.

The use of named return values happily and successfully evades this
issue, at least with respect to return values, so I didn't bother to
get into it in my previous NRV postings.  But some kind of resolution
of this part of the story could be had by tightening up the
description of X(X&) semantics just enough to better reflect the
general assumptions that are or could be made inside of cfront, g++,
and probably all other C++ compilers. Jerry and I once came up with
something along the lines of

    Given any class X, and any X object `obj', when another object
    with the value of obj is required, C++ compilers may optionally
    use obj instead of a new object, `newobj', constructed via X(obj)
    if obj is not subsequently modified differently than newobj during
    newobj's potential lifetime.  Similarly, compilers may in such
    cases use any other X object constructed via X(obj), but never
    since or subsequently modified differently than obj during obj's
    lifetime.  X(X&) constructors should be defined in such a way that
    these optional compiler-generated actions do not change the
    meanings of programs.

To be more complete, this description would have to be accompanied by
an emumeration of those cases where any such value/object is
`required'.  The current reference manual appears to list these,
although not all in one place, or in these terms. 

The idea is to spell out those situations in which a compiler may safely
(1) alias and/or reuse objects (as in the case of a local and a
return value) rather than forcing copies via X(X&), and 
(2) invoke X(X&) (say, into a register) even in cases where it is not
explicitly needed if it somehow discovered that this would improve 
overall efficiency.  

The reference manual currently touches on some of these issues in
terms of `generating temporaries', leading one to believe that it is
describing case (2) here, in situations where it is really referring
to unexploited instances of case (1), i.e., the fact that compilers
don't have to guarantee that they will optimize out logically
required, but obviously unnecessary X(X&)'s.

Part of the reason that all this is controversial is that, both from a
safety perspective and in order to enable more aggressive optimization
(safety, correctness, and optimizability are almost always just
different ways of looking at the same problem), it would be nice to
banish *all* side effects from X(X&) constructors. However, it does
not seem at all desirable to disallow `innocuous' side effects, such as
reference counting manipulations.  Additionally, a case can be made
that programmers should be allowed to write non-innocuous-side-effect
laden X(X&)'s and to have a way to disable optimizations in order to
force X(X&) constructors whenever they are logically or explicitly
required, in a way analogous to how ANSI C `volatile' variables are
handled, although I am no longer convinced that this is especially
worthwhile.

The important practical implication is that programmers should write
X(X&) constructors that build faithful copies of their arguments
without any kinds of side effects that would cause programs to behave
differently depending on whether the constructors were actually called
or not when they are logically required.  Of course, it is impossible
for a compiler alone to prove to itself whether X(X&) side effects are
considered by the programmer to be innocuous or not, or even whether a
compiler-generated or programmer-defined X(X&) really does make a
`correct' copy, so any kind of restriction on X(X&) is mostly
unenforceable by compilers.

Again, these semantic issues do not impact one way or another my other
arguments for the utility of named return values. In particular, they
do not alter the fact that named return values provide a deterministic
guarantee that X(X&) will NOT be invoked, whereas, even with further
refinement of X(X&) semantics, the question of whether any compiler
actually can and does optimize out all return-based X(X&) constructors
remains a non-trivial compiler implementation matter.

Why is return-value X(X&) optimization hard for a compiler to guarantee?
Consider a function with lots of local X's, and lots of conditionals,
loops, inline and non-inline function calls, returns, etc. It would
take analyses at the edge of current state of the art compiler
technology to discover which variable(s) could safely be aliased to
the return value slot, and how to manage the slot. I am all in favor
of compiler implementors adding such features to C++ compilers.  But
even with these kinds of analyses, it seems naive to believe that a
compiler could always arrive at code that would always be as good as
that specified in a simple and natural manner by a programmer using
named return values. Permitting such control over how ones code gets
compiled is surely within the spirit of C and C++.


Doug Lea, Computer Science Dept., SUNY Oswego, Oswego, NY, 13126 (315)341-2367
email: dl@oswego.edu              or dl%oswego.edu@nisc.nyser.net
UUCP :...cornell!devvax!oswego!dl or ...rutgers!sunybcs!oswego!dl