[comp.lang.c++] distinguishing operator[] on left and right

gilley@ndl.com (Greg Gilley) (03/01/91)

Is there any way to distinguish when the operator[] is used as an
lvalue as opposed to an rvalue?

The problem is that I am attempting to do a delayed copy of contents
with a reference count (a sort of copy-on-write).  So what I have is
a class of the form:

class X
{
    struct Xrep
    {
	float *data;
	int refcnt;
	int length;
    } *p;
    .
    .
    .
};

and operator= looks like:
void X::operator=(const X &x)
{
    x.p->refcnt++;

    if (--p->refcnt == 0)
    {
	delete p->data;
	delete p;
    }

    p = x.p;
}

Which is all well and good.  If I do other operators, they create new
ones, etc.  The problem arises with [].  Suppose we have:

     X a(1);
     X b(1);

     a[0] = 1.0;
     b = a;
     b[0] = 10.0;

What you would like to have is a[0] == 1.0 and b[0] == 10.0 (which means
that b has to be "detached" from a before it changes).  However,
I can't find a way to distinguish when the result of [] is being used
as an lvalue or rvalue.

Any suggestions?  Thanks,

			Greg

-- 
-------------------------------------------------------
  Greg Gilley
  gilley@ndl.COM   [Numerical Design Limited]
  919-929-2917 (voice)

jbuck@galileo.berkeley.edu (Joe Buck) (03/02/91)

In article <1991Feb28.212419.20920@ndl.com>, gilley@ndl.com (Greg Gilley) writes:
|> Is there any way to distinguish when the operator[] is used as an
|> lvalue as opposed to an rvalue?

Yes, something like this:

class ElementRef {
	operator int ();
	operator=(int);
};

class IntVector {
...
	ElementRef operator[] (int);
};

The idea is that [] returns a special class called ElementRef, which
has two operators defined: assignment operator and cast-to-int.
If I say

	IntVector v;
	int i, j;
	...
	v[3] = i;
	j = v[4];

in the first case, ElementRef::operator=(int) is used.  In the
second case, ElementRef::operator int() is used.

--
Joe Buck
jbuck@galileo.berkeley.edu	 {uunet,ucbvax}!galileo.berkeley.edu!jbuck

horstman@mathcs.sjsu.edu (Cay Horstmann) (03/02/91)

In article <1991Feb28.212419.20920@ndl.com> gilley@ndl.com (Greg Gilley) writes:
>Is there any way to distinguish when the operator[] is used as an
>lvalue as opposed to an rvalue?
>
Surely I am not the first one to propose this one, but here goes anyway.

Just as pre- and postincrement operator++ are distinguished by a hidden
int argument, could this not be done for lvalue and rvalue operator[]?

I.e. const X& operator[]( int ) and X& operator[]( int, int )?

The second int is always 0. 

If the second operator is not present, the first one is taken for both
lvalues and rvalues. 

Points to consider:
   (1) It is ugly as hell
   (2) It has precedent (operator++)
   (3) It won't break existing code
   (4) It does not use the keyword "static" in unusual ways.

Cay

wmm@world.std.com (William M Miller) (03/03/91)

horstman@mathcs.sjsu.edu (Cay Horstmann) writes:
> Just as pre- and postincrement operator++ are distinguished by a hidden
> int argument, could this not be done for lvalue and rvalue operator[]?
>
> I.e. const X& operator[]( int ) and X& operator[]( int, int )?
>
> The second int is always 0.
>
> If the second operator is not present, the first one is taken for both
> lvalues and rvalues.
>
> Points to consider:
>    (1) It is ugly as hell

No argument here. :-)  Seriously, I don't like the operator++() and
operator--() solution in E&S, either, although I don't know whether it's
worth fighting to change.

>    (2) It has precedent (operator++)
>    (3) It won't break existing code

Here's where I think the real problem with this proposal lies: it really
isn't parallel to the operator++() design.  The operator++() design *does*
break existing code -- if you don't provide the two-argument form, you can't
use postfix ++.  It would be YACC (Yet Another C++ Complexity :-) for
operator[]() to fold the cases but for operator++() not to do so.

The proposal also suffers from lack of generality: operator[]() is not the
only operator that can be used in lvalue and rvalue contexts.  Even if we
restrict ourselves to operators that can have lvalue results on builtin
types, there are (prefix) ++, (prefix) --, "," (comma), and all the
assignment operators (not to mention ?:, since it can't be overloaded); more
generally, all the overloaded operators can return a reference and hence be
used in lvalue contexts.  There's no compelling reason to believe that only
operator[]() would benefit from being able to distinguish between lvalue and
rvalue contexts.  For instance, how would you extend this to apply to prefix
operator++()?  There's already a meaning for operator++(int).

Another major consideration X3J16 applies to proposals is if there is a
straightforward way to achieve the desired results in the existing language.
As someone pointed out in an earlier posting, making operator[]() return an
object of an auxiliary class with separate operator=() and conversion
operator member functions is a pretty reasonable way to address this problem
where it's needed, and it's more generally applicable, as well.

-- William M. Miller, Glockenspiel, Ltd.
   wmm@world.std.com

jbuck@galileo.berkeley.edu (Joe Buck) (03/04/91)

In article <1991Mar2.000705.3496@mathcs.sjsu.edu> horstman@mathcs.sjsu.edu (Cay Horstmann) writes:
>In article <1991Feb28.212419.20920@ndl.com> gilley@ndl.com (Greg Gilley) writes:
>>Is there any way to distinguish when the operator[] is used as an
>>lvalue as opposed to an rvalue?
>>
>Surely I am not the first one to propose this one, but here goes anyway.
>
>Just as pre- and postincrement operator++ are distinguished by a hidden
>int argument, could this not be done for lvalue and rvalue operator[]?

The difference is that it was needed for ++; there is otherwise no way to
tell the difference between predecrement and postdecrement, so you couldn't
make smart pointer classes look like pointers.

However, it's not difficult at all to get the proper behavior from the
existing language with operator[].  If you want to get a different
operation when the result of operator[] is used as an rvalue than you
do when it is used as an lvalue, you can have operator[] return a special
class that has both an assignment operator (which is used in the lvalue
case) and a cast operator (which is used in the rvalue case).

>Points to consider:
>   (1) It is ugly as hell
>   (2) It has precedent (operator++)

I disagree that this is a precedent.  There was no way before to tell
++p from p++ where p is a class.  It's not difficult at all to handle
v[key] = x, and x = v[key] correctly where v[key] is, say, a hash table and
key is a key, and the entry for key may or may not exist.  I take it
this is a case where you think there is a problem that needs to be
solved by a language extension.  But there is no problem, and the code
to do it right is less ugly than your method.

>   (3) It won't break existing code
>   (4) It does not use the keyword "static" in unusual ways.

    (5) It is completely unnecessary
    (6) It will lead to people making other unnecessary "extensions"
because there are other analogous situations that people will think need
"solutions"

--
--
Joe Buck
jbuck@galileo.berkeley.edu	 {uunet,ucbvax}!galileo.berkeley.edu!jbuck

jar@ifi.uio.no (Jo Are Rosland) (03/04/91)

In article <1991Mar2.000705.3496@mathcs.sjsu.edu> horstman@mathcs.sjsu.edu (Cay Horstmann) writes:
   Surely I am not the first one to propose this one, but here goes anyway.

   Just as pre- and postincrement operator++ are distinguished by a hidden
   int argument, could this not be done for lvalue and rvalue operator[]?

As someone else already pointed out, the lvalue/rvalue distinction is
already achievable within the current language, through the use of an
intermediate value returned from operator[].

The problem with this, is optimization.  You probably want something
like operator[] to be as fast as possible, and to do this in C++
today, you have to create an interface based on separate, inlined,
get/set operations.

A related example of something that's achievable, but not efficient,
is a string concatenation operator.  An interface like:

	String s1 = "foo";
	String s2 = "bar";
	String s3;

	s3 = s1 + s2;

would probably be a nice way to handle string concatenation, but this
can't be done efficiently.  Instead you'll probably have to do
something like:

	String s1 = "foo";
	String s2 = "bar";
	String s3;

	s3 = s1;
	s3 += s2;

To me, this use of intermediate values -- both compiler generated and
as part of class interface implementations -- is a serious problem
with C++, due to the performance degradation it leads to.

I mean, a very popular first project (and perhaps second and third
:-)) after having learned C++, is to create some kind of string class.
But how many of those string classes are actually in use?  After one
realizes one have to choose between a significant performance hit, or
a counterintuitive interface, I think many programmers go back to the
traditional strdup/strcpy/strcat way of handling strings.

It's not really that much to gain by renaming these as operator=,
operator+= and so on, since you (and the maintainers of your code)
would have to look up the implementations of these operators to make
sure there are no hidden surprises concerning things like copying.

Which brings me to another problem with C++, and probably the whole
OOP paradigm.  We're badly in need of some way of precisely specifying
interfaces to modules/classes.  This specification should include
performance of methods, as well as all the interesting parts of their
behaviour (sp?).

In practice, this should be something halfway between a C++ class
header file, and its implementation.  Specifications should be
standardized and powerfull enough to allow browse/search tools that
can aid in finding classes that match criteria like language, a set of
operations needed, and performance.

Only after this is achieved, can we hope to meet the "software ic"
goal of OOP, including things like interchangable software modules.
--
Jo Are Rosland
jar@ifi.uio.no

robert@am.dsir.govt.nz (Robert Davies) (03/05/91)

re: distinguishing operator[] on the left and right.

This follows up Greg Gilley's item. I think C++ could be improved here.
The "right" way of distinguishing the access to g in

      char c = g[3];

and

      g[3] = c;

would be with "const".

You need two versions of the subscript function (my examples are based on Tony
Hansen's string class):


      char &operator[](int i)
      {
         if (p->refcount > 1) disconnect();
         return str()[i];
      }

      char operator[](int i) const
      {
         return str()[i];
      }

Currently (in Turbo C++ or Glockenspiel C++; my version of Zortech can't
distinguish between the 2 versions) you need to write

      char c = ((const string)g)[3];

to get the second version. Which is a bit messy.

But wouldn't it be reasonable for the second version to be the default if the
compiler can tell that the operation won't affect the value of g?  Was this
kind of issue raised in the recent discussion on const in comp.std.c++?

I don't think Joe Buck's solution using an extra class is fully satisfactory
as it uses up the one coercion that C++ allows. For example, in his example,

      double d = v[4];

will not work.


Finally, do we really need delayed copy - or does it just cause more trouble
than it is worth? The only place it might be necessary is in returning values
from a function. And someone suggested that under some circumstances a clever 
compiler could avoid the copy you would ordinarily expect in a return.

Robert

fuchs@tmipe0.telematik.informatik.uni-karlsruhe.de (Harald Fuchs) (03/05/91)

robert@am.dsir.govt.nz (Robert Davies) writes:

>Currently (in Turbo C++ or Glockenspiel C++; my version of Zortech can't
>distinguish between the 2 versions) you need to write

>      char c = ((const string)g)[3];

>to get the second version. Which is a bit messy.

>But wouldn't it be reasonable for the second version to be the default if the
>compiler can tell that the operation won't affect the value of g?

Won't work. In general, the only way for a compiler to know about
constness is just by declaring a member function const. A non-const
operator[] is legal and sometimes even reasonable. Const vs. non-const
and LHS vs. RHS are completely different matters.
--

Harald Fuchs <fuchs@telematik.informatik.uni-karlsruhe.de>

stephens@motcid.UUCP (Kurt Stephens) (03/06/91)

robert@am.dsir.govt.nz (Robert Davies) writes:
>Currently (in Turbo C++ or Glockenspiel C++; my version of Zortech can't
>distinguish between the 2 versions) you need to write

>      char c = ((const string)g)[3];

>to get the second version. Which is a bit messy.

>But wouldn't it be reasonable for the second version to be the default if the
>compiler can tell that the operation won't affect the value of g?  Was this
>kind of issue raised in the recent discussion on const in comp.std.c++?

	How could the compiler tell that the operation won't affect the
value of g? the string::operator[]() could be doing just about anything
to the privates of g.  C++ compliers cannot read minds,
or understand the internal semantics of any functions.

Example:
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-==-=-=-=-=-=-=-=-=-
#include	<stdio.h>

class X {
	int	i;
public:
	X( int I ) : i(I) {}
	int	operator[](int I) { return i = I; }
	void	print() { printf("%d\n", i ); }
};

main() {
	X	a = 1;
	a.print();
	a[-1];
	a.print();
}
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=

X::operator[](int) means "assign to member i", but because of the
association with C's operator[], most container classes limit operator[]()
semantics to be "return a lvalue", "return a rvalue" or "lookup in table",
which is great because it makes for readable code.

Obviously, X::operator[](int) is a bad choice for "assign to member i";
X::operator=(int) would have been much more intuitive handle.

But we (and the complier) cannot assume that operator[]() never
modifies the state of the class or its instances.

Kurt A. Stephens		Foo::Foo(){return Foo();}
stephens@void.rtsg.mot.com	"When in doubt, recurse."

-- 

Kurt A. Stephens		Foo::Foo(){return Foo();}
stephens@void.rtsg.mot.com	"When in doubt, recurse."

chip@tct.uucp (Chip Salzenberg) (03/06/91)

According to jar@ifi.uio.no (Jo Are Rosland):
>I mean, a very popular first project (and perhaps second and third
>:-)) after having learned C++, is to create some kind of string class.
>But how many of those string classes are actually in use?

We still use ours.  The "a = b; a += c" syntax is a little awkward,
but it's sure better than calculating sizes, calling new/delete and
strcpy/strcat.

Note, though, that a |String| is often implicitly converted to a
|const char *| when used as a function parameter, since functions that
expect strings may well be called with old-style C strings, and I
don't want to construct a temporary |String| in such cases.
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>
   "All this is conjecture of course, since I *only* post in the nude.
    Nothing comes between me and my t.b.  Nothing."   -- Bill Coderre

ssd@engr.ucf.edu (Steven S. Dick) (03/07/91)

What if I have a [] operator that does something unusual to extract the
data from the object... for instance, a bitfield...

// interface parts only...
class packedbits 
{
 public:
  packedbits(int size);
  int operator[](int index);
  void set(int index);
  void clear(int index);
};

doit() 
{
   packetbits flags(100);

   if (flags[4])	// this works
     ....

   flags.set(4);	// works--but ugly
   flags[4] = 1;	// how can I make this work???
}

	Steve
ssd@engr.ucf.edu

robert@am.dsir.govt.nz (Robert Davies) (03/07/91)

Distinguishing lvalues and rvalues and delayed copy.

I posted a note suggesting that Greg Gilley's problem could be resolved if
C++ handled constant member functions slightly differently.

The problem is to decide whether you need to do a copy when you access an
element of a string or vector in a delayed copy situation. See for example the
string class in Tony Hansen's book.

A couple of people replied. The answers didn't make a lot of sense so I wonder
if my item got damaged in transmission. I repeat the problem posed by Greg
Gilley. If you have a string class (for example) with delayed copy how do you
tell it to copy, if necessary, when you have a statement like

   String g = f;
   ....
   g[3] = 'a';

so that g gets changed but not f (both get changed if you use the code in Tony
Hansen's book). But you don't want to copy when you have

   String g = f;
   ....
   char c = g[3];


I suggested having two versions of the operator[]

      char &operator[](int i)
      {
         if (p->refcount > 1) disconnect();   // do the delayed copy now
         return str()[i];                     // get the ref to the element
      }

      char operator[](int i) const            // constant member function
                                              // ARM 9.3.1
      {
         return str()[i];                     // get the element
      }


Currently (in Turbo C++ or Sun C++) you need to write 

      char c = ((const String)g)[3];

to get the second version. Which is a bit messy. And the Sun version makes an
extra copy of g.

I suggest that it would be reasonable for the second version to be the
default if the compiler can tell that the operation won't affect the value of
g.

The line

   char c = g[3];

will not affect g and the compiler knows this, since in this case the = is
predefined. If it is a user defined = then it will depend on whether the = has
its argument declared const. Assume it has.

Now suppose there are the two versions of operator[] defined in class String:

   char& operator[](int);

and

   char operator[](int) const;

The second version is guaranteed not to affect g. So either version of
operator[] is OK. The compilers I have access to choose the first version
in c = g[3] (unless g is declared const). They could equally well have used
the second version. In other words there would be no damage if the compiler
had decided that g had been declared constant.

So I suggest, if there are both a const member function and an ordinary member
function defined, then the compiler should pretend that the object (g in our
case) has been declared "const" and choose the const member version of the
function if this compiles OK.

This will ensure that the statement

   char c = g[3];

will get the version of operator[] that doesn't cause a copy.

On the other hand

   g[3] = 'a';

is not allowed if g is const so the compiler must choose the ordinary member
function version of operator [].

jimad@microsoft.UUCP (Jim ADCOCK) (03/08/91)

In article <1991Mar2.212017.13885@world.std.com> wmm@world.std.com (William M Miller) writes:
|Another major consideration X3J16 applies to proposals is if there is a
|straightforward way to achieve the desired results in the existing language.
|As someone pointed out in an earlier posting, making operator[]() return an
|object of an auxiliary class with separate operator=() and conversion
|operator member functions is a pretty reasonable way to address this problem
|where it's needed, and it's more generally applicable, as well.

.... assuming that the committee accepts overloaded operator dot.  Otherwise,
as Cay has pointed out, making operator[] return an auxiliary class
[ dare we call it a "reference" class ??? ] is not a general solution,
since it cannot be dereferenced like a normal object.

chip@tct.uucp (Chip Salzenberg) (03/09/91)

According to robert@am.dsir.govt.nz (Robert Davies):
>I suggested having two versions of the operator[]
>      char &operator[](int i)
>      char operator[](int i) const

I've done this.

>      char c = ((const String)g)[3];

Perhaps you meant

       char c = ((const String &)g)[3];

That's less likely to create a temporary.

>I suggest that it would be reasonable for the second version to be the
>default if the compiler can tell that the operation won't affect the value
>of g.

There's the rub.  How's the compiler supposed to know that?  We may
know from reading the class definition what's meant, but the compiler
hasn't got a prayer at figuring out when to call the const function
even though you're not operating on a const object.

A workaround would be to create a const reference to the object in
question, and use it for access:

     String s;
     String &r = s;
     char c = s[0];    // slow
     char d = r[0];    // fast

-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>
 "Most of my code is written by myself.  That is why so little gets done."
                 -- Herman "HLLs will never fly" Rubin

beng@microsoft.UUCP (Ben GOETTER) (03/09/91)

In article <27D3E61F.6226@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
| According to jar@ifi.uio.no (Jo Are Rosland):
| >I mean, a very popular first project (and perhaps second and third
| >:-)) after having learned C++, is to create some kind of string class.
| >But how many of those string classes are actually in use?
| 
| We still use ours.

As do we.  For us, the big win lies less in automatic storage management than
it does in localization of double-byte character-set (DBCS) dependencies.
It has also in the past handled conversion from application resourcefiles
and to/from alien string formats (e.g. non-ASCIZ).

Unfortunately, an airtight set of DBCS-safe methods renders impossible the
traditional byte-vector string manipulation so beloved by C hacks.  The
usual ad-hoc lexing becomes very awkward.  Make sure you have an adequate
replacement for char-at-a-time tokenizing and pattern-matching, lest you
be lynched by your angry clients.  (I still sport rope burns....)

--
Ben Goetter, microsoft!beng

markt@nro.cs.athabascau.ca (Mark Tarrabain) (03/10/91)

ssd@engr.ucf.edu (Steven S. Dick) writes:

> What if I have a [] operator that does something unusual to extract the
> data from the object... for instance, a bitfield...
> 
> // interface parts only...
> class packedbits 
> {
>  public:
>   packedbits(int size);
>   int operator[](int index);
>   void set(int index);
>   void clear(int index);
> };
> 
> doit() 
> {
>    packetbits flags(100);
> 
>    if (flags[4])	// this works
>      ....
> 
>    flags.set(4);	// works--but ugly
>    flags[4] = 1;	// how can I make this work???
> }
> 
> 	Steve
> ssd@engr.ucf.edu

The line reading:
        int operator[](int index);
should be:
        int &operator[](int index);
then it will work on the left or the right side of an =.

>> Mark

jbuck@galileo.berkeley.edu (Joe Buck) (03/12/91)

In article <1991Mar6.235058.3641@osceola.cs.ucf.edu>, ssd@engr.ucf.edu (Steven S. Dick) writes:
|> What if I have a [] operator that does something unusual to extract the
|> data from the object... for instance, a bitfield...
|> 
|> // interface parts only...
|> class packedbits 
|> {
|>  public:
|>   packedbits(int size);
|>   int operator[](int index);
|>   void set(int index);
|>   void clear(int index);
|> };

Your problem is right there.  You're having operator[] return an int.  This
means that it can only be used as an lvalue and cannot be used to set the
bit.

Let's send a helper class to the rescue: change operator[](int) to return
a BitRef helper class:

class BitRef {
private:
	packedbits& pb;
	int index;
public:
	BitRef (packedbits& obj, int idx) : pb(obj), index(idx) {}
	operator int() { return pb.readBit(index);}
	BitRef& operator=(int newBit) {
		if (newBit) pb.set(index);
		else pb.clear(index);
		return *this;
	}
};

I need a new function in class packedbits: readBit(int) returns
the value of the bit at the given position.

Now when I say

	packedbits bitarray;

	int x = bitarray[23];

this turns into x = bitarray.readBit(23);

and

	bitarray[34] = x;

turns into

	if (x) bitarray.set(34); else bitarray.clear(34);

Note that the returned object acts like a reference to the
given bit.  In the case where the thing returned is an object,
we'd like to be able to redefine operator dot (to have a "smart
reference" class).  We can't with the ARM, though smart references
have been proposed as an extension to the language.

--
Joe Buck
jbuck@galileo.berkeley.edu	 {uunet,ucbvax}!galileo.berkeley.edu!jbuck