[comp.lang.c++] Continuing efforts to build a CM++

casey@COGNET.UCLA.EDU (11/11/88)

  We must be doing something wrong.  We probably just don't know how to
use C++ (GNU C++) properly.  Could some kind soul flame the shit out of
us and then point us in the right direction???

  Appended to the end of this is a small sample of what we're trying to
do with C++ in building a development environment for the Connection
Machine.  (For the real masochists among you, I've also appended the
annotated assembly output (GNU C++ 1.27, VAX 8350, Ultrix 2.2).)  It's
highly abstracted, but demonstrates two of the problems we've been unable
to get past.  Both of these are efficiency problems.  We *can* get C++ to
do the right thing, but shades of Lisp, what a mess!!  The code it
generates is horrible!

  I've already described the first problem: the desire to have compile
time attributes for objects.  Our second problem is more likely just us
not doing the right thing.  This problem is demonstrated by the CM_int
operator+ function below.  When the statement ``x = y + z'' is compiled,
y + z is computed into the explicit temporary result, result is copied
into a compiler generated temporary, and then that is copied into x ...

  What we'd like to do is define the operator+ something like the
following:

	inline CM_int operator+(CM_int &a, CM_int &b)
	{
		if (!this)
			this = new CM_int(max(a.len, b.len));
		if (a.len == b.len && a.len == this->len)
			CM_s_add_3_1L(this->addr, a.addr, b.addr, a.len);
		else
			CM_s_add_3_3L(this->addr, a.addr, b.addr, this->len, a.len, b.len);
	}

where the compiler generate (essentially) three address binary operator
streams.  E.g. the two sources, and where it wants the result.  This
would completely eliminate the worthless copies being made in the
present code.  Note the code to create the compiler temporaries properly
via the test ``if (!this) ...'' - the compiler can't possibly know what
values to pass to the constructor for the temporary.

  Also note that if the "len" field were a compile time attribute as we
want it to be, the optimizer could pull out all the length comparison
conditionals and simply replace them with the correct branch.

Casey

P.S.
  Just as a note on the complexity issue that's been running around on
the group recently, I would have to say that C++ is not too complex.
What you're looking at is the inherit complexity in fully defining a
first class object class in the language.  Just think of the code in your
favorite C compiler to implement the stock set of types and you may get
an idea of what I'm talking about.  It takes a lot to define all the
attributes of how an object class behaves.

  I think what we'll probably be seeing in the future is programmers
writing/implementing classes in libraries, and other programmers simply
using those classes much as they already use "int", "char", etc.  After
all, we rarely object when people put B-tree database libraries together
for the use of other people.  Even though the code necessary to implement
that B-tree library is probably anything but trivial.

  On the contrary, I object that C++ doesn't bring enough of the power of
the compiler out to the class implementor.  When a C compiler implements
the int, float, char, etc. classes internally, it can attach any amount
of compile time attribute information to an object - I can't do that with
C++.  Likewise, the C compiler has control of how to instantiate
temporaries and properly control the data flow without loss of efficiency.

  But maybe someone will tell us that we're just being foolish and give
us a three line fix that does exactly what we need.  If not however, I
think that C++ still has a ways to go.  If we're going to create
facilities for creating new first class language classes, let's do it
right.  Right now I still can't create truly first class object classes -
my classes aren't as good as the classes the compiler inheritly
implements.

-----<t.cc>-----
class CM_int {
	private:
		unsigned int	addr;
		unsigned int	len;
	public:
		CM_int(unsigned int size);
		CM_int(CM_int &a);
		~CM_int(void);
		CM_int operator=(CM_int &a);
		friend CM_int operator+(CM_int &a, CM_int &b);
};

inline CM_int::CM_int(unsigned int size)
{
	addr = CM_allocate_heap_field(size);
	len = size;
}

inline CM_int::CM_int(CM_int &a)
{
	addr = CM_allocate_heap_field(a.len);
	len = a.len;
	CM_s_move_1L(addr, a.addr, len);
}

inline CM_int::~CM_int(void)
{
	CM_deallocate_heap_field(addr);
}

inline CM_int CM_int::operator=(CM_int &a)
{
	if (len == a.len)
		CM_s_move_1L(addr, a.addr, len);
	else
		CM_s_move_2L(addr, a.addr, len, a.len);
}

inline CM_int operator+(CM_int &a, CM_int &b)
{
	int m = max(a.len, b.len);
	CM_int result(m);

	if (a.len == b.len)
		CM_s_add_3_1L(result.addr, a.addr, b.addr, result.len);
	else
		CM_s_add_3_3L(result.addr, a.addr, b.addr, result.len, a.len, b.len);
	return(result);
}


main()
{
	CM_int x(16), y(32), z(64);

	x = y+z;
}
-----<EOF t.cc>----

-----<t.s>-----
#NO_APP
.text
	.align 1
.globl _main
_main:
	.word 0xfc0
	subl2 $48,sp

	movab -8(fp),r6			// r6 = &x
	movl $16,r7
	pushl r7
	calls $1,_CM_allocate_heap_field// CM_allocate_heap_field(16)
	movl r0,(r6)			// x.addr = !!
	movl r7,4(r6)			// x.len = 16

	movab -16(fp),r6		// r6 = &y
	movl $32,r7
	pushl r7
	calls $1,_CM_allocate_heap_field// CM_allocate_heap_field(32)
	movl r0,(r6)			// y.addr = !!
	movl r7,4(r6)			// y.len = 32

	movab -24(fp),r6		// r6 = &z
	movzbl $64,r7
	pushl r7
	calls $1,_CM_allocate_heap_field// CM_allocate_heap_field(64)
	movl r0,(r6)			// z.addr = !!
	movl r7,4(r6)			// z.len = 64

	movab -8(fp),r10		// r10 = &x
	movab -16(fp),r9		// r9 = &y
	movab -24(fp),r8		// r8 = &z
	movab -40(fp),r1		// r11 = r1 = &<compiler temp>
	movl r1,r11

	pushl 4(r8)			// z.len
	pushl 4(r9)			// y.len
	calls $2,_max
	movl r0,r6			// r6 is m = max ...

	movab -48(fp),r7		// r7 = &result
	pushl r6
	calls $1,_CM_allocate_heap_field// CM_allocate_heap_field(m)
	movl r0,(r7)			// result.addr = !!
	movl r6,4(r7)			// result.len = m

	cmpl 4(r9),4(r8)		// y.len == z.len
	jneq L59

	pushl -44(fp)			// result.len
	pushl (r8)			// z.addr
	pushl (r9)			// y.addr
	pushl -48(fp)			// result.addr
	calls $4,_CM_s_add_3_1L
	jbr L60
L59:
	pushl 4(r8)			// z.len
	pushl 4(r9)			// y.len
	pushl -44(fp)			// result.len
	pushl (r8)			// z.addr
	pushl (r9)			// y.addr
	pushl -48(fp)			// result.addr
	calls $6,_CM_s_add_3_3L
L60:
	movl r11,r6			// r6 = &<compiler temp>
	movab -48(fp),r7		// r7 = &result

	tstl r6				// testing to see if compiler
	jneq L61			// temp has already been
	pushl $8			// allocated - it has, so
	calls $1,___builtin_new		// we never call this
	movl r0,r6
L61:
	pushl 4(r7)
	calls $1,_CM_allocate_heap_field// CM_allocate_heap_field(result.len)
	movl r0,(r6)			// <compiler temp>.addr = !!
	movl 4(r7),4(r6)		// <compiler temp>.len = result.len

	pushl 4(r6)			// assign result to compiler temp
	pushl (r7)
	pushl (r6)
	movab _CM_s_move_1L,r6
	calls $3,(r6)

	movab -40(fp),r0		// assign compiler temp to x
	cmpl 4(r10),4(r0)
	jneq L70
	pushl 4(r10)
	pushl (r0)
	pushl (r10)
	calls $3,(r6)
	jbr L73
L70:
	pushl 4(r0)
	pushl 4(r10)
	pushl (r0)
	pushl (r10)
	calls $4,_CM_s_move_2L
L73:
	pushl -32(fp)			// destroy never used CM_int
	calls $1,_CM_deallocate_heap_field
	pushl -24(fp)			// destroy z
	calls $1,_CM_deallocate_heap_field
	pushl -16(fp)			// destroy y
	calls $1,_CM_deallocate_heap_field
	pushl -8(fp)			// destroy x
	calls $1,_CM_deallocate_heap_field
	ret				// forget to destroy result and
					// compiler temp
-----<EOF t.s>-----

bs@alice.UUCP (Bjarne Stroustrup) (11/12/88)

Giving control to the (library) programmer to the point of controlling
temporary variable usage is non-trivial. However, there are ways of using
C++ so that temporaries isn't user.

Assume a class Matrix for which creation/copying/destruction is so expensive
that you'd rather not have temporary variables introduced:

	f() {
		Matrix a,b,c;
		// ...
		a = b + c;	// yuck: temporary used
		a = a * c;	// yuck: temporary used
	}

however:

	f()
	{
		Matrix b,c;
		// ...
		Matrix a = b + c;	// temporary not used
		a *= c;			// temporary not used
	}

Initialization is typically cheaper than assignment (since the compiler
doesn't need to introduce a temporary to protect against aliasing) and
*=, +=, etc. is almost always a win for user-defined types.