[net.lang.c++] further C++ fix opportunities

nathan@orstcs.UUCP (nathan) (02/25/86)

Comments on C++ opportunities

The advent of a new language is an exciting time.  We each
see areas where we would like development, old pet peeves
to be fixed, features yearning to be added.  A strong
guiding hand (used judiciously) becomes extremely valuable.
It makes the difference between, for instance, C++ and Ada.

With that said, I would like to address an area in which I
hope there is still room for flexibility: re-entrancy. 
Modern C is in fact only partially re-entrant, and the
standard library is (in many routines) not at all.

The only non-re-entrant aspect of C itself is in its handling
of structures  returned from functions.

The problem of returning structures from functions has always
been one of where to put the structure.  It could easily go on
stack, except that the calling routine would have to allocate
space for it; but then if the routine was misdeclared, local
variables of the caller could get trashed.  (In C, this was
something to worry about.)  Steve Johnson, author of "pcc",
"resolved" this by reserving a static structure for the
routine, and returning a pointer to it.

This approach works fine, except that functions which return
structures are not re-entrant any more.  If a function is
called again between the times it assigns the first field
of a return value and time the first caller uses a field of the
returned structure, the first caller may receive trash. 
If such functions find their way into a standard library, it
may become very difficult to write a portable program which
requires re-entrancy.

The "pcc" method is no longer necessary.  Now that each
routine's type is known, space may be allocated for before
the call.  (Is it?)  Another problem, already present, now
becomes evident.  In returning a structure, one assigns to
fields of a local variable, then returns it.  This is copied
into the return-value structure (whether on the stack, or
elsewhere), after which the caller either copies it again
or uses one field.  This extra copying seems unnecessary.
To avoid it, some sort of "self" structure is needed.

I propose three alternatives:
given
	struct ab { int a, b };

1. the "null structure": that is, the following:

	struct ab func() {
		.a = 1;
		.b = 2;
	}		// implicit return, no copying needed

	The problem I see with this is twofold; first, what do you call
	the whole structure?  (say I want to pass a pointer to it down
	the line); second, could lint++ (or whatever) cope with such
	a method of returning a value?

2. the function-name structure (ala Pascal):

	struct ab func() {
		func.a = 1; func.b = 2;
		return func;	// type-checking resolves return value
	}

3. The keyword "this" may not be too heavily loaded to take on this
   chore.  A cursory study shows that it is only used now as a pointer:

	struct ab func() {
		this.a = 1; this.b = 2;
		return this;		// explicit return could be optional
	}

   I rather like this last one, though the "&this" problem may remain.

Clearly there are other alternatives.

Whichever method is used, backward compatibility is needed. 
The solution is easy: If a routine returns a pointer to its structure
(wherever it may be), then libraries compiled by the old
method will still work in new programs.

The subject of how to do re-entrant libraries I save for another posting.

	Nathan C. Myers
	{hplabs!hp-pcd | tektronix}!orstcs!nathan OR  nathan@oregon-state

rose@think.ARPA (John Rose) (02/27/86)

In article <34200001@orstcs.UUCP> nathan@orstcs.UUCP (nathan) writes:
>       In returning a structure, one assigns to
>fields of a local variable, then returns it.
>To avoid [exta copying], some sort of "self" structure is needed.
>I propose three alternatives:

>1.		.a = 1;
As you say, what do you call the whole structure?
Also, it uses an intriguing general syntax for
a very small purpose.

>2.	struct ab func() {
>		func.a = 1;
>		return func;	// type-checking resolves return value
>	}
Type checking does not resolve anything if there is a conversion
operator from (struct ab (*)()) to (struct ab)--not inconceivable.

>3.	struct ab func() {
>		this.a = 1; this.b = 2;
Confusing to have "this" be pointer or struct depending on usage?

>Clearly there are other alternatives.
How about something ugly, unmistakable, and specific:
4.		return.a = 1; return.b = 2; // and, subr(&return)

By the way, if the copying bothers you, you can use reference types
in new code.  E.g.:  struct ab& func() {....}

There's actually a deeper problem:  Values are returned not by
assignment but by initialization.  That's a big difference in C++.
So "subr(&return)" would pass a pointer to an uninitialized object
to subr(), and so on.  This is probably an unacceptible breach of
C++ data integrity rules.  So C++ syntax for getting at the
yet-to-be-returned structure is a bad idea.  But all those copying
steps mentioned are not a problem in C++.  This C code:
	temp.a = 1; temp.b = 2; return temp;
could be replaced by this C++ code:
	return ab(1, 2);
(assuming definition of ab::ab(,)) and the compiler is responsible
for making it efficient.

-- 
----------------------------------------------------------
John R. Rose, Thinking Machines Corporation, Cambridge, MA
245 First St., Cambridge, MA  02142  (617) 876-1111 X270
rose@think.arpa				ihnp4!think!rose

keith@cecil.UUCP (keith gorlen) (02/28/86)

>With that said, I would like to address an area in which I
>hope there is still room for flexibility: re-entrancy. 
>Modern C is in fact only partially re-entrant, and the
>standard library is (in many routines) not at all.

Let me second this motion!  The current implementation of
structure-returning functions in C is an accident waiting to happen, and
is inefficient as well.  I have a MASSCOMP with their "Real-Time" UNIX,
which has an asynchronous trap (AST) facility similar to that of RSX-11.
Its really handy, but I wonder how many programmers are aware of the
potential timing-dependent bugs that can occur because of the
non-reentrancy of the code produced by the C compiler.  With more talk
of adding real-time features to UNIX, its time to fix this defficiency.

As for how to correct this in C++, it appears that Bjarne Stroustrup has
already thought about this.  See Section 17 of his article "Operator
Overloading in C++", AT&T Bell Laboratories Technical Journal, Vol. 63,
No.8, October 1984 (I think).  Basically, the idea is that since the
caller knows that a structure is to be returned, it can pass the address
of where it wants the result placed in as a "hidden" argument (like
"this" is handled now for member functions) to the called function.  The
called function knows it is returning a structure, so it expects the
hidden result pointer and copies the local variable that appears in the
function return to this area.  The C++ translator could thus translate
structure-returning functions and function calls into C code that
doesn't use structure-returning functions, without any change to the C++
syntax.  I don't think C++ Release 1.0 does this, however.
-- 
---
	Keith Gorlen
	Computer Systems Laboratory
	Division of Computer Research and Technology
	National Institutes of Health
	Bethesda, MD 20892
	phone:	(301) 496-5363
	uucp:	{decvax!}seismo!elsie!cecil!keith

weemba@brahms.BERKELEY.EDU (Matthew P. Wiener) (02/28/86)

In article <34200001@orstcs.UUCP> nathan@orstcs.UUCP (nathan) writes:
>Comments on C++ opportunities
>
>The advent of a new language is an exciting time.  We each
>see areas where we would like development, old pet peeves
>to be fixed, features yearning to be added.  A strong
>guiding hand (used judiciously) becomes extremely valuable.
>It makes the difference between, for instance, C++ and Ada.

Here here.  I'd like to throw in one small request before C++ becomes
rigid: how about an exponentiation operator?  Just add one more infix
operator to the precedence table, stronger than multiplication and
division.  a^^b or a**b come to mind.  (The latter has the same ambiguity
as a/*b does, though.)
ucbvax!brahms!weemba	Matthew P Wiener/UCB Math Dept/Berkeley CA 94720

sam@delftcc.UUCP (02/28/86)

In article <34200001@orstcs.UUCP>, nathan@orstcs.UUCP (nathan) writes:
> The "pcc" method [for returning structures] is no longer necessary.
> Now that each routine's type is known, space may be allocated for
> before the call.  (Is it?)

Every routine's return type is supposed to be known in regular C, and
indeed, several C compilers (including some PCC-based ones) handle
structure returns the way you suggest.

In any case, as I understand it, C++ is implemented as an extra
pass which outputs C code (PCC intermediate code on some machines?),
and it can't affect the function calling sequence or any other part of
code generation.  Even if it could, you want to keep the calling
sequences of C and C++ the same.

> Another problem, already present, now
> becomes evident.  In returning a structure, one assigns to fields of a
> local variable, then returns it.  This is copied into the return-value
> structure (whether on the stack, or elsewhere), after which the caller
> either copies it again or uses one field.  This extra copying seems
> unnecessary.  To avoid it, some sort of "self" structure is needed.

An good optimizer should eliminate the local variable-to-return value
copy.  Question: are there any that do?

----
Sam Kendall			     allegra \
Delft Consulting Corp.		seismo!cmcl2  ! delftcc!sam
+1 212 243-8700			       ihnp4 /
ARPA: delftcc!sam@nyu.ARPA

mc68020@gilbbs.UUCP (Tom Keller) (03/05/86)

Hello.  We just got our newsfeed up about a week and a half ago.  I am
fascinated with what I am reading about C++.   I wonder if someone would
be kind enough to fill me in on exactly what it is, how one obtains it, 
etc. etc.



-- 

====================================

Disclaimer:  I hereby disclaim and and all responsibility for disclaimers.

tom keller
{ihnp4, dual}!ptsfa!gilbbs!mc68020

(* we may not be big, but we're small! *)

rfm@frog.UUCP (Bob Mabee, Software) (03/07/86)

In article <34200001@orstcs.UUCP> nathan@orstcs.UUCP (nathan) writes:
>The problem of returning structures from functions has always
>been one of where to put the structure.  It could easily go on
>stack, except that the calling routine would have to allocate
>space for it; but then if the routine was misdeclared, local
>variables of the caller could get trashed.  (In C, this was
>something to worry about.)  Steve Johnson, author of "pcc",
>"resolved" this by reserving a static structure for the
>routine, and returning a pointer to it.
>	Nathan C. Myers
>	{hplabs!hp-pcd | tektronix}!orstcs!nathan OR  nathan@oregon-state

You don't need to put the structure on the stack to get reentrancy.
Think of this return value as like a number in a register, just too
large to fit in the one or two registers normally allowed for return
values.  All you need is a virtual register: an agreed-upon place in
memory (static) which all interrupt paths are bound to save and restore
just like the real registers.  Most linkers have the ability to allocate
such a data object with the maximum size of any reference.  If the linker
will not provide the size of that object, however, you may need to create
a convention such as putting the size of the current value in another
known variable.

jbn@wdl1.UUCP (03/15/86)

      This can be done without any new syntax.  First,
the right way to handle function return values is to have the caller
provide the space for them, of course.  Even in stock C the caller
has enough information to do this, since the caller is supposed to
have a skeletal function definition available if the function returns
something other than an "int".  (If your function returns a long, for
example, you are supposed to have "long foo()" visible to the caller,
and if you don't, results on a 16-bit machine will probably be disappointing.)
So the caller has enough information to allocate the object at call time.
      A useful optimization by the caller avoids the copy.  The
caller should treat large returned values as arguments passed by address.
In other words, when the returned value is large, it should be treated
as an additional parameter to the function passed by address.
If the result is used in an expression, the caller will have to allocate
temporary space for it.  But if the result is simply used in a replacement,
as in
		struct ssstruct {
			int ff;
			char cc;
			};
		/*	function foo	*/
		struct ss foo(p)
		int p;
		{ ... }
		struct ssstruct ss;
		ss = foo(1);

the compiler should observe that the immediate use of the result is in an
assignment, and should generate code much as if the call were

		foo(1,&ss);

Of course, if the call looks like

		int x;
		x = foo(1).ff;

the compiler has to generate something like

		{	struct sstruct TEMPXXX1;
			foo(p1,&TEMPXXX1);
			x = TEMPXXX1.ff;
		}

But I suspect that the optimized case appears more often in practice than
the nonoptimized one.

				John Nagle