[comp.lang.c++] cast of a bound pointer: why ?

pauld@cs.washington.edu (Paul Barton-Davis) (06/12/91)

Something that's been bothering me for a little while: we have some
code that often casts a bound pointer to a member function:

	class Thread;
	class Scheduluer;

	Thread thread = new Thread;
	instance =  new Scheduler;

	...

	thread->start(int (*)()) instance->member_function, arg);

being the most common example. I notice from browsing the
anachronism's section of the AT&T 2.1 reference manual (not the ARM,
but the manual that comes with the compiler itself), that this is now
deemed an anachronism, raising the possibility that it may never be
allowed in future versions.

My understanding of this position is that it really boils down to an
syntactic nicety:

	ptr to member function != ptr to function

Can anyone explain the necessity for this distinction, and perhaps
more importantly, can anyone comment on the possibility that it will
no longer be possible to use this construct ?
-- 
Paul Barton-Davis                                 <pauld@cs.washington.edu>

Man has survived because he did not know how to realize his wishes.
Now that he can realize them, he must either change them, or perish.

steve@taumet.com (Stephen Clamage) (06/12/91)

pauld@cs.washington.edu (Paul Barton-Davis) writes:

>My understanding of this position is that it really boils down to an
>syntactic nicety:

>	ptr to member function != ptr to function

>Can anyone explain the necessity for this distinction, and perhaps
>more importantly, can anyone comment on the possibility that it will
>no longer be possible to use this construct ?

It is not a syntactic nicety, but a fundamental semantic issue.

A non-static member function requires a "this" pointer, which is
implicitly supplied by the compiler when you call it.  That is, given
	class C { ... foo(); ... } c;
when you write
	c.foo();
it really means something like this (with artificial syntax):
	C::foo(&c);

If you take the address of C::foo, cast it to an ordinary function
pointer, and call foo via that pointer, no "this" pointer is supplied.
	int (*fp)() = (int(*)())C::foo;
	fp();		// call C::foo with no "this" pointer
Function C::foo presumably uses "this", and will fail.

Now suppose foo() is a virtual function.  The actual function to be
called depends on the actual object it is called in conjuction with:
	C *p = ...;
	p->foo();
might call C::foo(), or another foo() in class C's hierarchy.  Therefore,
the pointer-to-member-function must carry with it enough data about 
the virtual function so that the right virtual function is called.
A simple address is not sufficient.  (There are further complications
with multiple inheritance when adjustment to "this" is required.)
-- 

Steve Clamage, TauMetric Corp, steve@taumet.com

pauld@stowe.cs.washington.edu (Paul Barton-Davis) (06/13/91)

In article <763@taumet.com> steve@taumet.com (Stephen Clamage) writes:
>pauld@cs.washington.edu (Paul Barton-Davis) writes:
>
>>My understanding of this position is that it really boils down to an
>>syntactic nicety:
>
>>	ptr to member function != ptr to function
>
>>Can anyone explain the necessity for this distinction, and perhaps
>>more importantly, can anyone comment on the possibility that it will
>>no longer be possible to use this construct ?
>
>It is not a syntactic nicety, but a fundamental semantic issue.
>
>A non-static member function requires a "this" pointer, which is
>implicitly supplied by the compiler when you call it.  That is, given
>	class C { ... foo(); ... } c;
>when you write
>	c.foo();
>it really means something like this (with artificial syntax):
>	C::foo(&c);
>
>If you take the address of C::foo, cast it to an ordinary function
>pointer, and call foo via that pointer, no "this" pointer is supplied.
>	int (*fp)() = (int(*)())C::foo;
>	fp();		// call C::foo with no "this" pointer
>Function C::foo presumably uses "this", and will fail.

This is the heart of the matter. C++ assumes that EVERY function
*requires* a "this" pointer. If one is using C++ for "systems"
programming, there are a number of cases where this is not true. The
particular one I'm facing involves the use of a much lower level call
(via direct asm macros) to a function whose address is already known.
The stack and frame pointers are set up *by the C++ code itself*
before calling, and it knows exactly what to pass the function.

Although I appreciate the idea behind the "this" pointer, it seems to
me that there ought to be a way to access a pointer to a member
function as if it were a pointer to a function. The fact that there
are supposed to be some conventions for how a member function gets
called is a side issue, and can be dealt with by noting that "any use
of the pointer is undefined", meaning that the implementor is fully
responsible for ensuring compatibility between the calling conventions
of a C++ implementation and their use of the pointer. That should not
mean that there is no way to obtain the pointer however, and it looks
as if this is the plan in some future release.

So, let me put these criticisms aside, and ask what a better solution is.
The context is a threads package. Each thread, when started, runs a
function specified by a member function call to "start":

	Thread::start (object, function, ...)

where function will be invoked with "object" as a sort of "this", and
any number of additional arguments can be passed to function. The
problem is that a thread is not allowed to actually start itself
running, but has to wait until the scheduler gets around to giving it
time. Hence, between the time a thread calls start() and the time it
actually begins, it is a state of "suspended animation", which it
enters AFTER having kept a record of the calling state it should be in
when it actually begins.

The scheduler uses some asm macros to put together a stack frame, and
then gets the thread on the move by an explicit assembly call to the
function. This means that it MUST have a pointer to the relevant
function, which some future implementation of C++ might make it
impossible to obtain.

Clearly this is only a problem when the function that a thread is
supposed to execute is a member function. However, since this is
nearly always the case (this is C++, after all), it represents a real
difficulty. 

Any ideas on how one would do this if it were not possible (as it
currently is) to obtain the actual address of a member function ?
I have a few ideas myself, but would like to see some others if there
are any.

-- 
Paul Barton-Davis                                 <pauld@cs.washington.edu>

Man has survived because he did not know how to realize his wishes.
Now that he can realize them, he must either change them, or perish.

barmar@think.com (Barry Margolin) (06/13/91)

In article <1991Jun12.180414.16718@beaver.cs.washington.edu> pauld@stowe.cs.washington.edu (Paul Barton-Davis) writes:
>This is the heart of the matter. C++ assumes that EVERY function
>*requires* a "this" pointer. 

No, only member functions.

>Any ideas on how one would do this if it were not possible (as it
>currently is) to obtain the actual address of a member function ?
>I have a few ideas myself, but would like to see some others if there
>are any.

Any problem (except efficiency, maybe) in computer science can be solved by
adding enough levels of indirection.

In this case, define a non-member function that takes the class object as a
regular argument and simply invokes the member function, and pass the
address of this non-member function.  The non-member function would look
something like this:

int nonmember_fn (my_class object, int arg1, char *arg2) {
	return object.member_fn (arg1, arg2);
}

Actually, it might be a little easier in your case to use a my_class& first
argument; otherwise, your assembly routine might have to invoke the class's
memberwise initialization constructor during the calling sequence (this
would actually be the case as well if any of the regular arguments were
class objects).

-- 
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

glenn@bitstream.com (Glenn P. Parker) (06/13/91)

In article <1991Jun12.192101.20607@Think.COM> barmar@think.com (Barry Margolin) writes:
> In article <1991Jun12.180414.16718@beaver.cs.washington.edu> pauld@stowe.cs.washington.edu (Paul Barton-Davis) writes:
> >This is the heart of the matter. C++ assumes that EVERY function
> >*requires* a "this" pointer. 
> 
> No, only member functions.

Close.  Only non-static member functions require a "this" pointer.

> >Any ideas on how one would do this if it were not possible (as it
> >currently is) to obtain the actual address of a member function ?
> >I have a few ideas myself, but would like to see some others if there
> >are any.

It *is* possible to take the address of a static member function.

> Any problem (except efficiency, maybe) in computer science can be solved by
> adding enough levels of indirection.
> 
> In this case, define a non-member function that takes the class object as a
> regular argument and simply invokes the member function, and pass the
> address of this non-member function.  The non-member function would look
> something like this:
> 
> int nonmember_fn (my_class object, int arg1, char *arg2) {
> 	return object.member_fn (arg1, arg2);
> }

Or, define a static member function, and let it do all the work.
Like so:

    class foo
    {
      public:
        // Don't confuse the following "static" with file static linkage.
	static int bar(foo*, int arg1, char* arg2);
      private:
	int baz;
    };

    int foo::bar(foo* self, int arg1, char* arg2)
    {
	// Use self->baz to access private data member.
    }

The address of foo::bar is &foo::bar.

--
Glenn P. Parker       glenn@bitstream.com       Bitstream, Inc.
                      uunet!huxley!glenn        215 First Street
                      BIX: parker               Cambridge, MA 02142-1270

steve@taumet.com (Stephen Clamage) (06/13/91)

pauld@stowe.cs.washington.edu (Paul Barton-Davis) writes:

>This is the heart of the matter. C++ assumes that EVERY function
>*requires* a "this" pointer. If one is using C++ for "systems"
>programming, there are a number of cases where this is not true. The
>particular one I'm facing involves the use of a much lower level call
>(via direct asm macros) to a function whose address is already known.
>The stack and frame pointers are set up *by the C++ code itself*
>before calling, and it knows exactly what to pass the function.

First of all, only non-static class member functions have a "this"
pointer.  You can declare a static member function, or a friend
function, or even an ordinary function which takes a parameter
of class pointer type and achieve the same effect.  For example,
instead of this:
	class C {
	    ...
	    foo(int);
	};
use this:
	class C {
	    ...
	    friend foo(C*, int);
	};
Now you can take foo's address and call it as any ordinary function,
which it is.  This is appropriate when interfacing to low-level
system calls, but not necessarily for high-level C++ programming.

What you describe is very system-specific, and far from being portable
to other systems, may not even survive the next release of the same
C++ compiler on the same system.

It is not appropriate, IMHO, to mix low-level assembler interface code
with C++.  If you have to do work in assembler, write that code in
assembler, but give it a standard function interface so it can be
called from C++.  Then your C++ code remains portable.  The non-
portable bits are well-isolated in assembler files.

You can also use C functions similarly.  For example, function foo
above coulde be declared extern "C" and do whatever you want.
-- 

Steve Clamage, TauMetric Corp, steve@taumet.com

jgro@lia (Jeremy Grodberg) (06/15/91)

In article <1991Jun11.191016.9873@beaver.cs.washington.edu> pauld@cs.washington.edu (Paul Barton-Davis) writes:
>[...]
>My understanding of this position is that it really boils down to an
>syntactic nicety:
>
>	ptr to member function != ptr to function
>
>Can anyone explain the necessity for this distinction, and perhaps
>more importantly, can anyone comment on the possibility that it will
>no longer be possible to use this construct ?

The difference is that member functions have an implied (or hidden) 
first argument which is the "this" pointer.  Thus a pointer to a member
function of class T taking an int is really a pointer to a function
taking a pointer to an object of class T, and an int.  Thus, member 
functions are not interchangable with non-member functions (the exception
being static member functions, which do *not* take a "this" pointer).

You could argue that you should be able to cast a member function to a 
non-member function with an object pointer, but a) this would not 
buy you much, if anything, and b) it would destroy the abstraction of 
member functions being associated with object, and c) (I believe) the
position of the "this" pointer in the function call is implementation 
dependant.

So conversions from pointer to member function to pointer to non-member
function (and vice versa) will be forbidden as soon as is practical.

-- 
Jeremy Grodberg      "Show me a new widget that's bug-free, and I'll show
jgro@lia.com         you something that's been through several releases."

jimad@microsoft.UUCP (Jim ADCOCK) (06/19/91)

In article <1991Jun12.180414.16718@beaver.cs.washington.edu| pauld@stowe.cs.washington.edu (Paul Barton-Davis) writes:

|Although I appreciate the idea behind the "this" pointer, it seems to
|me that there ought to be a way to access a pointer to a member
|function as if it were a pointer to a function. The fact that there
|are supposed to be some conventions for how a member function gets
|called is a side issue, and can be dealt with by noting that "any use
|of the pointer is undefined", meaning that the implementor is fully
|responsible for ensuring compatibility between the calling conventions
|of a C++ implementation and their use of the pointer. That should not
|mean that there is no way to obtain the pointer however, and it looks
|as if this is the plan in some future release.

Okay, since you've already agreed that "any use of the pointer is 
undefined" -- then the problem is solved.  Just define your "pointer" to be
the standard C++ member function pointer, and refuse to call that
pointer using standard C++ syntax.  Given that you are willing to
refuse to use that pointer using standard C++ syntax, you have exactly
what you're asking for: namely a pointer whose use is undefined.

Note there are some "special" aspects about member function pointers that you 
need to be aware of:

* on many C++ implementations member function pointers are more the 32 bits

* its unlikely that any two independently created C++ compilers will
  use the same internal bit representation for these pointers of more than
  32 bits.

But then, what can one *do* with these pointers whose use is undefined?

Well, one can note that their use is only undefined within the standard C++
language:

* One can still use assembly language to do what you want to them.

* One can still use "C" to do what you want to them.

* One can still use "C++" and unions to do what you want to them.

Well, what *can't* one do with these pointers then?

* One can't insist that all C++ implementations use the same calling
conventions for member functions -- which I think was the whole reason
for not requiring a standard conversion in the first place.  Rather,
compiler implementors are given the freedom to use whatever 
member function calling conventions are best for their architectures
and their customers.  

[If, for example, member function pointers where defined 

	*in the C++ language* 

to just be the same as "C" functions with the this pointer passed 
as the first parameter, then C++ member functions
would have to use "C" calling convention -- but many C++ compilers
are already using more efficient calling conventions for member
functions than this!]

Such is the difference between *language* and *implementation*.
The C++ *language* needs to give the C++ *implementers* the freedom
to write compilers making the best use of present -- and future --
CPU architectures.

PS:

Okay, lets say one decides to use C++ unions to extract the 32-bit byte
address of a member function.  One still has to figure out how to 
correctly set up the call stack [or the appropriate registers in 
register calling-convention compilers.]  Given that some C++ implementations
have different C++/member-function calling conventions than C/C-calling-
convention functions -- one may still have to use some macros, assembly,
or whatever, to complete the job.

PPS:

Given the difference between language and implementation, there's 
nothing much to keep your favorite compiler vendor from adding some
optional extensions to their implementation [assuming their implementation
uses standard "C" calling convention] to do what you originally asked
for.  They, and their customers, just have to be willing to put up with
a little slower code.  If that's what you think is best, go lobby your
vendor for it!

PPPS:

Personally, *I* think C++ compiler vendors should aim for the fastest 
possible code, and should leave out hack "features" that lead to slower
code.  [cast from const being one such "feature"]