[comp.lang.c++] Virtual things, lists, collections, ...

miller@FS1.cam.nist.gov (Bruce R. Miller) (06/14/91)

All this talk about various ways to do lists & collections, type casting
vs. sub-classes vs. templates, has got me wondering.
First off, I hope that I am not approaching this thing from either a too
naive standpoint, or too biased (I'm a hard-core lisper, flavors &
CLOS).  Maybe I'd understand it if I studied the books some more, but...

Just how is it that Virtual functions work?  ie. Where's the type info
at run type stored.  Do the objects have (hidden) headers?

Suppose I've defined a class BASE which has a function which does
something, say SCAN, in a generic way and invokes one virtual function within
it, say VIRT. An arbitrary number of sub-classes later, say class Z, I
define the virtual function for a Z.

As I understand it the code for SCAN is not replicated (there's only
one version in the object code).  So somehow that code (or some dispatch
code for VIRT) figures out what the arg is and dispatches.  How?  I can
think of two ways offhand; 1) every class instance has a header with its type
encoded -- which might be nice to know if you are worried about
tradeoffs of space -- and that might only be needed if the class (or its
children?) define virtuals (can the compiler tell?).  2) whenever a
virtual (or caller of virtual! or caller of....) is called an extra type
code is pushed on the stack (again, can the compiler tell?).

Is either of these the `right' way? or both? (so long as it works, it
doesn't matter...)

At least if 1) were the answer, then it seems that a lot of the problems
of collections are simplified if you've got ONE base class for
everything you are going to use.  And, you can easily have an individual
container which contains various objects of _different_ classes in the
same list. [Natural for lispers, but nobody seems to be considering that here]

But, I understand the problem there:  C++ declined to define a single
base class that EVERYTHING is an instance of (like class T in CLOS).
So everybody makes their own; Every library I've broused defines its OWN
class GenericThingy!  Gets messy!

If C++ would define it, that would seem to handle half the cases people
seem to want templates for, doesn't it?  Templates vs GenericThingy
classes seems to be a tradeoff with:
   Template               vs.   GenericThingy
   replicated object code vs.  not replicated.
   `fast' function call   vs.  dispatch
Missing anything?

OK, I've rambled on, speculated, wondered outloud...
Flame me or congratulate me on my insight, but at least, TEACH ME!

sarima@tdatirv.UUCP (Stanley Friesen) (06/15/91)

In article <2885836642@ARTEMIS.cam.nist.gov> miller@FS1.cam.nist.gov (Bruce R. Miller) writes:
>Just how is it that Virtual functions work?  ie. Where's the type info
>at run type stored.  Do the objects have (hidden) headers?

This is implementation dependent.  Any mechanism that provides the required
semantics is acceptible. (In real life it must also provide reasonable
performance).

Thus *any* dependence on the implementation of virtual functions is 
intrinsicly unportable.

>As I understand it the code for SCAN is not replicated (there's only
>one version in the object code).

Not necessarily, the best compilers do indeed accomplish this in most
cases, but it is not guarenteed. (Though it is guarenteed that you will
never get any multiple definition errors from linking code using virtual
functions).

> So somehow that code (or some dispatch
>code for VIRT) figures out what the arg is and dispatches.  How?  I can
>think of two ways offhand; 1) every class instance has a header with its type
>encoded -- which might be nice to know if you are worried about
>tradeoffs of space -- and that might only be needed if the class (or its
>children?) define virtuals (can the compiler tell?).  2) whenever a
>virtual (or caller of virtual! or caller of....) is called an extra type
>code is pushed on the stack (again, can the compiler tell?).
 
>Is either of these the `right' way? or both? (so long as it works, it
>doesn't matter...)

I think either would be a legal implementation. But I do not know of any
existing compiler that uses either one.

Most existing compilers implement virtual functions by adding an extra
hidden data member to the class record layout.  This member is a pointer
to a table of function pointers, with one entry per virtual function.
[This table is traditionally called the vtable]. The constructor for each
class (including each derived class) is augmented with code to initialize
this pointer to the proper value for the actual class of the object. This
is possible because you must know the exact type of an object to construct
it, so the constructor "knows" the type exactly.  A call to a virtual
function is then simply indirected through the vtable. In pseudo-C this is
essentially (*virt_object.__vtbl_ptr__[FUNC_ID])(args).

Now, remember - DO NOT RELY ON THIS IMPLEMENTATION - it is NOT guarenteed.

-- 
---------------
uunet!tdatirv!sarima				(Stanley Friesen)

miller@FS1.cam.nist.gov (Bruce R. Miller) (06/17/91)

In article <38@tdatirv.UUCP>, Stanley Friesen writes: 
> In article <2885836642@ARTEMIS.cam.nist.gov> miller@FS1.cam.nist.gov (Bruce R. Miller) writes:
> >Just how is it that Virtual functions work?  ie. Where's the type info
> >at run type stored.  Do the objects have (hidden) headers?
> ...
Thanks for the info.

> Now, remember - DO NOT RELY ON THIS IMPLEMENTATION - it is NOT guarenteed.

and thanks for the reminder.

I wasn't really interested in _Using_ the information to hack up mystery
code.  But to try to understand what C++ `typically` does with a
program.  Given all the options in C++, I need a better model of what
goes on underneath to choose the `best' way to code foo.