[comp.object] Shouldn't debuggers be able to debug dynamic objects?

andrew@resam.dk (Leif Andrew Rump) (01/24/91)

I experienced a rather annoying feature in TurboPascal 5.5 (but I
think the problem may exist in a lot of other (integrated) debuggers
used to debug object oriented programmes.
The program enclosed create two (2) dynamic objects: a square & a
box.
The first thing to notice is that the compiler only allow squareptr
to be used as a general object for squares & boxes! Well I could
accept that _IF_ the debugger (integrated & external (TD)) at least
would notice when I create a box! But no - it only display the
content of a square - the length has to be extracted manually!!!

program dynamic;

type
  generalptr = ^general;
  general =
    object
      destructor done;               virtual;
    end;

  squareptr = ^square;
  square =
    object(general)
      height, width : real;

      constructor init(h, w : real);
    end;

  boxptr = ^box;
  box =
    object(square)
      length : real;

      constructor init(h, w, l : real);
    end;

  destructor general.done; begin end; (* everything is done behind our back *)

  constructor square.init(h, w : real);
  begin
    height := h;
    width := w;
  end;

  constructor box.init(h, w, l : real);
  begin
    square.init(h, w);
    length := l;
  end;

const
  n = 2;

var
  general_object : array(.1..n.) of squareptr;
  nr : integer;

begin
  for nr := 1 to n do
    if odd(nr) then
      general_object(.nr.) := new(squareptr, init(3, 4))
    else
      general_object(.nr.) := new(boxptr, init(3, 4, 2));

  for nr := 1 to n do
    dispose(general_object(.nr.), done);
end.


Leif Andrew Rump, AmbraSoft A/S, Stroedamvej 50, DK-2100 Copenhagen OE, Denmark
UUCP: andrew@ambra.dk, phone: +45 39 27 11 77                /
Currently at Scandinavian Airline Systems                =======/
UUCP: andrew@resam.dk, phone: +45 32 32 51 54                \
SAS, RESAM Project Office, CPHML-V, P.O.BOX 150, DK-2770 Kastrup, Denmark

> > Read oe as: o <backspace> / (slash) and OE as O <backspace> / (slash) < <

bytor@ctt.bellcore.com (Ross Huitt) (01/25/91)

In article <1991Jan24.152256.24273@resam.dk> andrew@resam.dk (Leif Andrew Rump) writes:
>I experienced a rather annoying feature in TurboPascal 5.5 (but I
>think the problem may exist in a lot of other (integrated) debuggers
>used to debug object oriented programmes.
>The program enclosed create two (2) dynamic objects: a square & a
>box.
>The first thing to notice is that the compiler only allow squareptr
>to be used as a general object for squares & boxes! Well I could
>accept that _IF_ the debugger (integrated & external (TD)) at least
>would notice when I create a box! But no - it only display the
>content of a square - the length has to be extracted manually!!!
>
...code deleted
>
>Leif Andrew Rump, AmbraSoft A/S, Stroedamvej 50, DK-2100 Copenhagen OE, Denmark
>UUCP: andrew@ambra.dk, phone: +45 39 27 11 77                /
>Currently at Scandinavian Airline Systems                =======/
>UUCP: andrew@resam.dk, phone: +45 32 32 51 54                \
>SAS, RESAM Project Office, CPHML-V, P.O.BOX 150, DK-2770 Kastrup, Denmark
>
>> > Read oe as: o <backspace> / (slash) and OE as O <backspace> / (slash) < <

When I _know_ which derived class a base class ptr is pointing to
and I want to inspect or display the object as a derived class, a cast
is usually sufficient to get the job done.  With the Turbo products,
simply type in 'squareptr(general_object(1))' at the Display or 
Evaluate prompts.  I don't know for sure that the technique works with
Turbo Pascal or if this particular syntax is correct, but I do use the
technique with C++ frequently.

	-ross aka bytor@ctt.bellcore.com

andrew@resam.dk (Leif Andrew Rump) (01/28/91)

In <1991Jan25.145947.16431@bellcore.bellcore.com> bytor@ctt.bellcore.com (Ross Huitt) writes:

>In article <1991Jan24.152256.24273@resam.dk> andrew@resam.dk (Leif Andrew Rump) writes:
>>I experienced a rather annoying feature in TurboPascal 5.5 (but I
>>think the problem may exist in a lot of other (integrated) debuggers
>>used to debug object oriented programmes.
>>The program enclosed create two (2) dynamic objects: a square & a
>>box.
>>The first thing to notice is that the compiler only allow squareptr
>>to be used as a general object for squares & boxes! Well I could
>>accept that _IF_ the debugger (integrated & external (TD)) at least
>>would notice when I create a box! But no - it only display the
>>content of a square - the length has to be extracted manually!!!
>>
>...code deleted
>>

>When I _know_ which derived class a base class ptr is pointing to
>and I want to inspect or display the object as a derived class, a cast
>is usually sufficient to get the job done.  With the Turbo products,
>simply type in 'squareptr(general_object(1))' at the Display or 
>Evaluate prompts.  I don't know for sure that the technique works with
>Turbo Pascal or if this particular syntax is correct, but I do use the
>technique with C++ frequently.

But, but, but, ... Why can't the debugger that has _all_ the information
do this job! The main reason for me to write this letter was that one
of my programmes used a dynamic list of objects and if I was going to
use the debugger to show me the content of for example the first three
objects in a list a had to remember what type each object was and change
casts accordingly - then (as I wrote in my original letter) I surely
sill make mistakes and then the debugger only caused havoc!!!

Leif Andrew


Leif Andrew Rump, AmbraSoft A/S, Stroedamvej 50, DK-2100 Copenhagen OE, Denmark
UUCP: andrew@ambra.dk, phone: +45 39 27 11 77                /
Currently at Scandinavian Airline Systems                =======/
UUCP: andrew@resam.dk, phone: +45 32 32 51 54                \
SAS, RESAM Project Office, CPHML-V, P.O.BOX 150, DK-2770 Kastrup, Denmark

> > Read oe as: o <backspace> / (slash) and OE as O <backspace> / (slash) < <

bytor@ctt.bellcore.com (Ross Huitt) (01/28/91)

In article <1991Jan28.100213.1822@resam.dk> andrew@resam.dk (Leif Andrew Rump) writes:
>In <1991Jan25.145947.16431@bellcore.bellcore.com> bytor@ctt.bellcore.com (Ross Huitt) writes:
>
>>When I _know_ which derived class a base class ptr is pointing to
>>and I want to inspect or display the object as a derived class, a cast
>>is usually sufficient to get the job done. 
>
>But, but, but, ... Why can't the debugger that has _all_ the information
>do this job! The main reason for me to write this letter was that one
>of my programmes used a dynamic list of objects and if I was going to
>use the debugger to show me the content of for example the first three
>objects in a list a had to remember what type each object was and change
>casts accordingly - then (as I wrote in my original letter) I surely
>sill make mistakes and then the debugger only caused havoc!!!
>
>Leif Andrew
>
>UUCP: andrew@resam.dk, phone: +45 32 32 51 54                \
>> > Read oe as: o <backspace> / (slash) and OE as O <backspace> / (slash) < <

The debugger does not necessarily have _all_ of the information for
objects that are dyanmically allocated.  If a derived object is later
referenced by a base pointer the only piece of information I can
think of that a debugger could use to determine the compile-time type of
would be the pointer to the virtual function table.  If the classes
involved don't have a virtuals then even this won't work.  (I know
there are other cases, too, so don't bother flaming me.) To accomplish
what you want would require support from the compiler such as embedding
a class signature in each instance of a class.  The debugger would then
read this signature to determine the class of the object regardless
of the class of the pointer.  This level of meta-class support would
be very welcome especially if we programmers had access to it.  Its
one aspect of Smalltalk that I miss very much in C++. BTW, there
are programming techniques that you can use to determine the classes of your
objects at run time.  You would still have to use casts in the debugger,
but at least you could execute a member function to determine the
class of the object before you used a cast. 

	-bytor@ctt.bellcore.com

mas@genrad.com (Mark A. Swanson) (01/29/91)

Determining what a dynamic object is is not always possible in raw C++.
However, if one is developing within a development environment with a standard
set of object coding conventions (e.g. a required char *myclass() method)
an integrated debugger can use them to provide such functionality.

Such a layered approach is practical now and avoids having to wait years
for invention and agreement on how best to support meta-identities without
adding to C++'s minimal object overhead.

shankar@hplego.cup.hp.com (Shankar Unni) (01/30/91)

bytor@ctt.bellcore.com (Ross Huitt) writes:

> >But, but, but, ... Why can't the debugger that has _all_ the information
> >do this job! [ ... ]

> The debugger does not necessarily have _all_ of the information for
> objects that are dyanmically allocated.  If a derived object is later
> referenced by a base pointer the only piece of information I can
> think of that a debugger could use to determine the compile-time type of
> would be the pointer to the virtual function table.  If the classes
> involved don't have a virtuals then even this won't work. [...]

The HP C++ debugger (on HP-UX) uses exactly this sort of scheme to figure
out what the real type of the object is. And of course, as you point out,
if the classes involved do not have any virtual functions, it is helpless
as far as figuring out the "real type".

In the latter case, it is possible to rationalize this behavior by saying
that the program itself cannot distinguish between a base_ptr pointing to
a base and a base_ptr pointing to a derived when there are no virtual
functions involved...

(Of course, the compiler could have gone and stuck a type-id into each object,
but that would break lots of other things)
-----
Shankar Unni                                   E-Mail:
Hewlett-Packard California Language Lab.     Internet: shankar@hpda.hp.com
Phone : (408) 447-5797                           UUCP: ...!hplabs!hpda!shankar

DISCLAIMER:
This response does not represent the official position of, or statement by,
the Hewlett-Packard Company.  The above data is provided for informational
purposes only.  It is supplied without warranty of any kind.

tma@m5.COM (Tim Atkins) (02/28/91)

In article <40541@genrad.UUCP> mas@crom.genrad.COM (Mark A. Swanson) writes:
>Determining what a dynamic object is is not always possible in raw C++.
>However, if one is developing within a development environment with a standard
>set of object coding conventions (e.g. a required char *myclass() method)
>an integrated debugger can use them to provide such functionality.
>
>Such a layered approach is practical now and avoids having to wait years
>for invention and agreement on how best to support meta-identities without
>adding to C++'s minimal object overhead.

I fail to understand the phrase "minimal object overhead" regarding
C++.  Surely such can not be claimed in terms of space.  If one cares
about issues of polymorphism for a given set of classes then each instance
will contain an overhead of at least a 4 byte virtual table pointer for
every class in its inheritance hierarchy than defines virtual functions.
In comparison, languages that only include a single pointer to the object's
class where information allowing message binding is to be found are much
more minimal in terms of space overhead.  Nor is this the complete story
on C++ space overhead.  Due to the way MI is implemented in relation to
the implementation requirement that all superclass variables be contiquous,
it is quite possible that there is duplication of entire substructures
in the C++ instance.   I understand that the goal of these implementation
practices is greater run-time efficiency,  but certainly no one should
claim they came without price nor that C++ acheives "minimal object
overhead".

I also have considerable problem with C++ in that given an instance one
cannot know its class at run-time.  In my view this means that the object
is in some sense not truly first class and posseses only a partial identity.
Among other problems this totally blows away encapsulation in that the
behavior of the object and indeed any metalevel information is known only
to the compiler.  Even within cfront this information is never explicitly
present from what I have seen.   C++ instances are really only data structures
with most of their knowledge not present at all within running programs.  They
are not objects in the sense that they have knowledge present within themselves
(or their class) of their own structure or behavior.  They literally do not
know what they are.

- Tim

jbuck@galileo.berkeley.edu (Joe Buck) (03/01/91)

In article <4578@m5.COM>, tma@m5.COM (Tim Atkins) writes:
|> I fail to understand the phrase "minimal object overhead" regarding
|> C++.  Surely such can not be claimed in terms of space.  If one cares
|> about issues of polymorphism for a given set of classes then each instance
|> will contain an overhead of at least a 4 byte virtual table pointer for
|> every class in its inheritance hierarchy than defines virtual functions.

Perhaps an implementation could be written this way, but both cfront and g++
use only a single pointer per object, regardless of its class hierarchy.

|> In comparison, languages that only include a single pointer to the object's
|> class where information allowing message binding is to be found are much
|> more minimal in terms of space overhead.

You've just described C++ (there is only a single pointer to the object's
virtual function table per object in most implementations, and there is
only one virtual function table per class).  However, since C++ also
allows classes with no virtual functions, it's possible to have no
overhead at all.

!> Nor is this the complete story
|> on C++ space overhead.  Due to the way MI is implemented in relation to
|> the implementation requirement that all superclass variables be contiquous,
|> it is quite possible that there is duplication of entire substructures
|> in the C++ instance.   I understand that the goal of these implementation
|> practices is greater run-time efficiency,  but certainly no one should
|> claim they came without price nor that C++ acheives "minimal object
|> overhead".

Will you actually learn the language before you criticize it?  You're
stating imaginary requirements and reaching false conclusions.  Base
classes can be declared as virtual and shared if desired in multiple
inheritance hierarchies.  Alternatively, there can be duplication if
the programmer chooses.  All of these "wasted space" arguments you're
making are based on misunderstandings of the language.

There is no implementation requirement that all superclass variables
be continuous.
 
|> I also have considerable problem with C++

I think I'll just cut you off right there.  It seems appropriate. :-)

C++ certainly has flaws, but you aren't yet competent enough to write
about them since your understanding of the language has serious holes.

--
Joe Buck
jbuck@galileo.berkeley.edu	 {uunet,ucbvax}!galileo.berkeley.edu!jbuck

peter@prefect.Berkeley.EDU (Peter Moore) (03/02/91)

In article <4578@m5.COM>, tma@m5.COM (Tim Atkins) writes:
|> I fail to understand the phrase "minimal object overhead" regarding
|> C++.  Surely such can not be claimed in terms of space.  If one cares
|> about issues of polymorphism for a given set of classes then each instance
|> will contain an overhead of at least a 4 byte virtual table pointer for
|> every class in its inheritance hierarchy than defines virtual functions.

You are mistaking features of one particular implementation of C++ with
features of the language itself.  Virtual tables (vtbls for short) ARE
NOT PART OF THE LANGUAGE.  They are merely one way of implementing the
virtual function behavior of the language.

Secondly, the particular implementation you are probably refering to,
cfront from ATT, does NOT have a vtbl pointer for every super class
that has virtual functions.  In the single inheritance case, there is
atmost one vtbl pointer no matter how many classes in the inheritance
chain have virtual functions.  Multiple inheritance is more
complicated.

|> Due to the way MI is implemented in relation to
|> the implementation requirement that all superclass variables be contiquous,
|> it is quite possible that there is duplication of entire substructures
|> in the C++ instance.

Again, this is NOT A CHARACTERISTIC of the langauge.  The people at Tau Metric
and Oregon Software would probably be glad to tell you about how their compiler
doesn't make any redundant copies of base classes.


	Peter Moore

jimad@microsoft.UUCP (Jim ADCOCK) (03/05/91)

Sorry, but I really have to question how familiar this poster is
with C++:

In article <4578@m5.COM> tma@m5.UUCP (Tim Atkins) writes:
|I fail to understand the phrase "minimal object overhead" regarding
|C++.  Surely such can not be claimed in terms of space.  If one cares
|about issues of polymorphism for a given set of classes then each instance
|will contain an overhead of at least a 4 byte virtual table pointer for
|every class in its inheritance hierarchy than defines virtual functions.

I have never seen any C++ compiler that requires 4 bytes per object
per class in the inheritence hierarchy.  Most C++ objects as compiled
by most C++ compilers only require a 2/4 byte overhead per object 
regardless of how many classes are in the inheritence hierarchy.
This is regardless of how many places in the inheritence hierarchy
virtual functions are defined.  In the infrequent case of multiple
inheritence, one common C++ implementation scheme requires a extra
2/4 byte overhead per additional class being immediately 
[multiply] inherited from.  But this is still far less than the 4 bytes
per class in the inheritence hierarchy you claim.

|In comparison, languages that only include a single pointer to the object's
|class where information allowing message binding is to be found are much
|more minimal in terms of space overhead.  Nor is this the complete story
|on C++ space overhead.  Due to the way MI is implemented in relation to
|the implementation requirement that all superclass variables be contiquous,
|it is quite possible that there is duplication of entire substructures
|in the C++ instance.   

C++ does not require that all superclass variables be contiguous.
Rather, it only requires variables declared within one labeled section
within one class declaration be contiguous.  C++ substructures are
duplicated if C++ programmers require substructures be duplicated.
If C++ programmers desire substructures be shared, that can also be
specified.

|I understand that the goal of these implementation
|practices is greater run-time efficiency,  but certainly no one should
|claim they came without price nor that C++ acheives "minimal object
|overhead".

I will not claim that the common C++ implemetation choices come without
price, but I will claim that C++ remains a leader in OOPL language 
efficiency, and that is why more people are using C++ than any other OOPL.
Most C++ objects have a single 2/4 byte "vtable ptr" used to represent
their class type, and require one extra indirection for virtual 
function dispatch.  This is not minimal -- its just a lot better than just 
about all other OOPLs.  In addition, in many simple situations C++
inline functions can be used, frequently execution 20-100 times faster
than C++ virtual funtions, which in turn makes them 200-1000 times faster
than some traditional OOPL dispatches.  I claim inlining or other 
"customization" techniques are necessary to make OOPLs sufficiently 
efficient to attract mainstream programming use.

If C++ has a weakness in "efficiency" it is that under some situations
trivial class derivations making slight changes in virtual functions
requires entirely new dispatch tables for a class be created.  Thus,
if C++ classes typically inherit hundreds on virtual functions, but only 
slightly modify one or two of those inherited functions, then the size
costs of the dispatch tables can become expensive.  Still, this is not
a restriction of C++, but rather of current compiler implementations.
Alternative compiler approaches could avoid the large dispatch table costs
even in these infrequent situations.

|I also have considerable problem with C++ in that given an instance one
|cannot know its class at run-time.  In my view this means that the object
|is in some sense not truly first class and posseses only a partial identity.
|Among other problems this totally blows away encapsulation in that the
|behavior of the object and indeed any metalevel information is known only
|to the compiler.  Even within cfront this information is never explicitly
|present from what I have seen.   C++ instances are really only data structures
|with most of their knowledge not present at all within running programs.  They
|are not objects in the sense that they have knowledge present within themselves
|(or their class) of their own structure or behavior.  They literally do not
|know what they are.

I counter-claim that C++ objects do know what they are, they just aren't
currently allowed to tell you. :-)  The reason for this is that 
when programmers can find out an object's type, so many programmers 
write bad code depending on an object's type, thus destroying code
reuse.  I for one, am lobbying for allowing C++ objects tell you somethings
about itself.  It would then be up to programmers whether they choose to
write bad code or not.

I do agree with you that C++ objects do have some identity weaknesses when
using multiple inheritence.  It would not be hard to fix such weaknesses
along with allowing objects to report some run-time type information 
about themselves.

tma@m5.COM (Tim Atkins) (03/23/91)

In article <11550@pasteur.Berkeley.EDU> jbuck@galileo.berkeley.edu (Joe Buck) writes:
>
>Perhaps an implementation could be written this way, but both cfront and g++
>use only a single pointer per object, regardless of its class hierarchy.

Joe,

You quite simply do not know what you are talking about when it comes to
cfront.  I haven't tried g++ to check your contention there.

- Tim

tma@m5.COM (Tim Atkins) (03/23/91)

In article <1991Mar2.052423.3231@objy.com> peter@objy.com writes:
>In article <4578@m5.COM>, tma@m5.COM (Tim Atkins) writes:
>|> I fail to understand the phrase "minimal object overhead" regarding
>|> C++.  Surely such can not be claimed in terms of space.  If one cares
>|> about issues of polymorphism for a given set of classes then each instance
>|> will contain an overhead of at least a 4 byte virtual table pointer for
>|> every class in its inheritance hierarchy than defines virtual functions.
>
>You are mistaking features of one particular implementation of C++ with
>features of the language itself.  Virtual tables (vtbls for short) ARE
>NOT PART OF THE LANGUAGE.  They are merely one way of implementing the
>virtual function behavior of the language.

True enough.  I am not mistaken implementation for language however. 
I am merely  referring to the nearest thing to a 
"standard" we have today, namely Cfront from ATT.  Unless things
have changed radically in later versions what I wrote also applies
to g++, as I remember it.  These two implementations unfortunately seem
to cover the majority of workstation users I find posts from on
 comp.lang.c++ .  I'm not sure about how Borland or Zortech do things or
other companies do things.  I would be very pleased if
someone would enlighten me on alternate strategies that are *market
realities*.  It is no big challenge merely to think of alternate stratgies.

>
>Secondly, the particular implementation you are probably refering to,
>cfront from ATT, does NOT have a vtbl pointer for every super class
>that has virtual functions.  In the single inheritance case, there is
>atmost one vtbl pointer no matter how many classes in the inheritance
>chain have virtual functions.  Multiple inheritance is more
>complicated.

False, or at best only partially correct.  Pre version 2.0 when there was
no MI you are correct.  After 2.0 even hierarchies with no multiple inheritance
will have multiple vtbl pointers, one for each class in the hierarchy that
defines new virtual functions.
>
>|> Due to the way MI is implemented in relation to
>|> the implementation requirement that all superclass variables be contiquous,
>|> it is quite possible that there is duplication of entire substructures
>|> in the C++ instance.
>
>Again, this is NOT A CHARACTERISTIC of the langauge.  The people at Tau Metric
>and Oregon Software would probably be glad to tell you about how their compiler
>doesn't make any redundant copies of base classes.

I would love to hear more about how these and other companies do things.  But
do note that I did not say these latter things are language features.  I 
specifically tied it to particular implementation practices in at least
two places shown in the quoted material.

- Tim Atkins