[comp.object] Overhead in object tables

pkr@media01.UUCP (Peter Kriens) (02/05/91)

I found the following message in the object group:

>There is more than the indirect reference.  There is the virtual function
>table which must be maintained for each object.  If you are dealing with
>many (thousands) of small (2-4 int) objects, by using virtual functions
>you are adding at least 25%-50% extra size to your objects (assuming ints
>and the index into the virtual function table to be equal in size - it's
>probably worse than that).

First how many 2-4 sized objects are there in a reasonable system. Even an
int object takes 4 bytes on any system and for example smalltalk has
a solution for small integers to not take any object space. In our own 
object oriented system CO2 we found a reasonable modal object size of
32 bytes. 

This means thatif you even if you would use this kind of calculation, the
overhead is much less.

Not to speak about the fact that the object table contains very reasonable
information that is almost always included in the object memory space like
class and size that is needed to do dynamic binding. 

The strange thing is that in most system the load on memory is actually 
less than in pointer based systems. Smalltalk V286 in a loaded image has about
20000 objects. This means that an object pointer could be represented in 
16 bits. Because in an object oriented system most references are to objects,
this actually saves the 16 bits from the 32 bits  C++ is using to refer to
an object. (By the way, V286 is using pointers and has there is no difference
there). 

The other advantage of an object table is the fact that you can enummerate 
all the instances of a certain class. This can only be done when you
have an object table or link the objects (much harder). 

Peter Kriens
email : pkr@media01.uucp or hp4nl!media01!pkr

moss@cs.umass.edu (Eliot Moss) (02/05/91)

In article <1913@media01.UUCP> pkr@media01.UUCP (Peter Kriens) writes:

   The other advantage of an object table is the fact that you can enummerate 
   all the instances of a certain class. This can only be done when you
   have an object table or link the objects (much harder). 

You can enumerate objects even *without* an object table if you make the heap
scannable. This is easy to do for Smalltalk, I expect more difficult to do for
C++ without a little additional overhead to help you find object boundaries
and such. It would not be a problem for our Modula-3 memory management system.
--

		J. Eliot B. Moss, Assistant Professor
		Department of Computer and Information Science
		Lederle Graduate Research Center
		University of Massachusetts
		Amherst, MA  01003
		(413) 545-4206, 545-1249 (fax); Moss@cs.umass.edu

sakkinen@tukki.jyu.fi (Markku Sakkinen) (02/06/91)

In article <MOSS.91Feb5094917@ibis.cs.umass.edu> moss@cs.umass.edu writes:
>In article <1913@media01.UUCP> pkr@media01.UUCP (Peter Kriens) writes:
>
>   The other advantage of an object table is the fact that you can enummerate 
>   all the instances of a certain class. This can only be done when you
>   have an object table or link the objects (much harder). 
>
>You can enumerate objects even *without* an object table if you make the heap
>scannable. This is easy to do for Smalltalk, I expect more difficult to do for
>C++ without a little additional overhead to help you find object boundaries
>and such. It would not be a problem for our Modula-3 memory management system.

I would believe that the task is _impossible_ for C++.
(Even if you by some means knew the boundaries of an object,
you would normally not be able to tell its class.)
However, some extremely sophisticated implementation, completely
different from the standard ones, might perhaps make it possible.
Such an implementation should be focussed on a strong object orientation
instead of saving storage.
Of course the programmer-accessible feature for scanning through
all existing instances of a given class would itself be an extension
of the standard language.

Markku Sakkinen
Department of Computer Science and Information Systems
University of Jyvaskyla (a's with umlauts)
PL 35
SF-40351 Jyvaskyla (umlauts again)
Finland
          SAKKINEN@FINJYU.bitnet (alternative network address)

jimad@microsoft.UUCP (Jim ADCOCK) (02/14/91)

In article <1991Feb6.111013.8412@tukki.jyu.fi| sakkinen@jytko.jyu.fi (Markku Sakkinen) writes:
|In article <MOSS.91Feb5094917@ibis.cs.umass.edu> moss@cs.umass.edu writes:
|>In article <1913@media01.UUCP> pkr@media01.UUCP (Peter Kriens) writes:
|>
|>   The other advantage of an object table is the fact that you can enummerate 
|>   all the instances of a certain class. This can only be done when you
|>   have an object table or link the objects (much harder). 
|>
|>You can enumerate objects even *without* an object table if you make the heap
|>scannable. This is easy to do for Smalltalk, I expect more difficult to do for
|>C++ without a little additional overhead to help you find object boundaries
|>and such. It would not be a problem for our Modula-3 memory management system.
|
|I would believe that the task is _impossible_ for C++.

I disagree -- but of course, this all depends exactly on how one 
defines the task of "enumerating all instances of a certain class."

The general approach people typically take to solving this problem in
C++ is to create a base class that supports the methods necessary to
do the enumerating of "objects."  If you want all "objects" to be
enumerable, you put the necessary support in the base class of all
"objects."

The creator of a new class then needs to implement a few support functions
for that particular class.  Templates can typically reduce this to a one
liner.  If you want to do better than that, then you indeed need to make
compiler changes, and/or support tools to automatically generate the
support routines.

In general, its do-able, and people are doing it, but it may not be quite
as simple and clean today as people might like.

|(Even if you by some means knew the boundaries of an object,
|you would normally not be able to tell its class.)

Depends on what one considers an "object."  If all "objects" have vtables,
then all are essentially type-tagged, and it is easy to tell classes.
Or if all "objects" are segregated by pools, then its easy to tell from
their pools.  Of if all "objects" of a class are linked together ....

|However, some extremely sophisticated implementation, completely
|different from the standard ones, might perhaps make it possible.
|Such an implementation should be focussed on a strong object orientation
|instead of saving storage.

I don't believe C++ was designed for saving storage -- rather, I believe
it was designed to be close to optimal in speed while using close to
traditional compiler techniques.  There are areas where C++ is not very
efficient in the use of storage -- constructors, destructions, and 
the approach taken to method dispatch come to mind.  Creating subclasses
in C++ where only one or two virtual methods are overridden can be 
quite expensive of space.  Hashed dispatch would be slower, but more
space efficient under these conditions.

|Of course the programmer-accessible feature for scanning through
|all existing instances of a given class would itself be an extension
|of the standard language.

Could be, needn't be.  Either the programmer has to do some programming
to support this, in which case it wouldn't be an extension to the language,
or the compiler would have to add the feature, in which case it would be
an extension to the language.

moss@cs.umass.edu (Eliot Moss) (02/14/91)

Concerning some of the comments about enumerating instances of classes in C++,
what I meant was it is possible to design a C++ implementation that would do
this. I was not thinking of some add-on to existing implementations, or that
the code would (necessarily) be written in C++.

I think it's a real shame that better memory management and garbage collection
were not designed in to C++ from the start. I will tend to trust programs
written in Modula-3 (for example) rather more than ones written in C++ because
of the insidious nature of memory management bugs (storage leaks, dangling
references, etc.) and the fact that C++ does little to help the programmer
attain correct memory management (a little better than C, but that's not
saying much). Well enough soapbox speechifying ....
--

		J. Eliot B. Moss, Assistant Professor
		Department of Computer and Information Science
		Lederle Graduate Research Center
		University of Massachusetts
		Amherst, MA  01003
		(413) 545-4206, 545-1249 (fax); Moss@cs.umass.edu