budd@fog.cs.orst.edu (Tim Budd) (06/11/91)
I was surprized that Jean-Ren Bouvier's request for a simple ``List'' data structure generated so little discussion. Surely this must be one of the most common programs written in C++. But that is not to say it is easy to do. In my book ``An Introduction to Object-Oriented Programming'' I tried to create just such classes (Chapter 17). I was somewhat dissatisfied with the result, and seemingly a few readers were as well. So recently I returned to the problem. My objectives were as follows: 1) The data structures created should be small and easy to understand. 2) Use of the data structure classes should not imply nor prohibit use of any other classes. That is, using the data structure classes should not force you into using a particular memory management package, or whatever. 3) They should not make any restriction on the type of values they contain. For example they should not require that all values held in the data structures be subclasses of a specific class. [ This was the constraint I voilated in my original Chapter 17, and one violated in many other examples I have seen]. 4) The solution should *not* use templates, but should be *guaranteed* type-safe. That is, it should be possible to guarantee that any casts used (and clearly some casts will be necessary) are type-safe. [ Avoiding templates was not a philosophical decision - I merely wanted to show it could be done, and easily done at that]. 5) It should be possible to manipulate the data structures, for example iterating over the elements of a list, without any knowledge of the internal structure of the data structure. Anyway. To make a long story short, you can obtain the fruits of my labors via an automatic mail server. I've written classes for lists (ordered and unordered), sets, stacks, queues, trees and tables. I will probably be adding to this as times goes on. Send a note to ``oopintro@cs.orst.edu'' with subject line ``Generic'' to obtain the code and a latex description. I would be interested in feedback on either code or document. Note that mail to ``oopintro'' is not read by human beings, if you want to reach me personally my address is below. --tim budd budd@cs.orst.edu
dag@control.lth.se (Dag Bruck) (06/11/91)
I agree: writing a list package is a non-trivial task, and we should
encourage anyone who's willing to share his/her interpretation of what
it should look like. Here are three additional suggestions:
1. Member of multiple lists
In most list packages an element can be the member of only one list
at a time. For me and my colleagues this is quite unacceptable. We
often keep a large group objects in a "master" list and then create
temporary sub-lists for some particular processing task.
The straigh-forward use of a base class "Link" which all other classes
are derived from won't work; some sort of indirection is needed.
2. Constant objects
It should be possible to declare a list of `const' objects. This may
not be easy depending on how you use the macro processor. My own list
package does not handle this case gracefully. A `typedef' may help:
typedef const T cT;
List(cT) clist;
(Note that I assume that a real template implementation handles this
case without any additional typedef.)
3. Const compatibility
If I have a list of Foo objects, I would like to pass it to a function
than takes a list of `const Foo' objects:
typedef const Foo cFoo;
void f(const List(Foo) x) { .... }
void g(List(cFoo) x) { .... }
void h() {
List(Foo) fl;
f(fl); // no problem
g(fl); // hmmm
}
In function h(), `fl' is a mutable list of mutable Foo objects. There
is no problem calling f(), which takes a constant list of mutable
objects. The difficulty is to explain to the compiler that a list of
non-mutable objects is compatible with a list of mutable objects.
Comments would be appreciated.
Dag Bruck
--
Department of Automatic Control E-mail: dag@control.lth.se
Lund Institute of Technology
P. O. Box 118 Phone: +46 46-104287
S-221 00 Lund, SWEDEN Fax: +46 46-138118
ttoupin@diana.cair.du.edu (Aerin) (06/12/91)
In article <1991Jun11.061402.21934@lth.se> dag@control.lth.se (Dag Bruck) writes: >1. Member of multiple lists > >In most list packages an element can be the member of only one list >at a time. For me and my colleagues this is quite unacceptable. We >often keep a large group objects in a "master" list and then create >temporary sub-lists for some particular processing task. > >The straigh-forward use of a base class "Link" which all other classes >are derived from won't work; some sort of indirection is needed. I have done something similar in the past. First, we must have class Object { private: unsigned long int _users; public: Object() { _users=0; } Object(const Object &) { _users=0; } ~Object() { --_users; } void operator delete(void *__mem) { if(!(((Object *)__mem)->_users)) ::operator delete(__mem); } unsigned long int operator++() { return _users++; } unsigned long int operator--() { return --_users; } operator unsigned long int() { return _users; } virtual Object *copy(int =0) =0; virtual enum ClassType typeOf() const =0; virtual void *delta(Object *__ptr) const { return __ptr; } virtual void *delta() { return this; } } ; The copy member should, when given a non-zero value, make a legitimate copy of the Object (i.e.: in a new spot in memory), or, otherwise, increment the user count and just return a pointer to the object itself. The delta functions are used for casting up from a base to a derived class. To keep list, use a structure class ListUser : virtual public Object { private: Object *_node; ListUser *_next; public: ListUser(Object *__node,ListUser *__next=NULL) { _node=__node;_next=__next; } ListUser(const ListUser &__list) { if(_node=__list._node) _node=__list.node->copy(-1); // (1) if(_next) _next=new ListUser(*_next); } ~ListUser() { if(_node) delete _node; if (_next) delete _next; } // Put append, insert, search, etc. operations here /* * The typing information should make the ListUser object act like what- * ever it contains... */ virtual enum ClassType typeOf() const { if(_node) return _node->typeOf(); else return ListUserType; } virtual void *delta() const { if(_node) return _node; else return this; } } ; Finally, a List class should have a ListUser which is the list that the List is representing. A copy of a master list in this way could be manipulated as desired, w/o affecting the order of the master; however, modifications of the copies affect the master as well. (There is a lot missing here--if you want the working source code, send me email.) >Dag Bruck >-- >Department of Automatic Control E-mail: dag@control.lth.se >Lund Institute of Technology >P. O. Box 118 Phone: +46 46-104287 >S-221 00 Lund, SWEDEN Fax: +46 46-138118 -- Tory S. Toupin | ttoupin@diana.cair.du.edu | Existence toward perfection... Unversity of Denver | Life of mediocrity! Undergraduate: Math & Computer Sciences| Denver, CO 80208 | - M. E. ----- Ceci n'est pas une signature.
bill@robots.ox.ac.uk (& Triggs) (06/12/91)
I would agree with Dag Bruck that any real list package must let objects be members of multiple lists - there are too many toy C++ examples and not enough real implementations about at present! However if you have a parametrised list class, you can always use it to store pointers or references to objects rather than the objects themselves, whereas if you have a 'list-of-pointer' class you can't use it to store real objects. So any decent list package should allow you to store an arbitrary data type in the nodes for flexibility, even though most people will only use it to store pointers. He also mentions the problem that 'list-of-foo' and 'list-of-const-foo' are completely independent types as far as the compiler is concerned. This is a bit of a nuisance, but can be worked around either by (i) deriving a 'list-of-foo' from 'list-of-const-foo' which casts away the constness of the returned foo's, or (ii) providing a list-of-foo -> list-of-const-foo cast operator. (i) is cleaner & reuses code more, but (ii) is likely to be easier to implement if you have a generic list template class, since you don't need to figure out which const-foo -> foo wrappers to build. Do other people share my feeling that C++ is not quite making the grade as far as code reuse goes? - All of the 'solutions' I see seem to be convoluted, bulky, and difficult to use. In particular, too many little 'helper' and 'wrapper' functions seem to be required for hygiene and sanity... Maybe we should all be using sm*llt*lk ::-)) Sorry, I'm probably just dumb and I'll wash my mouth out, but I find smalltalk collections incomprehensible when they're implemented in C++ - ie NIHCL. Too many tangled pages of macros and headers for me! Libg++ is OK I guess, but it would be nicer if you didn't need to include an entire separate list implementation for each type of list you have - a better solution would probably be a really tight & nasty void*-chasing core implementation with a fashionable replicatable C++/template wrapper class to interface with it. The list code would require comparison operations from the wrapper and would provide raw bytes for the wrapper to construct list node data objects into (disgusting!), so we might have to sort out this silly 'constructors are not real functions' business [ARM 12.1 p 265]. An explicit constructor call 'bytes->Foo(...)' or '((Foo*)bytes)->Foo(...)' OUGHT to be a legal construct creating a Foo in the given (void*) bytes, just as an explicit destructor call is legal [ARM 12.4]. Having to overload operator new just to get this behaviour seems silly to say the least. I guess the advantages of this approach are not so clear when you only have lists, but when your application starts needing multiple copies of B+ or splay trees you begin to notice the size difference... -- Bill Triggs Oxford University Computing Laboratory 11 Keble Rd Oxford OX1 3QD U.K. tel +44-865-273157 fax +44-865-273839
schwartz@karl.cs.psu.edu (Scott Schwartz) (06/13/91)
bill@robots.ox.ac.uk (& Triggs) writes:
Do other people share my feeling that C++ is not quite
making the grade as far as code reuse goes?
I wonder why users of other languages aren't publicly complaining
about having these problems. There are substantial libraries for Ada
and Eiffel, and their newsgroups are quiet. ML and Scheme users don't
seem to be complaining. The Oberon and Modula-3 crowds seem
optimisic. Why all the noise here?
pcg@aber.ac.uk (Piercarlo Grandi) (06/15/91)
On 12 Jun 91 13:34:50 GMT, bill@robots.ox.ac.uk (& Triggs) said: bill> I would agree with Dag Bruck that any real list package must bill> let objects be members of multiple lists I believe that this particular example is the core of the difficulties of making a good reuse oriented language. The issues are very subtle, and only using pointers solves them, if by gross oversimplification, in current languages. It is true but not sufficient to say that: bill> However if you have a parametrised list class, you can always use bill> it to store pointers or references to objects rather than the bill> objects themselves, whereas if you have a 'list-of-pointer' class bill> you can't use it to store real objects. Yet as to: bill> - there are too many toy C++ examples and not enough real bill> implementations about at present! This is inaccurate; there are whole operating systems and major applications (in the hudnreds of thousands of line each range) in C++. It is true though that all these systems have been possible because tha authors, when in tight spots, have solved them by cheating, like in C (casts and other horrible things :->). bill> Do other people share my feeling that C++ is not quite making the bill> grade as far as code reuse goes? - All of the 'solutions' I see bill> seem to be convoluted, bulky, and difficult to use. bill> In particular, too many little 'helper' and 'wrapper' functions bill> seem to be required for hygiene and sanity... I agree with you; I think that there is little understanding of typing and modularization issues inc omputer science. I understand them better than most, IMNHO, and I am amused by all the searching in the dark that goes on. bill> Maybe we should all be using sm*llt*lk ::-)) Smalltalk, and other similar languages, resolve many issues by simply using references and late binding everywhere, which is bit overkill. The challenge is not to build a language that works in the general case, but one that also can be tailored to special cases efficiently. I reckon that this can only be done with symbolic evaluation techniques (supercompiling, and other earlier concepts). In any case C++ is unique in being, bey design, an OO systems implementation language. This means that it must be lean enough to allow you to write a production Smalltalk implementation in it. This is a big constraint, and a big challenge. Too bad that just like with C, C++ is being used also in application contexts where a less systems oriented language, for example Objective C, could be used. bill> Libg++ is OK I guess, but it would be nicer if you didn't need to bill> include an entire separate list implementation for each type of bill> list you have - a better solution would probably be a really tight bill> & nasty void*-chasing core implementation with a fashionable bill> replicatable C++/template wrapper class to interface with it. How do you think are template classes implemented in most C++ compilers that have them? Just like LIBG++ does, a simple preprocessor, just like generics in Ada. A sumbolic reduction compiler could do much better. Some similar technology has been described, but I am not sure it has been implemented. bill> [ ... ] so we might have to sort out this silly 'constructors are bill> not real functions' business [ARM 12.1 p 265]. An explicit bill> constructor call 'bytes->Foo(...)' or '((Foo*)bytes)->Foo(...)' bill> OUGHT to be a legal construct creating a Foo in the given (void*) bill> bytes, just as an explicit destructor call is legal [ARM 12.4]. It is legal indeed. C++ here is very unelegant; a difference is made between the constructor *syntax* and the constructor *function*, and between the new operator and 'operator new'. In particular by using the so called placement syntax for 'operator new' one can invoke just the constructor. In other words while new Foo(...) is an invocation of the new operator that is equivalent to (if it were legal): (Foo::operator new(sizeof (Foo)))->Foo::Foo(...) this invocation of the new operator (where 'bytes' is some suitable pointer): new(bytes) Foo(...) does not imply any call to 'operator new', and is equivalent to (if it were legal): ((Foo *) bytes)->Foo::Foo(...) as you require. The placement syntax is used to construct object for which space has already been allocated. This is typically used for shared memory or otherwise persistent objects which are to be created at a specific address. Admittedly the distinction between the new operator and 'operator new' is a bit subtle and confusing... The ice is particularly thin here :-). -- Piercarlo Grandi | ARPA: pcg%uk.ac.aber@nsfnet-relay.ac.uk Dept of CS, UCW Aberystwyth | UUCP: ...!mcsun!ukc!aber-cs!pcg Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@aber.ac.uk
tmb@ai.mit.edu (Thomas M. Breuel) (06/19/91)
In article <PCG.91Jun14183952@aberdb.aber.ac.uk> pcg@aber.ac.uk (Piercarlo Grandi) writes:
bill> I would agree with Dag Bruck that any real list package must
bill> let objects be members of multiple lists
I believe that this particular example is the core of the difficulties
of making a good reuse oriented language.
The problem is that C++'s version of polymorphism is rather inconvenient
to use.
In C++, you must introduce type names in templates and specify them at
the point of use even if the compiler has enough information to infer
this information (I don't have a C++ compiler with templates, so the
syntax that your C++ implementation uses for templates may differ
slightly from what I am using in this example):
template<base> list<base> map(base (*f)(base),list<base> l) ... ;
...
map<float>(f,l);
By contrast, in ML, you can simply write:
fun map(f,l) = ...
...
map(f,l)
or
fun map(f:'base -> 'base,l:'base list): 'base list = ...
...
map(f,l)
Note that when using the function "map" in ML, you never have to
specify what the value of the type variable "'base" actually is;
the type checker figures it out by itself.
This means that in ML you only need to specify types if you want a
function to be monomorphic rather than polymorphic, or to resolve
ambiguities in operator overloading.
In fact, Scheme and Lisp go one step further: "generic" arithmetic and
other "generic" operations are provided via dynamic typing rather than
via overloading, so you don't ever need to declare types to resolve
overloaded functions.
The issues are very subtle,
and only using pointers solves them, if by gross oversimplification, in
current languages. It is true but not sufficient to say that:
ML certainly does not force the programmer to use pointers in order to
achieve polymorphism (in fact, ML does not have "pointers" in the
usual sense). However, many implementations of ML use pointers
internally to implement polymorphism.
But, implementations that are concerned with optimizing data space and
execution time at the cost of program size and compilation time do not
need to be based on pointers; my understanding is that SISAL 2.0 is an
example of a polymorphic language that does not use pointers for
implementing polymorphic functions (it probably generates different
versions of functions for different argument type patterns).
In summary, statically typed languages with efficient implementations
of polymorphism are possible. The C++ programming language, however,
was not designed with this in mind, and I doubt if this is easily
fixable. In particular, polymorphism and overloading interact in
unpleasant ways.
If you want the convenience of a fully polymorphic type system, you
are probably better off using a language that was designed with such a
type system in mind from the ground up.
Thomas.
hitz@csi.uottawa.ca (Martin Hitz) (06/19/91)
In article <TMB.91Jun18235004@volterra.ai.mit.edu> tmb@ai.mit.edu (Thomas M. Breuel) writes: >The problem is that C++'s version of polymorphism is rather inconvenient >to use. > >In C++, you must introduce type names in templates and specify them at >the point of use even if the compiler has enough information to infer >this information (I don't have a C++ compiler with templates, so the >syntax that your C++ implementation uses for templates may differ >slightly from what I am using in this example): > > template<base> list<base> map(base (*f)(base),list<base> l) ... ; > ... > map<float>(f,l); > According to the ARM, the example should read: template<class base> list<base> map(base (*f)(base),list<base> l); float f (float); list<float> l; ... map(f,l); Note that at the call to map(), the compiler DOES infer the correct type. Martin Hitz@csi.uottawa.ca