rfg@NCD.COM (Ron Guilmette) (12/22/90)
In article <CLINE.90Dec21171912@cheetah.ece.clarkson.edu> cline@sun.soe.clarkson.edu (Marshall Cline) writes: > >So the basic problem is that C++ pointers aren't ``smart'' enough. Ie: >a copy-collect GC routine moves objects, and the pointers are machine >addresses rather than some higher level `logical' pointers, so the >machine addresses become invalid. > >A possible solution is to ban `Cons*' and replace it with ConsPtr, >which is a smart pointer... Great! Now how do I "ban" the use of the type Cons* ??? Marshall's idea of "banning" the use of regular pointers is good one except for one thing. The C++ language (as currently defined) does not provide any form of support for "banning" the use of specific pointer types. This notion of banning pointers to "managed" or "controlled" classes seems to be the general answer for a lot of the questions which arise when `smart pointers' are used. It seems that `smart pointers' work wonderfully, but only so long as nobody on your project team forgets what's going on and then (accidently) codes up a declaration for a T* object rather than a smart_pointer_to_T object. Of course we could even deal with *those* cases if we were allowed to overload such things as operator= for T*'s, but we can't because the language rules say that we can't. (The justification for this is a bit esoteric, and has to do with language mutability. But I digress...) Anyway, given that we can't defend ourselves (e.g. by overloading operator= for pointer types) from the random dummies who may happen to get assigned to our project teams at any given instant in time, wouldn't it be nice if (as an alternative) we could have some way of instructing the compiler to absolutely "ban" people from declaring objects of type T* for cases where we want Ts to be accessed only via values of some `smart_pointer_to_T' type? Rather than inventing yet another keyword (banned?) how about something like this: class T; class smart_pointer_to_T { /* ... magical stuff, possibly including ref counts, etc. */ class T* pointer; public: smart_pointer_to_T (); smart_pointer_to_T& operator= (smart_pointer_to_T); T& operator * (); }; protected class T { /* ... innards of T ... */ }; The idea here is that once we declare `class T' to be `protected', then from that point forward (in the compilation) it becomes impossible to use the type T* in any way, shape, or form. In particular, it would be impossible to declare objects to be of type T* or objects of type T**, or objects of type T*[] or anthing like that. It would also become illegal to cast anything to (T*) or (T**) or (T*[]) or (T*&) or anything like that. Furthermore, it would be impossible to use, or even to generate values of type T*. Assume for the moment that C++ already included the concept of "protected" classes (as described herein). Looking in my crystal ball, I see one very curious side-effect of making a given class "protected". Consider: class T T_array[10]; ... T_array[i] ... // ERROR: value of type T* generated/used The indicated line would contain an error because in C and C++ any reference to an array element gets converted (internally in the compiler) to it's simpler equivalent. For an array reference like `a[i]' the compiler effectively converts this to `*(a+i)' where the evaluation of the sub-expression `a' yields a pointer to the first element of the array `a'. I'm not at all sure whether this curious side-effect of declaring a class to be "protected" would be good or bad in practice. I can see where it might at first seem to be an irritant, but I have a funny feeling that it would end up *aiding* in the enforcement of the programmer's wish to insure that all accesses to type T object go (indirectly) through some object of type `smart_pointer_to_T'. In summary, I have believed for quite some time that the concept of `smart pointers' is a terrific one, but that C++ desperately needs some way to ENFORCE their universal use for certain "controlled" types. I believe that allowing classes to be declared as "protected" could be a good way to provide such enforcement because it should be relatively simple to implement in actual C++ compilers. After all, it involves no new keywords and no new run-time semantics. All it does is to provide the programmer with a convenient way to tell the compiler to treat a slightly larger class of things as compile-time errors. I welcome discussion of the idea of "protected" classes. If people have no objection to the idea, I may just decide to ask x3j16 to consider it. I'd like to get some feedback first however. P.S. When I spoke earlier about "random dummies" getting assigned to project teams, I was definitely *NOT* refering to anybody here at NCD. We don't have any dummies here at all... random or otherwise. Everybody I work with here is incredibly bright. Unfortunately, I cannot say the same thing about *all* of the places that I have worked over the years. :-( -- // Ron Guilmette - C++ Entomologist // Internet: rfg@ncd.com uucp: ...uunet!lupine!rfg // Motto: If it sticks, force it. If it breaks, it needed replacing anyway.
rfg@NCD.COM (Ron Guilmette) (12/25/90)
I wrote: >In article <CLINE.90Dec21171912@cheetah.ece.clarkson.edu> cline@sun.soe.clarkson.edu (Marshall Cline) writes: >> >>So the basic problem is that C++ pointers aren't ``smart'' enough. Ie: >>a copy-collect GC routine moves objects, and the pointers are machine >>addresses rather than some higher level `logical' pointers, so the >>machine addresses become invalid. >> >>A possible solution is to ban `Cons*' and replace it with ConsPtr, >>which is a smart pointer... > >Great! Now how do I "ban" the use of the type Cons* ??? > >Marshall's idea of "banning" the use of regular pointers is good one >except for one thing. The C++ language (as currently defined) does >not provide any form of support for "banning" the use of specific >pointer types. I went on to propose a syntax for specifying that (for a given class) no regular (stupid? :-) pointers to that class could be declared or used. I've since realized that there was a rather serious problem with my proposal. It provided no way to circumvent the restriction against declaring and/or using regular pointer types (to the given class). You may occasionally need to circumvent such a restriction (for instance within the definition of the `smart-pointer' class, where you typically declare one of the fields of the smart-pointer class to be a stupid pointer). Here is an alternative proposal which solves this problem: By default, for any class type T, the use of the type `T*' (or values thereof) is unrestricted (as is true currently). If however the definition of the class type T contains a (member) declaration statement which mentions the type T*, but which includes no actual declarators, and if that declaration statement appears either in a private or a protected part of the class declaration, then the visibility of the type T* will be restricted as if it were a private or protected member of the class (respectively). Here is an example of how this could be put to work in a reference-counted class and in its associated `smart-pointer' class. ////////////////////////////////////////////////////////////////////////// class managed { unsigned ref_count; managed*; // Type `managed*' is now private... friend class managed_pointer; // ... but let managed_pointer type ... // ... use the stupid type. public: managed () : ref_count (0) { } ~managed () { if (ref_count > 0) raise Exception (); // Dangling references !!! } managed_pointer operator & (); // addressof yields a managed_pointer. }; class managed_pointer { managed *ptr; // Only allowed due to frienship! public: managed_pointer (mamaged *); // Only called from managed::operator& managed_pointer () : ptr (0) { } managed_pointer (managed_pointer& arg) : ptr (arg.ptr) { ptr->ref_count++; } ~managed_pointer () { if (--ptr->ref_count == 0) // unreachable delete ptr; } managed& operator * () { return *ptr; } managed& operator -> () { return *ptr; } }; managed_pointer managed::operator& () { return new managed_pointer (this); // stupid converted to smart } ////////////////////////////////////////////////////////////////////////// Although not shown here, the declaration of `managed*' in the private part of `class managed' would make that type act as if the type itself was private to `class managed'. Thus, it would only be usable by members and friends of `class managed'. Note that for the classes shown above, if you ever (accidently) attempt to delete an object of type `managed' you will be informed immediately (via an exception) if there are **any** outstanding pointers to the given object (which have not themselves been cleaned up). This will be true *regardless* of the behavior of other programmers who may make use of these classes (which is the whole point of this idea). With this scheme, you can use smart pointers and still be assured that you will be protected from the mistakes of others, and also from (most of) your own mistakes! Comments welcomed. -- // Ron Guilmette - C++ Entomologist // Internet: rfg@ncd.com uucp: ...uunet!lupine!rfg // Motto: If it sticks, force it. If it breaks, it needed replacing anyway.
markb@agora.uucp (Mark Biggar) (12/25/90)
In article <3071@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes: >>> bunch of stuff about smart pointers and protected classes <<< >Assume for the moment that C++ already included the concept of "protected" >classes (as described herein). Looking in my crystal ball, I see one >very curious side-effect of making a given class "protected". Consider: > > class T T_array[10]; > > ... T_array[i] ... // ERROR: value of type T* generated/used > >The indicated line would contain an error because in C and C++ any >reference to an array element gets converted (internally in the compiler) >to it's simpler equivalent. For an array reference like `a[i]' the >compiler effectively converts this to `*(a+i)' where the evaluation >of the sub-expression `a' yields a pointer to the first element of >the array `a'. Fine, allow the above error. If the writer of protected class T wants to allow for arrays of T, he should define an operator[] for T. Now there is no problem. I would even go so far as to suggest that is you don't define an operator[] for T then a declaration of an array of T should also be just as illegal as a pointer to T. -- Mark Biggar
rmartin@clear.com (Bob Martin) (12/27/90)
In article <3071@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes: > >Rather than inventing yet another keyword (banned?) how about something >like this: > > class T; > > class smart_pointer_to_T { > /* ... magical stuff, possibly including ref counts, etc. */ > class T* pointer; > public: > smart_pointer_to_T (); > smart_pointer_to_T& operator= (smart_pointer_to_T); > T& operator * (); > }; > > protected class T { > /* ... innards of T ... */ > }; > >The idea here is that once we declare `class T' to be `protected', then >from that point forward (in the compilation) it becomes impossible to >use the type T* in any way, shape, or form. Let me suggest an alternative: class T; class T* { /* ... magical stuff, possibly including ref counts, etc. */ T* pointer; public: T* (); T& operator= (T*); T& operator * (); T& operator [](int); }; I am not sure that I have covered every contigency but the idea is that instead of 'banning' the creation of pointers to certain classes, we create classes which the compiler recognizes to be the pointers to the classes. Then all the dereferencing operations are under the programmer's control regardless of whether anyone else on the project team knows that the pointer is smart. -- +-Robert C. Martin-----+:RRR:::CCC:M:::::M:| Nobody is responsible for | | rmartin@clear.com |:R::R:C::::M:M:M:M:| my words but me. I want | | uunet!clrcom!rmartin |:RRR::C::::M::M::M:| all the credit, and all | +----------------------+:R::R::CCC:M:::::M:| the blame. So there. |
cline@cheetah.ece.clarkson.edu (Marshall Cline) (12/27/90)
In article <3071@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes: (Ron suggests a syntax whereby a class smart_pointer_to_T can have exclusive access to a class T) >The idea here is that once we declare `class T' to be `protected', then >from that point forward (in the compilation) it becomes impossible to >use the type T* in any way, shape, or form. ... Sounds to me that you want class T to be `owned' by class smart_pointer_to_T. Or said another way, you want class T to be hidden `inside' smart_ptr_to_T. Sounds like `nested classes' fit the bill! class smart_ptr_to_T { class T { //... }; public: //... }; Q: The class name `smart_ptr_to_T::T' is private to T, right? Marshall Cline -- PS: If your company is interested in on-site C++/OOD training, drop me a line! PPS: Career search in progress; ECE faculty; research oriented; will send vita. -- Marshall Cline / Asst.Prof / ECE Dept / Clarkson Univ / Potsdam, NY 13676 cline@sun.soe.clarkson.edu / Bitnet:BH0W@CLUTX / uunet!clutx.clarkson.edu!bh0w Voice: 315-268-3868 / Secretary: 315-268-6511 / FAX: 315-268-7600
tma@osc.COM (Tim Atkins) (12/27/90)
Smart pointers are not any sort of real solution to problems of pointer maintenance in the face of any sort of object movement including that found in copying garbage collectors. The reason is simply that regardless of how smart your pointer type is, at some point it is evaluated to a real pointer and stored into some temp variable and or is sitting around on the stack or in registers. It is precisely these implied direct pointers built behind the scenes that give the biggest headaches. Without some way of knowing exactly where all pointers and "internal" pointers are, there is simply no way to move objects around and update pointers safely in C++. - Tim Atkins
davidm@uunet.UU.NET (David S. Masterson) (12/29/90)
>>>>> On 26 Dec 90 16:19:50 GMT, cline@cheetah.ece.clarkson.edu >>>>> (Marshall Cline) said: Marshall> In article <3071@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes: Marshall> (Ron suggests a syntax whereby a class smart_pointer_to_T can have Marshall> exclusive access to a class T) Ron> The idea here is that once we declare `class T' to be `protected', then Ron> from that point forward (in the compilation) it becomes impossible to use Ron> the type T* in any way, shape, or form. Marshall> ... Marshall> Sounds to me that you want class T to be `owned' by class Marshall> smart_pointer_to_T. Or said another way, you want class T to be Marshall> hidden `inside' smart_ptr_to_T. Sounds like `nested classes' fit Marshall> the bill! Marshall> class smart_ptr_to_T { Marshall> class T { //... Marshall> }; Marshall> public: //... Marshall> }; Marshall> Q: The class name `smart_ptr_to_T::T' is private to T, right? I missed something here. If T is nested inside of smart_ptr_to_T, then how would someone go about creating a T object for the smart_ptr_to_T to point to? (Is this what you mean by "smart_ptr_to_T::T"?) -- ==================================================================== David Masterson Consilium, Inc. (415) 691-6311 640 Clyde Ct. uunet!cimshop!davidm Mtn. View, CA 94043 ==================================================================== "If someone thinks they know what I said, then I didn't say it!"
davidm@uunet.UU.NET (David S. Masterson) (12/29/90)
>>>>> On 27 Dec 90 06:29:19 GMT, tma@osc.COM (Tim Atkins) said:
Tim> Smart pointers are not any sort of real solution to problems of pointer
Tim> maintenance in the face of any sort of object movement including that
Tim> found in copying garbage collectors.  The reason is simply that
Tim> regardless of how smart your pointer type is, at some point it is
Tim> evaluated to a real pointer and stored into some temp variable and or is
Tim> sitting around on the stack or in registers.  It is precisely these
Tim> implied direct pointers built behind the scenes that give the biggest
Tim> headaches.  Without some way of knowing exactly where all pointers and
Tim> "internal" pointers are, there is simply no way to move objects around
Tim> and update pointers safely in C++.
I don't agree.
Object movement should be as easily accomplished in C++ as it would be in any
other language.  Ensuring internal consistency of pointers is just a locking
problem to make sure that pointers are not dereferenced before they are valid.
Smart pointers should be able to handle lock primitives in concert with a
garbage collection routine to ensure that they don't go to the actual data
while the garbage collection is in progress.
The problem in C++ is ensuring (as Ron pointed out) that only the smart
pointer is used in referencing the object pointed to and no real pointers are
created in uncontrolled circumstances.
--
====================================================================
David Masterson					Consilium, Inc.
(415) 691-6311					640 Clyde Ct.
uunet!cimshop!davidm				Mtn. View, CA  94043
====================================================================
"If someone thinks they know what I said, then I didn't say it!"davidm@uunet.UU.NET (David S. Masterson) (12/29/90)
>>>>> On 22 Dec 90 08:07:55 GMT, rfg@NCD.COM (Ron Guilmette) said:
Ron> Anyway, given that we can't defend ourselves (e.g. by overloading
Ron> operator= for pointer types) from the random dummies who may happen to
Ron> get assigned to our project teams at any given instant in time, wouldn't
Ron> it be nice if (as an alternative) we could have some way of instructing
Ron> the compiler to absolutely "ban" people from declaring objects of type T*
Ron> for cases where we want Ts to be accessed only via values of some
Ron> `smart_pointer_to_T' type?
Ron> Rather than inventing yet another keyword (banned?) how about something
Ron> like this:
Ron> 	class T;
Ron> 	class smart_pointer_to_T {
Ron> 		/* ... magical stuff, possibly including ref counts, etc. */
Ron>		class T* pointer;
Ron>	public: 
Ron>		smart_pointer_to_T ();
Ron>		smart_pointer_to_T& operator= (smart_pointer_to_T);
Ron>		T& operator * ();
Ron>	};
Ron> 	protected class T {
Ron>		/* ... innards of T ... */
Ron>	};
Isn't there a problem here in that, if T is declared protected, how will
smart_pointer_to_T point to it internally?
--
====================================================================
David Masterson					Consilium, Inc.
(415) 691-6311					640 Clyde Ct.
uunet!cimshop!davidm				Mtn. View, CA  94043
====================================================================
"If someone thinks they know what I said, then I didn't say it!"pcg@cs.aber.ac.uk (Piercarlo Grandi) (12/30/90)
On 22 Dec 90 08:07:55 GMT, rfg@NCD.COM (Ron Guilmette) said: rfg> Great! Now how do I "ban" the use of the type Cons* ??? rfg> Marshall's idea of "banning" the use of regular pointers is good one rfg> except for one thing. The C++ language (as currently defined) does rfg> not provide any form of support for "banning" the use of specific rfg> pointer types. [ ... ] rfg> Anyway, given that we can't defend ourselves (e.g. by overloading rfg> operator= for pointer types) from the random dummies who may happen rfg> to get assigned to our project teams at any given instant in time, rfg> wouldn't it be nice if (as an alternative) we could have some way rfg> of instructing the compiler to absolutely "ban" people from declaring rfg> objects of type T* for cases where we want Ts to be accessed only rfg> via values of some `smart_pointer_to_T' type? Frankly here you are asking too much of C++. What you are describing is the classic capability system proprty that only the type manager for a given type can access the representation for that type. C++ was not designed as a capability language, and indeed, except for some small thing like private:, the representation is not really accessible only to the type manager. In particular there is no way, as you note, to hide the representation of fundamental values like int or pointer. That's it. It would take a major redesign of the language to fix this. -- Piercarlo Grandi | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk Dept of CS, UCW Aberystwyth | UUCP: ...!mcsun!ukc!aber-cs!pcg Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk
hcobb@ccwf.cc.utexas.edu (Henry J. Cobb) (12/31/90)
To insure that the luser doesn't get a grip on any members of T (or *T which is the same thing), make all constructors private and return only smart_pointer_to_T from your own functions. The luser will be unable to create any T or get a firm grip on any of yours. -- Henry J. Cobb hcobb@ccwf.cc.utexas.edu SFB Tyrant
cline@cheetah.ece.clarkson.edu (Marshall Cline) (01/01/91)
In article <CIMSHOP!DAVIDM.90Dec28222804@uunet.UU.NET> cimshop!davidm@uunet.UU.NET (David S. Masterson) writes: > Marshall> Sounds to me that you want class T to be `owned' by class > Marshall> smart_pointer_to_T. Or said another way, you want class T to be > Marshall> hidden `inside' smart_ptr_to_T. Sounds like `nested classes' fit > Marshall> the bill! > Marshall> class smart_ptr_to_T { > Marshall> class T { //... > Marshall> }; > Marshall> public: //... > Marshall> }; > Marshall> Q: The class name `smart_ptr_to_T::T' is private to T, right? > I missed something here. If T is nested inside of smart_ptr_to_T, then how > would someone go about creating a T object for a smart_ptr_to_T to point to? > (Is this what you mean by "smart_ptr_to_T::T"?) Alas the discussion has moved too far from its roots. My mistake. The original question (Doug Moore, dougm@zamenhof.rice.edu) involved a Scheme interpreter that had to collect dead `Cons' objects. My suggestion was to `ban' Cons* and replace them with a smart pointer or a perhaps with a surrogate object that acted like a smart reference. The real point is that machine addresses (T*) are too dumb. If you do things through real-live objects, the object's destructor can do magical things for you (like deleting the pointed-to-data if that's what you want). In Doug Moore's case, moving a Cons object during garbage collection required changing `this', an anachronism. By doing everything through a surrogate wrapper around the actual Cons object (the smart-ptr or smart-ref, if you will), the wrapper itself doesn't have to move, so you eliminate changing `this' (only the contained `Cons' has to move). I think this answers Dave Masterson's question: > If T is nested inside of smart_ptr_to_T, then how > would someone go about creating a T obj for a smart_ptr_to_T to point to? The smart_ptr_to_T pointer creates the T object. And destroys it too. Smart_REF_of_T would have been a much less confusing name for this abstraction. Marshall Cline -- PS: If your company is interested in on-site C++/OOD training, drop me a line! PPS: Career search in progress; ECE faculty; research oriented; will send vita. -- Marshall Cline / Asst.Prof / ECE Dept / Clarkson Univ / Potsdam, NY 13676 cline@sun.soe.clarkson.edu / Bitnet:BH0W@CLUTX / uunet!clutx.clarkson.edu!bh0w Voice: 315-268-3868 / Secretary: 315-268-6511 / FAX: 315-268-7600
tma@osc.COM (Tim Atkins) (01/01/91)
In article <CIMSHOP!DAVIDM.90Dec28224114@uunet.UU.NET> cimshop!davidm@uunet.UU.NET (David S. Masterson) writes: >X-Posting-Software: GNUS 3.12 [ NNTP-based News Reader for GNU Emacs ] > >>>>>> On 27 Dec 90 06:29:19 GMT, tma@osc.COM (Tim Atkins) said: > >Tim> Smart pointers are not any sort of real solution to problems of pointer >Tim> maintenance in the face of any sort of object movement including that >Tim> found in copying garbage collectors. The reason is simply that >Tim> regardless of how smart your pointer type is, at some point it is >Tim> evaluated to a real pointer and stored into some temp variable and or is >Tim> sitting around on the stack or in registers. It is precisely these >Tim> implied direct pointers built behind the scenes that give the biggest >Tim> headaches. Without some way of knowing exactly where all pointers and >Tim> "internal" pointers are, there is simply no way to move objects around >Tim> and update pointers safely in C++. > >I don't agree. > >Object movement should be as easily accomplished in C++ as it would be in any >other language. Ensuring internal consistency of pointers is just a locking >problem to make sure that pointers are not dereferenced before they are valid. >Smart pointers should be able to handle lock primitives in concert with a >garbage collection routine to ensure that they don't go to the actual data >while the garbage collection is in progress. > >The problem in C++ is ensuring (as Ron pointed out) that only the smart >pointer is used in referencing the object pointed to and no real pointers are >created in uncontrolled circumstances. >-- I don't understand your disagreement at all. If I have a smart pointer and, for instance, reference a field of the underlying object as a parameter that is passed to another function, then the real pointer is most definitely in some temporary location on the stack for at least the duration of that function call. Actually in 2.0 implementations it seems to remain on the stack for at least the duration of the surrounding block if not the duration of the entire enclosing function. If anything promotes a low memory condition during this time that requires a garbage collect then there is at least one unplanned direct pointer in stack space. How are you planning to identify such pointers and distinquish them from other data types that might have the exact same bit pattern? This was and is the substance of my objection. I would be simply thrilled if you or any of the other net readers has a solution to this one. - Tim
davidm@uunet.UU.NET (David S. Masterson) (01/03/91)
>>>>>> On 27 Dec 90 06:29:19 GMT, tma@osc.COM (Tim Atkins) said: Tim> Smart pointers are not any sort of real solution to problems of pointer Tim> maintenance in the face of any sort of object movement including that Tim> found in copying garbage collectors. The reason is simply that Tim> regardless of how smart your pointer type is, at some point it is Tim> evaluated to a real pointer and stored into some temp variable and or is Tim> sitting around on the stack or in registers. It is precisely these Tim> implied direct pointers built behind the scenes that give the biggest Tim> headaches. Without some way of knowing exactly where all pointers and Tim> "internal" pointers are, there is simply no way to move objects around Tim> and update pointers safely in C++. >>>>> On 28 Dec 90 22:41:14 GMT, cimshop!davidm@uunet.UU.NET (David S. Masterson) said: David> Object movement should be as easily accomplished in C++ as it would be David> in any other language. Ensuring internal consistency of pointers is David> just a locking problem to make sure that pointers are not dereferenced David> before they are valid. Smart pointers should be able to handle lock David> primitives in concert with a garbage collection routine to ensure that David> they don't go to the actual data while the garbage collection is in David> progress. >>>>> On 1 Jan 91 06:58:23 GMT, tma@osc.COM (Tim Atkins) said: Tim> If I have a smart pointer and, for instance, reference a field of the Tim> underlying object as a parameter that is passed to another function, then Tim> the real pointer is most definitely in some temporary location on the Tim> stack for at least the duration of that function call. Actually in 2.0 Tim> implementations it seems to remain on the stack for at least the duration Tim> of the surrounding block if not the duration of the entire enclosing Tim> function. If anything promotes a low memory condition during this time Tim> that requires a garbage collect then there is at least one unplanned Tim> direct pointer in stack space. How are you planning to identify such Tim> pointers and distinquish them from other data types that might have the Tim> exact same bit pattern? This was and is the substance of my objection. An off-the-cuff C++ approach to handling objects that may be garbage collected and the pointers that point to them: 1. The garbage collection routines (GC) would have full knowledge of the type of objects that can be garbage collected, how they would be moved, and how to clean up pointers to those objects. This implies that the "direct" pointers to garbage collected objects are in a "well-known" place (PtrList). 2. These "direct" pointers can, therefore, be allocated as an object is created and put in the list in an indexable fashion for the "smart" pointers to reference. There should only be one static list of these pointers so that the GC can find them when its needs to do some shuffling. 3. (As Marshall Cline suggested) Smart pointers would be the only thing that creates the actual object. Thus, creating a "smart" pointer would create the object that it points to and have its "direct" pointer entered into the static list. The "smart" pointer would maintain an index into the static list of "direct" pointers that corresponds to its object. 4. When GC decides to move an object, it uses the old address to index (somehow) into the static list of pointers and replaces that address with the new address of the object. GC should only move entire objects. 5. When a "smart" pointer needs to dereference the object it points to, it uses its smart index to find the real address from the static pointer list. This probably the definition of the overloaded "->" operator on the smart pointer. 6. The only problem left is ensuring that (4) and (5) do not step on one another. This should be accomplished via a locking protocol that both (4) and (5) pay attention to. When (4) finds a candidate object to move, it checks its lock status (perhaps in the static pointer list), and moves it when it isn't locked. When (5) decides to dereference an object, it checks its lock status, and waits till the object is free to dereference it. This protocol isn't necessarily efficient, but its flexible and should be improvable. -- ==================================================================== David Masterson Consilium, Inc. (415) 691-6311 640 Clyde Ct. uunet!cimshop!davidm Mtn. View, CA 94043 ==================================================================== "If someone thinks they know what I said, then I didn't say it!"
jgro@lia (Jeremy Grodberg) (01/03/91)
In article <3090@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes: >[...] >I went on to propose a syntax for specifying that (for a given class) >no regular (stupid? :-) pointers to that class could be declared or used. > >I've since realized that there was a rather serious problem with my >proposal. [...] > >Here is an alternative proposal which solves this problem: >[...] It is easy to make pointers hard enough to use that programmers won't use them, which I think is good enough. By overloading & (the address-of operator) and new() for the class, one can force them to return smart-pointers, as Ron started to do in his proposal. The only way left for people to create regular pointers to the objects is by new-ing an array. The fact that a class' operator new() is not called when an array of that class' objects are created, and thus a class has no way to prevent creation of an arrary, is a much discussed shortcoming which I would invite everyone to complain about to Messrs. Stroustrup and Koenig. While Ron's proposal might make it possible to prevent allocating an array because the compiler would not let you create the resulting pointer, that solution is very confusing to the programmer; I would much rather have a new() operator that I could overload to allocate an array of objects. If you do the overloading I suggested, then a programmer will have to make a conscious effort to get a regular pointer to the object, which one can presume they are doing because that is what is truely needed. If the programmer unknowingly and inappropriately tries to take create a regular pointer to the object, they would get a useful and informative compiler error saying something like "Can't convert a (classPtr) to a (* class)." Thus you will have accomplished the main objective, which is to prevent programmers from erroneously getting regular pointers to the objects of the given class. The truely general solution is neither the above nor Mr. Guilmette's, but rather to allow programmers to define a class pointer as another class. For example: class T; class T* { public: operator*(); operator->(); //... private: T*(); // make constructor private, so no one can create one void* ptr; // the actual address of the T object. } The problem with this proposal is that it completely changes the concept of what a pointer is and can be, and would likely cause major ripples throughout the compiler. You now need a new syntax to describe a regular pointer to T. I really want "ptr" to be a regular pointer to T, but I can't call it a T* now. And what should &T return by default, a T* or a regular pointer? Perhaps you could get away with T t; void* regularPointer = &(void)t; t = (T) *regularPointer; but now you've blown a great big hole in the type-checking system. Compared to many of the proposed enhancements to C++, I think adding language support to restrict one's ability to create pointers to objects is too difficult in relation to its utility to be considered at this time. -- Jeremy Grodberg "I don't feel witty today. Don't bug me." jgro@lia.com
dkeisen@Gang-of-Four.Stanford.EDU (Dave Eisen) (01/04/91)
>In article <3071@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes: > > >Sounds to me that you want class T to be `owned' by class smart_pointer_to_T. >Or said another way, you want class T to be hidden `inside' smart_ptr_to_T. >Sounds like `nested classes' fit the bill! > >class smart_ptr_to_T { > class T { > //... > }; It seems to me that nested classes raise some of the issues that C so successfully avoided by not allowing nested function declarations. C code is easy to read in part because the scope of things is pretty clear from the text; anything that moves away from that is a step in the wrong direction. -- Dave Eisen There's something in my library 1447 N. Shoreline Blvd. to offend everybody. Mountain View, CA 94043 --- Washington Coalition Against Censorship (415) 967-5644 dkeisen@Gang-of-Four.Stanford.EDU
tma@osc.COM (Tim Atkins) (01/05/91)
In article <CIMSHOP!DAVIDM.91Jan2113026@uunet.UU.NET> cimshop!davidm@uunet.UU.NET (David S. Masterson) writes: >> [ old exchange on smart-pointers not helping with object movement deleted] > >An off-the-cuff C++ approach to handling objects that may be garbage collected >and the pointers that point to them: > >1. The garbage collection routines (GC) would have full knowledge of the type >of objects that can be garbage collected, how they would be moved, and how to >clean up pointers to those objects. This implies that the "direct" pointers >to garbage collected objects are in a "well-known" place (PtrList). > >2. These "direct" pointers can, therefore, be allocated as an object is >created and put in the list in an indexable fashion for the "smart" pointers >to reference. There should only be one static list of these pointers so that >the GC can find them when its needs to do some shuffling. This totally misses the point it seems to me. The point is that regardless of what clever contortions one goes through on the C++ end there is absolutely no way at all to control the C intermediate code and/or object code that gets generated. The object code when run will most definitely create and use direct pointers to objects in an uncontrolled fashion. The challenge of moving C++ objects at will is to find and update these pointers. As the architecture is not tagged it is in principle impossible to do a simple scan and distinquish pointers from other memory words with the same pattern. Nor will knowing what the structure of objects is in the garbage collector solve all the problems as it will not handle any temporary variables set up by the compilers or copies of pointers setting around in registers. > >4. When GC decides to move an object, it uses the old address to index >(somehow) into the static list of pointers and replaces that address with the >new address of the object. GC should only move entire objects. > >5. When a "smart" pointer needs to dereference the object it points to, it >uses its smart index to find the real address from the static pointer list. >This probably the definition of the overloaded "->" operator on the smart >pointer. > Hmmm. I think I see what you're getting at. One approach is to have all pointers have one level of indirection so a double dereferencing always happens. When an object moves only the interior pointer (quaranteed only one per object) changes and not the exterior ones. Only trouble is this still doesn't help with temp copies on the stack and register references. > [locking discussion removed ] So I still have to take the position that without going into the compiler business and most likely losing some C compatibility, it is not possible to provide a fool-proof object movement scheme in C++. - Tim
davidm@uunet.UU.NET (David S. Masterson) (01/08/91)
>>>>> On 4 Jan 91 21:53:34 GMT, tma@osc.COM (Tim Atkins) said: Tim> In article <CIMSHOP!DAVIDM.91Jan2113026@uunet.UU.NET> Tim> cimshop!davidm@uunet.UU.NET (David S. Masterson) writes: David> An off-the-cuff C++ approach to handling objects that may be garbage David> collected and the pointers that point to them: David> ... Tim> This totally misses the point it seems to me. The point is that Tim> regardless of what clever contortions one goes through on the C++ end Tim> there is absolutely no way at all to control the C intermediate code Tim> and/or object code that gets generated. I've never really examined the code that C++ generates, but I would think that the C code could only do what the C++ code specifies at the time that it specifies it. That is to say that if the C++ code does (A, B, C), then the generated C code will do (A+, B+, C+) (where the '+' is possible extra code to fill in the intentions of the C++ statement as C understands them). Therefore, if B is doing something in an uncontrolled fashion, then A and C could be used to control B and the generated C code would follow suit. Is this a wrong impression of what C++ generates? David> 4. When GC decides to move an object, it uses the old address to index David> (somehow) into the static list of pointers and replaces that address David> with the new address of the object. GC should only move entire David> objects. David> 5. When a "smart" pointer needs to dereference the object it points David> to, it uses its smart index to find the real address from the static David> pointer list. This probably the definition of the overloaded "->" David> operator on the smart pointer. Tim> Hmmm. I think I see what you're getting at. One approach is to have all Tim> pointers have one level of indirection so a double dereferencing always Tim> happens. When an object moves only the interior pointer (quaranteed only Tim> one per object) changes and not the exterior ones. Only trouble is this Tim> still doesn't help with temp copies on the stack and register references. Isn't that what locking is for (assuming my above statement is correct)? In particular, if there are temp copies of the pointers on the stack or register references, who cares? Once outside the lock protocol, these vestiges will never be used again (they may be REINITIALIZED when reentering the lock protocol, though). -- ==================================================================== David Masterson Consilium, Inc. (415) 691-6311 640 Clyde Ct. uunet!cimshop!davidm Mtn. View, CA 94043 ==================================================================== "If someone thinks they know what I said, then I didn't say it!"
jimad@microsoft.UUCP (Jim ADCOCK) (01/08/91)
In article <4127@osc.COM> tma@osc.UUCP (Tim Atkins) writes: |So I still have to take the position that without going into the compiler |business and most likely losing some C compatibility, it is not possible to |provide a fool-proof object movement scheme in C++. I disagree with your position as stated. One simple counter-example to your statement is to use the fool-proof object movement scheme of simply not moving objects. Simple, and fool-proof. Other, less acedemic counter-examples are also possible.
jimad@microsoft.UUCP (Jim ADCOCK) (01/08/91)
Also, note that the smart pointer problem is really just a sub-part of the metaclass problem in C++, to which I say: C++ needs metaclass support. It should be possible to tie-in the metaclass support with templates, so that the class programmer is not tied to one particular definition of what the metaclass support should be, but rather can provide the individualized metaclass support required for a particular job.
davidm@uunet.UU.NET (David S. Masterson) (01/09/91)
>>>>> On 7 Jan 91 23:15:17 GMT, jimad@microsoft.UUCP (Jim ADCOCK) said:
Jim> C++ needs metaclass support.
Does it?  Or is there just a need for a standard functionality when dealing
with metaclasses?  That is, metaclasses are not part of the language, but more
a part of a standard convention for using the language.
Jim> It should be possible to tie-in the metaclass support with templates, so
Jim> that the class programmer is not tied to one particular definition of
Jim> what the metaclass support should be, but rather can provide the
Jim> individualized metaclass support required for a particular job.
As long as the individualized metaclass support still follows conventions and
allows for asking a question of an object like "are you a metaclass?"  Then
again, the generalization of this is to be able to ask an object "are you an
object?"  Should C++ support as a standard derivation from the object Object
(a la NIH)?
--
====================================================================
David Masterson					Consilium, Inc.
(415) 691-6311					640 Clyde Ct.
uunet!cimshop!davidm				Mtn. View, CA  94043
====================================================================
"If someone thinks they know what I said, then I didn't say it!"rfg@lupine.ncd.com (Ron Guilmette) (01/13/91)
Well, I'm gratified that there has been so much discussion on this
question of how to deal with the "smart pointer problem".  At the very
least this should indicate to all concerned that this is an important
issue and is worthy of careful consideration (in particular by x3j16).
To recap (for those of you who were sleeping) the "smart pointer problem"
is just this:
	If you have a class type T and another class type smart_pointer_to_T
	there is currently no way in C++ to force all "pointers" to type T
	objects (except those within some limited contexts) to be
	smart_pointer_to_T type values rather than T* type values.
	It may somtimes be important to enforce such a restriction on
	T*, either because your garbage collection scheme requires it
	or for some other reason.
	The basic problem is to prevent values of type T* from "leaking"
	out of a certain region of your source code.
Here is a (perhaps biased) summary of the highlights of the discuss so far.
In a separate and subsequent message, I offer my own (definitely biased)
analysis of the state of the discussion and consider yet another solution.
---------------------------------
Marshall Cline really started this thread by mentioning a programming
problem in which it would be useful to "ban" the use of stupid pointers
(perhaps not entirely, but at least over some part of the code in a given
program).
I proposed a combination of new syntax/semantics to allow the programmer
to say (in effect) that the use of values of type T* was restricted to
only some certain limited lexical contexts.  The idea rested on what I
called "protected" class types.  Subsequently I amended this proposal.
After that, Marshall Cline (cline@cheetah.ece.clarkson.edu) proposed that
the solution to this problem was already available within the current 
definitions of the C++ language, via the judicious use of nested classes.
(More on this later).
Then Robert C. (Bob) Martin (rmartin@clear.com) proposed an alternative
to my proposal in which pointer types themselves would be allowed to be
treated as classes, thus allowing class definitions like:
	class T* { ... } ;
where the '...' could be replaced with constructors, destructors, type
convertors, and all sorts of other data members, member functions,
operators (including operator=) and what-not.
Later on, Bob amended his proposal and cross-posted it to comp.std.c++
for additional comments.
Henry J. Cobb (hcobb@ccwf.cc.utexas.edu) proposed simply making all of the
constructors for the type T private so that they would be difficult to
create at ramdom (unrestricted) places.  (I assume that he also would
suggest making the class smart_pointer_to_T a friend of type T so that
at least smart_pointer_to_T could create T's.)
Jeremy Grodberg (jgro@lia.com) also independently proposed the same
thing that Bob Martin proposed (i.e. allowing T* to be declared as its
own class type and to have operators and what-not defined for it.
Jeremy Grodberg (jgro@lia.com) also noted that there is an important
(and currently uncontrolable) source of "leakage" of regular pointers
built-in to the current definition of the C++ language, i.e. doing
a new() on an array of T's always returns a T* because the language rules
say that new() for an array always uses the global new() rather than
any class-specific new() operator.
Andrew Ginter (gintera@cpsc.ucalgary.ca) suggested that any solution would
have to take into account issues relating to pointers returned from functions,
references (to pointers?), the `this' pointer, and temporaries generated
during expression evaluation (even where their ordering of creation &
destruction may not be known).  Furthermore, he suggested that data
members of class objects (which he refered to as "instance variables" in
the Smalltalk tradition of nomenclature) could be a cause for concern
if people were allowed to randomly take their addresses.  It is apparent
from this that Andrew G. is not aware of the (relatively new) C++ features
with respect to "pointer-to-members".
Tim Atkins (tma@osc.UUCP) seemed to feel that if you could not control
*all* of the  pointers to T everywhere in the program (and make them
all the smart kind) then you would still have a problem.  He also
alluded to some (unspecified) problems which he believed might arise
in the generated C code (because it might make use of T* values in
uncontrolled ways behind the back of the C++ programmer).  In particular,
he seemed to feel that T* values might end up in registers, and that this
could cause problems when and if a garbage collector needed to relocate
the pointer at object(s).
Piercarlo Grandi (pcg@cs.aber.ac.uk) said that the solution to this problem
required a "capability" language and that C++ was not such a language.
Jim ADCOCK (jimad@microsoft.UUCP) said that the solution to this problem
required a new set of features relating to "metaclasses".rfg@lupine.ncd.com (Ron Guilmette) (01/13/91)
Now for my detailed comments on the "smart pointer problem" discussion
so far.  (This is where it really starts to get biased! :-)
--------------------------------------------------------------------------
I believe that the concerns expressed by Andrew Ginter are, for the most
part, non-issues.  I don't see where functions which return pointers
(either smart or stupid ones) need to cause us any special concerns.
Likewise for temporaries and expression evaluation ordering.  The `this'
pointer is worthy of note only in that it must have type T* and thus,
the type T* should be unrestricted wherever `this' is accessible.
Taking the address of a data member of an object of type T need not cause
us any special concer either because the value yielded by this use of the
unary & operator will be of some pointer-to-member-of-T type, which cannot
be subsequently be used in isolation.  Ratherany such pointer-to-member-
of-T may only be used in conjunction with honest-to-goodness pointers to
objects of the type T and if these rae maintained correctly than all will
work out just fine.
Likewise, Tim Atkins concerns are (I believe) misplaced.  I don't think
that it is necessary to have *all* pointers to some type T be smart in
order for a program containing objects of type T to be useful.  Quite
the contrary, it seems to me that for any pointed-at type (T) you may
want to use smart pointers (to T) in most places and you will absolutely
have to use stupid pointers to T in certain (limited) places.  Additionally,
I don't see where low level implementation-specific details (e.g. the
code that cfront generates) needs to enter into this discussion unless
cfront has bugs that become aparent when we are fiddling with smart
(or stupid) pointers.  The issue of T*'s in registers also seems
unrelated, unless of course our garbage collector can be triggered into
action asynchronously (e.g. as the result of a signal).  In that case,
it may be wise to declare all of our stupid pointers to be volatile
(so that we don't get into memory/register synchronization problems)
but that is all unrelated to the point of this discussion.
I believe that both Peter Grandi and Jim Adcock are saying that we need
to restrict the use of the type T* at run-time via run-time mechanisms.
If so, I disagree with both of them.  I feel that we ought to be able
to do something at compile-time where the performance cost is not so high
as it is for things done at run-time.
Henry Cobb's idea to make all constructors for type T private is somewhat
similar to Marshall Cline's suggestion of nesting the declaration (and
definition) of the class T within a smart_pointer_to_T class.  In both
cases, the idea seems to be to restrict the ability to create objects
of type T to some particular (limited) set of lexical scopes (all of
which are under complete control of the smart_pointer_to_T type).
To varying degrees, these two proposals solve the "smart pointer
problem" by making the type T unknown to the outside world. (In the case
of Henry's proposal, the whole program could at least say `sizeof(T)'
whereas in Marshall's proposal, even that would be illegal outside of
the encapsulating outer class.) 
Anyway, these two proposals succeed by hiding the type T from those who
would attempt to use it directly, and by forcing such potential users
to ask for assistance from the smart_pointer_to_T type in order to
do anything (including creation and destruction) with an object of type T.
These solutions have definite merits, but there is a downside to hiding
the type T.  (More on this later.)
The solution proposed by Bob Martin and (independently) also by Jeremy
Grodberg to allow the type T* to be  treated like a class (which can
be declared and which can have member functions and operators defined
for it) is clever and I had myself considered it, however I fear that
Bjarne will never like it.  The reason?  Well, it makes the language
"mutable" (in Stroustrup's terms).  One early (and related) idea
which I had some time ago for solving this "smart pointer" problem
was to allow stuff like:
	T*& operator= (T*, T*&);
	T operator* (T*);
In effect, I wanted to let the user just redefine the meaning of = and
(unary) * for plain old pointer types.  If you could do that, then you
could be in complete control of all operations done with stupid pointers.
That idea was almost the same as allowing:
	class T* {
	public:
		T*& operator= (T*&);
		T operator* ();
	};
But in both cases, you are allowing the user to change the existing
meaning of things whose meaning is already well defined in the language
(e.g. the meaning of unary * when applied to a pointer type value).
Bjarne doesn't want to open that Pandora's box.  I tend to feel that
this one important case (of pointer types) might warrant a bit of
"mutability" being allowed to stick its nose into the tent, but it
doesn't much matter what I think.  I doubt that Bjarne will have any
part of it.
Of all of these ideas, I think that I like Marshall Cline's the best.
It certainly has good prospects of being implemented widely so that we
can all start to use it soon.  After all, it relies only on features of
C++ which are already described in current drafts of the x3j16 working
documents!  In effect, nested classes are already "in" the standard.
(I hope nobody in x3j16 kills me for having said that.)
Likewise, Henry Cobb's idea (to make all constructors for T private
and then to just make functions and classes which actually have to
create T's into friends of T) is a good solution which ought to work
even with current implementations.
I do see some problems with these two ideas however.  First and foremost,
by using either of these approaches, I have to give up the ability
(which I would otherwise have) to simply declare an object of type T
as a storage-class `static' file-scope variable, or as storage-class
`auto' variable (local to a function) or even as a member.
I don't like that one bit!  Just because I want the use of T*'s to be
to be restricted does not mean that I also want to be restricted in
what I can do with a T.  Gosh darn it!  I want my cake and I want to
eat it too!
Another problem with both Marshall's idea and with Henry's idea is that
they both require me to put the entire *definition* of the (controlled)
class T into header files where I don't even want it to be!  That slows
down compilation unnecessarily (which irks me).  For example, with Henry's
proposal, I have to put this into my header file:
	smart_tp.h:
	---------------------------------------------------------------
	class T {
		/* ... the complete definition of T ...*/
		friend class smart_pointer_to_T;
	};
	class smart_pointer_to_T {
		/* ... definition of smart_pointer_to_T ... */
	};
	---------------------------------------------------------------
Here, both definitions of both classes have to be scanned and compiled 
for each .C file which includes "smart_tp.h".  Many of these may not even
need to know *any* of the details of the definition of class T.
Likewise, for Marshall's proposal, I need:
	smart_tp.h:
	---------------------------------------------------------------
	class smart_pointer_to_T {
		class T {
			/* ... the complete definition of T ...*/
		};
		/* ... definition of smart_pointer_to_T ... */
	};
	---------------------------------------------------------------
Which is equally wasteful of compile time.
Now somebody else was asking over in comp.std.c++ if it was legal to
incompletely declare a nested class, so that you could have (for example):
	smart_tp.h:
	---------------------------------------------------------------
	class smart_pointer_to_T {
		class T;	/* incomplete declaration of T */
		/* ... definition of smart_pointer_to_T ... */
	};
	---------------------------------------------------------------
and then later on in a different file:
	complete_t.C:
	---------------------------------------------------------------
	#include "smart_tp.h"
	class smart_pointer_to_T::T {
		/* completion of type smart_pointer_to_T::T */
	};
	---------------------------------------------------------------
In my opinion, that would be "way cool" if you could do that, but I don't
think that it is legal.  Furthermore, even if it is legal, it only
provides a way of eliminating one of my two objections to Marshall's
proposed solution to the "smart pointer problem".  The other (more
important) objection still remains.  You still couldn't declare T
objects all over the place.  You could only created them where the
smart_pointer_to_T type would let you (probably only in the heap).
My initial proposal was intended solve the "smart pointer problem"
while keeping the language "immutable", allowing declarations of T
objects in most places, and avoiding any need to have a complete
definition of the type T preceed the definition of the type
smart_pointer_to_T.
I believe that my proposal did all that, but I'm now starting to wonder
if it was really such a hot idea after all.
My proposal simply provided a means for telling the compiler that (in
certain contexts) it sould treat uses of type T* values as illegal
(thus forcing the user to use the smart pointer type in those contexts
instead).
Perhaps I grabbed the problem by the wrong end.
I now believe that it might be equally effective to simply make it
impossible to even generate a valid (non-null) stupid pointer-to-T
value in certain contexts.  Obviously, if you can prevent valid
values of type T* from leaking out into some area then you don't
even need to worry about whether or not operations on T*'s are
restricted (over that area) or not.
Obviously, for a class type T, you can overload operator& (either as
a member function of T or as a global function taking a T&).
That right there puts you in control of most of the cases where a T*
could potentially be generated.
Unfortunately, there are others that you (currently) can't control.
As Jeremy Grodberg (jgro@lia.com) noted, the language rules currently say
that if you new() an array of objects of some type T, the global operator
new is invoked for this regardless of whether or not the type T has its
own class-specific operator new() defined.  As a result, whenever you
new() an array of T, you'll get back a value of type T* even if you
would have preferd getting back a value of type smart_pointer_to_T.
This is a one means by which which unwanted (but valid and non-void)
values of type T* may leak into some context.  This leakage is very bad
and it ought to be rectified by x3j16.
Also, there is one more leakage problem.  Given  some local or global
variable called `ta' of type array of T, the following expression
yields a value of type T* even if the type T has its own class-specific
operator& defined for it:
		ta
That's it!  The name of an array is generally converted (implicitly) into
a pointer to the zeroth member of that array.  This implicit conversion
currently circumvents any class-specific operator& definition (if one
is present) for the class T and allows values of type T* to leak into
contexts where they may not be welcomed.
If both of these unfortunate leaks in the language could be plugged, we
might be able to achieve really water-tight "safe" smart pointer types
just by overloading operator& (and having it yield a smart pointer type)
for some "controlled" type T.
Both leaks could be easily plugged while doing little harm to the existing
language.
For the first leak, it would be easy enough to say that a class-specific
operator new() for a class T is called whenever a single object *or* an
array of objects of type T is new'ed.  Such operators could then be defined
by the user to return some smart pointer type.
For the second leak, we could simply redefine the semantics of "array-name"
(where "array-name" names an array of objects of some class type) to be
equivalent to invoking (implicitly) an applicable operator& (either member
or global) on the zeroth element of the array.
There now.  That was simple, eh?
Note that by plugging these two leaks, we have not destroyed the user's
ability to declare objects of type T (or even objects of type T*)
but what we have done is to give the user all of the tools he/she needs
in order to insure that no useful values of type T* (other than NULL)
ever leak into a given area (where they might be misused).jimad@microsoft.UUCP (Jim ADCOCK) (01/15/91)
In article <CIMSHOP!DAVIDM.91Jan8135218@uunet.UU.NET> cimshop!davidm@uunet.UU.NET (David S. Masterson) writes: >X-Posting-Software: GNUS 3.12 [ NNTP-based News Reader for GNU Emacs ] > >>>>>> On 7 Jan 91 23:15:17 GMT, jimad@microsoft.UUCP (Jim ADCOCK) said: > >Jim> C++ needs metaclass support. > >Does it? Or is there just a need for a standard functionality when dealing >with metaclasses? That is, metaclasses are not part of the language, but more >a part of a standard convention for using the language. My position on "metaclass support" is very simple: * Programming tasks commonly performed in C++ should be supported by the language. * Programming tasks commonly performed by the C preprocessor indicate a _lack_ of support for that task in the C++ language. * While templates should help in the creation of metaclass information, they won't help enough to keep people from using the C preprocessor for these tasks. Therefore templates are not sufficient to be considered metaclass support. * Creating metaclass-like information is commonly performed in C++. [Into the metaclass-like category I'd also lump "smart pointers" and "helper classes."] --- My [vague] idea of what metaclass support might be could be a set of template classes that are automatically created when a class is derived, without requiring a DECLARE_CLASS(foo, base) macro, or whatever. Not only are such macros ugly, unsafe, a pain to implement and debug, but they also leave open ample opportunity for subsequent class derivers to screw 'em up.
davidm@uunet.UU.NET (David S. Masterson) (01/16/91)
>>>>> On 14 Jan 91 23:47:25 GMT, jimad@microsoft.UUCP (Jim ADCOCK) said:
Jim> My [vague] idea of what metaclass support might be could be a set of
Jim> template classes that are automatically created when a class is derived,
Jim> without requiring a
Jim> 	DECLARE_CLASS(foo, base)
Jim> macro, or whatever.  Not only are such macros ugly, unsafe, a pain to 
Jim> implement and debug, but they also leave open ample opportunity for 
Jim> subsequent class derivers to screw 'em up.
The Fall 1990 issue of C++ Journal discusses the Dossier type and the
MKDossier command to capture information for generating Dossiers.  I don't
fully understand the methodology being expressed in the article, but it sounds
like an alternative idea with the same goals as above.  Anyone care to
comment?
--
====================================================================
David Masterson					Consilium, Inc.
(415) 691-6311					640 Clyde Ct.
uunet!cimshop!davidm				Mtn. View, CA  94043
====================================================================
"If someone thinks they know what I said, then I didn't say it!"davidm@uunet.UU.NET (David S. Masterson) (01/17/91)
>>>>> On 12 Jan 91 23:05:11 GMT, rfg@NCD.COM (Ron Guilmette) said:
David> In particular, if there are temp copies of the pointers on the stack or
David> register references, who cares?
Ron> Exactly.
Ahhh, but (as you posted in other messages) can this be statement be
guaranteed through lexical restrictions in the language.  My statement is made
from the perspective that, even if there are "dead" pointers lying around, as
long as you don't use them (and so, write your program accordingly), you're
safe.  Viewing that from the language implementation end, though, means that
the language must guarantee that these temporary copies of pointers *no longer
exist* outside their lexical scope (even if its to pass pointers from one
valid scope to another valid scope cannot be allowed).  Its an entirely
different emphasis.
--
====================================================================
David Masterson					Consilium, Inc.
(415) 691-6311					640 Clyde Ct.
uunet!cimshop!davidm				Mtn. View, CA  94043
====================================================================
"If someone thinks they know what I said, then I didn't say it!"brucec@phoebus.labs.tek.com (Bruce Cohen;;50-662;LP=A;) (01/18/91)
In article <CIMSHOP!DAVIDM.91Jan15104514@uunet.UU.NET> cimshop!davidm@uunet.UU.NET (David S. Masterson) writes: > > The Fall 1990 issue of C++ Journal discusses the Dossier type and the > MKDossier command to capture information for generating Dossiers. I don't > fully understand the methodology being expressed in the article, but it sounds > like an alternative idea with the same goals as above. Anyone care to > comment? I've looked at Dossier in the course of trying to develop a tool for instrumenting C++ programs. I talked briefly with both Mark Linton and John Interrante, and got access to the Dossier source, but I want to emphasize that these are my own opinions, with reference to my application requirements, and I apologize if I misrepresent Mark or John, or if they've done something in the last month or two which invalidates what I say now. Mkdossier is a text filter that, given a C++ class definition in a source file, builds a source declaration of a Dossier object containing a classname string and information about the inheritance (lists of parents and children classes) and the file and line number of the class definition. Mark has talked about adding additional information, but as far as I know there's no one who plans to do that anytime soon. The current version of Dossier doesn't contain types and offsets of data members. For some applications, you would want a still more complete definition of the class, including descriptions of the member functions. If you want to get even more complex, I think you could implement most of the CLOS metaobject protocol by subclassing the MetaClass class which is the class of all class objects (I swear that sounds like Gilbert and Sullivan :-)). Getting back to earth, my application requires that I be able to find the types of all data members, and if they are pointers to class objects that I be able to follow those pointers recursively. So Dossier is insufficient for my purposes, and I've been forced to develop my own tools. Your mileage may vary. The basic idea, however, of an automatic tool which generates source code for metaobjects from your source, is (in my opinion) the right way to handle the problem of maintaining run-time information about classes. The tool I'm building is not really different in principle from mkdossier; it just has to be more complex to extract and use the information I need. -- ------------------------------------------------------------------------ Speaker-to-managers, aka Bruce Cohen, Computer Research Lab email: brucec@tekchips.labs.tek.com Tektronix Laboratories, Tektronix, Inc. phone: (503)627-5241 M/S 50-662, P.O. Box 500, Beaverton, OR 97077
rmartin@clear.com (Bob Martin) (01/20/91)
In article <3348@lupine.NCD.COM> rfg@lupine.ncd.com (Ron Guilmette) writes: >Now for my detailed comments on the "smart pointer problem" discussion >so far. (This is where it really starts to get biased! :-) >-------------------------------------------------------------------------- > >The solution proposed by Bob Martin and (independently) also by Jeremy >Grodberg to allow the type T* to be treated like a class (which can >be declared and which can have member functions and operators defined >for it) is clever and I had myself considered it, however I fear that >Bjarne will never like it. The reason? Well, it makes the language >"mutable" (in Stroustrup's terms). One early (and related) idea >which I had some time ago for solving this "smart pointer" problem >was to allow stuff like: > > T*& operator= (T*, T*&); > T operator* (T*); > >In effect, I wanted to let the user just redefine the meaning of = and >(unary) * for plain old pointer types. If you could do that, then you >could be in complete control of all operations done with stupid pointers. >That idea was almost the same as allowing: > > class T* { > public: > T*& operator= (T*&); > T operator* (); > }; > >But in both cases, you are allowing the user to change the existing >meaning of things whose meaning is already well defined in the language >(e.g. the meaning of unary * when applied to a pointer type value). But don't the unary operator& and operator* already violate this rule of mutability? ARM 13.4 implies that there is no return type restriction on these operators. So by using an Y operator*(X*) you "Change the meaning of unary * when applied to a pointer type value." Can you elaborate more on this? -- +-Robert C. Martin-----+:RRR:::CCC:M:::::M:| Nobody is responsible for | | rmartin@clear.com |:R::R:C::::M:M:M:M:| my words but me. I want | | uunet!clrcom!rmartin |:RRR::C::::M::M::M:| all the credit, and all | +----------------------+:R::R::CCC:M:::::M:| the blame. So there. |
jimad@microsoft.UUCP (Jim ADCOCK) (01/23/91)
In article <6059@exodus.Eng.Sun.COM> chased@rbbb.Eng.Sun.COM (David Chase) writes: >Your "stupid pointers" have vanished, to be replaced by pointers to >the interior of the object. There are several things to consider: The pointer problem cuts two ways. People writing for GC would like a foo* point to alias a whole range of objects, say foo_array[100], thus keeping the whole array alive. People not writing for GC would [presumably] like a foo* to only alias one object -- that currently pointed to, in order to cut down on the amount of pessimism [anti-optimization] required by the code generator. The fundamental problem is that a pointer in C/C++ is designed to allow indexing of an infinite array of objects +- from the present object pointed at. But as often as not, people don't use pointers to point into arrays at all -- but rather use them to refer to isolated instances of objects that are not even part of any array. [No, an isolated foo is not the same as a foo[1] -- consider the delete operator.] And given that this pointers are fundamental to C++, one cannot even avoid the problem by programming using references. This fundamentally flawed notion of pointers is so fundamental to C/C++ that I cannot propose much to improve the situation -- except that I believe that non-const references should be added to the language -- to at least partially allow people to avoid pointer problems. As Bjarne has noted, this would require adding a new operator, say ':=' , used to represent re-assignment of a non-const reference to point to another object. Such being required since '=' already has well established meaning to copy the object being referenced. Adding such an operator would move references from being 3rd class things to being 2nd class things. They still wouldn't have the full set of operators required to make them first class things. >There is an easy, portable answer to this problem: "volatile". Hmm, how does that solve the this pointer problems?
rfg@NCD.COM (Ron Guilmette) (01/23/91)
In article <6059@exodus.Eng.Sun.COM> chased@rbbb.Eng.Sun.COM (David Chase) writes:
+In the absence of any optimizer cooperation (distinct from front-end
+cooperation -- the front-end might automatically generate relocation
+or mark methods), root pointers MUST be "registered" or otherwise
+protected from the optimizer's helping hand.  Even so, they cannot be
+relocated, because temporaries might also be created that would not be
+updated.
+
...
+There is an easy, portable answer to this problem:  "volatile".
I get the feeling that David Chase is talking about what I would call
"asynchronous" garbage collectors which are invoked every so often
(perhaps as a result of a timer signal).
That is indeed a problem which needs the help of "volatile", but it is
also quite a different problem from the one I was talking about in
my previous postings about "the smart pointer problem".
-- 
// Ron Guilmette  -  C++ Entomologist
// Internet: rfg@ncd.com      uucp: ...uunet!lupine!rfg
// Motto:  If it sticks, force it.  If it breaks, it needed replacing anyway.tma@osc.COM (Tim Atkins) (01/26/91)
Responding to Ron Guilmete's post on limiting pointer bugs... This is a pretty interesting proposal but it seems to break down when object movement is allowed. I assume it is to be considered a bug if an object moves but a subsequent update goes to the old location of the object instead of the new one. Even with good encapsulation I'm not sure that you will get much joy in the situation that a smart pointer is dereferenced and stored by the C or object code generated in a register just before some other arbitrary function gets executed which results in a situation where the object moves. On return the code (much of which you did not write but was generated from your code by the compiler) will happily pick up the old address from the register and run some updating method on it. I don't really see how encapsulation helps you here. You can't just look at the classes methods but must also look at the probability that something they call will move "this" object before mutation occurs. Do you have a reasonable solution for dealing with this problem? It is a class of bug also that will happen only sporadically depending on the entire state of the process re the need for object movement. - Tim
tma@osc.COM (Tim Atkins) (01/26/91)
In article <3344@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes: >In article <CIMSHOP!DAVIDM.91Jan7105709@uunet.UU.NET> cimshop!davidm@uunet.UU.NET (David S. Masterson) writes: >+>>>>> On 4 Jan 91 21:53:34 GMT, tma@osc.COM (Tim Atkins) said: >+ >+Tim> This totally misses the point it seems to me. The point is that >+Tim> regardless of what clever contortions one goes through on the C++ end >+Tim> there is absolutely no way at all to control the C intermediate code >+Tim> and/or object code that gets generated. >No. It isn't. I believe that you are correct. In short, I believe that >Tim's vague concerns about rogue pointers running rampage (without our >permission) in the (mysterious?) C code which is generated by cfront is >a non-issue. > >+In particular, if there are temp copies of the pointers on the stack or >+register references, who cares? > >Exactly. > I can scarcely believe that folks of your caliber can not seem to see the obvious problem for object movement that these supposedly innocent generated pointers create. Obviously, a dereference of a smart pointer MUST happen somewhere in the generated code for any real work to get done. Obviously, generated results can be and often are stored away by compiler generated code even without counting optimization for later use in any context where it can be determined by the compiler that any depended on values are constant in particular. Now, if I call something that causes the object to move, then these stored away pointers are now invalid. However, HOW DOES THE COMPILER know? I don't know of any way to make it know this strictly from the language end. True a compiler can be built that does makes this possible. On return from the movement causer the code can happily use the old pointer as, say the ultimate receiver of a mutation function. You now have updated the object's old location NOT the working object. This seems to me to obviously be a bug. Does this really make it clear enough what I'm getting at? I would prefer to stick to a general statement of the problem as it is always easy to finagle a particular example to where all works well. This can lead to false confidence that there is no real problem. However, consider the following simple minded example of where the problem might occur: void Foo::a( SmartB x, SmartC y){ something(x,y); somethingMore(x); this->doOther(); } Here if something and somethingMore take real pointers as they would if the were extern functions then obviously room for grief exists. If this problem is bypassed, then what if "something" causes a garbage collect in a compacting collector? Then the "this" is wrong and the last call will update the wrong location. This is not the best of examples and I'm sure you or other folks are creative enough to blow holes in it or least come up with exotic variations of how to do it right. BUT I think it does illustrate the sort of tangles that have to be dealt with. So far, none of the solutions that stop short of a compiler change will deal successfully with all such situations, in my opinion . If this problem can be easily dismissed then no one will be happier than I. But at least be so kind as to tell us what the solution is instead of merely making bald statements that no problem is there. - Tim Atkins
tma@osc.COM (Tim Atkins) (01/26/91)
In article <CIMSHOP!DAVIDM.91Jan16120513@uunet.UU.NET> cimshop!davidm@uunet.UU.NET (David S. Masterson) writes: >X-Posting-Software: GNUS 3.12 [ NNTP-based News Reader for GNU Emacs ] > >>>>>> On 12 Jan 91 23:05:11 GMT, rfg@NCD.COM (Ron Guilmette) said: > >David> In particular, if there are temp copies of the pointers on the stack or >David> register references, who cares? > >Ron> Exactly. > >Ahhh, but (as you posted in other messages) can this be statement be >guaranteed through lexical restrictions in the language. My statement is made >from the perspective that, even if there are "dead" pointers lying around, as It is precisely the assumption of "dead" that makes object relocation such a nightmare in a an untagged environment. Most compilers and optimizers work hard at not putting pointers and other intermediate products in valuable register and less valuable stack space UNLESS they expect to reuse them in the near future. However, there is no guarantee that an operation that invalidates those stored intermediaries such as a copying GC will not occur before the reuse without the compiler being any the wiser. If it occurs then the subsequent use must be restricted not from the user but from the generated code! If there is some unknown magic for doing this outside of changing the compiler then I am not aware of it. Also, I would expect such magic to make the code decidedly less efficient if it did exist. The only other, and the traditional, option is to change these stored pointers which brings us right back to the problem of knowing a real pointer from some arbitrary data with the same bit pattern. Various semi-solutions have been proposed but I would argue that only a compiler could be expected to do the job without overburdening the user. - Tim
tom@ssd.csd.harris.com (Tom Horsley) (01/28/91)
>>>>> Regarding Re: Smart pointers and stupid people (was: garbage collection...); tma@osc.COM (Tim Atkins) adds: tma> Obviously, generated results can be and often are stored away by compiler tma> generated code even without counting optimization for later use in any tma> context where it can be determined by the compiler that any depended on tma> values are constant in particular. "...where it can be determined by the compiler that any depended on values are constant..." This is the key point. No compiler with any brains keeps something in a register across a function call where that something might be aliased. The question then becomes, how does the compiler find out the thing is aliased? tma> Now, if I call something that causes the object to move, then these tma> stored away pointers are now invalid. However, HOW DOES THE COMPILER tma> know? I don't know of any way to make it know this strictly from the tma> language end. True a compiler can be built that does makes this possible. This is the other key point. How do you make sure that 'this' is also a "smart pointer" so the compiler will know it had its address taken and know it is aliased? Without changes to the language (which is what most of this discussion seems to have been about in the first place) this is that hard part, with changes to the language it is not all that difficult, you simply need 'this' to be some kind of "smart pointer" as well as all the other pointers that are laying around. The only way (without language changes) seems to be following a coding convention that will avoid ever having 'this' move. As someone pointed out early in this thread (it seems like years ago now) this can be done by always using an extra level of indirection so 'this' is not actually in the heap and is not garbage collected. This involves extra overhead for the indirection and the coding style is cumbersome and easy to get wrong, which is why so many people are talking about changes to the language. -- ====================================================================== domain: tahorsley@csd.harris.com USMail: Tom Horsley uucp: ...!uunet!hcx1!tahorsley 511 Kingbird Circle Delray Beach, FL 33444 +==== Censorship is the only form of Obscenity ======================+ | (Wait, I forgot government tobacco subsidies...) | +====================================================================+
jimad@microsoft.UUCP (Jim ADCOCK) (01/30/91)
In article <TOM.91Jan28082617@hcx2.ssd.csd.harris.com> tom@ssd.csd.harris.com (Tom Horsley) writes: >This is the key point. No compiler with any brains keeps something in a >register across a function call where that something might be aliased. The >question then becomes, how does the compiler find out the thing is aliased? This is an interesting point in terms of optimization too. Conversely, no compiler with any brains flushes values needlessly from registers when optimization is turned on. The question then becomes: when is an object legitimately aliased by a function call, and when not? There isn't consensus in the C++ community of what constitutes a legitimate alias of an object. So, I guess compilers just have to pessimistically flush registers across function calls.... so much for C++ "optimization"
jimad@microsoft.UUCP (Jim ADCOCK) (02/01/91)
In article <4175@osc.COM> tma@osc.UUCP (Tim Atkins) writes: |It is precisely the assumption of "dead" that makes object relocation such |a nightmare in a an untagged environment. Most compilers and optimizers |work hard at not putting pointers and other intermediate products in valuable |register and less valuable stack space UNLESS they expect to reuse them |in the near future. However, there is no guarantee that an operation that |invalidates those stored intermediaries such as a copying GC will not occur |before the reuse without the compiler being any the wiser. If it occurs then |the subsequent use must be restricted not from the user but from the |generated code! If there is some unknown magic for doing this outside |of changing the compiler then I am not aware of it. Also, I would expect |such magic to make the code decidedly less efficient if it did exist. Such "magic" does exist. It need not make code much "less efficient" than the relatively minor inefficiencies already designed into C++. Note that if GC is forced to occur only when it is known that there are no such intermediaries in existance, or if GC doesn't move any objects that could be being referred to by such intermediaries, then there is no problem. Both these approaches are frequently pretty easy to implement. |The only other, and the traditional, option is to change these stored pointers |which brings us right back to the problem of knowing a real pointer from |some arbitrary data with the same bit pattern. Various semi-solutions have |been proposed but I would argue that only a compiler could be expected to |do the job without overburdening the user. I disagree with the "overburdening" part of the argument. In one limit, a GC scheme could choose to never move objects. Then there is no burden on the user to know where pointers are. So, at least one GC design choice is *not* overburdening. The question then becomes, how many *other* GC design choices are not overburdening? And what kinds of applications do these various GC schemes represent good design trade-offs for? In general, more efficient GC schemes require more work -- either from the programmers of the compiler, or programmers of the code compiled by that compiler. How efficient a GC scheme needs to be is critically dependent on what it is you're trying to do. In some applications, GC can be _truly_ trivial. I claim that there are lots of GC schemes that can work perfectly well in lots of applications, though I know of no scheme that will work in all applications. Note that there is no rule that says one must implement GC for all one's objects -- schemes that only perform GC on certain classes will solve the problem for some applications. Reference counted string classes being one pervasive, if silly, example. You do correctly point out the central issue in "conservative" garbage collection schemes: If one has to "conservatively" guess that some location *might* contain a pointer, then you can't move the *maybe* referenced object, because you can't update the location, because it *might not* contain a pointer. The important design choice is: Are my classes going to use GC or not? Using GC or not leads to pervasive changes in how one writes and thinks about software. If you don't believe this, spend a week or two writing little programs, ignoring issues of memory reclamation, and *pretend* that a GC is cleaning up after you. The freedom to design and use new data structures is tremendous. The exact details of what GC scheme to use are relatively unimportant design issues. GC or not GC is the real question. What the C++ community needs to be working on is ways to abstract-out issues of GC and memory management from other aspects of class design, so that class libraries are not "hard-wired" to one or another memory management or GC schemes.
chased@rbbb.Eng.Sun.COM (David Chase) (02/01/91)
You didn't put the most important part in all caps like you should have. People seem not to appreciate this point. In article <70353@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes: >The important design choice is: Are my classes going to use GC or not? >Using GC or not leads to pervasive changes in how one writes and thinks >about software. ... >The exact details of what GC scheme to use are relatively unimportant design >issues. GC or not GC is the real question. >What the C++ community needs to be working on is ways to abstract-out >issues of GC and memory management from other aspects of class design, >so that class libraries are not "hard-wired" to one or another memory >management or GC schemes. If I read this as "abstracting decisions of GC or not-GC from my class design", it seems like you have just contradicted yourself. If I write an interface that expects the assistance of a garbage collector, it will be different from (simpler than) the one I would have written without the garbage collector. I don't see how I can abstract on that, or why (given a garbage collector) I would want to -- it would take time, and not taking that time is one of the big plusses of using a garbage collector. Perhaps I misread your intention and we really agree, but in short -- C++ and C+++GC are different languages. It's just an accident that they use the same syntax. David Chase Sun
daniel@terra.ucsc.edu (Daniel Edelson) (02/01/91)
In article <70353@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes: > ><argument for GC omitted> > >What the C++ community needs to be working on is ways to abstract-out >issues of GC and memory management from other aspects of class design, >so that class libraries are not "hard-wired" to one or another memory >management or GC schemes. Yes but no. This is a wonderful idea for about 5 minutes. There is something very attractive about contemplating something like the following: CC -F -GCgenerations=2 -GCtype=copying cfront.C ... # Output is: /* <<AT&T C++ Translator 5.0 11/30/93>> */ ... C program omitted for brevity ... That is, having a very flexible GC interface so that you really can plug in different algorithms. Unfortunately, this seems to be an unrealistic ideal. Copying GC probably imposes stricter stylistic requirements on the programmer. Will programmer be willing to code to the stricter standards unless they intend to use copying GC, even if it introduces inefficiencies? Perhaps I'm wrong. Perhaps programmers will be willing to obey stricter coding practices in order to bind late to the GC algorithm. I doubt it but hope so.
jimad@microsoft.UUCP (Jim ADCOCK) (02/02/91)
In article <3348@lupine.NCD.COM> rfg@lupine.ncd.com (Ron Guilmette) writes: |Now for my detailed comments on the "smart pointer problem" discussion |so far. (This is where it really starts to get biased! :-) |-------------------------------------------------------------------------- | |The solution proposed by Bob Martin and (independently) also by Jeremy |Grodberg to allow the type T* to be treated like a class (which can |be declared and which can have member functions and operators defined |for it) is clever and I had myself considered it, however I fear that |Bjarne will never like it. The reason? Well, it makes the language |"mutable" (in Stroustrup's terms). One early (and related) idea |which I had some time ago for solving this "smart pointer" problem |was to allow stuff like: | | T*& operator= (T*, T*&); | T operator* (T*); | |In effect, I wanted to let the user just redefine the meaning of = and |(unary) * for plain old pointer types. If you could do that, then you |could be in complete control of all operations done with stupid pointers. Hm, again, I can see where if T is a built-in type, say an int, or a type derived from a built-in type, say and int[100], or an int*, then it makes sense not to allowed the basic built-in types to be mutated. This would give one programmer the right to change the meaning of another programmer's programming efforts. However if T is a class type, say a FOO, or a type derived from a class type, say a FOO* or a FOO[100], then the only people affected by mutating the meaning of T are people who have bought into using that particular type. This seems more than fair to me. Thus, as I suggested on comp.std.c++, I don't see why overloading is allowed only on class types. It would seem that FOO*'s, FOO[100]'s, and enums would also be "fair" types to overload. The only people affected would be those people who *choose* to use those class types, enum types, or types derived from those class or enum types.
davidm@uunet.UU.NET (David S. Masterson) (02/04/91)
>>>>> On 26 Jan 91 08:47:22 GMT, tma@osc.COM (Tim Atkins) said: Tim> In article <3344@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes: Ron> In article <CIMSHOP!DAVIDM.91Jan7105709@uunet.UU.NET> Ron> cimshop!davidm@uunet.UU.NET (David S. Masterson) writes: David> On 4 Jan 91 21:53:34 GMT, tma@osc.COM (Tim Atkins) said: Tim> This totally misses the point it seems to me. The point is that Tim> regardless of what clever contortions one goes through on the C++ end Tim> there is absolutely no way at all to control the C intermediate code Tim> and/or object code that gets generated. My quote is missing here and it is very important for the conversation. Basically, the assumption has been that generated code from C++ will not do anything outside of the sequential statements expressed in the C++ program. That is, if C++ says do A B C, then the generated code will do no more than A+ B+ C+ (none of B's behavior will be done after C). This means that B can be controlled by A and C. Ron> No. It isn't. I believe that you are correct. In short, I believe that Ron> Tim's vague concerns about rogue pointers running rampage (without our Ron> permission) in the (mysterious?) C code which is generated by cfront is a Ron> non-issue. David> In particular, if there are temp copies of the pointers on the stack or David> register references, who cares? Ron> Exactly. Tim> I can scarcely believe that folks of your caliber can not seem to see the Tim> obvious problem for object movement that these supposedly innocent Tim> generated pointers create. [... more on uncontrolled dereferences ...] The point of my statement above is that, if it is true (I believe it is, but I might be wrong), then controlled GC can be implemented safely with respect to C++ as easily as it might be implemented with respect to C. Controlled GC means that garbage collection can only occur on an object when the controls on the object allow it. Also, during the period where GC on an object is not allowed (call this the transaction period), all invoked references to the object must first recompute themselves at least the first time in the transaction period. With this design for GC, the key is to implement objects AND access to objects that obey this. The problem has been that the standard (dumb) dereferencing of pointers is not safe and, therefore, must be controlled (which means use them only when controls allow AND they have been computed as being correct). -- ==================================================================== David Masterson Consilium, Inc. (415) 691-6311 640 Clyde Ct. uunet!cimshop!davidm Mtn. View, CA 94043 ==================================================================== "If someone thinks they know what I said, then I didn't say it!"
rfg@NCD.COM (Ron Guilmette) (02/05/91)
In article <4173@osc.COM> tma@osc.UUCP (Tim Atkins) writes:
+Responding to Ron Guilmete's post on limiting pointer bugs...
+
+This is a pretty interesting proposal but it seems to break down when
+object movement is allowed.  I assume it is to be considered a bug
+if an object moves but a subsequent update goes to the old location
+of the object instead of the new one.  Even with good encapsulation I'm
+not sure that you will get much joy in the situation that a smart
+pointer is dereferenced and stored by the C or object code generated
+in a register just before some other arbitrary function gets executed
+which results in a situation where the object moves...
You talk about a "smart point being dereferenced".  I assume by that you
mean a smart pointer being turned into a dumb pointer so that the
pointed-at object can actually be accessed.
Anyway, I don't think that this is a problem.  The basic idea in my
proposal is *not* to make everyone wear bullet-proof vests so that nobody
could ever do themselves any dammage by accidently discharging some firearms.
Rather, my idea was to give people safety catches on their firearms so that
they could protect themselves when they conciously decided they needed to.
In short, to make use of the ideas in my proposal, you would still be
required to cooperate by programming carefully.  In particular, you would
have to limit the regions of code in which dumb pointers are generated
and/or used, and in those (hopefully few) areas, you would have to
avoid doing things which may cause the pointed-at objects to move around
or to be deleted (thus invalidating one or more of the "stupid" pointers
that you are manipulating in that small region of code).
I think a good analogy is "critical sections".  You're not supposed to
spin-wait on a lock within a critical section with interrupts disabled.
That's just dumb practice and it gets you nowhere fast.
Likewise, within the small regions of code where "stupid" pointers are
actually generated and/or manipulated (in my scheme) you should not do
stuff that can cause movement of the pointed-at object(s).
Now just because some yahoo *could* write some code in which he did a
spin-wait on a lock within a critical section with interrups disabled
does NOT mean that critical sections are a bad concept.  It only means
that you must use them carefully in order to use them properly.
-- 
// Ron Guilmette  -  C++ Entomologist
// Internet: rfg@ncd.com      uucp: ...uunet!lupine!rfg
// Motto:  If it sticks, force it.  If it breaks, it needed replacing anyway.rfg@NCD.COM (Ron Guilmette) (02/05/91)
In article <4174@osc.COM> tma@osc.UUCP (Tim Atkins) writes:
<
<Obviously, a dereference of a smart pointer MUST happen somewhere in the
<generated code for any real work to get done.
Right.
<Obviously, generated results can be and often are stored away by compiler
<generated code even without counting optimization for later use in any
<context where it can be determined by the compiler that any depended on
<values are constant in particular.
Right.
<Now, if I call something that causes the object to move, then these
<stored away pointers are now invalid.
Right.
<However, HOW DOES THE COMPILER
<know?  I don't know of any way to make it know this strictly from the
<language end...
The compiler need not know.  It is up to the programmer to write code
in which the following *does not* happen:
	generate a dumb pointer value
	call a function which may move the pointed-at object
	use the dumb pointer value generated earlier
My proposal attempted to give the programmer some extra help when it
comes time to make sure sequences of actions like this don't ever happen.
Briefly, my proposal would have allowed to programmer to *restrict* the
area of the program in which dumb pointer values could even be generated.
Once that is done, and the number of places where a dumb pointer value
can even exist is reduced to a minimum, then the programmer would need
to make sure that in all of those (hopefully few) places where the dumb
pointers could exist, no sequences like the one above occur.
In short, if my proposal were adopted, you could still do stupid things,
however you could (with the help fo the compiler) impose restrictions on
the number of places in your program where you could *either* use or
*misuse* dumb pointers.  That's at least a step in the right direction.
-- 
// Ron Guilmette  -  C++ Entomologist
// Internet: rfg@ncd.com      uucp: ...uunet!lupine!rfg
// Motto:  If it sticks, force it.  If it breaks, it needed replacing anyway.rfg@NCD.COM (Ron Guilmette) (02/05/91)
Regarding the idea of allowing dfeclarations of class T* { ... };
In article <1991Jan19.190403.24325@clear.com> rmartin@clear.com (Bob Martin) writes:
>>... however I fear that
>>Bjarne will never like it.  The reason?  Well, it makes the language
>>"mutable" (in Stroustrup's terms)...
...
>
>  But don't the unary operator& and operator*  already violate this
>  rule of mutability?  ARM 13.4 implies that there is no return
>  type restriction on these operators.  So by using an Y operator*(X*)
>  you "Change the meaning of unary * when applied to a pointer type value."
You are both right an wrong.
Note that those operators can only be defined for parameters
which have either some class type or some reference-to-class type.
Now in the case of operator*, you are dead wrong.  Allowing a definition
of this operator for some class type does not make the language mutable
because operator* has no existing "built-in" definition for objects of
a class type.  Thus, by providing your own explicit definition for
operator* for one of your own class types, you are not changing the
meaning of any previously existing operator which was legally applicable
to objects of that class type.
In the case of (unary) operator& however you make a very good point.  In
fact it seems to me that Stroustrup has in fact violated his own principal
of "non-mutability" when he allowed (unary) operator& to be defined by the
user for class types.
I will make it a point to raise this philosophical inconsistancy with x3j16.
-- 
// Ron Guilmette  -  C++ Entomologist
// Internet: rfg@ncd.com      uucp: ...uunet!lupine!rfg
// Motto:  If it sticks, force it.  If it breaks, it needed replacing anyway.rfg@NCD.COM (Ron Guilmette) (02/08/91)
In article <TOM.91Jan28082617@hcx2.ssd.csd.harris.com> tom@ssd.csd.harris.com (Tom Horsley) writes:
+>>>>> Regarding Re: Smart pointers and stupid people (was: garbage collection...); tma@osc.COM (Tim Atkins) adds:
+
+tma> Now, if I call something that causes the object to move, then these
+tma> stored away pointers are now invalid.
It may seem to be a trivial point, but please keep in mind that C++ objects
do not "move".  You can copy the value of one C++ object into another, and
then destroy the first object, but the objects themselves don't move.
Still, if we modify the statement of the problem a bit, we can see that
there is in fact still a potential problem here.  If we change the word
"move" to "be destroyed" then we will see that it would indeed be a problem
if you had a sequence of actions like:
	create a (dumb) pointer value to some object
	invoke something which may cause the object to be destroyed
	use the (now stale) pointer
That's definitely a problem.  So what else is new?
My suggestion for two minor language changes does not (and did not) attempt
to make such cases impossible.  My suggested language changes were aimed only
at giving the programmer the ability to *encapsulate* such problems (arising
from messed up uses of dumb pointers) into a limited area of the program so
that when your program crashes, you will know where to start looking for the
problem.
No language can totally prevent you from writing code which contains errors.
Some languages (and language rules) however do make it *harder* to write
incorrect code and/or make it easier for you to find the bugs that you *do*
create.
I do not understand why people seem to want to insist that any minor change
made to the language should instantly make it impossible to write code
which has bugs.  Sorry folks.  I cannot propose any change which will
make it impossible to write programs which have bugs.
+This is the other key point. How do you make sure that 'this' is also a
+"smart pointer" so the compiler will know it had its address taken and know
+it is aliased?
I claim that it is not important to do that.  Remember that `this' can only
be used within members functions of a given class.  Thus, `this' is
already *encapsulated* and thus any logical errors in its use are also
encapsulated (within the class).
+The only way (without language changes) seems to be following a coding
+convention that will avoid ever having 'this' move.
Remember, C++ objects don't move.
-- 
// Ron Guilmette  -  C++ Entomologist
// Internet: rfg@ncd.com      uucp: ...uunet!lupine!rfg
// Motto:  If it sticks, force it.  If it breaks, it needed replacing anyway.rfg@NCD.COM (Ron Guilmette) (02/08/91)
In article <CIMSHOP!DAVIDM.91Feb3201846@cimshop4.uunet.UU.NET> cimshop!davidm@uunet.UU.NET (David S. Masterson) writes: >>>>>> On 26 Jan 91 08:47:22 GMT, tma@osc.COM (Tim Atkins) said: > >With this design for GC, the key is to implement objects AND access to objects >that obey this. The problem has been that the standard (dumb) dereferencing of >pointers is not safe and, therefore, must be controlled (which means use them >only when controls allow AND they have been computed as being correct). Right. If you are going to be using dumb pointers to things that may be deleted, they you must do so with great caution. These uses of dumb pointers must be "controlled". One useful approach to begin to control all such uses of dumb pointers is to first *contain* them (in a limited area of your program). Another word for "containment" is "encapsulation". -- // Ron Guilmette - C++ Entomologist // Internet: rfg@ncd.com uucp: ...uunet!lupine!rfg // Motto: If it sticks, force it. If it breaks, it needed replacing anyway.
jimad@microsoft.UUCP (Jim ADCOCK) (02/12/91)
In article <3779@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes: |Remember, C++ objects don't move. Essentially, GC proposals for C++ fall into two camps: 1) Those GC techniques that don't move C++ objects. 2) Those GC techniques that do move C++ objects. Clearly, moving C++ objects either represents [depending on your point of view] a change in the language, or a hack performed on a particular implementation of the language. Still, I don't see how this prevents one from considering the ramifications of performing GC on C++ with moveable objects. MS-Windows, for example, uses moveable objects in *C*, no less. The impact is not as bad on the programmer as one might expect, if object movement is restricted to relatively rare, well-defined times.
tom@ssd.csd.harris.com (Tom Horsley) (02/12/91)
>>>>> Regarding Re: Smart pointers and stupid people (was: garbage collection...); jimad@microsoft.UUCP (Jim ADCOCK) adds: jimad> In article <3779@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes: |Remember, C++ objects don't move. jimad> Essentially, GC proposals for C++ fall into two camps: jimad> 1) Those GC techniques that don't move C++ objects. jimad> 2) Those GC techniques that do move C++ objects. jimad> Clearly, moving C++ objects either represents [depending on your point jimad> of view] a change in the language, or a hack performed on a particular jimad> implementation of the language. Still, I don't see how this prevents jimad> one from considering the ramifications of performing GC on C++ with jimad> moveable objects. Yep. Mark-and-sweep type algorithms can leave objects in their original location, but copy collecting algorithms move them around (in fact, the way most simple copy collectors are implemented, they *always* move an object when GC is run). If you examine the literature on garbage collection, you will find that the performance is generally vastly superior for variations of copy collectors. This is the main reason a lot of people are interested in GC schemes in which objects *do* move. -- ====================================================================== domain: tahorsley@csd.harris.com USMail: Tom Horsley uucp: ...!uunet!hcx1!tahorsley 511 Kingbird Circle Delray Beach, FL 33444 +==== Censorship is the only form of Obscenity ======================+ | (Wait, I forgot government tobacco subsidies...) | +====================================================================+
boehm@parc.xerox.com (Hans Boehm) (02/13/91)
tom@ssd.csd.harris.com (Tom Horsley) writes: >Yep. Mark-and-sweep type algorithms can leave objects in their original >location, but copy collecting algorithms move them around (in fact, the way >most simple copy collectors are implemented, they *always* move an object >when GC is run). >If you examine the literature on garbage collection, you will find that the >performance is generally vastly superior for variations of copy collectors. >This is the main reason a lot of people are interested in GC schemes in >which objects *do* move. Neither copying, nor mark-sweep collection performs uniformly better under all circumstances. The tradeoffs are fairly complicated. The standard argument about copying collection taking time proportional to the number of accessible objects and mark/sweep collection taking time proportional to the size of the heap is 95% bogus. If you include allocation time, they are both the same. If you do not include allocation time, I can write my Mark-Sweep collector so that the Sweep pahse takes place incrementally during allocation. Copying collection tends to do better (by a constant) with large amounts of available physical memory (and no shortage of resources needed to map it into your address space). It also interacts better with generational collection, since I can keep the generations physically separated. However, It is not at all clear that copying outperforms M/S when paging becomes an issue. Copying requires that you allocate roughly twice as much virtual memory as for M/S, all of which is touched at least once for every full collection cycle. A M/S collector may run in physical memory, when a copying collector would be thrashing. (It takes a long time to page in 20 MB, even if it happens only every once in a while.) It is also unclear which strategy produces better locality between collections. Simple breadth-first copying collectors do compact, but in such a way that a single data structure tends to get smeared over the entire heap. Based on what little data I've seen, this is probably a lot worse than the approximation to allocation order maintained by a M/S collector. Cleverer copying collectors are possible, but something like the Appel-Ellis-Li parallel collector is nontrivial to adapt to something like depth-first traversal. I'm routinely using a parallel M/S collector. I have yet to see much realistic data on a comparison between the two strategies. (Note that the ratio of time-to-page-in-my-entire-address-space to cpu speed has gotten larger over time. Old claims should be treated with caution.) Zorn's paper in the 1990 Lisp conference does compare a copying collector with a mixed strategy, and points out that the mixed strategy often wins. (My apologies for repeating this argument again.) Hans-J. Boehm
rfg@NCD.COM (Ron Guilmette) (02/17/91)
In article <70606@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes: +In article <3779@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes: + +|Remember, C++ objects don't move. + +Essentially, GC proposals for C++ fall into two camps: + +1) Those GC techniques that don't move C++ objects. + +2) Those GC techniques that do move C++ objects. Jim Adcock (and others) have shown an inability to grasp the meaning of a simple sentence written in plain english. Read my lips: C++ objects do not move. You can copy the contents of one C++ object to another C++ object, and you can then delete the first object, but C++ objects do not "move". Oh, you can make tricky use of pointers and maybe even memcpy() to make it *appear* that a C++ object has moved, but as far as the language itself is concerned, there is no such thing as a "move object" operation. The distinction may seem like a merely semantic one, but it becomes important when you start to discuss (in detail) issues relating to smart pointers and other such things. -- // Ron Guilmette - C++ Entomologist // Internet: rfg@ncd.com uucp: ...uunet!lupine!rfg // Motto: If it sticks, force it. If it breaks, it needed replacing anyway.
ksand@Apple.COM (Kent Sandvik) (02/18/91)
In article <3945@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes: >Jim Adcock (and others) have shown an inability to grasp the meaning of >a simple sentence written in plain english. > >Read my lips: C++ objects do not move. ...as long as we talk about C++ compilers working under UNIX. Note that there are other operating systems that have a wide range of ideas how memory is handled. Speaking from my own point of view, in the case of MacOS we have relocatable blocks called handles, and in our C++ AT&T port we do have a object type called HandleObject, which is an object which indeed moves in memory (in order to compact the heap). Anyway, I have not checked ARM and the ANSI specs/drafts, but I would be highly surprised if the standards lock the underlying object memory handling scheme to a plain non-moveable version only. Also, is it in the interest of a language to specify the implementation? Regards, Kent Sandvik -- Kent Sandvik, Apple Computer Inc, Developer Technical Support NET:ksand@apple.com, AppleLink: KSAND DISCLAIMER: Private mumbo-jumbo Zippy++ says: "C++ is a write-only language, I can write programs in C++ but I can't read any of them".
jimad@microsoft.UUCP (Jim ADCOCK) (02/20/91)
In article <3945@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes: |In article <70606@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes: |+In article <3779@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes: |+ |+|Remember, C++ objects don't move. |+ |+Essentially, GC proposals for C++ fall into two camps: |+ |+1) Those GC techniques that don't move C++ objects. |+ |+2) Those GC techniques that do move C++ objects. | |Jim Adcock (and others) have shown an inability to grasp the meaning of |a simple sentence written in plain english. | |Read my lips: C++ objects do not move. I fully understand what you're saying Ron -- what you are saying is not a hard concept to grasp. It's just that I disagree with what you are saying. |You can copy the contents of one C++ object to another C++ object, and |you can then delete the first object, but C++ objects do not "move". | |Oh, you can make tricky use of pointers and maybe even memcpy() to |make it *appear* that a C++ object has moved, but as far as the language |itself is concerned, there is no such thing as a "move object" |operation. The language also doesn't have a concept of "page this region of memory to secondary store" -- but that doesn't mean a particular implementation can't use virtual memory. Usually with hardware assist, some implementations do indeed move objects, but do so in a way that is totally transparent to the language. Thus, the language doesn't care if objects move or not, as long as the requirements of the language are met. I claim implementations of the C++ language are free to move objects or not -- as long as the implementation meets the stated requirements of ARM. Mind you, I think there are going to be some regions of ARM that are very hard to meet using a garbage collected moveable objects strategy -- thus some implementations using GC may do so as an extension to the language, or alternatively as a restriction on some of the traditionally allowed pointer hacks. But then, *any* successful GC scheme is going to have to make some restrictions on the traditionally available set of pointer hacks. In Summary: Ron says C++ objects don't move period. I say C++ objects can move as long as they obey ARM.
chip@tct.uucp (Chip Salzenberg) (02/20/91)
According to ksand@Apple.COM (Kent Sandvik): >Anyway, I have not checked ARM and the ANSI specs/drafts, but I would >be highly surprised if the standards lock the underlying object >memory handling scheme to a plain non-moveable version only. Prepare for a big surprise, then: They _do_ require unmoving objects. A pointer value derived at the beginning of a program had better compare equal to the same pointer value derived at any later time in the same program. And you must remember that pointers to members are not necessarily class pointers: FOO *foo = new FOO; char *p = &foo->member; // other code in which the value of "foo" is not modified if (p != &foo->member) cerr << "This C++ implementation is broken\n"; On the other hand, you can take advantage of the as-if rule. If you (the implementor) can make moveable objects look like fixed objects, perhaps by making _all_ pointers "handle pointers," then that's fine. But the illusion had better be complete, or your compiler will not be standard-conforming. -- Chip Salzenberg at Teltronics/TCT <chip@tct.uucp>, <uunet!pdn!tct!chip> "I want to mention that my opinions whether real or not are MY opinions." -- the inevitable William "Billy" Steinmetz
rfg@NCD.COM (Ron Guilmette) (03/03/91)
In article <27C19834.162A@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes: +According to ksand@Apple.COM (Kent Sandvik): +>Anyway, I have not checked ARM and the ANSI specs/drafts, but I would +>be highly surprised if the standards lock the underlying object +>memory handling scheme to a plain non-moveable version only. + +Prepare for a big surprise, then: They _do_ require unmoving objects. +A pointer value derived at the beginning of a program had better +compare equal to the same pointer value derived at any later time in +the same program... I think that there may have been some misunderstanding regarding the true meaning of my statement to the effect that C++ objects don't move. When I said that C++ objects don't move, I meant that they do not move just like memory locations themselves do not move. Sure, you can copy the *contents* of one location to another, but I have yet to see any program on any computer which caused a little mechanical arm to whiz around inside the machine pulling DRAMS out of sockets and them plugging them into different sockets. (It sure would be entertaining to see such a machine, or such a program however. Perhaps that would be the ultimate in "hardware assisted relocation". :-) Likewise, given some C++ code like: { SOME_TYPE some_object; //... //... } I think that it is fairly clear that `some_object' resides in one (and only one) place throughout its lifetime. That's true even if you copy its *contents* into some other object. -- // Ron Guilmette - C++ Entomologist // Internet: rfg@ncd.com uucp: ...uunet!lupine!rfg // Motto: If it sticks, force it. If it breaks, it needed replacing anyway.
horstman@mathcs.sjsu.edu (Cay Horstmann) (03/04/91)
In article <4186@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes: > >I think that there may have been some misunderstanding regarding the true >meaning of my statement to the effect that C++ objects don't move. > >When I said that C++ objects don't move, I meant that they do not move >just like memory locations themselves do not move. Sure, you can copy >the *contents* of one location to another, but I have yet to see any (cute stuff deleted)> >Likewise, given some C++ code like: > > { > SOME_TYPE some_object; > > //... > //... > } > >I think that it is fairly clear that `some_object' resides in one (and only >one) place throughout its lifetime. That's true even if you copy its >*contents* into some other object. > In C, objects can move. A typical scenario is to malloc an array of structs (X* p = malloc( sizeof(X) * n )), fill it and later realloc. The same behavior occurs in a C++ variable array class. Cay