[comp.lang.c++] Smart Pointers -- A proposed language extension

rmartin@clrcom.clear.com (Bob Martin) (12/29/90)

				Smart Pointers for C++
					A Proposal for
				A new language feature

				REQUEST FOR COMMENTS

THE PROBLEM
	The facilities of C++ allow for overloading the pointer operators
	so as to create "smart pointers".  These are classes which behave
	in some ways as pointers to an object, but have other more
	intelligent behaviors as well.  For example it would be possible to
	create a smart pointer to a block of bytes which would not allow
	itself to be assigned any value outside the bounds of the block.

	Although this concept is useful it is limitted in that there is no
	way to enforce the use of smart pointers to classes designed to
	be referenced by them.  Thus if a user creates a class and intends 
	that it should only be referenced via a "smart pointer" he must 
	inform all the users of the class that they may not use normal 
	pointers but must use his "smart pointer" class.  If a programmer
	forgets and uses a normal pointer, then the program will not work
	and it will be very difficult to find the error.

PURPOSE
	The following is a proposal for a language addition which would
	allow the user to declare a smart pointer to a class, and enforce
	its use by causing the compiler to transform all pointer 
	declarations or expressions refering to that class to be 
	instantiations of the smart pointer.

PROPOSED SYNTAX
	Whereas "class T" is a declaration of T, "class *T" is a
	declaration of a "smart pointer" to T;  If such a class is
	defined then the compiler is warned that a smart pointer to T
	exists and that all expresions and declarations which
	involve pointers to T (T*) should be converted to expresions or
	declarations involving the smart poitner (*T) instead.  For example
	the following declaration:  T *a;  would be interpreted as: *T a;
	Or the following code segment:
		T x;
		T *p;
		p = &x;
	Would be interpreted as:
		T x;
		*T p;
		p = *T(x&);
	
	Thus a user can enforce the use of smart pointers without having 
	to trust the memory of other users. The other users simply declare 
	and use pointers to these objects normally without necessarily 
	knowing that the pointers are in fact "smart pointers".

DEFINING SMART POINTERS
	When in the scope of a *T definition, the use of pointer operations
	on type T are not converted to *T.  Thus within the member
	functions of *T, pointers to T can be used normally.  Thus:

		class *T {
			T *p;

		public:
			*T() 		{p = (T*)NULL;};
			*T(T& t) 	{p = (T*)t;};
		};
	works alright because T* is a dumb pointer when in the context of
	the *T definition.

	Smart pointers to T can also be used in the scope of the *T 
	definition, but they must be declared as *T;

	A definition of the form class *T{}; creates a dummy smart pointer
	which behaves exactly the way a dumb pointer does.  It is a way
	of saying "I don't want a smart pointer to T."  This is useful
	if you wish to cancel the inheritance of a smart pointer to a 
	BASE CLASS.  (See SMART POINTER INHERITANCE)

CASTING SMART POINTERS
	Variables of type *T cannot normally be cast to pointer types.
	Specifically they cannot be cast to (void *).  (Unless the
	conversion has been added as a member function operator void*());

	Variables or expressions of type void* can be converted into *T
	by casting as follows: 
		void *v;
		T *p;
		p = (T *)v;  // converts void* into *T;

	This allows functions of the form:
		operator *T(void*) to be written so as to aid in the 
		conversion from void* to *T;

SMART POINTER INHERITANCE
	When a smart pointer *B has been declared for a class B, then any
	use of pointers to class D derived from B will use *B smart
	pointers unless a *D smart pointer has been declared.  Thus:

		class B;
		class *B;
		class D : B;  // D is derived from B

		D *p;		// p is really a *B

	In the case of multiple inheritance, if one or more of the base
	classes has a smart pointer associated with it, it is illegal to
	use any pointer operations unless a smart pointer has been declared
	for the derived class:

		class A;
		class *A;	// smart pointer to A
		class B;
		class D : A,B; // D is derived from A and B.

		D *p;		// illegal unless *D is declared.

SMART POINTERS AS CLASSES
	Smart pointers are true classes.  They can have member data and
	functions.  Can inherit from one or more base classes, etc.  It
	seems likely that an inheritance hierarchy of classes would be
	shadowed by an inheritance hierarchy of smart pointers to those
	classes.

REQUIRED MEMBER FUNCTIONS FOR *T CLASS DEFINITIONS
	*T()
		The default constructor does not exist.  One must be defined
		with no parameters so that the compiler can create empty
		smart pointers.

	*T(T&)
		The pointerizing constructor.  Used by the compiler to create
		pointers to existing objects.

	operator *()
		Should return a reference to an object of type T;

	operator ->()
		Must return a true pointer to type T. (Not a smart pointer)
		The compiler will not attempt to convert it to a smart pointer.

RECOMMENDED MEMBER FUNCTIONS FOR *T CLASS DEFINITIONS		
	operator[](int)
		Should return a reference to an object of type T;

	*T& operator+=(int) and *T& operator-=(int)
		Should perform reasonable transformations on the pointer and
		return references to *T (hopefully references to "this").

	*T operator+(int) and *T operator-(int);
	friend *T operator+(int, *T&);
		Should perform reasonable transformations on the pointer and
		return a new *T;

	operator void*()
		Should return a void* 'v' which can properly be converted back
		to type *T via (T *)v
		
		
CONCLUSION
	This technique could be a useful feature to the C++ language.  It
	seems in character with other features of the language, and
	provides a way to control all indirect access to any type of
	object.
				

-- 
+-Robert C. Martin-----+:RRR:::CCC:M:::::M:| Nobody is responsible for |
| rmartin@clear.com    |:R::R:C::::M:M:M:M:| my words but me.  I want  |
| uunet!clrcom!rmartin |:RRR::C::::M::M::M:| all the credit, and all   |
+----------------------+:R::R::CCC:M:::::M:| the blame.  So there.     |

gintera@fsd.cpsc.ucalgary.ca (Andrew Ginter) (01/06/91)

In <1990Dec28.203554.21028@clear.com>, rmartin@clrcom.clear.com (Bob Martin) writes:

>	The following is a proposal for a language addition which would
>	allow the user to declare a smart pointer to a class, and enforce
>	its use by causing the compiler to transform all pointer 
>	declarations or expressions refering to that class to be 
>	instantiations of the smart pointer.

To be powerful enough to implement a garbage collector, the smart
pointer proposal must emphasize that, outside of the definition of the
smart pointer class, all language constructs involving machine
addresses must use smart pointers.  This includes:

* all pointers declared by the user and declared as being returned
  from functions (these were mentioned in the proposal),

* reference variables, 

* "this" pointers declared implicitly by the compiler,

* pointers to member functions (which use a pointer to an object
  internally),

* and other temporary pointers which may arise when invoking
  memberwise constructors, destructors or assignment operators.

This will guarantee that expressions like the following no longer goof
up:

	ptr_to_bar X;
	X -> boo (foo ());

With the existing "smart pointer" support, this would be translated to
something like:

	ptr_to_bar X;
	boo ( operator-> (&X), foo ());

The problem with this is that the order of evaluation of arguments to
boo is not defined.  If "operator->" is called before foo is, then a
machine pointer to a "bar" will have been left on the stack while foo
potentially activated a compacting garbage collector.  With "this"
pointers and all other internal pointers made smart pointers, the same
expression would translate to something like:

	ptr_to_bar X, tmp;
	ptr_to_bar_constructor (&tmp, &X);   /* initialize tmp = X */
        boo (&tmp, foo ());

Even with this support, implementing a garbage collector is not easy.
The big loophole is people taking the address of instance variables
in classes with smart pointers defined for them.  If the instance
variables have no smart pointers defined to them, the user will wind up
with a dumb pointer to the middle of a smart object.  If a garbage
collector tries to relocate the object, the dumb pointer is not updated.

You can get around this by requiring that all instance variables in
smart classes be themselves smart.  In an application with different
garbage collectors for different smart classes, this won't buy you
anything though.  In such an application it will be impossible for the
compiler to decide whether or not all of the instance variables in a
class are of the same "smartness" (ie: register pointers with the same
garbage collector).

You can get around this by making all instance variables in the smart
class private.  This limits the use of smart pointers to special
purpose applications.  A general purpose garbage collector can't
require all instances of all collected classes to be private.  And
even for special purpose applications, programmers will have to watch
very carefully.  If a method in the smart class invokes a method in a
dumb instance class, the dumb method will be passed a dumb "this"
pointer.  If that method calls other functions which eventually
trigger a garbage collection in the original smart class, the dumb
method's "this" pointer will be invalidated.

For anyone who's interested, I'm almost finished writing a report
describing the interaction between smart pointers and various garbage
collection options.  If you send me mail, I can send you a copy of the
report when it's finished.

Andrew Ginter, 403-220-6320, gintera@cpsc.ucalgary.ca

chased@rbbb.Eng.Sun.COM (David Chase) (01/19/91)

In article <1991Jan18.014050.27108@cpsc.ucalgary.ca> gintera@fsd.cpsc.ucalgary.ca (Andrew Ginter) writes:
>I'm not sure what Chase's example of how optimizers generate
>unregistered temporary pointers was supposed to show.  His example is
>a problem for a parallel garbage collector, yes.  But serial and
>incremental collectors would have no problems with his example since
>his function does nothing that could trigger a garbage collection.

If you re-examine Chase's example, you'll notice that it calls
a function "f" in the middle of that loop.  I thought it was clear
what "f" might do.

>I also disagree with Chase's assertion that "volatile" solves the
>problem.  ... [volatile] is good enough for memory
>mapped registers and may be good enough for some smart pointer
>applications, but it's not good enough for a compacting collector.

I guess I wasn't clear.  With existing compilation technology, it's
not clear that you can get a fully compacting collector (relocating
ALL the pointers) for C++, period.  Edelson does this by registering
all the pointers (creates aliases, too) and assuming
single-threadedness.  As soon as you have more than one thread sharing
an address space (preemptively scheduled, just for entertainment), all
bets on nailing all the references are off because one thread may get
suspended in the middle of an addressing calculation.  Since I ported
a collector to Mach about a year ago, I regard this as a distinct
possibility.  Even without this, there's always Your Friend the
signal handler.

Bartlett combines compacting collection with conservative collection
by not relocating objects referenced by "ambiguous" pointers.  It's a
neat trick, and I think it is a necessary one.  Still, it needs at
least one pointer to the base of an object, and that's what volatile
solves for you, more or less (other people have pointed out that
volatile is underspecified and slow -- what's not clear is whether it
is significantly slower than any registration scheme that relies on
aliasing to thwart the optimizer.  Think about it.)

I also think single-threaded programs will soon (within a few years)
be regarded as a special case (on Unix, at least).  It is certainly
true that a garbage collector is more useful in a multi-threaded
environment, because manual storage management is harder there.  To
me, fancy schemes for single-threaded garbage collection are
interesting, but not useful in the long run.  Not useful in the long
run means that portability is less of an issue.  (An aside -- what
does it mean to be "portable" in a language that mutates every six
months?  Ack -- stifle that -- I must not think bad thoughts.)

>Chase's claim that registration methods that work today may fail
>tomorrow is also wrong.  All of today's smart pointer registration
>methods alias the smart pointer being registered.  If the pointer is a
>temporary, this aliasing means it must be allocated in memory.  

(All?  You're certain of that?  My intent was to warn anyone reading
away from making the usual brain-dead assumptions about what the
compiler "surely" must do in compiling code.  "I'll do this -- it's
faster, and it works for me, so it must be ok."  You must have come
across at least one person who prorgammed in that style.)

Unless it says so in the standard, you cannot rely on this in general.
Whole-program optimizers (e.g., Convex's) have the opportunity of
caching "aliased" variables in registers when they cannot be modified.
(So do less ambitious optimizers, but they have fewer opportunities
for figuring this out.)  The alias created MUST be to a variable that
can eventually be reached by the garbage collector by
standard-conforming code (and in fact, I all of the registration
schemes I've seen do satisfy this constraint), and thus the optimizer
will conclude (correctly) that the variable could change whenever the
collector might be called.

Again, this is heavyweight, and fails for multi-threaded programs
(unless you say volatile). 

I think a better answer would be to blow off this goal of portability
in standard C++, and instead get some compiler support or use a
language that already supports garbage collection.  IF you get garbage
collection, it's a different language, anyway.  Compiler support for
garbage collection is a massive win as far as efficiency goes, and you
don't have to screw around (as Edelson did) with different types for
references from different locations (Local_Type, Global_Type,
Heap_Type).  It works, but it is tedious.

David Chase
Sun Microsystems

gintera@fsd.cpsc.ucalgary.ca (Andrew Ginter) (01/19/91)

Chase: If you re-examine Chase's example, you'll notice that it calls
Chase: a function "f" in the middle of that loop.  I thought it was clear
Chase: what "f" might do.

My apologies - I missed "f".  I still argue that the example is
misleading though.  The C code declares "a", "b", and "c" to be smart
pointers, but never executes constructors or other gc registration
procedures on them.  On the other hand, the point isn't worth arguing,
since the example I gave is one where dumb pointer temporaries bite
you even though all possible registration procedures have been called, 
given the current state of the language.

I agree with your points about preemptively scheduled multi-threaded
applications and signal handlers.

Chase: I think a better answer would be to blow off this goal of portability
Chase: in standard C++, and instead get some compiler support or use a
Chase: language that already supports garbage collection.  IF you get garbage
Chase: collection, it's a different language, anyway.  Compiler support for
Chase: garbage collection is a massive win as far as efficiency goes, and you
Chase: don't have to screw around (as Edelson did) with different types for
Chase: references from different locations (Local_Type, Global_Type,
Chase: Heap_Type).  It works, but it is tedious.

That's just it.  Right now, C++ supports reasonably portable
conservative collectors.  Right now, the C++ language does NOT
guarantee the operation of a type safe, reliable, portable,
cooperative collector.  The flaws I've pointed out affect ALL general
purpose cooperative collectors for "standard" C++.  Language support
for cooperative collectors is at least one intended application of all
of the various smart pointer proposals that have come by.  The
question is, can it be done while changing the rest of the language
very little?  Can it be done without sacrificing compatibility with C
source and object code?  Can the result be competitive with manual
allocation?  With conservative collectors?  Can it be done for
parallel applications and applications with signal handlers that may
trigger collections?

Andrew Ginter, 403-220-6320, gintera@cpsc.ucalgary.ca

chased@rbbb.Eng.Sun.COM (David Chase) (01/19/91)

gintera@fsd.cpsc.ucalgary.ca (Andrew Ginter) writes:
>Chase: I think a better answer would be to blow off this goal of portability
>Chase: in standard C++, and instead get some compiler support or use a
>Chase: language that already supports garbage collection.  ...

>That's just it.  ... Right now, the C++ language does NOT guarantee
>the operation of a type safe, reliable, portable, cooperative
>collector.  ... The question is, can it be done while changing the
>rest of the language very little?  Can it be done without sacrificing
>compatibility with C source and object code?

Strictly speaking, language changes (in the form of restrictions) are
probably necessary.  It's too easy for me to take the address of a
member of an object and propagate that all over the world (including
into the heap), which forces me to go to "hyper-conservative"
collection, which is dangerously leaky.  Pointers smart enough to deal
with this are probably (after a moment's reflection) too expensive for
practical use.

I think that compatibility with C (and C++) source and object code is
largely a red herring.  In the short term, yes, this is useful.  In
the long term, the two languages are just plain different.  They may
look the same, but people will write vastly different code.  They'll
write different INTERFACES; no need to worry about reclaiming memory
resources if there is a garbage collector to back you up.  (I'm not
explaining this well -- program with a garbage collector for two
years, and you'll see what I mean.  It becomes painfully clear after
the garbage collector is taken away).

By the way, why is it important to have a portable (in the sense of
"contains no machine-dependent #ifdef's") garbage collector?  I
realize that this is a blasphemous question, but the Boehm-Weiser
collector is quite good (except for when the optimizer outsmarts it),
easy to use, and not "portable", though it is easily ported.  People
don't seem to be bothered that they use different compilers, linkers,
and libraries on different machines -- only the interfaces are the
same.

>Can the result [nonconservative collection?] be competitive with
>manual allocation?  With conservative collectors?

Define "competitive".  Your probably mean, "run as quickly as".

Consider the other metrics you might use:

is it less prone to leaking memory?  yes. yes.
is it less prone to pointer smashes?  yes. no worse.
is it more convenient to the programmer? yes. to be seen.

A quote taped up next to my office says:

 "Time spent by programmers to solve storage management related
  problems has decreased from an estimated 40% in Mesa [no GC] to an
  estimated 5% in Cedar [with GC] ....  Automatic storage management can
  improve programmer productivity dramatically without affecting
  performance adversely."  (Rovner, 1984)

Ponder that next time you hear about a product that failed to ship on
time.

Edelson presents evidence that a fairly naive compacting C++ collector
can run about as quickly as manual allocation.  I've been satisfied
with the performance of the Boehm-Weiser collector in my use of it
(but this is without using "volatile" to ensure correctness after
optimization).

>Can it be done for parallel applications and applications with signal
>handlers that may trigger collections?

No, you'll need to introduce a degree of conservativism.  You cannot
control the code that is ultimately generated by your backend if you
are claiming to be "portable".  In a parallel world, there will almost
always be some temporary floating around that your "smart pointer"
won't know about and your collector will be unable to relocate.  Read
the papers by Bartlett and Boehm&Weiser.  Get their code, and study
it.

David Chase
Sun Microsystems

rfg@NCD.COM (Ron Guilmette) (01/23/91)

In article <1991Jan18.014050.27108@cpsc.ucalgary.ca> gintera@fsd.cpsc.ucalgary.ca (Andrew Ginter) writes:
+
+Guilemette's claim that support for overloading the "address of
+member" operator prevents dumb pointers into the middle of smart
+objects is wrong.  When invoking a member function of a data member of
+a smart object (smart object = one that has a smart pointer defined
+for it), there is nothing that requires the data member itself to be a
+smart object.

Now I understand the point that you were trying to make.  I believed before
that the only thing that you were worried about was the (non-existant)
problem of getting a dumb pointer via `&object.member'.  As I say, that
is not a problem because noadays, C++ is defined to give you a very
special kind of pointer (a pointer-to-member) when you use such a
construct.

You are correct however that if I do `object.data_member.func_member()'
then the invoked `func_member' will get (as its `this' pointer) a pointer
to just the `data_member' subpart of `object'.  You are also correct that
this is worth thinking about, because if (while func_member is being
executed) the entire `object' moves (or is deleted) you are in big
trouble.

While this question is worth thinking about, it is not a critical
concern for the problem I was hoping to see a solution for.  Basically,
I just didn't want dumb pointers to objects (or, as you correctly note,
even dumb pointers to subparts of objects) to be "leaking out" all over
the program.  In the case of `object.data_member.func_member()' you could
prevent dumb pointers to data members from "leaking" all over the map
simply by doing what C++ programmers often do anyway... i.e. by making
`data_member' private to the class T (where T is the type of `object').
If you did that, then you have done a part of what I was asking for...
you have "encapsulated" pointers to `object' (or its subparts) to the
point where they might only be floating around in *some* (but not all)
parts of the program.

Of course you need additional language support to complete the "encapsulation
of the dumbness", but even if you have the additional language support,
the programmer *must* cooperate if you are to have any hope at all of
keeping the dumbness confined (to a limited area of code).

+Guilemette's smart pointer proposal seems to assume that it's possible
+to exclude dumb pointers to smart objects from some lexical contexts.
+While this is true, I think the number of such contexts is more
+limited than most people would consider useful.  In particular,
+consider the statement:
+
+  X -> A = boo ();
+
+where X is a smart pointer and boo is a function that sometimes
+triggers a compacting garbage collection.  The order of evaluation of
+this expression is undefined, and the left hand side (lhs) of the
+expression is an lvalue - a dumb pointer to A.

HOLD IT!  STOP RIGHT THERE!  Who says that an lvalue is the same thing as
a pointer?  This is an interpretation of C/C++ semantics I have never heard
before!!!

+If the language evaluates the lhs of the expression before evaluating
+the rhs and boo triggers a collection, the dumb pointer will be stored
+in a temporary or in a register during the collection.

You have noted a case where the ordering criteria for C/C++ operations
is sufficiently weak that a programmer could (as in your example) get
himself into trouble.  Does that mean that attempts to improve upon
the safety of smart pointers should be abandoned?  I think not.

+Gregory's claim that normal pointers to objects are OK as long as
+a smart pointer still refers to the object is of course true for
+non-compacting collectors.  However, this is a rule USERS must
+enforce - compilers can't...

Right.  Compilers can only give you the basic tools to write "good" or
"safe" code.  They cannot actually write the code for you.  You have
to help a little. :-)

As I said earlier, you (the programmer) must do your part to "encapsulate
the dumbness".

-- 

// Ron Guilmette  -  C++ Entomologist
// Internet: rfg@ncd.com      uucp: ...uunet!lupine!rfg
// Motto:  If it sticks, force it.  If it breaks, it needed replacing anyway.

rfg@NCD.COM (Ron Guilmette) (01/23/91)

In article <TOM.91Jan21083134@hcx2.ssd.csd.harris.com> tom@ssd.csd.harris.com (Tom Horsley) writes:
+Just to throw some more dust around the smart pointer issue :-)
+
+It seems to me that what the "smart pointer" proposals all boil down to is a
+request for language support to prevent programmers from using something
+"improperly". I believe the general consensus throughout this discussion has
+been that garbage collection and smart pointers are possible now, but to use
+them effectively you have to employ somewhat restrictive coding standards
+which are easy to unintentionally violate.

Gee!  At least one person caught my drift. :-)

+Given this, could the smart pointer question be considered an "environment"
+issue rather than a "language" issue? Could a sufficiently sophisticated C++
+development environment provide the sort of additional semantic checks
+needed to insure that no one ever used a class "improperly"?

I want to refute this as strongly as possible.  NO!!!  This has nothing
to do with environments!  Type-checking is one way that the compiler
can restrict you from shooting yourself in the foot.  Is type-checking
an environment issue?  NO WAY!

+... Without multi-threading I don't
+believe there are any problems...

Except for those darn dumb pointers!

+...GC in a single thread environment is always
+triggered by some sort of subroutine call, and no self-respecting compiler
+will keep something alive in a temporary across a routine call when that
+value might be aliased by the routine call.

Right.  "Synchronous" garbage collection (triggered by a function call)
is a LOT easier than BIG TIME "asynchronous" garbage collection (or
garbage collection which gets performed by other threads, which
amounts to the same thing).

I realize that the latter problem is more entertaining (and perhaps
even of more practical value) but right now, I'll settle for a "safe"
way to do the former.

-- 

// Ron Guilmette  -  C++ Entomologist
// Internet: rfg@ncd.com      uucp: ...uunet!lupine!rfg
// Motto:  If it sticks, force it.  If it breaks, it needed replacing anyway.

roger@xanadu.com (Roger Gregory) (01/30/91)

> rfg@NCD.COM (Ron Guilmette)
>+ tom@ssd.csd.harris.com (Tom Horsley) 
>+Just to throw some more dust around the smart pointer issue :-)
>+
>+It seems to me that what the "smart pointer" proposals all boil down to is a
>+request for language support to prevent programmers from using something
>+"improperly". I believe the general consensus throughout this discussion has
>+been that garbage collection and smart pointers are possible now, but to use
>+them effectively you have to employ somewhat restrictive coding standards
>+which are easy to unintentionally violate.
We do this routinely

>+Given this, could the smart pointer question be considered an "environment"
>+issue rather than a "language" issue? Could a sufficiently sophisticated C++
>+development environment provide the sort of additional semantic checks
>+needed to insure that no one ever used a class "improperly"?
>
>I want to refute this as strongly as possible.  NO!!!  This has nothing
>to do with environments!  Type-checking is one way that the compiler
>can restrict you from shooting yourself in the foot.  Is type-checking
>an environment issue?  NO WAY!

I disagree, type checking certainly can be made into an environmental
issue, consider a c++lint powerful enough to check c++ code for
various eccentric restrictions (we call ours xlint).  This enables
us to check lots of restrictions, though all possible sources of
errors.

I contend that this gives us extended type checking OUTSIDE the
compiler, just as C originaly had lint to do aditional checking 
we can just declare that we will not use code that doesn't pass 
xlint.  I could even have make enfoarce this if I wanted to, I don't.

The last thing that we thought we required C++ to do that wasn't
in the ARM has recently been finessed, though if we could wish for
a way to make smart pointers that polymorphize properly, we make do
with some hackery.



-- 
Roger Gregory
Xanadu Operating Company
(415) 856-4112 x113
roger@xanadu.com

rfg@NCD.COM (Ron Guilmette) (02/05/91)

In article <1991Jan30.002612.27260@xanadu.com> roger@xanadu.com (Roger Gregory) writes:
>
>>+Given this, could the smart pointer question be considered an "environment"
>>+issue rather than a "language" issue? Could a sufficiently sophisticated C++
>>+development environment provide the sort of additional semantic checks
>>+needed to insure that no one ever used a class "improperly"?
>>
>>I want to refute this as strongly as possible.  NO!!!  This has nothing
>>to do with environments!  Type-checking is one way that the compiler
>>can restrict you from shooting yourself in the foot.  Is type-checking
>>an environment issue?  NO WAY!
>
>I disagree, type checking certainly can be made into an environmental
>issue, consider a c++lint powerful enough to check c++ code for
>various eccentric restrictions (we call ours xlint).  This enables
>us to check lots of restrictions, though all possible sources of
>errors.

OK.  Let me re-phrase my statement.

Type checking, as well as numerous other sorts of checking for "correctness"
of C or C++ programs *may* be incorporated into a tool which you don't
call a "compiler", however that is often a foolish implementation
technique.

I consider it to be a fundamental part of the job of a "compiler" to
perform a good deal of "checking" on the programs (or sub-hunks of
programs) that I feed it.  Placing some of the checking into a separate
tool is like inviting somebody to write crappy code.  If you don't
believe that, try running lint on some big hunks of code where the
original developer decided not to be bothered with trivia like using
lint during the initial development.  Most likely you will be horrified
by what you find.


-- 

// Ron Guilmette  -  C++ Entomologist
// Internet: rfg@ncd.com      uucp: ...uunet!lupine!rfg
// Motto:  If it sticks, force it.  If it breaks, it needed replacing anyway.

sdm@cs.brown.edu (Scott Meyers) (02/05/91)

In article <3706@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes:
| Type checking, as well as numerous other sorts of checking for "correctness"
| of C or C++ programs *may* be incorporated into a tool which you don't
| call a "compiler", however that is often a foolish implementation
| technique.
| 
| I consider it to be a fundamental part of the job of a "compiler" to
| perform a good deal of "checking" on the programs (or sub-hunks of
| programs) that I feed it.  Placing some of the checking into a separate
| tool is like inviting somebody to write crappy code.  If you don't
| believe that, try running lint on some big hunks of code where the
| original developer decided not to be bothered with trivia like using
| lint during the initial development.  Most likely you will be horrified
| by what you find.

I'll be presenting a paper at the USENIX C++ conference in April on the
kinds of likely programming errors that could (and I believe should) be
detected by an as-yet-fictitious lint++.  Moises Lejter is a coauthor of
the paper.  We list a dozen or so candidate errors for detection, show that
they can be efficiently detected, and argue that a tool for doing such
detection should be distinct from a compiler.

The primary impetetus for this last point is that a lint++ could look for
things far afield from the usual domain of a compiler.  For example, a
lint++ could read special comments that say "this virtual function should
*always* be redefined in any derived class, no matter how far down the
inheritance hierarchy".  This would be useful for the following member
function:

    const char * className() const;  // returns the name of the class

I don't think you want your compiler checking for the enforcement of such
constraints.  At the same time, I think it's very useful to be able to
state such constraints and to have violations detected by some tool.

Scott


-------------------------------------------------------------------------------
What do you say to a convicted felon in Providence?  "Hello, Mr. Mayor."