[comp.lang.c++] Current O-O Languages as Software Engineering Tools

coggins@retina.cs.unc.edu (Dr. James Coggins) (11/08/88)

In order to give the religious debate on C++ vs. Objective-C a more
solid (well, a little less vaporous) foundation, consider the
following propositions that were proposed to me recently:

1. Smalltalk-like languages (including Obj-C) do a better job of
separating specification and implementation than Simula-like languages
(including C++). 

2. Smalltalk-like languages are better tools for developing small
programs because of their massive built-in class libraries and their
more flexible (later, dynamic) binding which allows polymorphic types. 

3. Simula-like languages are better tools for medium size software
development because for these larger projects it is worth building
specialized class hierarchies for the particular system, yielding a
better match to conceptual structures and increased programmer and
run-time efficiency. 

Now for the leap...

4. The Smalltalk-like languages will be the better tools for large
software development because item #1 above will kick in to make the
project more manageable. 

Opinions?

---------------------------------------------------------------------
Dr. James M. Coggins          coggins@cs.unc.edu
Computer Science Department   
UNC-Chapel Hill               Old: Data + Algorithms = Programs
Chapel Hill, NC 27514-3175    New: Objects + Objects = Objects
---------------------------------------------------------------------

jima@hplsla.HP.COM (Jim Adcock) (11/10/88)

Why ask for opinions when you can check out the real
development work being done in C++ verses other
OOPLs ???

Ultimately, languages are good at what they prove to
be good at.

Some of the things that C++ is already proving to be
good at: fast operating systems, fast graphics programs,
fast compilers....

smryan@garth.UUCP (Steven Ryan) (11/11/88)

>1. Smalltalk-like languages (including Obj-C) do a better job of ....
>3. Simula-like languages are better tools for medium size software ....

Well, provided we're just being informal.....

I think it might be nice to have a sequence of closely related languages.
At one end would be a strong-typed Smalltalk (or whatever) which could be
compiled into fast code, Smalltalk in the middle, and then beyond Smalltalk
(where classes are first-class objects).

People could start at the high end with rapid developement and then migrate
toward the other end if the code was used often enough to justify the
effort to make it compilable.
-- 
                                                   -- s m ryan
+-------------------------------+----------------------------------------------+
| Home of the brave and land of |  Congress shall make no law respecting the   |
| [deleted by the authority of  |  establishment of religion,...; or abridging |
| the Official Secrets Act].    |  the freedom of speech, or of the press;...  |
+-------------------------------+----------------------------------------------+

bs@alice.UUCP (Bjarne Stroustrup) (11/12/88)

coggins@retina.cs.unc.edu (Dr. James Coggins) presents 3 propositions
and a conjecture for comments. As people would expect (after reading
the original note) I'll challenge the conjecture. For good measure I'll
also challenge ALL of the propositions!

 > In order to give the religious debate on C++ vs. Objective-C a more
 > solid (well, a little less vaporous) foundation, consider the
 > following propositions that were proposed to me recently:

 > 1. Smalltalk-like languages (including Obj-C) do a better job of
 > separating specification and implementation than Simula-like languages
 > (including C++). 

Not at all! They do a different and often inferior job of this separation.
Let us first consider the key difference between the two styles of language
and to simplify matters let us compare Smalltalk and C++ as representatives
of their respective schools of thought. Each carries scars from their
history, each lacks features provided by newer and more ``researchy'' 
members of their family, and each has proven itself successful in its
own core application areas. Furthermore, I like both better than most
of the other languages and systems in the OOP arena.

Naturally, I have to make some simplifications in this discussion. The topic
is more suitable for a couple of major dissertations than a comp.lang note.
Try considering the major points rather than simply looking at each sentense
finding the 5 qualifications necessary for making it strictly true. I probably
know them too, but I'm not writing a Ph.D. -- Actually, it would be nice if
someone would do a really thourough discussion of these issues.

In my opinion is the key issue/difference is type system. Most of my comments
will relate to this:

	ST relies exclusively on run-time type checking
	C++ relies extensively on static type checking

So, what does ``separating specification and implementation'' mean?
For now, let us consider this by comparing the answers to the following
questions:

	I changed the implementation of something,
		how do I know I didn't change the specification?
		how are clients affected? and
		what do I have to get the program running again?

Ideally, with a perfect separation of specification and implementation
the answers are ``its obvious,'' ``not at all,'' and ``nothing''.

C++ class declarations provide the user with the ability to specify strongly
typed interfaces. If you change the interface it is obvious - and clients
depending on the interface will fail to compile. Furthermore you can specify
separate interfaces to different kinds of clients. In particular, you can
provide one interface to the general public, another to implementors of
derived classes, and a third to yurself (see for example Alan Snyder's
paper for OOPSLA'86 or Barbara Liskov's formulation of the same ideas in
the OOPSLA'87 keynote address). Smalltalk has problems in this area.

There are of course many subtle dependencies that are not ameanable to
verification by static type checking such as ``f() must be called before g()''.
This is a problem in every language. C++'s constructors and destructors
and initialization rules provide a few features in this direction and
dynamic checking of all sorts of properties can always be done. I do,
however, strongly prefer declarative properties that can be checked before
execution to properties that are essential dynamic in nature and can only
be checked when running.

The snag with run-time checking is that it is notoriously difficult to
ensure that errors detected at run-time is handled in a reasonable manner.
Entering even the most advanced debugger after detecting a vector range
violation or a ``method not found'' is useless if there is no programmer
present.

So C++ can detect a large class of problems at compile time, but what can
it do once they are detected? This is the notorious and I think largely
misunderstood ``header file problem''. Smalltalk doesn't have it, once you
have decided what to change, you simply does it and keep running. The effect
of a change ``instantly'' affects the whole program.

Once you make a change in a C++ program you need to recompile. If you simply
make a change to a member function, in principle you need only to recompile
and link that function (and there are systems being built that does exactly
and only that). The problem comes when you make a change to something that
in part of an interface. Then you need to recompile the clients too.
That is the essential price you pay for having the static checking. I contend
that in all but the most trivial programs it is worth it. In this, I am backed
by evidence of actual compile, debug, and integration times of medium sized
(<500K lines of code) projects.

As I have often said before C++ needs a tool that determines the minimal
recompilation needed after a change and prefarebly an incremental compiler
and linker to minimize the work of doing this recompilation. A UNIX `make'
that recompiles the world because you changed a comma in a comment in a
heavily included header file simply isn't a suitable component in a C++
programming environment.

There is one C++ design decision that affects the amount of recompilation
that people usually fail to appreciate (I think largely because of the lack
of tools). The private part of a C++ class is part of the declaration of the
class itself and a change to it may force the recompilation of clients.
For example:
		class X {
			int a;
		public:
			f();
		};

		main ()
		{
			X x;
			x.f();
		}

If you change X to

		class X {
			int a;
			int b;
		public:
			f();
		};

main() will have to be re-compiled. The reason is that by declaring an automatic
(on the stack) variable the client main() required knowledge of the size of X
(the compiler needs that information to generate code for the function call and
return since that requires knowledge of the size of the stack frame).

Clearly that could be avoided (even in a C++ implementation) if we were
willing to use a smarter link environment or code generation strategy
(so that the layout of stack frames was handled dynamically), but that
would seriously hurt C++'s portability and its ability to fit into a
traditional environment.

Alternatively we could use the trick of never allocating class objects on
the stack. This is the strategy of Simula and most of its decendents.
The snag is that if we did that we would incur an overhead of two memory
management operations per function call and the cost of indirecting every
access to a class object. Measurements on Simula indicates that this cost
is at least a factor of 2 in run-time. Most of Simula's decendents pay
an even higher price. Accepting this overhead would imply giving up large
application areas to C, assembler, and Fortran. C++ was specifically designed
to preserve efficiency in this area. The apparant cost is recompilation time.

However, if you don't use these features, that is, if you don't declare
automatic or static variables of a type, if you don't have inline functions
that depend directly on the contents of the private part in a type, and if
you don't take the sizeof a type THEN the users of a type is insulated from
changes in the implementation of a type EXACTLY as in languages that always
use indirection in the access to class objects. For example, had main()
been written like this:

	main()
	{
		X* p = new X;
		p->f();
	}

it would have been unaffected by the change to X's representation - and it
would of course have incurred the run-time cost.

My contention is that often run-time cost matters and often it doesn't.
C++ serves you in both cases. However, the current C++ tools does not
help you sufficiently. Curiously enough I was building tools for finer
grain dependency analysis and tools for taking advantage of it 4 years
ago, It is not particularly hard, but the explosion of C++ use distracted
me. What you see here is not a design flaw in the C++ language but a
deficiency in the available support tools. This deficiency is finally
being remedied.

 > 2. Smalltalk-like languages are better tools for developing small
 > programs because of their massive built-in class libraries and their
 > more flexible (later, dynamic) binding which allows polymorphic types. 

Again I must disagree. Of course you can throw a small Smalltalk program
together to do many things that would be painful to build from scratch in
C++. However, the massive libraries and the wonderful program development
environment of Smalltalk is not something that C++ lacks because of some
inherent defect. Rather, Smalltalk is about 10 years older than C++ and
have had something like 100 times more effort and resources lavished on
its environment and libraries. I am in particular looking forward to
trying ParcPlace's Cynegy C++ program development environment. It has
the potential of bringing some of the Smalltalk expertise to bear on the
fairly well understood problems with C++'s (lack of) program development
tools/environment.

If you consider UNIX and MS-DOS to be C++ toolsets - imperfectly inherited
from C - C++ fares a bit better, but C++ can and will progress much further
in the programming environment and standard libraries areas. Where it comes
to using a traditional tool such as a data base system, an standard (grubby)
operating system interface, a Fortran engineering library, a C device driver,
etc. C++ has an edge. If your small program needs to run on a very small
machine, an unusual machine, or a mainframe, C++ has an edge. The ability
to coexist and ease of porting can be essential even for very small projects.

 > 3. Simula-like languages are better tools for medium size software
 > development because for these larger projects it is worth building
 > specialized class hierarchies for the particular system, yielding a
 > better match to conceptual structures and increased programmer and
 > run-time efficiency.

Here I was naturally sorely tempted to agree, but #3 also is an
oversimplification that could confuse the issues. It really also depends
on how you define ``medium''. If medium is defined as programs that it
take more than one person more than one year to build I agree, but here
of of course the choice of language/system can radically affect the perceived
size of the project. If the problem is a good ``fit'' for a Smalltalk
system on a suitably sized workstation you can see spectacular benefits
from using Smalltalk. However, if that fit isn't there using Smalltalk
could turn into an exercise in horrendous contortions.

For all sizes of projects it is essential to chose a suitable tool for the
problem. There is no perfect tool.

 > Now for the leap...

 > 4. The Smalltalk-like languages will be the better tools for large
 > software development because item #1 above will kick in to make the
 > project more manageable.

Well, I disagreed with #1 so I could rest my case here.

 > Opinions?

There is absolutely no basis for this conjecture and the absense of really
large projects successfully developed and supported in a Smalltalk-like
language is a contraindication. After 16 years of Smalltalk variants and
offshoots we should not be conjecturing on this point. Smalltalk has a
string of spectacular successes to its credit, but as far as I'm aware
large scale systems development isn't among them. Nor was it mentioned
among the things Smalltalk was supposed to be especially good at.

The Smalltalk immitations, hybrids, and commencial offshoots have consistently
been far inferior to ``the real thing''. 

For larger projects C++'s strong static type checking provides essential
benefits in the integration phase by providing mechanisms for documenting
interfaces in a way that can be mechanically checked. For many-person projects
the ability to plug exclusively dynamically checked together with great ease
and no (static) checking simply defers checking until after integration where
nobody is able comprehend the total system anyway.

This is the reason for all the incompatible plugs we have in the hardware
arena. You don't really want to be able to plug everything together. People
using an electric shavers appreciate not being able to plug them into high
power outlets.

The value of a Smalltalk like environment and superb debugging features is
greatly diminished when the system is operated by someone who does not
understand the total system, isn't allowed to change it on the fly, and
might not even be a programmer. Also, it still is (and I expect it will remain)
most unusual for a large system to fit on a (single) personal workstation.
The ST-style programming environments still appears to be primarily aimed
at serving a single user on a single system. Often a large software system
will have to serve a large user-community on a large (typically ugly and
often diverse) hardware base.

Also run-time efficiencies often matters when you start pressing against the
limits of the hardware (memory sizes, mainframe CPU's, network connections,
disc bandwith). Here C++ comes into its own.

LARGE system development is typically a mess. The state of the art is poor.
There is much that can be done and I think that OOP (in its various incarnations)
has a vital role to play, but we have a long way to go yet. I am convinced,
however, that statically typed interfaces that can provide a base for
documentation and design methods allowing relatively large numbers of people
to cooperate is an essential part in any remedy for LARGE system development.

Anyway, I consider the belief that there is exactly one right way of doing
things and especially the belief that there is exactly one right language
of doing it in and infantile disorder.

		``Anything works on small projects.''
			- James Coggins

Thanks, Jim, for posting these propositions. 

	- Bjarne Stroustrup

coggins@alanine.cs.unc.edu (Dr. James Coggins) (11/16/88)

In article <77300015@p.cs.uiuc.edu> johnson@p.cs.uiuc.edu writes:
>
>> > 2. Smalltalk-like languages are better tools for developing small
>> > programs because of their massive built-in class libraries and their
>> > more flexible (later, dynamic) binding which allows polymorphic types. 
>
>I understood Jim Coggins to be saying that he thought that C++ DID have an
>inherent defect that prevented the large class libraries from being created.
>I don't really understand what it is about C++ that he thinks causes this,
>nor am I convinced that C++ has any fatal errors in this regard, but that
>was his claim.

Hold it - nobody move!
My list of propositions was a summary of an hour-long discussion with
someone not on the net primarilyly reflecting my colleague's main points. 
(I said something like that in the first line of my posting, BTW.)
I will see him again in a couple of weeks and I'm hoping you net.experts
can tell me how to demolish his argument.

We were having a C++ vs. Objective-C debate, and he was claiming that
the biggest hindrance to using C++ on large projects was that C++
inherently binds an implementation to your architecture. His claim was
that dynamic binding at run time allowed architecture and
implementation to be more independent of each other in practice,
therefore the Smalltalk model would be better for large-system
development than the Simula model.

Bjarne Stroustrup gave me lots of ammo to fire at his propositions and
his logic.  I'm still going through that long paper so I haven't
responded yet.

I am not aware of any inherent defect in C++ that would prevent C++
from being usable on large projects, but my colleague's logic seems
plausible so I posted his ideas to see who shot at them.  It drew out
the big guns, huh?

>My experience is that Smalltalk is better for prototyping a design, but
>that once it is finished it doesn't matter that much which language you use.

This is the conventional wisdom.  Since I like to challenge
conventional wisdom, let me offer this counterargument. If you
prototype using Smalltalk, you adopt the Smalltalk object hierarchy as
your architectural tools/metaphors/thought patterns. You also adopt
the underlying implementations, but they can be thrown out or redone
later (if you have strong personal and professional discipline - see
my .sig).  By adopting the Smalltalk environment, you might be
starting off your architectural design effort by putting on a
straitjacket and blindfold.  Anything works for small projects.  For
large projects the fit of mental models, architecture, and language
abstractions needs to be tighter in order to manage and control the
project.  I believe that this fit can be made tighter by designing
abstractions (classes) to fit the project, not twisting the mental
model to fit a predefined hierarchy.  This is the '80's version of the
top-down heuristic: 
          Let the implementation conform to the conception. 

My colleague thinks that dynamic binding results in an advantage in
abstraction that will make Smalltalk-like systems better tools for
large projects.  I didn't buy it in our discussion then, but I was
unable to marshall a good argument.  Now I know how to approach the issue.

Bjarne once said, "C++ is not Smalltalk." Hear!Hear!

>The reason that Smalltalk programming environments are so good is the same
>as why Lisp programming environments are so good.  Programmers can access
>contexts and classes from inside Smalltalk so they can easily write
>debuggers and browsers.  A similar system for C++ will require a lot of
>interaction with the operating system, which will be more complicated and
>less portable.  Thus, making a nice programming environment for C++ is
>a harder job than making one for Smalltalk, so it will take longer.

Hey, there's a general rule that applies: interpreters yield better
programming environments than the compile-edit-run cycle because they
are virtual maching simulators, and you can cause that virtual machine
to help you out in any way you can define. So who will be first out
with a C++ interpreter? 

>Smalltalk certainly has some faults when it comes to building large
>systems, primarily in its support for mulitperson projects, but I think that
>Dr. Coggins was speaking of "Smalltalk-like" languages (whatever that
>means) and not Smalltalk itself.

Um, I plead guilty to necessary oversimplification.  In doing so, I
leave to respondents the selection of what more specific aspect they
wish to address.  Bjarne did exactly that, for which I'm grateful.

>My type system for Smalltalk is much more flexible than the one for C++.
>I am curious about Dr. Coggins's explanation of why C++ is not as good
>for reuse, and wonder how types fit in to it.

The question is whether the flexibility in the type system afforded by
late dynamic binding is ultimately an advantage (by virtue of its
additional support for abstracting architectures independent of
implementations) or ultimately a disadvantage (by delaying detection
of errors and in fact not detecting as errors some things that in fact
are errors). 

---------------------------------------------------------------------
Dr. James M. Coggins          coggins@cs.unc.edu
Computer Science Department   Working code is EXTREMELY valuable.
UNC-Chapel Hill               That's why it's hard to throw out
Chapel Hill, NC 27514-3175    ditzy prototypes.
---------------------------------------------------------------------

coggins@coggins.cs.unc.edu (Dr. James Coggins) (11/20/88)

In article <77300016@p.cs.uiuc.edu> johnson@p.cs.uiuc.edu raises lots
of interesting points that are worth responding to in spite of the
resulting length of this posting.

>Claim:  (by a colleague of Dr. Coggins)
>  dynamic binding at run time allowed architecture and implementation
>  to be more independent of each other in practice, therefore the Smalltalk
>  model would be better for large-system development than the Simula model.
johnson@p.cs.uiuc.edu says...
>I agree with the first part of the statement, but C++ provides dynamic
>binding at run time, so the last half is a non sequitur.

I reply,
Well, the semantic sloppiness of fairly casual conversation makes this
a point of debate, but not a particularly enlightening one.  Is C++
dynamic or not?  Yes and no.  My colleague was claiming that the more
dynamic environment of Smalltalk impacts large-scale software
engineering. My summary at the end (I'll *** it below) perhaps
expresses the real issue more clearly.

I wrote...
>>If you prototype using Smalltalk, you adopt the Smalltalk object hierarchy
>>as your architectural tools/metaphors/thought patterns. You also adopt
>>the underlying implementations, but they can be thrown out or redone
>>later.  By adopting the Smalltalk environment, you might be
>>starting off your architectural design effort by putting on a
>>straitjacket and blindfold.  Anything works for small projects.  For
>>large projects the fit of mental models, architecture, and language
>>abstractions needs to be tighter in order to manage and control the
>>project.  I believe that this fit can be made tighter by designing
>>abstractions (classes) to fit the project, not twisting the mental
>>model to fit a predefined hierarchy.  This is the '80's version of the
>>top-down heuristic: 
>>          Let the implementation conform to the conception. 
>
johnson@p.cs.uiuc.edu commented...
>There seem to be several misunderstandings of Smalltalk here.  Smalltalk
>certainly does not force anyone to use existing classes.  People use them
>because they are well designed. 

I (the establishment iconoclast) respond...
That's a correct Party Line response that is meaningless in practice.
This is exactly the kind of conventional wisdom that should be
examined more carefully.  Each of these classes in the Smalltalk
architecture comes with an implementation bound up with it. Part of
the price you pay for using Smalltalk is the tacit adoption of those
implementations along with the nice architecture.

johnson@p.cs.uiuc.edu commented...
>It seems to me that you are arguing against reuse. [Who, me?] I find that hard to
>believe. [Me too.] It is very hard to write reusable code, and trying to reuse
>code that is not reusable is no fun.  However, it is possible for code
>to be reusable, and Smalltalk is an example.  Smalltalk programmers reuse
>code because that is the easiest way to develop a high-quality product.

My argument is not against reuse but against the proposition that the
mechanisms we currently have available (even in Smalltalk) are
sufficient to support any meaningful code reuse - enough to make a
lead bullet if not a silver one [see Brooks' IEEE Computer cover
article, April 1987].  Instead, the existing mechanisms
require the reuser to adopt conventions and implementation assumptions
that may or may not conform to the reuser's conception of the task at
hand.  We have basically no mechanism other than complete rewrite for
exchanging implementations within an architecture. (Unless you want to
try to bring in "executable specifications", perhaps.)  To enter
Silver Bullet country we are going to require mechanisms that are
foreshadowed by OOP but that allow a yet-higher level of abstraction.

[and I have no axe of my own to grind - this is not my principal research
area]

>I claim that no good
>designer is going to find a class hierarchy a straight-jacket, because if
>it is not working then it will be changed.

I am a good designer (only a fair implementer, though), and I found the 
Smalltalk hierarchy restricting when I needed to implement large objects
like images and 3-D graphics models.  I wasted time trying to fit what
I needed into what Smalltalk provided. 

This may shock some people, but here it is...
Smalltalk is NOT the Ultimate Software Engineering Environment!

>In a couple of years C++ will have just as large a class library as
>Smalltalk, maybe larger.  Lots of companies are already developing their
>own.  Nobody will force you to use them, but the high-quality libraries
>will have lots of users.

You must be at a university, too. A more hard-nosed business attitude
would hold back on the enthusiasm until mechanisms appear for USING
what somebody claims to be "good" classes.  The C++ folks are working
on the problem (and good luck to them!).

>There should be much, much, much more reuse in software systems.  We don't
>keep inventing new abstractions for arithmetic, why should we be doing it
>for everything else?  

Because there is no "implementation" for arithmetic, of course!
Computer stuff is bound to implementations requiring engineering
decisions concerning tradeoffs of time/space, compilation
effort/run-time support effort, etc.

****************************************************************
>>The question is whether the flexibility in the type system afforded by
>>late dynamic binding is ultimately an advantage (by virtue of its
>>additional support for abstracting architectures independent of
>>implementations) or ultimately a disadvantage (by delaying detection
>>of errors and in fact not detecting as errors some things that in fact
>>are errors). 
****************************************************************
>One of the purpose of a type system is to detect errors at compile-time.
>A type system for an object-oriented language, for example, should
>prevent any "message not understood" errors at run-time.  I don't see
>why late binding itself prevents the detection of errors.  As I said
>earlier, C++ virtual functions provide late binding, and the type system
>for C++ ensures that an object understands the messages sent to it.

Smalltalk people tend to think that this characterization of
late-binding is itself a straitjacket!

---------------------------------------------------------------------
Dr. James M. Coggins          coggins@cs.unc.edu
Computer Science Department   Old: Algorithms + Data Structures = Programs
UNC-Chapel Hill               New: Objects + Objects = Objects
Chapel Hill, NC 27514-3175    
---------------------------------------------------------------------