[comp.lang.smalltalk] Teaching object oriented programmin

johnson@p.cs.uiuc.edu (08/03/89)

I'd like to thank Edward Berard for the good reference list.  There
were several Ada related articles that I haven't seen.  

He writes:
>There seem to be three classes of example problems:

Most of the programming language books have examples.  However,
the purpose is often to describe the language instead of the
programming style.  Stroustrup's book is a good example of this
phenomenon--you can't learn object-oriented design from that
book.  On the other hand, Smalltalk books often make an attempt
to teach design.  The language is so small that they would run
out of things to talk about if they didn't.  They talk about
both the design of the standard class libraries and the design
of new applications.  However, the examples are all pretty small,
and object-oriented programming becomes most valuable on larger
projects.

>If you find any object-oriented examples of any size, you will find
>that they often address such issues as "object-oriented domain
>analysis," "object-oriented requirements analysis," and
>"object-oriented design." Unfortunately you will find precious few
>references to "object-oriented development in the large" (OODIL) --
>but there are a few.

As far as I can tell, there aren't any problems in OODIL that are
unique to OO, they are all caused by the DIL.

One feature of object-oriented technology is that it squashes together
the various phases of the design cycle.  OOP is great for prototyping,
so usually systems are built during the requirements analysis phase to
help figure out what the user needs.  Not surprisingly, a certain amount
of design information is discovered here as well, and sometimes a few
components are developed that are reused later.  At the other end of
the life-cycle, I find that code is not reusable until it has been
reused, and this implies that you have to redesign software after it
has been used in a couple of projects.

On the other hand, OOP provides a consistent world-view that can carry
you through the whole life-cycle.  Requirements analysis is determining
the objects from the users point of view and figuring out what operations
will take place on them.  Design is decomposing these objects into smaller
objects, delegating responsibility for operations to various objects,
determining the class hierarchy, and discovering new classes that are
part of the solution domain, not the problem domain.  Implementation
is writing code for all the operations and defining data structures
for the classes.  Invariably new classes are discovered and the design
is revised during implementation, but that's just because top-down
development is impossible to carry off perfectly.

>Most of the work which has been done in the area of object-oriented
>life-cycle issues, outside of object-oriented programming, has been
>accomplished within the Ada community. 

In general, I don't like this work.  Speaking from the point of view
of a Smalltalk programmer, I think that they are really talking about
data abstraction and NOT object-oriented programming.  Important
issues that they miss are the importance of building families of
components with the same interface, using abstract classes for
templates for new components, and polymorphism.

On the other hand, I haven't read all these papers, so perhaps I am
missing something.  I'll read them.

By the way, I encourage people to take a look at Ed's object and
class specifications.  He sent me his notes and I found them worthwhile.

I agree that the details of execution of Smalltalk are unimportant.
However, the general world-view of objects communicating by sending
messages is important, and close to what Smalltalk actually does.
This notion is important to understand polymorphism and inheritance.

Ralph Johnson

johnson@p.cs.uiuc.edu (08/07/89)

I said:
>> As far as I can tell, there aren't any problems in OODIL that are
>> unique to OO, they are all caused by the DIL.
(OODIL being object-oriented development in the large)

Ed Berard then gave a nice description of how object-oriented programming
makes large system design easier, and discusses particular design
techniques that can be used to provide better modulatarity in large
systems, which he calls subsystems and systems of objects.  I agree
completely.  In fact, I teach them in MY course, too, Ed!  The point
I was trying to make is that OO provides solutions to the problems
of DIL, it does not introduce any.  (Other than the problem of learning
to use the solutions.)

Actually, what I call subsystems is probably more like your systems
of objects.  One of the problems with object-oriented technology is
that there is no standardization of names.

I teach the notions of subsystems and frameworks.  A subsystem
is a collection of objects with a narrow interface to the rest
of the system.  For example, a file system would be a subsystem
within an operating system.  A user interface could be a subsystem
within an application (and then again, it might not be a subsystem).
A framework, on the other hand, is a design expressed as a collection
of abstract classes and a description of how objects in those classes
are interconnected.  A framework tells a programmer how to build an
application or a subsystem for a particular purpose.

A language for building large systems needs a mechanism for limiting
the visibility of names.  There is nothing in Smalltalk to implement
exporting and importing in the way Ed describes.  Some people think
that classes provide sufficient modularity, but they are wrong.
A common problem in Smalltalk is for a subsystem (using my terminology)
to give a class the same name as another subsystem.  Because class
names are global, the other subsystem breaks.  This particular problem
can be solved relatively easily by introducing modules into Smalltalk
and defining classes within modules, not within the global dictionary.
A student in one of my Smalltalk classes built a prototype of this in
a few weeks (he got the browser to work, but not the change management
system).  My guess is that a good Smalltalk programmer could build a
bullet-proof version in a couple of weeks.

However, type checking is also necessary if you really want to enforce
interfaces.  Type systems for Smalltalk are a major interest of mine.
I am building an optimizing compiler for Smalltalk that will produce code
that is as efficient as that for any other language.  It uses a type 
system to figure out where it can perform optimizations.  The purpose
of the type system is to describe what Smalltalk programs do, not to
force a particular design style on programmers.  Thus, it should be
able to type-check nearly any correct program, though in practice it
only type-checks the kind of programs that Smalltalk programmers write,
such as the standard image.  We have a type inference system, so that
a program in untyped Smalltalk can be filed in with little effort.
(Instance variables, class variables, and global variables still must
have their typed declared).  Justin Graver has just finished a PhD
thesis on the topic.

>You might also want to give some thought as to how object-oriented
>configuration management might be different from traditional
>configuration management.

I not only have thought about the subject, I know how it should be done!
Traditional configuration management systems do two things, version
management and selecting components from a database and processing them
to form a program.  For example, people often use RCS (or SCCS) and "make"
for configuration management.  In my opinion, these two function should
NOT be related at all.  The purpose of version management is just to get
old versions of the system so that you can respond to bug reports properly.
However, it gets used for configuration management because traditional
programmers can not use inheritance to describe variants of a component.
Instead, they use different versions.

The goal of object-oriented design is a large library of reusable
components that can be used to quickly build applications.  The applications
are built by plugging together existing components.  A configuration is a
description of which components are used and how they are plugged together.
A version is a complete copy of the component library at some particular
time.  

The component library changes in several ways.  One of the most common
is that new classes can get added to it.  Of course, this doesn't affect
existing configurations at all.  Another is that new methods can be added
to existing classes.  This can have an affect on existing configurations
only if the new methods override an inherited method.  Usually they don't.
Sometimes bugs are fixed by modifying some methods.  This requires all
the existing configurations to be retested.  Sometimes class hierarchies
are reorganized to make them more reusable.  This is currently a big
problem, but I have some ideas that I hope will solve it.  I wrote a
paper in the June/July 1988 issue of the Journal of Object-Oriented
Programming (with Brian Foote) on why and how class hierarchies are
reorganized.

>> At the other end of
>> the life-cycle, I find that code is not reusable until it has been
>> reused, and this implies that you have to redesign software after it
>> has been used in a couple of projects.

>If you study software reusability technology, you find that you do not
>have to approach software reusability in a "trial and error" manner.
>You really can design software so that it is reusable _without_
>_modifications_ later. There are techniques, rule, methods, and
>guidelines which can be used at all levels of abstraction to ensure
>and enhance software reusability. (I incorporate many of these items
>into the courses on object-oriented technologies that I teach.)

I strongly disagree with Ed on this point.  I have studied the literature
on software reusability, and I don't think much of it.  I have been a part
of several medium sized projects, all of which have reusability as one
of their goals.  The Typed Smalltalk compiler project is three years
old, currently has half a dozen programmers working on it and has had
over a dozen over the years.  The Choices operating system (in C++) has
a dozen people working on it at present.  These are not large projects
from an industrial point of view, but are from an academic point of view.

We use good design techniques, and the more experienced people do a fairly
good job of making reusable designs.  However, there are ALWAYS surprises.  
It is not possible to predict all the ways that software will need to be 
reused. However, experience shows that experience will show the range of 
uses of software.  Thus, it is relatively straightforward to produce
reusable software if you expect that it will be the RESULT of the project,
not something that gets produced in the first phase.

> It is not the purpose of OORA to identify _all_ of the code
>  software objects and classes in the system. It is perfectly
>  normal, healthy, and _expected_ to uncover additional
>  objects and classes during an OOD process.

I distinguish between problem domain objects and solution domain
objects.  The problem domain objects are the ones that are identified
during the analysis and are the ones that the users think about.
As Ed says, these often include parts of the environment that do not
need to be represented as classes in the final solution.  The solution
domain objects are the ones that turn up during design and implementation.
Of course, sometimes you discover problem domain objects later than you
would like, since the users often don't tell you everything you need
to know.  That's life!

Booch's book "Software Components With Ada" is pretty good.
I read it shortly after it came out, and was impressed by how well
the standard Smalltalk classes were implemented in Ada.  I was also
impressed by the large number of different implementations that were
needed because of questions about memory management and parallelism.
It has always seemed to me that those are the sort of issues that a
compiler should address, and that programmers should usually be able
to ignore.  Of course, my opinion is probably influenced by the fact
that I'm writing a compiler.

The set of components described by Booch certainly reused a small of
interfaces extensively.  However, it seemed to me that the main
reason for this was to make the set easier to understand, because
the number of names you had to learn was smaller.  This is an important
reason to reuse interfaces, but not THE most important reason.
The most important reason is that it makes it possible to provide
a set of components that you can connect together to build an application.
For example, a user interface framework like InterViews lets you build
an interface by connecting buttons, text viewers, scrollers, etc. together.
Unless you need customized graphics there is unlikely to be any need to
define any new components.  This is impossible unless you are able to
connect any component to any other component, i.e. unless there are
shared interfaces.  Moreover, this also requires a kind of polymorphism
that cannot be simulated with generics.

Ralph Johnson -- University of Illinois at Urbana-Champaign