[comp.object] The essence of objects...

pkr@media01.UUCP (Peter Kriens) (07/10/90)

The essence of objects...

4 years ago we met the object oriented paradigm by
using Smalltalk. The shock was lessened by the fact that
parts ideas behind object oriented "thinking" were already 
in use by our previous software developments. Many times we felt
like "coming-home". Finally a way of thinking that matched
our own experiences. 

These years of using Smalltalk and later CO2 (something
like objective C) teached us many different things. We
made many mistakes and we have learned more. But we
still feel very strongly about the advantages.

But I still have problems explaining oops to people who
have no prior experience. When I start talking about
inheritance, dynamic binding they start to gaze and
we both feel lost. Why is it so difficult to explain
something which is so advantageous to use and which feels 
so right? How do I explain the essence of objects?

When I think about it further it seems that I have never 
really seen a description about oops that touches the
nucleus. There is a lot of talk about reuse, encapsulation,
dynamic binding versus static binding, multiple 
inheritance versus single inheritance and many more 
"symptoms". But they seem to be the tools and not the 
essence.

There is one element I feel very strong that it is
close to the nucleus of oops. My problem is that I find 
it very hard to make myself clear. I would like to try
it here and maybe some people can help to come to a
better formulation, or prove me wrong.

	The essence of oops is to specify 
	a functionality only once.

Let me try to make myself clear. If I design a hashing
collection I would like to write the locate function
in such way that it will work for all types of objects. If
I design a dragging routine I would like it to work on
all objects that could have a visual representation.

If this is the essence of oops, then I feel that we should
focus on the specification of that functionality. How can
we find those "functions" that fit in a world where they
are only described once. 

It seems that those functions need to be very "narrow" to
allow infinite combinations. It seems almost like we
need some sort of normalisation technique to find those
functions. How could case tools help us here?

Oops seem to give me exactly this behaviour. The inheritance
allows me to reuse specifications higher up in the
hierarchy and polymorphism gives me the handle to
specify a hash functions which is defined in the class
itself.

I would really like to start a discussion about this
subject. So please post back any comments.

Peter Kriens

davidm@uunet.UU.NET (David S. Masterson) (07/11/90)

In article <1280@media01.UUCP> pkr@media01.UUCP (Peter Kriens) writes:

   The essence of objects...

...is that they are objects.  Sorry, I couldn't resist.  However, this
statement may actually be the focus of the rest of the article.

   But I still have problems explaining oops to people who
   have no prior experience. When I start talking about
   inheritance, dynamic binding they start to gaze and
   we both feel lost. Why is it so difficult to explain
   something which is so advantageous to use and which feels 
   so right? How do I explain the essence of objects?

Object orientation is "supposedly" an expression of problem modelling that is
more *natural* to the way people think.  I make no statement as to whether
this is true or false.  However, people's thought processes are their own, so
perhaps the best way of explaining OOPS is to allow people to explain it to
themselves.  Lead them through a problem and see what manner they use to
arrive at a solution, then equate that process to OOPS philosophy.

   When I think about it further it seems that I have never 
   really seen a description about oops that touches the
   nucleus. There is a lot of talk about reuse, encapsulation,
   dynamic binding versus static binding, multiple 
   inheritance versus single inheritance and many more 
   "symptoms". But they seem to be the tools and not the 
   essence.

True.  All the terms are for technical specialists to worry about.  An
"essence" may not be achievable because the "essence" of the problem is
defined in the particular problem being solved at the moment.

   If this is the essence of oops, then I feel that we should
   focus on the specification of that functionality. How can
   we find those "functions" that fit in a world where they
   are only described once. 

Is your concern with "functionality" any more the essence of the problem than
someone else's concern with the "information" involved with the problem.  One
is functional decomposition, the other is data definition, but are either the
essence of object orientation?  (Now for the tricky question) Are both?

--
===================================================================
David Masterson					Consilium, Inc.
uunet!cimshop!davidm				Mt. View, CA  94043
===================================================================
"If someone thinks they know what I said, then I didn't say it!"

timm@runxtsa.runx.oz.au (Tim Menzies) (07/12/90)

In article <1280@media01.UUCP> pkr@media01.UUCP (Peter Kriens) writes:
>The essence of objects...
>
>	The essence of oops is to specify 
>	a functionality only once.

yes, yes, yes. oop is a way of normalising procedural knowledge. database
designers realised that we had to normalise data decades ago. now, we
normalise code as well.

pkr@media01.UUCP (Peter Kriens) (07/13/90)

>In article <1280@media01.UUCP> pkr@media01.UUCP (Peter Kriens) writes:
>>The essence of objects...
>>
>>	The essence of oops is to specify 
>>	a functionality only once.

Tim Menzies answers
> yes, yes, yes. oop is a way of normalising procedural knowledge. database
> designers realised that we had to normalise data decades ago. now, we
> normalise code as well.

This answer seems to imply that we now normalize code. I think we should
normalize our code, one way or the other. But it seems that the
database designers have some formal logic, and at least some understanding
of what it means to normalize data. I wouldn't know where to start
to normalize code.

Is there anyone who does? Are there formal techniques like normalizing
which can be adapted to software design?

I would love to hear about those.

Peter Kriens
Mediasystemen

cdurrett@cup.portal.com (chuck m durrett) (07/15/90)

>Tim Menzies answers
>> yes, yes, yes. oop is a way of normalising procedural knowledge. database
>> designers realised that we had to normalise data decades ago. now, we
>> normalise code as well.
>
>This answer seems to imply that we now normalize code. I think we should
>normalize our code, one way or the other. But it seems that the
>database designers have some formal logic, and at least some understanding
>of what it means to normalize data. I wouldn't know where to start
>to normalize code.
 
>Is there anyone who does? Are there formal techniques like normalizing
>which can be adapted to software design?
>
>I would love to hear about those.
>
>Peter Kriens
>Mediasystemen
 
An object is the fusion of function and state.  When teaching people
about OOP, we use the terms "code" and "data" to help get some of
the ideas across.  Once across, these terms should be dropped.
 
An object is a compound not a mixture.  Eg, I can put sodium and
chlorine on my food but it makes a lot of practical difference
whether they are together as a compound (salt) or a mixture
(burning poison).
 
An object is an instance of a class(ification) or abstraction.
It exhibits behavior.  We shouldn't let our understanding of HOW
it exhibits that behavior interfere with WHAT that behavior is.
 
In traditional programming environments, we manipulate descriptions
about the world (code, data, etc.).  In OO we are working with the
world directly.
 
Chuck Durrett

Chris.Holt@newcastle.ac.uk (Chris Holt) (07/17/90)

In article <1288@media01.UUCP> pkr@media01.UUCP (Peter Kriens) writes:
>...
>Tim Menzies answers
>>            ... oop is a way of normalising procedural knowledge. database
>> designers realised that we had to normalise data decades ago. now, we
>> normalise code as well.
>
>This answer seems to imply that we now normalize code. I think we should
>normalize our code, one way or the other. But it seems that the
>database designers have some formal logic, and at least some understanding
>of what it means to normalize data. I wouldn't know where to start
>to normalize code.

Relational databases have a mathematical model of values, and operations
that can be used to combine them, and generate new structures from
old ones.  Denotational semantics and process algebras have a
mathematical model of functions and state transformers, and define
operations that can be used to combine them and generate new
structures from old ones.

Objects comprise in themselves values and operations that can be
performed on them.  They have to be modelled, and an algebra has
to be developed that incorporates their combination (e.g. multiple
inheritance) and the generation of new objects from old ones.  Is
this the kind of thing you mean?
-----------------------------------------------------------------------------
 Chris.Holt@newcastle.ac.uk      Computing Lab, U of Newcastle upon Tyne, UK
-----------------------------------------------------------------------------
 "Algebraic expression is something that is to be surpassed..."

pcg@cs.aber.ac.uk (Piercarlo Grandi) (07/23/90)

In article <1280@media01.UUCP> pkr@media01.UUCP (Peter Kriens) writes:

   But I still have problems explaining oops to people who have no
   prior experience. When I start talking about inheritance, dynamic
   binding they start to gaze and we both feel lost. Why is it so
   difficult to explain something which is so advantageous to use and
   which feels so right? How do I explain the essence of objects?

I will give you some of my ideas, with a premise: all we know of OO
programming is OO languages really.

The first idea is that when structuring a program we have an implicit
matrix that maps (function,datatype) -> implementation. OO programming
is that style of programming in which you list the matrix by datatype
and not by function or any other order, i.e. you assume that the
diverse implementations that apply to the same datatype are
fundamentally more related than the diverse implementations of an
operator applied to different datatypes.

The second, related, idea is that the principal tool of program design
is decomposition, and that OO programming is that style of programming
where you decompose not by function (FORTRAN), not by datatype and
function (PASCAL), not by module (ADA), but by abstract data type.

	The first idea comes from some paper by Fabry on capability
	systems, the second from some article by Stroustrup on
	decomposition.

As a general comment, why are these two things important?  Not in
themselves, but because we believe that selecting the proper program
description and decomposition paradigms will enhance reuse of interface,
semantics, implementation.

Too bad that many OO practitioners have lost sight of this ultimate
goal, and use fuzzily defined concepts like polymorphism or dynamic
binding or inheritance (which are merely devices) without reference to
such goal.


More specifically I think that the two points of view given above on why
we want to do OO programming reveal both strengths and weaknesses of the
idea.

For example, it is usually usually true that listing the
(function,datatype) matrix by datatype is a good choice; concatenation
of matrixes and lists require quite different implementations, while
concatenation and cloning of lists have mostly similar implementations
(but this is because the navigational part is usually small).  It is
also true that decomposition by ADT is usually, by the same token, more
appropriate than the other forms, which either miss out one of the
dimensions or are too unstructured.

On the weakness side, it is apparent that programs are not just
collections of ADTs, but also of "glue" logic, and that the real
semantics of the program belong to this "glue" logic, which usually is
a graph navigator (interpreter). Where the "glue" logic is prevalent on
the data manipulation, maybe then fucntional decomposition is more
appropriate for example.



My own view of programming is that programs cannot be decomposed at
every level with a single paradigm, like OO purports to do, but that
you need *two* paradigms, that alternate, where you have a navigational
sublayer, a driver that explores some overall data structure and calls
upon one of a library of ADTs as the next layer to operate on part of
the structure, or upon a lower level navigational component if the
structure is many layered.

It is especially difficult to use this model in OO programming, because
for a number of reasons it is difficult in most OO schemes to express
control abstraction and semantics that straddle ADT boundaries, i.e.
the "glue" navigational code or driver.

In other words, IMNHO it is the case that OO is nice to describe
*libraries* of ADTs, but it is less apropriate to describe *programs*,
especially if they have a strong navigational or control component.

In practice only continuation passing (and maybe polymorphism) provide
the required support for control abstraction. Too bad that they are
arcane arts outside the advanced fringes of the Lisp community.
--
Piercarlo "Peter" Grandi           | ARPA: pcg%cs.aber.ac.uk@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk

pcg@cs.aber.ac.uk (Piercarlo Grandi) (07/23/90)

In article <1990Jul17.014837.2370@newcastle.ac.uk>
Chris.Holt@newcastle.ac.uk (Chris Holt) writes:

   In article <1288@media01.UUCP> pkr@media01.UUCP (Peter Kriens) writes:
   >...
   >Tim Menzies answers
   >>            ... oop is a way of normalising procedural knowledge. database
   >> designers realised that we had to normalise data decades ago. now, we
   >> normalise code as well.
   >
   >This answer seems to imply that we now normalize code. I think we should
   >normalize our code, one way or the other. But it seems that the
   >database designers have some formal logic, and at least some understanding
   >of what it means to normalize data. I wouldn't know where to start
   >to normalize code.

   Relational databases have a mathematical model of values, and operations
   that can be used to combine them, and generate new structures from
   old ones.  Denotational semantics and process algebras have a
   mathematical model of functions and state transformers, and define
   operations that can be used to combine them and generate new
   structures from old ones.

Yes, but what we lack is some rule or set of guidelines that allow us to
decompose "optimally" this mathematical model, e.g. to enhance reuse at
various levels. Relational normalization theory gives us some fairly
specific guidelines on how to look for dependencies and resolve them, OO
programming some pretty generic ones.

In another article I argue that while for many types of programs the
guidelines implicit in OO programming are fairly appropriate, they fall
short of being complete in general, because for some classes of programs
the best "normalization" is not achieved by eliminating redundancy on
the datatype dimension, but on the control logic one (or both, of course).

Let me repeat here I think that one should apply control and data
decomposition in alternating layers, not data (or control) decomposition
alone throughout.
--
Piercarlo "Peter" Grandi           | ARPA: pcg%cs.aber.ac.uk@nsfnet-relay.ac.uk
Dept of CS, UCW Aberystwyth        | UUCP: ...!mcsun!ukc!aber-cs!pcg
Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk

cox@stpstn.UUCP (Brad Cox) (07/26/90)

In article <PCG.90Jul23125449@thor.cs.aber.ac.uk> pcg@cs.aber.ac.uk (Piercarlo Grandi) writes:
< 
< In article <1280@media01.UUCP> pkr@media01.UUCP (Peter Kriens) writes:
< 
<    But I still have problems explaining oops to people who have no
<    prior experience. When I start talking about inheritance, dynamic
<    binding they start to gaze and we both feel lost. Why is it so
<    difficult to explain something which is so advantageous to use and
<    which feels so right? How do I explain the essence of objects?
< 
< I will give you some of my ideas, with a premise: all we know of OO
< programming is OO languages really.

I tend to come at this with exactly the opposite premise, that all that
mankind really knows about are the tangible objects of everyday experience,
and that the programming language community is at least fifty years behind.
I'd like to think OO means that we're about to start gaining on them,
but the apparent inability of the programming language community to escape
its traditional process-centered view of the software universe (Ada, 2167,
Cleanroom, C++) and adopt a product-centered universe, based on marketplaces
in standard off-the-shelf components (component environments like Smalltalk
and Objective-C), makes me fear the reverse.

< The second, related, idea is that the principal tool of program design
< is decomposition, and that OO programming is that style of programming
< where you decompose not by function (FORTRAN), not by datatype and
< function (PASCAL), not by module (ADA), but by abstract data type.

Mature domains design by decomposition, but implement by *composition*; i.e.
they assemble larger solutions from off-the-shelf components; i.e. plumbing
systems, bridges, space shuttles. The stockroom of available components
are one of the major inputs to the design by decomposition process; the
horse, not the cart.

< Too bad that many OO practitioners have lost sight of this ultimate
< goal, and use fuzzily defined concepts like polymorphism or dynamic
< binding or inheritance (which are merely devices) without reference to
< such goal.

Dynamic binding is very nearly a *synonom* for what I earlier referred
to as composition. Binding is a fuzzy concept only so long as we view
programming as a decomposition process, rather than a composition process.
Binding is that thing on your boots that transforms you + skiis into
skier. Dynamic versus static binding is the difference between doing 
this at the ski resort as opposed to having your parents do it for you
at birth. Run-time versus compile-time is hardly a fuzzy distinction.

I agree with respect to inheritance. Sometimes I wish it had never been
added to implementation languages, and reserved for where it would be far
more appropriate, as a specification/testing technology.

< On the weakness side, it is apparent that programs are not just
< collections of ADTs, but also of "glue" logic, and that the real
< semantics of the program belong to this "glue" logic, which usually is
< a graph navigator (interpreter). Where the "glue" logic is prevalent on
< the data manipulation, maybe then fucntional decomposition is more
< appropriate for example.

You're talking about what I call "The sucker trap of object-oriented
programming", the notion that OOD means make the nouns objects and the
verbs methods. It is far better to do what Smalltalk people call
"model/view/controller", and as you seem to be suggesting: make
passive objects represent the nouns and active objects the verbs
(your "glue").

< it is difficult in most OO schemes to express
< control abstraction and semantics that straddle ADT boundaries, i.e.
< the "glue" navigational code or driver.

No, its not difficult at all. See the active/passive object distinction
above. The difficulty is only that so many authors (myself included!) 
have fallen into the sucker trap and their readers with them.

-- 

Brad Cox; cox@stepstone.com; CI$ 71230,647; 203 426 1875
The Stepstone Corporation; 75 Glen Road; Sandy Hook CT 06482

cook@hplabsz.HPL.HP.COM (William Cook) (07/27/90)

I gave a presentation on the essence of object-oriented programming at the
school/workshop on the foundations of object-oriented languages (REX/FOOL)
this summer.  The proceedings will be printed as a Springer volume this
fall.  In summary, I contrast object-oriented programming (which I prefer to
call procedural data abstraction (PDA)) with abstract data types (ADT).
They are not the same.  I can only illustrate the dichotomy here, and give
some references.

Here is an ADT for integer lists, written in ML (it would look about the
same in CLU, Ada, Modula-2, etc).

    exception error;

    abstype intlist = NIL | CELL of int * intlist with
        val xnil = NIL;

        fun null(l : intlist) = case l of
                NIL => true |
                CELL(x, l) => false;

        fun head(l : intlist) = case l of
                NIL => raise error |
                CELL(x, l') => x;

        fun tail(l : intlist) = case l of
                NIL => raise error |
                CELL(x, l') => l';

        fun cons(x : int, l : intlist) =
                CELL(x,l);
        end;

   head(cons(4, nil)) --> 4

This is called an "abstract data type" because the actual definition of
intlist is hidden, or abstracted, in the the rest of the program.

Here's an OOP version, written in a syntactically sugared version of ML (a
correct implementation is given in the appendix).

    datatype IntListObject = {
        null: bool,
        head: int,
        tail: intlistobject,
        cons: int -> intlistobject
        };

    let MakeNil : IntListObject =
          recursive self = {            -- the Nil "class"
            null = true,
            head = raise error,
            tail = raise error,
            cons = fun(y : int) MakeCell(y, self)
            }
    and MakeCell(x : int, r : IntListObject) : IntListObject =
          recursive self = {        -- the Cell "class";
            null = false,           -- x and r are "instance variables"
            head = x,
            tail = r,
            cons = fun(y : int) MakeCell(y, self)
            }

This program does not have any hidden or abstract types.  The full
definition of IntListObject is visible everywhere.  It uses procedural
abstraction to get information hiding.  Creating a list with element 4 and
then returning it works like this:

   MakeNil.cons(4).head  --> 4

As you can see, all of the NIL cases from the ADT are collected into the
MakeNil object, and all of the CELL cases are collected into MakeCell.
There are no case statements in the OOP version;  case statements are bad
because you are always having to add new cases (in this situation, if you
add a new representation).  This organizational difference has great
consequences for programming methodology.  For example, consider adding an
"interval" class that represents the integer list [n...m].  In the OOP
format its easy:

    MakeInterval(x : int, y: int) : IntListObject =
        if x > y then           -- the Interval "class";
            MakeNil             -- x and y are "instance variables"
        else
            recursive self = {
                null = false,
                head = x,
                tail = MakeInterval(x+1, y),
                cons = fun(y : int) MakeCell(y, self)
                }

Exercise: consider adding this to the ADT code.

In short: Object-oriented programming is using procedures to represent data.
That is, procedural data abstraction.

-william cook@hplabs.hp.com


REFERENCES:

Early definition and discussion of virtues of "OOP":
    @article{Zilles73
    ,title="Procedural Encapsulation: A Linguistic Protection Mechanism"
    ,author = "Stephen N. Zilles"
    ,journal = "SIGPlan Notices"
    ,pages = "142-146"
    ,volume = 8
    ,number = 9
    ,year = 1973
    }

Change of direction, away from OOP, to invent ADTs:
    @article{LiskovZilles74
    ,author = "B. Liskov and S. Zilles"
    ,title = "Programming with Abstract Data Types"
    ,journal = "SIGPlan Notices"
    ,year = "1974"
    ,volume = "9"
    ,number = "4"
    ,pages = "50-59"
    }

Comparison of OOP and ADT, summarizes most significant issues:
    @incollection{Reynolds78
    ,author="John C. Reynolds"
    ,title="User Defined Types and Procedural Data Structures as
             Complementary Approaches to Data Abstraction"
    ,booktitle="Programming Methodology, A Collection of Articles by IFIP WG2.3"
    ,publisher="Springer-Verlag"
    ,address=NY
    ,editor="David Gries"
    ,year=1978
    ,pages="309-317"
    ,note="Reprinted from S. A. Schuman (ed.),
             {\em New Advances in Algorithmic Languages 1975},
             Inst. de Recherche d'Informatique et d'Automatique,
             Rocquencourt, 1975, pages 157-168",
    }

Discussion of OOP in chapter on "data-oriented program", similar to CLOS:
    @book{AbelsonSussman85
    ,author = "Abelson and Sussman"
    ,title = "The Structure and Interpretation of Computer Programs"
    ,publisher = "MIT Press"
    ,year = 1985
    }


APPENDIX: ML version of OOP code

    datatype intlistobject = tag of {
        null: unit -> bool,
        head: unit -> int,
        tail: unit -> intlistobject,
        cons: int -> intlistobject };

    fun MakeNil() =
          let fun self() = tag({
            null = fn() => true,
            head = fn() => raise error,
            tail = fn() => raise error,
            cons = fn y => MakeCell(y, self())
            })
          in self() end
    and MakeCell(x,r) =
          let fun self() = tag({
            null = fn() => false,
            head = fn() => x,
            tail = fn() => r,
            cons = fn y => MakeCell(y, self())
            })
          in self() end;

cook@hplabsz.HPL.HP.COM (William Cook) (08/08/90)

I was just wondering why nobody ever replies to my postings... its
been almost two weeks.  I know I didn't use any jargon
and my example was concrete, but can't you come up with something
to flame about?  Here, let me try again.  My point was that object-oriented
programming arises from a technical dichotomy in the organization of
the observers and constructors of a data abstraction.  Abstract data
types and objects (which I call procedural data abstractions)
are the result of these two alternatives.  Organization by `observer'
results in an abstract data type; organization by `constructor' results
in a object/class structure.  They *both*
model the "real world"; they *both* have `objects' and `operations';
finding the nouns and the verbs will help just as much in each case
(or be of as little help...).  The difference is how
things are organized, and this makes a big difference:  multiple
interacting implementations are easier in object-oriented programming,
while correctness proofs and optimizations are easier in abstract 
data types.  Shall I take silence for agreement?  I mean, isn't 
usenet the place to grind one's axes...

william cook@hplabs.hp.com

daugher@cs.tamu.edu (Dr. Walter C. Daugherity) (08/08/90)

In article <5743@hplabsz.HPL.HP.COM> cook@hplabsz.HPL.HP.COM (William Cook) writes:
>....  My point was that object-oriented
>programming arises from a technical dichotomy in the organization of
>the observers and constructors of a data abstraction.  Abstract data
>types and objects (which I call procedural data abstractions)
>are the result of these two alternatives.  Organization by `observer'
>results in an abstract data type; organization by `constructor' results
>in a object/class structure.

I think I would say that ADT's lack inheritance and polymorphism.  One 
important consequence of the latter is that any extension to an ADT requires 
revising the code for all of its operators.   This is not true in OOP.

For example, consider an "graphic object" ADT (i.e., something you want to
manipulate on a display) with operations such as "draw yourself" and "report
your area."  If this ADT encompassed rectangles and ellipses then each
operation would have a SWITCH/CASE statement, e.g.,
	float function area(g_o)
		graphic_object *g_o;
		{
			switch (type_of(g_o)){
			case RECTANGLE:
				area = height_of(g_o) * width_of(g_o);
				break;
			case ELLIPSE:
				...
		}
The draw_self procedure is similar.  Now to add another type, say TRIANGLE,
requires both functions to be modified (by adding another case and its code).

In contrast, in OOP you do not code the SWITCH statement: it is implicit in
the system that a message to an object invokes the method for the object's
class **INDEPENDENTLY** of any other classes which may have methods of the same
name.  So the AREA and DRAW_SELF methods of the RECTANGLE and ELLIPSE classes
remain intact, and you only code the methods for the new TRIANGLE class.  This
eliminates the prospect of ruining previous code because it isn't touched.

Hope this helps.


-------------------------------------------------------------------------------
Walter C. Daugherity			Internet, NeXTmail: daugher@cs.tamu.edu
Knowledge Systems Research Center	uucp: uunet!cs.tamu.edu!daugher
Texas A & M University			BITNET: DAUGHER@TAMVENUS
College Station, TX 77843-3112		CSNET: daugher%cs.tamu.edu@RELAY.CS.NET
	---Not an official document of Texas A&M---