[comp.lang.eiffel] CHALLENGE: typing and reusability

sommar@enea.se (Erland Sommarskog) (03/26/91)

(For comp.lang.eiffel readers: there's been a long discussion going
on in comp.lang.misc on dynamic vs. static typing.)

As the first propoent of dynamic typing, Urs Hoelzle
(hoelzle@neon.Stanford.EDU) gives us a concrete example
where langauges with static typing is said to bite the dust.
I have deferred the quote of his example to end of this article
for new readers in comp.lang.eiffel.

>** So, would everybody who does *NOT* believe that dynamically-typed       **
>** languages can achieve greater reusability than today's statically-typed **
>** languages PLEASE post a solution to this (written in his/her favorite   **
>** programming language)???                                                **
>...
>I claim that this piece of code won't type-check in Ada, C++, Eiffel, ...

Ada is out. But in Eiffel this is a piece of cake, however with
one non-ignorable objection: we must change the inheritance
structure. If we introduce a new class SUPER, preferably deferred
(which means that no objects of this class be created), and make
A and B heirs of SUPER, we can then define a list class which
looks something like:

    class FOO_DISPLAY_LIST(T -> SUPER)
         -- () is easier to read than [] on my screen.
    export  repeat LINKED_LIST, do_foo, display
    inherit LINKED_LIST   -- or whichever list that's appropriate
    feature
      -- define iterators that call foo and display
    end

We can now declare:
   ListA : FOO_DISPLAY_LIST(A);
   ListB : FOO_DISPLAY_LIST(B);
   ListC : FOO_DISPLAY_LIST(SUPER);

Of course having to modify the existing classes with a common ancestor,
may be out of the question, and the example says that A and B are
unrelated. On the other hand, if they are going to be in the same
list, they are no longer unrelated, are they? And remember that it
does not suffice that they both understand Foo and Display, they have
to understand them in the same way, i.e. they must have the same
parameter profile in both classes.

With adding the example code to our system, we do add a dependency 
between A and B, and in a dynamically typed language if we modify 
A or B we must run a test suite which covers all a referenses to 
A and B to enure we didn't break anything. Not only the test suite is
labourous task to develop - albeit very useful in any environment - 
we still don't *know* that type or parameter-profile error will not 
occur, we can only assume. In a statically typed language, the compiler
tells ensures us that these errors won't happen. Of course several
other errors are still possible, but we are at least one worry
shorter.

Now, if modifying A and B is undesireable *and* this is a common
scene, one could of course envision changes to Eiffel to alleviate
the situation. One idea would be to allow a class to declare
explicit heirs, which means that we could SUPER, but still leave
A and B untouched. To help up our sleep at night, this sort of
declaration would only be allowed in a deferred class. But to be
frank, I don't find the reason for such a change compelling enough.

I'm by no means an Eiffel expert, I would appreciate other Eiffel
people's input on this.

If you know the example, you can press "n" here.

>======================  example ==============================
>Assume the following types:
>
>type List = generic list; understands (among others)
>	     'concatenate': returns a new [generic] list which is the
>		concatenation of the two argument lists"
>	     'do': takes a procedure with one argument and applies it
>	        to all elements of the list (i.e. a generic iterator)
>
>type FooType = "object that understands 'foo'"
>
>type A = some random application type, understands (among others)
>	 'foo' and 'display'
>
>type B = some other application type (completely unrelated to A), also
>	 understands (among others) 'foo' and 'display'
>
>Also assume that there exists a procedure someProc(l:List of FooType)
>which iterates through the list and may send 'foo' to some of its
>elements.
>
>VAR listA: "list of objects of type A"
>VAR listB: "list of objects of type B"
>VAR listC: "see below"
>
>Now, the program you're writing has to do:
>
>listA := ...  // some computation to fill the list
>listB := ...  // some computation to fill the list
>
>listC := listA.concatenate(listB);	// make a new list containing As and Bs
>
>someProc(listC);			// do the "foo thing" to the list
>
>listC.do(display);			// display all elements of the list
>
>======================  end of example =========================
-- 
Erland Sommarskog - ENEA Data, Stockholm - sommar@enea.se

hoelzle@elaine13.Stanford.EDU (urs hoelzle) (03/27/91)

In <1991Mar26.005805.1914@kodak.kodak.com> cok@islsun.Kodak.COM (David Cok)
writes:

> However, I do prefer static typing where it works and am interested in 
                                     ^^^^^^^^^^^^^^
> discovering how to make it work better.  In any case Hoelzle's programming 

Sure... I guess we just disagree what "works" means :-)

But now to the technical points: your code example is nice, but (as
you mention yourself) it doesn't solve the problem because of a typing
snag, and that was exactly my point.  Furthermore, it has two other problems:

1) It assumes that you have source code access to A and B.  You write that

> If a programmer did not
> have source code access to A and B, one could still create classes MyA (MyB) 
> which inherited from both A (B) and FooDisplayable.  The programmer would have 
> to use MyA and MyB everywhere instead of A and B (at least where things might 
> be put on the list) and A and B would have had to declare display and foo 
> virtual.

This doesn't work as soon as you also have Lists of A (or other
classes parametrized by A) in existing modules, because List[MyA] is
not a subtype of List[A].  This is a result of the contravariance rule
(see e.g. "A Proposal for Making Eiffel Type-Safe" by W. Cook,
ECOOP'89; "Introduction to Trellis/Owl", OOPSLA'86 [I might have
gotten the years slightly wrong]).

The problem isn't really with contravariance, but with the way it is
applied: type-checkers don't only check the validity of your program,
but also the validity of *all possible variations*: 
someProc(List[FooDisplayable]) is illegal because someProc *could* use
List::insert to insert a Fooable instead of a FooDisplayable, which would
violate the type assertion for the list.

HOWEVER, the existing version of someProc *does not* use any "dangerous" 
List operation such as insert() and is perfectly safe.  But today's
type checkers are unable to recognize this because they still live
in a world of batch-style compilation and are prohibited from looking
at the *actual* use of objects.  Therefore, they have to use worst-case
assumptions about the *potential* uses and unnecessarily restrict the
programmer.

[For Eiffel programmers: before you agree with me, let me say that what
Eiffel does is *not* what I propose here.  Eiffel's "solution" to the
problem is, IMHO, totally wrong. :-)]

> As the original poster pointed out, this will not give any runtime message
> errors in a dynamically typed system.  However, I'd maintain that it is a bad
> use of dynamic typing.  The only thing that prevents errors is the semantic
> knowledge that someProc only does foo and not anything else that Footype allows.
> If someProc were modified later by someone who did not know that it was applied
> to a FooDisplayable list, errors would result.

Sure, type errors would result in a statically-typed language, and
run-time errors in a dynamically-typed language.  There's no problem
with that: if your change transforms a type-correct (in my sense, not
C++'s) program into an unsafe program, you get an error.


2) You need a new type of iterator for every type of message you want
   to send to the list.

This is actually orthogonal to the issue of dynamic vs. static typing
since it is about closures.  In Self/Smalltalk, you define one
iterator which takes a block.  The block is the called by the
iterator, once for every element, with the element as an argument.
Thus you can do

   list do: [ | :elem | elem foo ]             "original example"
   list do: [ | :elem | elem bas: 15]

or (assuming that prev is a local in the caller of do:)

   prev: nil.
   list do: [ | :elem | elem baz: prev. prev: elem ]

You could introduce closures to C++ et al, but it's interesting to
observe that no statically-typed language has them, while many
dynamically-typed languages do.  You could also simulate this in C++
with function pointers, but you have to "invent" a procedure for every
(trivial) block... reminds me of COBOL where you were forced to write
your loop bodys as separate sections ;-)

-Urs

PS:

> >>>>> Compiling takes too long. I want runtime syntax checking! <<<<<

Hey, you can have this in Self - almost every string of characters
with balanced parentheses is a legal Self expression. :-)

hoelzle@elaine2.Stanford.EDU (urs hoelzle) (03/27/91)

In <3090@enea.se> sommar@enea.se (Erland Sommarskog) writes:

> As the first proponent of dynamic typing, Urs Hoelzle
> (hoelzle@neon.Stanford.EDU) gives us a concrete example
> where langauges with static typing are said to bite the dust.

Just to set the record straight: I do not claim that dynamically-typed
languages are always better for all kinds of problems, or any such thing.
I posted this example because someone claimed that static typing does not
affect reusability, and I just couldn't let this stand unchallenged.

>     class FOO_DISPLAY_LIST(T -> SUPER)
>          -- () is easier to read than [] on my screen.
>     export  repeat LINKED_LIST, do_foo, display
>     inherit LINKED_LIST   -- or whichever list that's appropriate
>     feature
>       -- define iterators that call foo and display
>     end
> 
> We can now declare:
>    ListA : FOO_DISPLAY_LIST(A);
>    ListB : FOO_DISPLAY_LIST(B);
>    ListC : FOO_DISPLAY_LIST(SUPER);

This is nice, but it only addresses the trivial part of my example.  The
hard part comes when you actually try to use the lists you defined.  I don't
have an Eiffel system to try this, but I don't think this will type-check.

>listC := listA.concatenate(listB);     // make a new list containing As and Bs
>
>someProc(listC);                       // do the "foo thing" to the list
>
>listC.do(display);                     // display all elements of the list

(Remember that "someProc" is supposed to already exist, and its parameter
is a FOO_LIST, not a FOO_DISPLAY_LIST.)

For all your other comments see my other posting (response to a C++ example).

-Urs
--
------------------------------------------------------------------------------
Urs Hoelzle                 			       hoelzle@cs.stanford.EDU
Center for Integrated Systems, CIS 42, Stanford University, Stanford, CA 94305

jls@rutabaga.Rational.COM (Jim Showalter) (03/28/91)

>Sure, type errors would result in a statically-typed language, and
>run-time errors in a dynamically-typed language.  There's no problem
>with that:

Uh, yes there is. The Achilles Heel of dynamically-typed languages is
run-time errors. These may be fine for WIMPS interfaces on workstations:
just print out "Window Manager is confused", abort, and let the user
reboot the system. ;-)

This gets a little trickier when the code is running embedded in a
$100 million satellite orbiting Venus.
--
***** DISCLAIMER: The opinions expressed herein are my own, except in
      the realm of software engineering, in which case I've borrowed
      them from incredibly smart people.

hoelzle@neon.Stanford.EDU (Urs Hoelzle) (04/01/91)

NOTE: the following discussion is relatively long.  If you don't want to 
read all of it, skip to the "executive summary" at the end of the article.

Several people (e.g. paj@mrcu (Paul Johnson), rick@tetrauk.UUCP (Rick
Jones) and bertrand@eiffel.UUCP (Bertrand Meyer)) have argued that the
types A and B in my example should be related since they have common
behavior.  With this change, they showed how the problem can be solved
using multiple inheritance.

This is indeed the case, but my case rests on the claim that this is
not achievable in the Real World if we assume a high degree of
code reuse [I originally posted the challenge because someone had
claimed that static typing has absolutely no effect on reusability.]

Here's the scenario on which my claim is based:

1) You haven't written A and B yourself but bought them from someone.
   That is, you cannot change A or B to inherit from a common base
   class.  I consider this as an absolutely unavoidable problem (more
   justification below).

   There is a good objection why dynamic typing wouldn't help:

|> If A and B had completely independent designers then even in Smalltalk
|> it is unlikely that the two classes would have this commonality:
|> different names and different arguments would probably have been
|> chosen.

   The solution to this is to subclass A and B (with classes SubA and
   SubB) in order to define the necessary "interface glue". In a
   statically typed language, SubA and SubB would both inherit from a
   common superclass (FooDisplayable).

   So far, so good - no real disadvantages for statically-typed languages.

2) Unfortunately, the solution above only works well for leaves in the
   inhertiance tree.  If A already had subclasses C and D, it would be
   very tedious to subclass all of these, too.

   More importantly, it only works for "first-level imported types", but
   not for "second-level" types such as List[A]: in typed languages,
   you cannot pass a List[SubA] in place of a List[A].  Thus, every
   piece of functionality which has second-level arguments *cannot be
   reused*.  I contend that such cases are frequent (since "container
   classes" (collections) are very useful.)

   Bertrand Meyer argues that this problem (exemplified by someProc in
   my example) would not occur:

|> Of course, this means that B to must be a descendant of SPECIFIC,
|> and Mr. Hoelzle's point is probably that this is inappropriate since
|> some_proc will only be called with arguments of type LIST [A]. However
|> the problem does not arise in this way in ``pure'' object-oriented 
|> programming of the Simula/Eiffel (and, unless I am mistaken, Smalltalk) style.
|> In this world some_proc would not take an argument l of list type,
|> as in Mr. Hoelzle's statement of the problem given above, but
|> would be a procedure in a list class. 

   But this is just an artifact of my simplified example (I didn't
   want to include additional classes), and there are plenty of
   scenarios where some object takes a list of objects as one of the
   arguments of a function (e.g. scheduler.doSomething(list_of_processes)).
   Thus I still contend that the problem will show up very frequently:
   not every function which takes a list should be defined in a list
   class (actually, most of them probably shouldn't).

3) The Real Problem (C) doesn't show up until you actually try to do
   this in the Real World (TM).  If you do indeed have a high degree
   of code reuse, you basically compose your new program out of
   existing pieces of functionality.  Typically, there will be dozens
   if not hundreds of these components in your program, selected from
   a library of thousands.  That is, reuse is very fine-grained.

   In this world, the "impedance mismatch" caused by the problems
   outlined in 1) and 2) becomes very annoying since the main job
   of a new program is to pass objects (and often collections of
   objects) from one subpart to another, and static typing hinders
   this communication because of the type system's restrictions.

   Furthermore, it is extremely unlikely that the designers of the
   thousands of predefined pieces of functionality have foreseen every
   possible commonality between them and factored the inheritance
   hierarchy accordingly.  In fact, I claim that there will always be
   different (and equally valid) conceptual views (and thus type
   hierarchies) of some domains.

   Therefore, we must assume that situations where the types A and B
   are unrelated (do not share the "right" supertype) are the rule
   rather than the exception, even in a well-structured world.

   As Paul Johnson correctly observes, type hierarchies must be very
   fine-grained in order to achieve true reusability.  Most pieces of
   functionality depend only on subsets of broader interfaces, and it
   is impossible to foresee all combinations of such subsets at design
   time.  And even if it were possible, the subtyping problems
   outlined above would make it hard to take advantage of the full
   potential for reusability.  Johnson's "sibling-supertype rule" would
   help here, but it still exhibits the problems with "second-level
   types" shown in 2).  Furthermore, the introduction of numerous
   "impedance-matching" types could impair programming productivity
   ("one gets buried in paperwork").

   It is here where dynamically-typed languages have a subtle but
   important advantage.  All pieces of functionality truly (per
   definition) depend only on that part of their argument's interface
   which they actually use.  Thus, programs written in such languages
   do not experience "impedance mismatch" and can achieve maximal
   reuse with minimal effort; the languages do not prevent reuse
   because of technicalities (i.e. restrictions in the type system).
   [Of course, the downside of dynamically-typed languages is that
   they also do not prevent abuse.  But this is a philosophical issue
   orthogonal to this discussion, and I don't want to start another
   language war ;-)]

Executive summary: dynamically-typed languages facilitate the
communication between objects, whereas today's statically-typed
languages tend to create an "impedance mismatch" which can impair
communication.  In a world of high reusability, creating a new program
is mainly the task of coordinating the communication of existing
parts.  Therefore, this subtle advantage of dynamically-typed
languages can make all the difference in determining the degree of
reusability which can be achieved.

-- 
------------------------------------------------------------------------------
Urs Hoelzle                                            hoelzle@cs.stanford.EDU
Center for Integrated Systems, CIS 42, Stanford University, Stanford, CA 94305

chip@tct.com (Chip Salzenberg) (04/02/91)

According to hoelzle@neon.Stanford.EDU (Urs Hoelzle):
>Here's the scenario on which my claim is based:
>
>1) You haven't written A and B yourself but bought them from someone.
>   That is, you cannot change A or B to inherit from a common base
>   class.
>
>   The solution to this is to subclass A and B (with classes SubA and
>   SubB) in order to define the necessary "interface glue".

There is a better solution: Create an abstract GenericFoo class, which
is the element type for the FooCollection.  Then define subclasses of
GenericFoo called FooA, FooB, etc.  Here's the key part: each FooA
_refers_ to an A, each FooB _refers_ to a B, etc.

With this scheme, all member functions interesting to the collection
(and the collection's clients) can be implemented as virtual functions
defined in GenericFoo and implemented in all subclasses of GenericFoo.

This alternative solves the problem of subclassing A and B, because
FooA can refer as easily to a SubA as to an A.
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.com>, <uunet!pdn!tct!chip>
   "All this is conjecture of course, since I *only* post in the nude.
    Nothing comes between me and my t.b.  Nothing."   -- Bill Coderre

matt@bacchus.esa.oz (Matt Atterbury) (04/02/91)

In article <1991Mar31.215721.18687@neon.Stanford.EDU> hoelzle@neon.Stanford.EDU (Urs Hoelzle) writes:

   NOTE: ...

   Here's the scenario on which my claim is based:

   1) You haven't written A and B yourself but bought them from someone.
      That is, you cannot change A or B to inherit from a common base
      class.  I consider this as an absolutely unavoidable problem (more
      justification below).

      :
      :

How many of these problems would be solved if the language allowed the
addition of 'class utilities' (as described p.82 of Booch's OODwA) to
_any_ class at _any_ time (i.e. even if you don't have the source
code). I _believe_ you would still have to add such utilities to
sub-classes by hand due to Eiffel's default non-inheritance of
`methods' ? With this facility, you could make a bought-in class (BiC)
look exactly like you want it to.

You could do this now by subclassing the BiC and adding your utilities
to this new class, however the point of a HLL is to make the
programmer's life easier - consider buying an Eiffel encapsulation of
the entire X windows library and making all/many of its classes handle
the method jump_through_the_hoop !

[ a `class utility' is a non-primitive `method' which only uses
  primitive methods to do its work (i.e. doesn't play with state
  directly) ]

Note that I do not use Eiffel in Real Life so do not know for sure
that one can/cannot do the above. Also, I am probably commiting heresy
by not using the `right' terminology (e.g. method) :-).
--
-------------------------------------------------------------------------------
Matt Atterbury [matt@bacchus.esa.oz]      Expert Solutions Australia, Melbourne
UUCP: ...!uunet!munnari!matt@bacchus.esa.oz               "klaatu barada nikto"
  or: ...!uunet!murtoa!bacchus.esa.oz!matt            "consider this a divorce"
ARPA: matt%bacchus.esa.oz.AU@uunet.UU.NET  "life? don't talk to me about life!"

gudeman@cs.arizona.edu (David Gudeman) (04/03/91)

In article  <MATT.91Apr2093735@henry.bacchus.esa.oz> Matt Atterbury writes:
]
]How many of these problems would be solved if the language allowed...

The problem with this and all the other solutions is that they involve
a non-trivial amount of extra work -- work that is only necessary to
overcome problems that are caused by static typing.  Rather than try
to find ways around the problem, why not eliminate the source of the
problem?
--
					David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman

hoelzle@neon.Stanford.EDU (Urs Hoelzle) (04/03/91)

chip@tct.com (Chip Salzenberg) writes:

>According to hoelzle@neon.Stanford.EDU (Urs Hoelzle):
>>Here's the scenario on which my claim is based:
>>
>>1) You haven't written A and B yourself but bought them from someone.
>>   That is, you cannot change A or B to inherit from a common base
>>   class.
>>
>>   The solution to this is to subclass A and B (with classes SubA and
>>   SubB) in order to define the necessary "interface glue".

>There is a better solution: Create an abstract GenericFoo class, which
>is the element type for the FooCollection.  Then define subclasses of
>GenericFoo called FooA, FooB, etc.  Here's the key part: each FooA
>_refers_ to an A, each FooB _refers_ to a B, etc.

>With this scheme, all member functions interesting to the collection
>(and the collection's clients) can be implemented as virtual functions
>defined in GenericFoo and implemented in all subclasses of GenericFoo.

>This alternative solves the problem of subclassing A and B, because
>FooA can refer as easily to a SubA as to an A.

Introducing "wrappers" has almost the same disadvantages than the
first method because it only solves an *isolated* problem and not the
"impedance mismatch".  A few points to consider are:

- Objects aren't in just one collection or used by just one subpart of
  the system.  Several subparts work with (partially overlapping)
  subinterfaces of your objects.  You may need different wrappers for
  the same object (with different type hierarchies).

- Besides being cumbersome to write (wasn't OOP supposed to reduce
  duplication and programming time?), wrappers create problems with
  object identity.

- Wrappers only work for "first-level" imports, i.e. when the
  subsystem that you're reusing gives you As and Bs rather than
  collection of As or Bs.  This is not a realistic assumption (or
  else you'd be reusing only low-level subsystems).


-Urs
-- 
------------------------------------------------------------------------------
Urs Hoelzle                                            hoelzle@cs.stanford.EDU
Center for Integrated Systems, CIS 42, Stanford University, Stanford, CA 94305

matt@bacchus.esa.oz (Matt Atterbury) (04/03/91)

In article <1404@optima.cs.arizona.edu> gudeman@cs.arizona.edu (David Gudeman) writes:

   ...  Rather than try
   to find ways around the problem, why not eliminate the source of the
   problem?

Because the problem (i.e. static typing) DOES have real advantages,
which I (and presumably others) prefer, despite its disadvantages.

cheers ...
--
-------------------------------------------------------------------------------
Matt Atterbury [matt@bacchus.esa.oz]      Expert Solutions Australia, Melbourne
UUCP: ...!uunet!munnari!matt@bacchus.esa.oz               "klaatu barada nikto"
  or: ...!uunet!murtoa!bacchus.esa.oz!matt            "consider this a divorce"
ARPA: matt%bacchus.esa.oz.AU@uunet.UU.NET  "life? don't talk to me about life!"