[comp.lang.eiffel] CHALLENGE: heterogeneous collections

hoelzle@leland.Stanford.EDU (urs hoelzle) (03/28/91)

> As someone who hasn't yet completely converted to the object oriented religion,
> this has always struck me as using the wrong tool for the job.  I really don't
> want a "list of T or subtype of T".  I want a "list of objects that understand
> foo".  Once I take that view, I have no problems, at least in the few
> statically typed languages that allow this (e.g. Russell, and probably FX from
> MIT, and probably Scratchpad II from IBM).  I also don't need to construct
> my types in such a way that all types that understand foo are "derived" from
> a common supertype;  I can use inheritance purely where it simplifies the
> implementation.  And I get full static checking, though elements of the list
> will, of course, somehow carry the "foo" function with them.  (In Russell, I
> would build a list of pairs <x: T, T: [algebra containing foo]>.)


This sounds good.  However, I suspect that these languages don't (didn't?)
have inheritance at all...?  Plus, some people don't like functional languages.
Could you post some pointers to the relevant literature?

Thanks,

-Urs

------------------------------------------------------------------------------
Urs Hoelzle                 			       hoelzle@cs.stanford.EDU
Center for Integrated Systems, CIS 42, Stanford University, Stanford, CA 94305

hoelzle@leland.Stanford.EDU (urs hoelzle) (03/28/91)

In article <pallas.670008199@red-dwarf>, pallas@eng.sun.com (Joseph Pallas) writes:
|> 
|> >    In a way, the problem exists because today's type systems are based
|> >    on what you *could* do to the list in someProc, not on what you actually
|> >    do.  
|> 
|> Indeed.  If I could provide more information about someProc, such as
|> the fact that it never attempts to modify its parameter (did I hear
|> someone cough "const"?), then even a simple type checker would be
|> happy to approve this operation.  Urs is correct that subtyping is NOT
|> the right way to express this relationship.  Subtyping is too strict a
|> conformance relation.

The problem is (unfortunately) harder than "const": if you have a generic
type (say List), e.g.:

type List[T] = 
	......
	procedure f(arg: T);

then the above problem pops up, even if f doesn't change the List (e.g.
"const arg: T" wouldn't change the problem a bit).  The problem is 
contravariance: for a List[SubT], f's signature is "procedure taking a SubT",
i.e. a more restricted type than T, violating contravariance.

Before any Eiffel fans say "Gotcha!" :-), let me go on to say that 
contravariance isn't really the *problem*, just a (correct) formulation of it:
if List[SubT] were a subtype of List[T], some variable v: List[T] could
contain either a List[T] or a List[SubT].  However, the statement
"v.f(t)" (with t:T) would be unsafe: if t holds a T, and v a List[SubT],
we're passing only a T to f, not a SubT as expected (declared).

So, again, the problem isn't contravariance, but how you apply it.

-Urs

------------------------------------------------------------------------------
Urs Hoelzle                 			       hoelzle@cs.stanford.EDU
Center for Integrated Systems, CIS 42, Stanford University, Stanford, CA 94305

cs450a03@uc780.umd.edu (03/28/91)

Urs Hoelzle writes:
>Suppose you have two classes T and SubT, where SubT is a subtype of T
>(in the strict sense of `subtype').  You also have the lists
> ...
>Now the problem: List[T] is not a subtype of List[SubT] because of the
>insert method (contravariance).  Thus, someProc(listB) is illegal even
> ...

That's not a problem, that's a slice of code.

Raul Rockwell

hoelzle@leland.Stanford.EDU (urs hoelzle) (03/28/91)

In article <519@eiffel.UUCP>, bertrand@eiffel.UUCP (Bertrand Meyer) writes:
|> From <pallas.669923769@red-dwarf> by pallas@eng.sun.com (Joseph Pallas)
|> quoted by boissier@irisa.fr (franck boissiere) in
|> <1991Mar26.101051.29527@irisa.fr>:
|> 
|> > > |> Sooner or later, opponents of static typing bring out the
|> > > |> heterogeneous collection.  
|> 
|> To handle heterogeneous collections without endangering static
|> typing, use genericity (Eiffel syntax):
|> 
|> 	class COLLECTION [G] ...
|>
|> [More details about Eiffel's generic classes]


Yes, genericity *is* useful (and needed), but it is only a partial
solution.  At the risk of repeating myself, I'll restate the problem
which originally started this discussion:

-------  Hit 'n' if you've already seen this -------------

Suppose you have two classes T and SubT, where SubT is a subtype of T
(in the strict sense of `subtype').  You also have the lists

listA: List[A] 
listB: List[B]

(where List[T] is a generic List including the method insert(elem: T))

Also suppose that there exists a procedure someProc(l: List[A]) which
(among other things) may invoke the function a() on some elements of the
list (a() is defined in A).

Now the problem: List[T] is not a subtype of List[SubT] because of the
insert method (contravariance).  Thus, someProc(listB) is illegal even
though it is perfectly safe since all of its elements are guaranteed to
understand a().  

Summary: instances of generic types are rarely subtypes of each other.
Thus, code which takes an instance of a generic type as an argument cannot
be reused for another instance of this generic type.  This can seriously 
restrict reusability in programs which use many related instances of generic
types, e.g. collections.

------------------------------------------------------------------------------
Urs Hoelzle                 			       hoelzle@cs.stanford.EDU
Center for Integrated Systems, CIS 42, Stanford University, Stanford, CA 94305

bertrand@eiffel.UUCP (Bertrand Meyer) (03/29/91)

From <1991Mar28.030354.25051@leland.Stanford.EDU> by
hoelzle@leland.Stanford.EDU [Order of paragraphs reversed]:

> Suppose you have two classes T and SubT, where SubT is a subtype of T
> (in the strict sense of `subtype').  You also have the lists
> 
> listA: List[A] 
> listB: List[B]
> 
> (where List[T] is a generic List including the method insert(elem: T))
> 
> Also suppose that there exists a procedure someProc(l: List[A]) which
> (among other things) may invoke the function a() on some elements of the
> list (a() is defined in A).
> 
> Now the problem: List[T] is not a subtype of List[SubT] because of the
> insert method (contravariance).  Thus, someProc(listB) is illegal even
> though it is perfectly safe since all of its elements are guaranteed to
> understand a().  


> Yes, genericity *is* useful (and needed), but it is only a partial
> solution.

[The discussion below uses LIST rather than List and some_proc
rather than someProc.]

Here constrained genericity does provide a solution.
(Not necessarily *the* solution, but the best one that I know.)
The reasoning (as sketched in a previous posting) is that if
some_proc is to apply function a to objects of type T, it needs
some reassurance that such objects indeed have a function a.
It does not need to know what the function actually does, simply
that it exists, has a certain signature (argument and result types)
and, if so desired, certain semantic properties expressed by
a precondition and/or a postcondition.

The solution, then, is to introduce a deferred (i.e. abstract)
class SPECIFIC, with a deferred (not implemented) function a:

	deferred class SPECIFIC feature

		a: SOME_TYPE is deferred end;

		...
	end -- class SPECIFIC

to request that any actual generic parameter LIST be a
descendant of SPECIFIC:

	class LIST [T -> SPECIFIC] feature

		...

	end -- class LIST

and to make sure that A includes SPECIFIC among its parents.

Of course, this means that B to must be a descendant of SPECIFIC,
and Mr. Hoelzle's point is probably that this is inappropriate since
some_proc will only be called with arguments of type LIST [A]. However
the problem does not arise in this way in ``pure'' object-oriented
programming of the Simula/Eiffel (and, unless I am mistaken, Smalltalk) style.
In this world some_proc would not take an argument l of list type,
as in Mr. Hoelzle's statement of the problem given above, but
would be a procedure in a list class. Then two possibilities arise:

- Either some_proc appears in the generic class LIST as given above,
in which case it is legitimate to require that B be a descendant of
SPECIFIC since some_proc will be applicable to targets of type B.

- Or we do want some_proc to be applicable only to targets of type A.
Then we do not need constrained genericity; we should declare some_proc
not in the basic LIST class but in a descendant written specifically
for that purpose:

class LIST_OF_A inherit

	LIST [A]  

feature

	some_proc is
		local
			current_list_element: A
			x: SOME_TYPE
		do
			from ... loop ...

				x := current_list_element.a  --<------

			end
			...
		end -- some_proc

end -- class LIST_OF_A
				

In the call marked --<-----, the application of function `a' to
`current_list_element' is typewise valid since the function is assumed
to be a feature of class A.


In summary, static typing means that the software text must
make explicit the assumptions about what operations are applicable
to what entities, and then must not go beyond these assumptions in
applying operations to entities.

-- 
-- Bertrand Meyer
bertrand@eiffel.uucp

boehm@parc.xerox.com (Hans Boehm) (03/29/91)

hoelzle@leland.Stanford.EDU (urs hoelzle) writes:

[Me:]
>> As someone who hasn't yet completely converted to the object oriented religion,
>> this has always struck me as using the wrong tool for the job.  I really don't
>> want a "list of T or subtype of T".  I want a "list of objects that understand
>> foo".  Once I take that view, I have no problems, at least in the few
>> statically typed languages that allow this (e.g. Russell, and probably FX from
>> MIT, and probably Scratchpad II from IBM).  I also don't need to construct
>> my types in such a way that all types that understand foo are "derived" from
>> a common supertype;  I can use inheritance purely where it simplifies the
>> implementation.  And I get full static checking, though elements of the list
>> will, of course, somehow carry the "foo" function with them.  (In Russell, I
>> would build a list of pairs <x: T, T: [algebra containing foo]>.)


>This sounds good.  However, I suspect that these languages don't (didn't?)
>have inheritance at all...?  Plus, some people don't like functional languages.
>Could you post some pointers to the relevant literature?

I'd be glad to.

An introduction to the ideas behind Russell is in Demers and Donahue,
"Data Types are Values", TOPLAS 7, 3 (1985), pp. 426-445.

A (the) Russell implementation is available by anonymous ftp from
arisia.xerox.com:~ftp/russell.tar.Z.  It includes some further documentation.

There is a paper by Jouvelot anf Gifford in the 1991 POPL Proceedings that
cites some of the earlier papers on FX.

None of these languages are purely functional.  They all contain assignment.
(I don't like purely functional languages either.)

As far as I know, these languages do not provide inheritance in the now
conventional sense.  (I know Russell does not.)  They often get a similar
effect by allowing the use of an algebra with many operations in a place
where few operations are called for.  But adding data fields (as opposed
to functions or methods) to existing algebra implementations can be a bit
clumsy.  (You can fake it by adding an extra field whose type is a parameter
to the implementation of the supertype, and instantiating that in the subtypes.)
There are ways to avoid this.  In the case of Russell, they were never
implemented.  I believe Cardelli's language "Quest" is flexible enough
to implement inheritance, and to express the above notions.  It is no doubt
described in a DEC SRC report, though I don't have the reference handy.


Hans
(boehm@xerox.com)

pallas@eng.sun.com (Joseph Pallas) (03/29/91)

In <1991Mar27.183206.5506@leland.Stanford.EDU>
hoelzle@leland.Stanford.EDU (urs hoelzle) writes:

>The problem is (unfortunately) harder than "const": if you have a generic
>type (say List), e.g.:

>type List[T] = 
>	......
>	procedure f(arg: T);

>then the above problem pops up, even if f doesn't change the List (e.g.
>"const arg: T" wouldn't change the problem a bit).  The problem is 
>contravariance: for a List[SubT], f's signature is "procedure taking a SubT",
>i.e. a more restricted type than T, violating contravariance.

We've strayed far afield from the topic of heterogeneous collections,
I'd say.  The problem of the relationship between List[T] and List[U]
when U is a subtype of T (which is definitely NOT a subtype
relationship) is actively being researched.  

This problem doesn't go away in dynamically typed languages.

joe