bertrand@eiffel.UUCP (Bertrand Meyer) (11/10/90)
From <1990Nov9.010930.26493@Neon.Stanford.EDU>, craig@Neon.Stanford.EDU (Craig D. Chambers) writes: > [...] > What are you saying? That you didn't make the comments that I > attributed to you? In mail to Mr. Chambers I listed the points on which I believe he attributed to me ideas which are the contrary of what I wrote, which is why I just cannot participate in that part of the debate (static typing). I'd like to confine any controversy to technical issues, rather than each person's (recursive) understanding of another's comments, so I'll stop here. By the way, it is quite clear that if Mr. Chambers did misunderstand my points that may be partially or even totally my fault - for not being clear enough. On void references: 1. A void reference by any other name (nil, NONE, null object etc.) is still a void reference. 2. > In Self we are rewriting our code to avoid using a > generic nil object, always initializing variables to "real" objects. > In some cases we have type-specific null objects that implement all > the messages that should be understood by objects of the type, > typically by doing nothing and terminating a recursion. This is the worst idea I have heard in a long time (which means something). When you find that an operation is impossible to execute, you just ignore it! What about the specification - the contract - that your client expects? In the end, THIS is the real difference. Of course Self, like everything else, has void references - they are just called null objects. Where Self, from this description, differs from Eiffel, is that an erroneous feature call will simply be ignored. This policy, of course, could have been implemented in Eiffel (as it could be in Simula or anything else) by accepting that x.f does nothing if x is void. But this is not acceptable! If there is an illegal call, we want it to signal failure by raising an exception - NOT to fake success! Just to drive the point closer: if I have the call landing_gear.raise in a flight control software system, then I certainly would check very carefully, by static means, that landing_gear will never be void (null, nil ...) when this is executed. Once I have indeed checked, if, for some reason (which can only be a bug, although if there is some fault-tolerance in the software a rescue mechanism may be available), landing_gear does happen to be void at run-time, is the proper reaction to do nothing and wait quietly? ``Ladies and Gentlemen, we are starting our final approach towards Stanford. For reasons too complicated to explain in detail, you may wish to fasten your seat belts. We hope that if you future plans call for air travel ...'' I believe that this is not proper. The proper reaction is to trigger an exception, with the hope, of course, that there is some exception handling code - even if all that code does is to flash a light on the pilot's control board. -- Bertrand Meyer bertrand@eiffel.com
johnson@m.cs.uiuc.edu (11/11/90)
craig@Neon.Stanford.EDU (Craig D. Chambers) writes: > In Self we are rewriting our code to avoid using a > generic nil object, always initializing variables to "real" objects. > In some cases we have type-specific null objects that implement all > the messages that should be understood by objects of the type, > typically by doing nothing and terminating a recursion. to which bertrand@eiffel.com (Bertrand Meyer) replies: > This is the worst idea I have heard in a >long time (which means something). When you find that an operation >is impossible to execute, you just ignore it! What about the >specification - the contract - that your client expects? > In the end, THIS is the real difference. Of course Self, >like everything else, has void references - they are just called >null objects. Where Self, from this description, differs >from Eiffel, is that an erroneous feature call will simply >be ignored. This is not exactly the case. It is clear that Craig was not being clear, though that is not exactly his fault. I first heard this idea from Allan Wirfs-Broch about two or three years ago, and it took me awhile to get it. There are several different ways to use nil in Smalltalk. One is as the value of an unassigned variable. The other is as a generic "other" or "don't care" value. A common idiom in Smalltalk is x isNil ifFalse: [ x fee. x fi. x foe. x foo ] nil doesn't understand fee, fi, foe, or foo, but if x is not nil then x will understand those messages. Type-checking this code is funny. One alternative is to say that nil is a subtype of every type. This implies that programs that have been type checked could still send messages to objects that do not understand them. Another is to have the type-checking algorithm use flow analysis and case analysis (which is the alternative we chose in Typed Smalltalk). This makes type-checking expensive and hard to understand. The solution is to say that nil is used ONLY for undefined values. "Don't care" values and "error" values should be user defined objects, defined specifically for a particular type. Thus, performing an operation on an "error" value raises an error, while performing an operation on a "don't care" object does nothing. Which one you use depends on which one you need. The result is that programs type-check and the type system has the property that programs that have been type-checked only send messages that are understood by their receivers. It might be that some checking is given up, since the programmer might mistakenly give a "don't care" object when an "error" object should have been given. However, the increase in the quality of the type-checking might overcome this. I would be happy to try this technique in Typed Smalltalk, and indeed we often use something like it in the code that we write. However, one of the goals of the Typed Smalltalk project is to handle typical Smalltalk code the way that it is written by typical Smalltalk programmers, and this is not yet the way most Smalltalk programmers program. Thus, we cannot depend on it in our type system. Ralph Johnson -- University of Illinois at Urbana-Champaign
bertrand@eiffel.UUCP (Bertrand Meyer) (11/12/90)
There is something strange in this discussion. In the messages by Ralph Johnson and Craig Chambers one gets the impression (please correct me if this is a wrong impression) that it is possible to guarantee statically that no error will ever occur at run-time. I don't see this as a fulfillable goal theoretically or practically, short of taking either or both of the following two measures: 1. Full-fledged program proving. 2. Unacceptable restrictions on the expressive power of the language (e.g. no references). The aims of typing are more modest: guaranteeing statically that a certain class of run-time errors will never occur. Of course we should make that class as broad as possible. Object-oriented technology is a particularly illuminating context for discussing these issues because in a pure object-oriented language such as Eiffel or Smalltalk, once you remove the supporting machinery, the analysis/design/programming language has essentially ONE operation: feature call, of the form x.f Let us assume for a moment that f has no precondition (i.e. is always applicable to an instance of the class in which f was declared). Then the only problem of software reliability becomes: /A/ Can we statically guarantee that for every possible execution of x.f at run time, x will be attached to an object capable of handling feature f? Because answering this question positively would require either or both of 1. and 2. above and hence seems impossible, the most useful practical solution seems to be to settle for the question /B/ Can we statically guarantee that for every possible execution of x.f at run time, if x is attached to an object, that object will be capable of handling feature f? The answer in Eiffel to question /B/ is ``yes in principle; with Interactive's current compiler, yes in almost every case that counts'', and we are working to transform this into a ``yes'' without qualification. [Anyone familiar with axiomatic semantics will recognize in the difference between /A/ and /B/ something similar to the difference between total and partial correctness.] The idea behind /B/ is to cover statically as many as possible of the potential erroneous cases through type checking, and to relegate every other case to a single type of error, feature application to a void reference. In practice, there remains another erroneous case, the one in which f does have an object to work on, but the object does not satisfy the precondition of f (e.g. f is extraction of the real square root and x is an object representing a real number). Then /B/ becomes: /B'/ Can we statically guarantee that for every possible execution of x.f at run time, if x is attached to an object that satisfies the precondition of f, that object will be capable of handling f? (Some practical cases that are handled differently at the implementation level, such as arithmetic overflow or operating system signals, which theoretically fall in this framework too.) The situation with respect to precondition violations is the same as for vacuity tests (checking whether x may be void): everyone would like to check statically, but that is impossible in the general case within the current state of program proving technology. Of course, if we choose /B/ it is still essential to try to check statically, as extensively as possible, that x will always be non-void. In many cases this is possible or even trivial, as in x.Create; x.f ... Similarly, if we consider /B'/ instead (which is necessary in practice) we will try to ascertain statically, in as many cases as possible, that all feature calls guarantee the corresponding precondition. However: + When it comes to static checking of both non-vacuity properties and preconditions, we have to accept that the techniques that work are NOT those commonly used for static type checking, and that, short of solving 1. and 2. above, there is no universal solution, only individual, specific and partial techniques. + Whenever we have less than perfect faith in the exhaustiveness of the static verification we may have performed, we need to rely on dynamic checking: for every execution of x.f that has not been proved correct, check at run-time that x is non-void and the associated object satisfies the precondition of f. In dynamic checking does detect anomaly, the only acceptable solution is to raise an exception, which is why it is so essential for any attempt at reliable object-oriented software to have a clean exception handling facility, tied in to the assertion mechanism. -- Bertrand Meyer bertrand@eiffel.com
craig@Neon.Stanford.EDU (Craig D. Chambers) (11/12/90)
In article <447@eiffel.UUCP> bertrand@eiffel.UUCP (Bertrand Meyer) writes: >From <1990Nov9.010930.26493@Neon.Stanford.EDU>, >craig@Neon.Stanford.EDU (Craig D. Chambers) writes: >> In Self we are rewriting our code to avoid using a >> generic nil object, always initializing variables to "real" objects. >> In some cases we have type-specific null objects that implement all >> the messages that should be understood by objects of the type, >> typically by doing nothing and terminating a recursion. > > This is the worst idea I have heard in a >long time (which means something). When you find that an operation >is impossible to execute, you just ignore it! What about the >specification - the contract - that your client expects? OK. Let me try to be clear here. I'm not just calling nil a new name, and implementing messages that mask errors. That's clearly silly, and I try not to advocate silly things. Give me a little credit. This business with type-specific null/empty objects is a useful technique for writing programs in a good object-oriented language; I had hoped it was a little more widely known. Let's consider a linked-list abstraction, implemented with cons-cell objects and some sort of end-of-list object. In most implementations of this abstraction, the end-of-list object is either a void reference (null pointer) (e.g. Eiffel) or a pointer to the generic nil object (e.g. Lisp, Smalltalk). The code for iterating down the list, starting at a cons-cell, typically tests whether the next element (cdr) is nil, and if so, returns. If the next element is non-nil, then the iteration recursively invokes the iteration function on the next element of the list. Here's some Smalltalk code for this function: "In class ConsCell" do: aBlock aBlock value: car. cdr isNil ifTrue: [ "do nothing " ] ifFalse: [ cdr do: aBlock ]. Most routines in the ConsCell class will have similar tests for nil of the cdr instance variable. Plus all users of a list will have to test for empty list (i.e. isNil) before sending a message like "do:" to the cons-cell. This is pretty ugly, and all the testing operations are contrary to the spirit of object-oriented programming. A better way of implementing this list abstraction is to recognize that there are two representations of a list link (ConsCell and EndOfList) and that both have the same interface. EndOfList is an example of a type-specific null object, but it isn't just another name for the generic nil, since it is a perfectly reasonable representation of an instance of the list abstraction, namely, an empty list. Here's some Smalltalk code for this kind of implementation: "In class RevisedConsCell" do: aBlock aBlock value: car. "continue the recursion with the next list element" cdr do: aBlock. "In class EndOfList" do: aBlock "end of the list; do nothing" self. Both RevisedConsCell and EndOfList are subtypes/subclasses/whatever of some List abstraction, but are different representations appropriate to different subsets of the instances of List. Since EndOfList is a perfectly reasonable List instance, understanding all the List protocol, neither the code in the RevisedConsCell class, nor the users of the list, need to check for some nil object. This is a more object-oriented way of implementing lists, and this same style works for other kinds of "behavior modes" that exist within an abstraction: expanded-vs.-iconified windows, empty-vs.-non-empty collections, etc. To preempt the next comment/criticism of this scheme, I'll claim that it works well even for mutable objects, given the right language support. One problem with the EndOfList class above is that it's hard to add an element to an empty list in place, mutating an empty list into a non-empty list. Self offers a natural solution to this problem using dynamic inheritance. The list instance is represented by an empty object that is initially a *child* of the EndOfList object. To allow the user to add an element to a (currently) empty list, the EndOfList object defines the "append:" method that creates a new ConsCell object containing the new element, and then switches the parent pointer of the child list instance to point to the new instance of ConsCell. No references to the list object need to be updated (i.e. no "become:" was used), but the list now behaves as if it were a singleton list object, since it inherits from the ConsCell object. To find out more about dynamic inheritance and its usefulness for dynamically-changing behavior modes within an abstraction, you can read the paper "Organizing Programs without Classes" in the set of papers available via anonymous ftp from otis.stanford.edu. It is also scheduled to appear in an upcoming issue of "Lisp and Symbolic Computation" devoted entirely to Self. I hope this discussion is a little more clear on what I mean by type-specific null/empty objects and why they are a better way of writing many kinds of abstractions. -- Craig Chambers
craig@Neon.Stanford.EDU (Craig D. Chambers) (11/12/90)
In article <451@eiffel.UUCP> bertrand@eiffel.UUCP (Bertrand Meyer) writes: > >There is something strange in this discussion. In the messages >by Ralph Johnson and Craig Chambers one gets the impression >(please correct me if this is a wrong impression) that >it is possible to guarantee statically that no error will >ever occur at run-time. You are so corrected. I'm not making such a claim (in fact, in my earlier posting I presented concrete examples that are extremely difficult to type check statically, let alone verify for "correctness"), and I'm certain that Ralph Johnson isn't making such a claim, either. Again, give us some credit. > [stuff deleted] > >1. Full-fledged program proving. >2. Unacceptable restrictions on the expressive power > of the language (e.g. no references). > > [stuff deleted] > > x.f > > [stuff deleted] > >Then the only problem of software reliability becomes: > >/A/ > Can we statically guarantee that for every possible execution > of x.f at run time, x will be attached to an object capable > of handling feature f? > >Because answering this question positively would require either >or both of 1. and 2. above and hence seems impossible, the most >useful practical solution seems to be to settle for the question > >/B/ > Can we statically guarantee that for every possible execution > of x.f at run time, if x is attached to an object, that object > will be capable of handling feature f? Why do you insist that void references are required? I've given you examples of languages that include neither void references nor generic nil objects (CLU, Trellis/Owl), and real programs can be written in them. Implementations of both are available on various platforms (although I think you probably have to pay money to DEC for Trellis/Owl). Can you present examples that *require* void references to be practical? I can think of a few examples that without void references may be a little uglier or require a slightly more sophisticated type checker, but allowing void references throughout a program introduces a potential class of programming errors that is much more costly that a bit of complexity in a few instances. My position is that static type checking may be useful for developing large, reliable systems. It is certainly not *the* solution, but may be one of several useful programming tools. I feel that static type checking, if it exists, should be as unrestrictive as possible *without* accepting without notice programs that contain what I would call type errors. [I know what I mean by "type error", but I don't have the time or inclination to define it completely here.] To me, this implies contravariant subtyping rules, support for multiple dispatching, and (as a last resort) the ability of the programmer to ignore compile-time type errors (of course the implementation will contain a run-time type test at every ignored type error). I consider void references to be an undesirable language feature, since they are difficult to check statically (thus reducing the reliability of programs) and offer little additional expressive power in return. I agree with most people that subtyping is different from implementation inheritance, and that the two mechanisms should be linguistically distinct. This doesn't necessarily imply that classes cannot be interpreted as both types and chunks of code, but instead that at least two kinds of subclassing (subtyping and code inheritance) should be supported. I hope that clears up any mistaken impressions of my position. -- Craig Chambers
moss@cs.umass.edu (Eliot Moss) (11/14/90)
I have two things to add to this discussion:
(1) First, the idea of multiple representations for individual objects of the
same type, and specifically its application to situations such as the empty
list, has already appeared in the literature, so let's give some authors
credit:
@INPROCEEDINGS{LTP86,
AUTHOR = "Wilf R. {LaLonde} and Dave A. Thomas and John R. Pugh",
TITLE = "An Exemplar Based {Smalltalk}",
BOOKTITLE = oopsla,
ADDRESS = "Portland, OR",
YEAR = 1986,
MONTH = sep,
ORGANIZATION = acm,
PAGES = "322--330",
SERIES = sigplan,
VOLUME = "21(11)"
}
It does seem like a reasonable idea in that it cleans up code and takes
advantage of the implicit "type test" performed by the dynamic method lookup
mechanisms of OO languages. And it does so without violate static type
constraints. It is not clear if there is a performance difference between the
techniques. With nil you can detect end of list without having to follow that
last pointer (it has a distinguished magic value, frequently 0; in my
Smalltalk implementation nil is also a static constant rather than a pointer
to the Nil object precisely to speed up the code in a lot of places (we can
still send messages to it, etc.)); without nil you eliminate a number of
conditional checks (but will always have to do dynamic calls). A detailed
study of the relative performance of the techniques would be interesting; I am
not aware of its having been done.
(2) Since I was in on the design of CLU and also participated in Trellis/Owl,
perhaps I can add a little w.r.t. to them as well. First, variables can
potentially be uninitialized; this is supposed to be detected by the run time
system. A compiler can, with simple data flow analysis, sort the variables
into three categories: definitely initialized before use, definitely *not*
initialized before use (produce an error message and code to raise an
exception if that point is reached in the program); and uncertain (requires
run time check and possible exception at run time). The uncertain category
will tend to be pretty rare if reasonable programming style is used. This does
not directly relate to the void pointer question, but I thought it might help
clarify things w.r.t. Bertrand's points about total/partial correctness
analogies, etc.
To do a linked list in CLU you use a "oneof" (or "variant"; the distinction is
not all that relevant here) type, which is similar to a variant record in
Pascal. Such an object has a tag, and there is a tagcase statement for testing
the tag, which allows you to interpret the contents of the oneof according to
its actual tag, in a type safe manner. This avoids the concept of a void
pointer and replaces it with the concept of a void case in an object format.
The CLU approach leads to a nicer type system, but the representation is not
as compact and the coding perhaps more verbose than if we had void pointers.
Some ML compilers are smart enough to use the actual void pointer
representation for its analog of "oneof" when there are only two cases, one of
them void. The exemplar approach does not exactly get around the data
compactness problem -- the "tag" is the type field in the object, used by the
run time for method lookup. Of course, if the type field *must* be there for
other reasons, then there is no *additional* overhead. The end of list object
must also be represented explicitly. But the main thing is that the code is
compact.
Personally, I would not call the end of list object a "don't care". Among
other things, it probably says nasty things if you send car or cdr to it, but
it *does* implement them, so it is type correct in the sense that the method
is guaranteed to exist. If one had full signatures including exceptions, as in
CLU, Trellis/Owl, Modula-3, etc., then car and cdr should be noted as possibly
raising an exception.
Well, I hope this was helpful to some of you anyway .....
--
J. Eliot B. Moss, Assistant Professor
Department of Computer and Information Science
Lederle Graduate Research Center
University of Massachusetts
Amherst, MA 01003
(413) 545-4206; Moss@cs.umass.edu
bertrand@eiffel.UUCP (Bertrand Meyer) (11/15/90)
Two more comments after Craig Chambers's and Eliot Moss's last postings. 1. The technique using two variants of list cell descriptions (two heirs to a class CELL or LINKABLE), one to represent internal cells and another to represent terminal cells, can be implemented in any object-oriented language. I do not find it particularly attractive: A. It seems to require at least as much code as using a vacuity test. B. One may also question the claim that it is ``more object-oriented''. Are we to understand that the conditional instruction is not object-oriented? This is going too far. Good object-oriented style implies getting rid of certain conditional instructions (those which discriminate among a number, often large, of data types, especially if the list of choices is bound to change). The philosophy that I see behind Mr. Chambers's method (although he did not express it in that way) is that whenever technically feasible an explicit test should be replaced by dynamic binding. I may be misrepresenting his position, in which case he'll correct me; if so, however, this means there is room for conditional instructions, and the discrimination between void and non-void references (or terminal and non-terminal cells) seems to be a legitimate case for such instructions. C. The prospect of having to write TWO classes for every kind of cell (list element, tree node, ...), is rather unpleasant. Apparently this is easier in Self because it's not really classes but objects. But here I quit. Object-oriented programming as I understand it has little to do with objects; it is really about classes. Why one should try to have O-O Programming without classes, the major facility of the method, is beyond my understanding. I should probably have kept my mouth shut, but it's too late. (The term ``Class-based analysis, design and programming'' is what I personally would use to characterize the field. By the way, I have never understood the rationale behind distinctions such as ``object-based'' vs. ``object-oriented''. If we are using English, it would seem that terms such as ``based'', ``oriented'' and the like should retain a meaning which is reasonably close to their usual sense, and technical terms such as ``object'' should be used in their accepted technical sense. What does the distinction between ``based'' and ``oriented'', if any, have in common with the distinction between encapsulation languages such as Ada, supporting a certain module structure, and approaches such as Smalltalk's or Eiffel's supporting a module and typing structure based on classes? How do Ada ``objects'' differ from those of Pascal? In what way does the use of such terminology improve our understanding of the issues? But by now I have probably opened too many cans of worms anyway.) 2. On the topic of errors: It is nice to see that apparently everyone agrees that there may remain errors at execution-time. The next question is: which ones? I have described a framework (Eiffel's) in which, assuming a perfect compiler, only three kinds of errors may remain: A. Attempt to call a feature on a void reference. B. Assertion violation. C. Hardware or operating system fault (e.g. arithmetic overflow). In a theoretical study, case C would disappear, being covered by A and B. I believe this is clean, intellectually satisfying, practically convenient, and easy to explain. It is certainly not in the only possible solution. But it is essential that anyone claiming that language or environment X ``solves the problems'' of errors of a certain type (e.g. vacuity of references) should also state what are the categories of possible run-time errors that DO remain. -- -- Bertrand Meyer bertrand@eiffel.com
craig@Neon.Stanford.EDU (Craig D. Chambers) (11/15/90)
In article <454@eiffel.UUCP> bertrand@eiffel.UUCP (Bertrand Meyer) writes: > Apparently this is easier in Self because it's > not really classes but objects. > But here I quit. Object-oriented programming > as I understand it has little to do with objects; > it is really about classes. Why one should try to have > O-O Programming without classes, the major facility of > the method, is beyond my understanding. I should probably > have kept my mouth shut, but it's too late. I'm glad you didn't. There seems to be a common misconception that languages without classes cannot be used for normal kinds of object-oriented programming, in a "structured" way. This, I'd argue, is not the case. The linguistic facility in many object-oriented languages called a "class" provides a number of features: Repository for shared behavior (e.g. inherited methods) Repository for shared state (e.g. class variables) Template for format of instances (e.g. instance variables) Object containing instance-independent behavior and state (e.g. class methods, class variables) And for statically-typed languages: Specification of interface (e.g. signatures of methods, deferred methods) All of these features are useful, and any reasonable object-oriented language should provide these mechanisms somehow. Class-based languages tie these features together into one all-consuming linguistic construct, the class. This in itself is not a real problem. The problem arises when one class is defined as a subclass of another. In most existing class-based languages, it is not possible to inherit only part of the aspects provided by the superclass; it is impossible in most class-based languages to inherit the behavior of a superclass (i.e. the methods) but not the representation. For example, suppose a new Rectangle class wants to inherit most of the implementation of the existing Polygon class, but replace the list-of-vertices representation with four integers. In most existing class-based languages, the Rectangle class would be forced to accept the vertices instance variable as well. Trellis/Owl is an example of a class-based language that does not suffer from this problem (at least as documented in the manuals I have; perhaps the implementation is not this smart yet). Languages with extensive metaclass and reflective support, like CLOS, can probably solve this problem as well by doing extra programming at the metaclass level, but this seems a bit complicated for what should be a simple task. Another example of two facilities being intertwined in some existing class-based languages is the inheritance of implementation and interface as a unit (i.e. subtyping = submoduling). The specification part of classes must be inherited if the behavior part is inherited, and vice versa. This issue has been beaten to death already. In a classless language like Self, you'd use separate objects to provide each the above facilities. One object holds shared behavior and shared state for an abstraction (called the traits object). Another object serves as a template for instances of the abstraction (called the prototype). Depending on the mood of the Self programmer, instance-independent behavior may be included in the traits object or moved out into a separate object. Objects are also used as name spaces, so that a simple message like "traits rectangle" will evaluate to the rectangle traits object. Object inheritance subsumes the subclassing hierarchy, the instantiation hierarchy, and even the variable name scoping rules. Since different objects support different aspects of classes, refining objects can inherit behavior without representation, or vice versa, by inheriting from only the appropriate objects. In Self defining two or three objects for what would normally be part of a single class declaration is a bit more verbose (by a few lines), but future classless languages need not have this problem. People who've seen some of our talks on Self, or who have read some of our papers, sometimes feel like we've just re-invented classes and called them traits objects. In a way, they're right: traits objects are similar to class objects in that both fulfill the need to provide shared behavior. But classes bring along lots of other baggage (like representation) that may not always be needed or desired. What we've really done is to break up classes into several more primitive pieces that can be combined in useful ways that aren't possible in typical class-based languages. My feeling is that "ideal" class-based languages and "ideal" classless languages turn out to be very similar. Ideal class-based languages would allow the different facilities provided by the class construct to be mixed and matched in subclasses (thus extending the idea of two kinds of inheritance links that I proposed earlier to allow inheritance of particular aspects of classes). Ideal classless languages might be able to define some objects as "abstract" (implying use as a traits object) and other objects as "concrete" (implying use as a prototype object); the syntax for defining these objects could be designed so that a single declaration would define several related objects. In the end, programs in the two kinds of languages would end up looking remarkably similar, and similar kinds of power and flexibility are afforded by both languages. One remaining difference is that class-based languages include an instantiation hierarchy while classless languages do not; I'm not sure how important this distinction is. Another difference is that classless languages support inheritance from other objects (either abstract or concrete objects), while class-based languages only support inheritance from classes (analogous to inheriting from only abstract objects). This distinction may be more interesting, and in the end ideal classless languages may win out over ideal class-based languages for this very reason. -- Craig Chambers P.S. A less speculative paper on this topic, "Organizing Programs without Classes," is available via anonymous ftp from otis.stanford.edu.
ddean@rain.andrew.cmu.edu (Drew Dean) (11/16/90)
From a relative neophyte, if a subclass only inherits parts of its superclass (ie. some subset of the methods and/or variables), how can an instance of the subclass be used anywhere an instance of the superclass can be used ? In my (comparably primitive) understanding of OOP, this is important. Without this property, what good are abstract classes, which seem to be a useful organizational tool ? Drew Dean Drew_Dean@rain.andrew.cmu.edu
barmar@think.com (Barry Margolin) (11/16/90)
In article <1990Nov15.011702.25087@Neon.Stanford.EDU> craig@Neon.Stanford.EDU (Craig D. Chambers) writes: > In most existing class-based languages, it is not >possible to inherit only part of the aspects provided by the >superclass; it is impossible in most class-based languages to inherit >the behavior of a superclass (i.e. the methods) but not the >representation. While this may be strictly true, it is easily solved in class-based languages with multiple inheritance, such as CLOS. In such languages you can put the representations in one set of classes and the behavioral implementations in another set of classes. You would then define new classes that inherit one from column A and one from column B (am I coining a new metaphor, related to Flavors' use of "mixins"?), and only make instances of these classes. Implementing things this way requires extreme care. Normally, the code implementing the behavior of a class likes to have direct access to the representation of the class. In many OO languages the components of the representation ("slots" in CLOS, "instance variables" in Flavors and Smalltalk, "members" in C++, etc.) can be accessed directly as variables in the methods (in CLOS this requires use of the WITH-SLOTS macro, as methods are not attached to a single flavor so there is no built-in notion of "this" or "self"). In order to cope with the flexibility of the Chinese Menu style of OO programming you must take as much care in designing the internal interfaces of the classes as you do in designing the external interfaces. Separating out the interface specification can be done similarly, but it may require extra work. There can be a third set of classes that actually provide the externally visible interfaces. The interfaces to the above two sets of classes would not be available outside these three sets of classes (either using the language's visibility mechanisms or simply by adhering to naming conventions, e.g. calling the external interface FOO and the internal one FOO-INTERNAL). A particular specification class would only provide the interfaces that are part of the specification it models, but its implementations would simply pass the call on to the corresponding internal interface. In languages with explicit abstract superclasses you may be able to avoid the dual interfaces, and simply specify the interface in the abstract superclasses. Either way, the specification classes become column C in the class menu. -- Barry Margolin, Thinking Machines Corp. barmar@think.com {uunet,harvard}!think!barmar
sakkinen@tukki.jyu.fi (Markku Sakkinen) (11/16/90)
In article <11113@pt.cs.cmu.edu> ddean@rain.andrew.cmu.edu (Drew Dean) writes: >From a relative neophyte, Keep on asking these questions! I suppose that the object-oriented people (like many others) live under a small but constant threat of "the emperor's new clothes" (H.C. Andersen). > if a subclass only inherits parts of its superclass >(ie. some subset of the methods and/or variables), how can an instance >of the subclass be used anywhere an instance of the superclass can be used ? >In my (comparably primitive) understanding of OOP, this is important. Without >this property, what good are abstract classes, which seem to be a useful >organizational tool ? Such partial inheritance can only be used for inheriting implementation (what I called "incidental inheritance" in my ECOOP'89 paper), thus more typically with concrete than abstract superclasses. An instance of the subclass can _not_ be substituted for an instance of the superclass. In my opinion, inheriting only parts of superclass, say C, into a subclass, say D, is a highly dubious idea even for this purpose. One should rather have defined a fully-fledged class B containing those parts first, then both C and D as subclasses of B. It would be nice to have the kind of generalisation facilities that Claus Pedersen suggested in his OOPSLA'89 paper: that one could derive _super_classes approximately as easily as existing languages allow us to derive _sub_classes. Markku Sakkinen Department of Computer Science and Information Systems University of Jyvaskyla (a's with umlauts) Seminaarinkatu 15 SF-40100 Jyvaskyla (umlauts again) Finland SAKKINEN@FINJYU.bitnet (alternative network address)
pcg@cs.aber.ac.uk (Piercarlo Grandi) (11/19/90)
On 15 Nov 90 16:22:19 GMT, ddean@rain.andrew.cmu.edu (Drew Dean) said: ddean> From a relative neophyte, if a subclass only inherits parts of ddean> its superclass (ie. some subset of the methods and/or variables), ddean> how can an instance of the subclass be used anywhere an instance ddean> of the superclass can be used ? In my (comparably primitive) ddean> understanding of OOP, this is important. Well, this type of polymorphism does not arise in conventional OO languages, where the algebra of interface and implementation reuse is just inheritance via prefixing, and links together both, and only for supersetting, not for subsetting -- often the reason for this is, more or less explicitly, precisely to avoid this problem. But if you look at less restrictive languages you see that the algebras of interface and implementation reuse can be considerably more flexible, and less interlinked; in another article Barry Margolin points out that the problem is... barmar> ... easily solved in class-based languages with multiple barmar> inheritance, such as CLOS. In such languages you can put the barmar> representations in one set of classes and the behavioral barmar> implementations in another set of classes. What he really means is that in CLOS (like in some sense in Smalltalk) you have a functional interface to both methods and attributes, and you can obtain a new class interface by merging the interfaces (or subsets thereof) of one (or more) class that gives you the attributes interface with one (or more) that gives you the methods interface. The problem, for data traits or attributes, is naturally how much of the (name,offset,type) table that describes the components of the state of an object we need to keep at runtime. At one extreme in C and C++ 1.x and similar languages all attributes have characteristics known at compile time, even if their number may change (a subclass of A will have more attributes, but you are forbidden from accessing these, except via virtual functions, which indeed are accessed via a table) With virtual multiple inheritance, as in C++ 2.x (or Eiffel), some attributes have offsets that are variable at run time, so we need to store the offset at run time; in some languages the *type* of an attribute is not known at runtime, as in Smalltalk. In the fully general case the state of an object can be dynamic in all three dimensions; the number of attributes may be variable, their offsets (representations!) may vary, and their types too. Now consider the domain of applicability (or polymorphism) of a method; a method may depend at the very least on the names of the attributes it uses in the object, or also on their offsets (representations!), or also on their types. The most extreme cases can be easily implemented; if the set of attributes of an object are known at compile time and so are their offsets (the C/C++ 1.x case above) we do not need any runtime table; we can match a method against an object by comparing their interfaces in the compiler. If nothing is know at compile time, the solution is equally easy; we associate a full (name,offset,type) table to each object, and we match it with that of the method and its implementation at runtime. A method (or its implementation) can be applied to any object whose interface is 'stronger' than that it has been defined for. Whether it is a mistake to associate a stronger implementation to a weaker method interfaces is largely a matter of taste. On the other hand applying methods with a stronger interface to objects with a weaker one is a partial function, and its validty must be determined at runtime, just as applying a method with a stronger specification to an object with a weaker one. Example of the latter: applying 'factorial' to a numeric object that is not guaranteed positive and integral. Example of the former: applying 'square(int)' to a numeric object that is declared as 'union(int,float)'. The runtime table based solution allows for easy mix and match of interfaces, by mixing and matching the tables; but, especially for attribute access, a table lookup on every use is not attractive, except to AI people, which have been doing this forever (alists, frames, ...), also because in their applications a trait which is an attribute may be *dynamically* reimplemented as a function, or viceversa, and so table/alist based approaches are nearly essential. But in many cases the full generality is not necessary, even if static foreknowledge is not possible either. Implementation techniques for the intermediate cases are being researched. Things like Self and automatically typed Smalltalk are examples of what can be done to reduce the cost of full generality when it can be constrained either statically (at least part of the table for an object can be deduced at compile time) or dynamically (it is expected that the contents of the table do not change frequently). There are other tricks; I am terribly sorry that I do not remember the exact reference, but in some recent 1989 SIGPLAN conference on languages and their implementation there is a paper on laying out different objects types in memory so that subobjects that have the same naem and type in all those object types are at the same offset relative to the object pointer. This naturally means that the access functions to the subobjects are constant and known at compile time and are the same for all those object types. The goal is to minimize holes, of course; we can make the layouts of two classes compatible in the sense of having attributes of the same name and type at the same offset by creating holes, but this is hardly desirable. In the simple inheritance case implemented with prefixing avoiding this is trivial; with multiple prefixing things become more difficult (C++ has to resort to indirection via hidden pointers for virtual base classes). The paper I mention above demonstrates that a fairly simple trick, using postfixing (putting subojects at negative offsets from the object pointer) can solve the problem in many interesting cases with grater efficiency, that is no holes and no extra levels of indirection. ddean> Without this property, what good are abstract classes, which seem ddean> to be a useful organizational tool ? Abstract (deferred) classes are no good of course. They are just a hack to decouple a bit the algebra of interfaces from that of implementations, which are otherwise coupled in inheritance. The best thing sis instead to uncouple them completely, and have different algebras for implementations and interfaces, and to have more general algebras as well, not just supersetting/subsetting. As to this I would like to point you also to some work in the sw engineering reuse field on interface and component description languages, which allow you to describe how to mix and match interfaces (module A has an interface that is the union of those of modules B and C minus that of D). You could contact Bob Gautier (rjg%uk.ac.aber.cs@nsfnet-relay.ac.uk) which has done some work on this issue and on CDL, an interface description language, which I think is interesting. Some other interesting work is being done by the functional language people when they try to define clean type systems. Polymorphism is very important to them, and they have expended a lot of effort into dealing with it. -- Piercarlo Grandi | ARPA: pcg%uk.ac.aber.cs@nsfnet-relay.ac.uk Dept of CS, UCW Aberystwyth | UUCP: ...!mcsun!ukc!aber-cs!pcg Penglais, Aberystwyth SY23 3BZ, UK | INET: pcg@cs.aber.ac.uk
davis@barbes.ilog.fr (Harley Davis) (11/20/90)
In article <454@eiffel.UUCP> bertrand@eiffel.UUCP (Bertrand Meyer) writes:
2. On the topic of errors: It is nice to see that apparently
everyone agrees that there may remain errors at execution-time.
The next question is: which ones? I have described a framework
(Eiffel's) in which, assuming a perfect compiler, only three kinds
of errors may remain:
A. Attempt to call a feature on a void reference.
B. Assertion violation.
C. Hardware or operating system fault (e.g. arithmetic overflow).
"Assertion violation" is a little vague as a type of error. What
doesn't it cover? In particular, why can't it include type assertions
(or other static errors normally caught by the perfect compiler) and
void references?
-- Harley Davis
--
------------------------------------------------------------------------------
Harley Davis internet: davis@ilog.fr
ILOG S.A. uucp: ..!mcvax!inria!ilog!davis
2 Avenue Gallie'ni, BP 85 tel: (33 1) 46 63 66 66
94253 Gentilly Cedex
France
bertrand@eiffel.UUCP (Bertrand Meyer) (11/20/90)
From <DAVIS.90Nov19190243@barbes.ilog.fr> by davis@barbes.ilog.fr (Harley Davis): > "Assertion violation" is a little vague as a type of error. What > doesn't it cover? An assertion violation is a routine precondition that is not satisfied on routine entry, a routine postcondition not satisfied on routine exit, or a class invariant not satisfied in either case. (There are also other cases involving loop invariants and check instructions.) This is discussed in detail in various references (``Object-Oriented Software Construction'', chapter 7; ``Eiffel: The Language'', chapters 12 and 13; ``Programming by Contract''). On the other hand, since I am the author of these references, I am not the best judge of just how ``vague'' they are. > In particular, why can't it include type assertions > (or other static errors normally caught by the perfect compiler) > and void references? If some condition can be expressed as a type constraint and caught statically we don't need assertions to express it. (There was a comment of a similar nature in a recent message by LaPalice@tauto.fr, although I can't find the exact reference.) -- -- Bertrand Meyer bertrand@eiffel.com
johnson@m.cs.uiuc.edu (11/21/90)
pcg@cs.aber.ac.uk said: >Abstract (deferred) classes are no good of course. They are just a hack >to decouple a bit the algebra of interfaces from that of >implementations, which are otherwise coupled in inheritance. I am shocked! What can this mean? Abstract classes are one of the most important design techniques in object-oriented programming! Reading on, I decided that Piercarlo and I have different definitions of abstract classes. I think that he defines an abstract class as one that has NO implementation, in which every method is deferred. However, my definition of an abstract class is that it is a class that does not have a COMPLETE definition, in which there is SOME method that is deferred. Abstract classes according to my definition are a crucial technique for reusing classes. An abstract class is a template for its subclasses, and replaces other ways of reusing design such as code skeletons. I agree with Piercarlo that abstract classes should not be used just to represent types (though of course if you are programming in C++ then you have no other choice). Thus, we probably don't really disagree. Ralph Johnson -- University of Illinois at Urbana-Champaign
davidm@uunet.UU.NET (David S. Masterson) (11/25/90)
>>>>> On 20 Nov 90 06:45:55 GMT, bertrand@eiffel.UUCP (Bertrand Meyer) said:
Bertrand> An assertion violation is a routine precondition that is not
Bertrand> satisfied on routine entry, a routine postcondition not satisfied on
Bertrand> routine exit, or a class invariant not satisfied in either case.
Bertrand> (There are also other cases involving loop invariants and check
Bertrand> instructions.)
Hmmm. A question from relational database constraint resolution.
Within a relational database, one can define constraints that updates to the
database must satisfy before the update is allowed. Might this be considered
an assertion (most likely a routine postcondition)?
If so, its possible to define constraints on two (or more) different tables
(objects?) that are mutually exclusive (updating any of the tables violates a
constraint). I can't think of an example at the moment of when this would
occur (its usually rather complex), but the resolution of the problem is to
check constraints at transaction commit time (when all updates are done)
rather than after every update. The user of the objects makes the
determination whether to consider each update atomic or take all updates
together as atomic.
Would this be expressable under assertion methodologies (control of assertion
checking being outside of the object) or is this something that is just
avoided in development of objects?
--
====================================================================
David Masterson Consilium, Inc.
(415) 691-6311 640 Clyde Ct.
uunet!cimshop!davidm Mtn. View, CA 94043
====================================================================
"If someone thinks they know what I said, then I didn't say it!"