[comp.object] Types vs. Classes

stt@inmet.inmet.com (09/25/90)

I would like to begin a discussion on the distinction,
if any, between "types" and "classes".  There seems to be
no consistent distinction made between these two in the OOP literature.  
I have seen a type considered an implementation of a class, a class
as an implementation of a type, the two considered
synonyms, etc.  

Here is a proposed distinction:

1) Each value has a *unique* "type"
2) A value may be a member of one or more "class"es, determined
   by the type of the value.

Essentially, the type structure is considered a "partitioning"
of the value space (i.e. types are non-overlapping), 
whereas the class structure is a taxonomy (i.e. classes overlap
and enclose one another).

A class comprises a (potentially open-ended) set of types, 
and a subclass is a subset of a class, comprising a subset 
of the types in the superclass.  

In formal terms, it is meaningless to ask "what is the class" of
a given value.  However, this can be reinterpreted as
"what is the smallest class which contains the type" of a given value.

-------------------------
I prefer these definitions, because I think they are consistent
with the use of the term "type" in strongly-typed non-OOP languages,
and consistent with the use of the term "class" in OOP languages.

Actually, I think OOP languages tend to be a bit schizophrenic
about the term "class" already, since the term is used both
for parameters/references which may in fact refer to a value
in any subclass of the class, and to identify *the* class of a value (which
only makes sense if you use the interpretation suggested above
of "smallest class which contains the value").

Comments, flames, etc. welcome...

S. Tucker Taft      stt@inmet.inmet.com   uunet!inmet!stt
Intermetrics, Inc.
Cambridge, MA  02138

gilstrap@aslan.sbc.com (Brian R. Gilstrap) (09/27/90)

In article <60700001@inmet>, stt@inmet.inmet.com writes:
|> I would like to begin a discussion on the distinction,
|> if any, between "types" and "classes".  There seems to be
|> no consistent distinction made between these two in the OOP literature.  
|> I have seen a type considered an implementation of a class, a class
|> as an implementation of a type, the two considered
|> synonyms, etc.  
|> 
|> Here is a proposed distinction:
|> 
|> 1) Each value has a *unique* "type"
|> 2) A value may be a member of one or more "class"es, determined
|>    by the type of the value.

What he continues with does not necessarily jibe with how I think of
things (could simply be that I didn't understand what he was saying).
So, in order to stir up discussion and perhaps have some misconceptions
of my own corrected, I'll give this a stab.

Every object conforms to one or more abstract types.  For example, a given
object might comform to the type "File", where "File" has been
defined as a set of signatures, invariants, etc. (e.g. an abstract type).
Note that there can be "anonymous" abstract types.  That is, there may
be an object which conforms to an abstract type, but we have not given that
abstract type a name.

However, we now get to the big split.  Some OOPLs, like Eiffel for example,
make classes hierarchies conform to type restrictions.  For example, a
sub-class of a class conforming to type "File" would be required to satisfy
the requirements for being a "File".  The sub-class might satisfy even more
stringent requirements, but it must at least be a "File".

Other OOPLs, like Objective-C, use the class hierachy primarily as a means
of code-reuse.  In these cases, the "every sub-class must conform to the
type requirements of all its super-classes" idea is not pursued.  That is,
my "NeatoFile" class would inherit from my "File" class because it saves me
a lot of implementation time.  However, a "NeatoFile" object might not behave
to an outside observer in the same manner as a "File" object, even if the
outside observer only invoked behaviors possessed by "File".  This means that
programming with these languages still involves (perhaps implicit) abstract
types.  That is, at any point in the program the programmer will be willing
to say "I expect every object used here will support at least the following
behaviors with the following arguments and return types".  This is, of course,
an abstract type.  Of course, it is rare that things are made this explicit,
and the programmer usually thinks "I expect every object used here will
be a-kind-of ______".

It is still possible to say that every class conforms to some abstract type,
regardless of whether it is "strongly" or "weakly" typed.  It is just that
in the "strongly" typed languages the typing information is used to (1)
require that no attempt is made to invoke a behavior which is not supported
by the abstract type, and (2) to aid in efficiently translating the program
into machine code by allowing various kinds of optimizations.  In the "weakly"
typed languages, type checks are left to runtime and abstract type 
information is not used in optimizing the program to machine code.


Where does this leave us?  Good question.  If someone can come up with a
succinct way of putting all that, I'd be very interested in seeing it.


And please, let's not turn this into another "strongly typed" versus
"weakly typed" OOPLs war.


Brian R. Gilstrap
gilstrap@aslan.sbc.com
gilstrap@swbatl.sbc.com
...!{texbell,uunet}!swbatl{!aslan}!gilstrap

render@cs.uiuc.edu (Hal Render) (09/28/90)

In article <60700001@inmet> stt@inmet.inmet.com writes:
>I would like to begin a discussion on the distinction,
>if any, between "types" and "classes".  There seems to be
>no consistent distinction made between these two in the OOP literature.  
>I have seen a type considered an implementation of a class, a class
>as an implementation of a type, the two considered
>synonyms, etc.  
>
>Here is a proposed distinction:
>
>1) Each value has a *unique* "type"
>2) A value may be a member of one or more "class"es, determined
>   by the type of the value.
>
>Essentially, the type structure is considered a "partitioning"
>of the value space (i.e. types are non-overlapping), 
>whereas the class structure is a taxonomy (i.e. classes overlap
>and enclose one another).

The problem with your definition is that it does not match my understanding
of the use of the "type" even in  non-OO languages.  My informal understanding
of a type is that of a set of values and a set of operations defined
on those values.  For a structured type (a type that has values of other
types as elements arranged in some fashion) there is also some expression
of the relationship between the constituent types.  This does not preclude
having overlapping types.  Intuitively, the numeric quantity represented by 
the numeral 5 is both an integer value and a real value.  Does this not mean
that 5 is a member of two different types?  If not, I'd like to know why
not, since it violates my understanding of types as expressed in Pascal,
Ada, and other strongly-typed programming languages. 

>
>A class comprises a (potentially open-ended) set of types, 
>and a subclass is a subset of a class, comprising a subset 
>of the types in the superclass.  
>
>In formal terms, it is meaningless to ask "what is the class" of
>a given value.  However, this can be reinterpreted as
>"what is the smallest class which contains the type" of a given value.

In most OO languages that use classes, the class of an object
is understood to be the smallest class that contains the object, 
so your point seems rather moot.

>I prefer these definitions, because I think they are consistent
>with the use of the term "type" in strongly-typed non-OOP languages,
>and consistent with the use of the term "class" in OOP languages.

As I've said, if types are non-overlapping then we have to throw
out any language that allows the definition of subranges and subtypes
such as Pascal, Modula-2, Ada, and others.  This seems a rather
arbitrary (and unnecessary) restriction.

>Actually, I think OOP languages tend to be a bit schizophrenic
>about the term "class" already, since the term is used both
>for parameters/references which may in fact refer to a value
>in any subclass of the class, and to identify *the* class of a value (which
>only makes sense if you use the interpretation suggested above
>of "smallest class which contains the value").

I've never seen a person refer to a variable as a class (which you seem
to say) although one can constrain a variable/parameter/reference object
to only contain or reference an instance of a specific class in some
languages.  With such mechanisms, the difference between a class and a type
is admittedly blurred.

There is difference of opinion on what a class is, since most definitions
are operational, i.e. they are drawn from an implementation of a particular
OOPL.  This is where the problem lies, I think.  A "type" has a widely
accepted mathematical definition (or so I understand from my PL theory 
friends) but none yet exists for "class."  This may be good, however, since
it allows OOPL developers room to explore.  I think a consensual definition 
of the "class" will come in time, but I think we're still in the midst
of discovering what we want and need it to mean given how we use them.

hal.

johnson@m.cs.uiuc.edu (09/28/90)

This is a continuation of the discussion of the difference between
types and classes in an object-oriented language.

First, let me talk about the difference between specification and
implementation.  A class definition contains a little of both,
since an important part of the specification of an object is the
set of operations that can be performed on it.  Most statically-typed
object-oriented languages equate them, e.g. C++, Eiffel, Trellis/Owl.
In these language, a subclass inherits both the implementation and
the specification of its superclasses.

In general, there can be many implementations of a specification
and a particular implementation can match many specifications.
Moreover, a slight change to a particular implementation can completely
change the specification.  Thus, it seems to me that it is better to
separate specifications and implementations, and not to try to have
a single language construct that is used to describe both.

Assuming that you buy my argument, which of "type" and "class" should
be used to refer to implemenation and specification?  This is my
interpretation of the original question.

What do we mean when we say that a language is statically typed?
I could say that this means that programs are type-checked at
compile-time, not run-time, but this just begs the question, because
what does type-checking mean?  What are we checking when we check
types?  Are we checking implementations or specifications?

It seems clear to me that we are checking specifications.  If I
type-check a Smalltalk expression (x + y) then it means that the
value of x is an object that understands the + message and that
the type of the argument of the method that is invoked contains
the type of y.  I think that the preceeding sentence HAS to use
the word "type", and that using "class" instead would be very
counterintuitive.  Thus, it is clear (to me) that "type" means
specification and that "class" means implementation.

This fits very nicely with the intuition of the Smalltalk
community.  Everybody knows what I mean when I say that I
want to type-check Smalltalk.  Some like the idea, some don't
like it, but everybody knows that I am requiring the values of
variables to have certain properties, and that I have to write
down those properties, i.e. specifications.

Ralph Johnson -- University of Illinois at Urbana-Champaign

dlw@odi.com (Dan Weinreb) (09/29/90)

In article <60700001@inmet> stt@inmet.inmet.com writes:

   I would like to begin a discussion on the distinction,
   if any, between "types" and "classes".

I'd like to suggest two things about such a discussion.

First, be very careful about distinguishing the concept of the "type
of a variable" and the concept of the "type of a value".  Sometimes
(in some contexts to some people), a "type" is something that a
variable has, and it serves to constrain the possible values that the
value may have.

Second, remember that in any serious type system, types do not form a
disjoint partition.  One type can be a "subtype" of another.  Suppose
A is a subtype of B.  From the value point of view this means that a
value of type A is also of type B.  From the variable point of view,
this means that a variable of type B can hold a value of type A.  An
implication of this is that usually it is not meaningful to say "the
type of value X", since X is usually of more than one type.

The relationships between types might not even be strictly
hierarchical, which makes it even more complex.

There is interesting discussion of all this in the specification of
the Common Lisp type system (see Guy Steele's book "Common Lisp", read
the chapter on types, look at the functions "typep" and "subtypep").
Common Lisp has an extensive set of runtime features for talking about
types, asking questions about types, and so on.  It's also very
interesting to see how the Common Lisp Object System integrated the
notions of "class" and "type".

stt@inmet.inmet.com (09/29/90)

Re: Types vs. classes

Render@cs.uiuc.edu writes:
>The problem with your definition is that it does not match my understanding
>of the use of the "type" even in  non-OO languages.  My informal understanding
>of a type is that of a set of values and a set of operations defined
>on those values.  For a structured type (a type that has values of other
>types as elements arranged in some fashion) there is also some expression
>of the relationship between the constituent types.  This does not preclude
>having overlapping types.  Intuitively, the numeric quantity represented by 
>the numeral 5 is both an integer value and a real value.  Does this not mean
>that 5 is a member of two different types?  If not, I'd like to know why
>not, since it violates my understanding of types as expressed in Pascal,
>Ada, and other strongly-typed programming languages. 

In a legal Ada program, every expression has a unique type, 
but certain expressions are implicitly convertible if they 
have "universal" type, to some other type in the same "class".  
I will admit this is a fine distinction, but at least theoretically, 
every expression has a unique type.
Values are the result of evaluating an expression, and are conventionally
considered to have the same "type" as the expression which computes them.
There are "corresponding" values in multiple types (i.e., each
numeric type has its own zero value), but they are not considered
the "same" value.

In the classic text by Dahl, Dijkstra, and Hoare, "Structured Programming,"
(Academic Press, 1972), Hoare states quite clearly on page 92:
"(3) In a higher-level programming language the concept of a type is
of central importance.  Again, each variable, constant and expression has
a unique type associated with it."

In a later chapter, the concept of a "sub-range" type is introduced.
These are considered distinct types, though convertible to
the "base" type of the sub-range.  In other programming languages,
sub-ranges have often been considered sub-types rather than distinct types.

One source of confusion is that "notations" like "5" are overloaded
in many languages, to be a value representing the concept of "five-ness,"
of a type determined by context.
Another source of confusion is that Ada uses the term "subtype"
to mean "a type plus a constraint" or a "subset of a type", *not*
to represent a distinct type in and of itself.  Pascal is a bit
more confused on this issue.

Again, I will admit that if two types are interconvertible,
implicitly or explicitly, then they may be said to overlap
in an informal sense, but, at least formally, they are considered
distinct, and any given value computed by an expression
is in one or the other.

To avoid confusion, it might be safer to use the term
"the result of an expression" rather than value,
and instead say:

1) The result of an expression has a single unique type
2) The result of an expression may be a member of
one or more classes, determined by its type.

I think this formulation even works for "type-less" languages
like Lisp, where types are associated with values, rather than
variables.

It is generally agreed that a type includes a set of values,
but it is not just a random collection, but rather
a very special kind of set:  First of all, all members of this
value set are usable with the type's operations.  And secondly, 
I would propose, any given "value" is only in one type's value set,
though there may be a "corresponding value" in another type's value set.

>>Actually, I think OOP languages tend to be a bit schizophrenic
>>about the term "class" already, since the term is used both
>>for parameters/references which may in fact refer to a value
>>in any subclass of the class, and to identify *the* class of a value (which
>>only makes sense if you use the interpretation suggested above
>>of "smallest class which contains the value").
>
>I've never seen a person refer to a variable as a class (which you seem
>to say) although one can constrain a variable/parameter/reference object
>to only contain or reference an instance of a specific class in some
>languages.  With such mechanisms, the difference between a class and a type
>is admittedly blurred.

I certainly didn't intend to equate a variable and a class (though
I admit my wording could be interpreted that way).  I meant
to say that the term class is used when specifying the "type"
of a pointer/reference, when, at least in C++, pointers/references
may in fact refer to any value in that class *or* one of its subclasses.
However, a variable in C++, when declared to be of a certain "class,"
may *not* contain a value of a subclass (it's not "big enough" to do so).  
This is the "schizophrenia" to which I was referring.

Using the proposed terminology, a variable in C++ has a single specified 
"type," whereas a pointer/reference refers to a "class" of types.
The C++ "class" X is defined to comprise the "type" X plus all types derived
directly or indirectly from X.  When calling a virtual
function given a pointer to a *class* X, the
*type* of the pointed-to object determines which specific
implementation of a virtual function is called.

S. Tucker Taft   stt@inmet.inmet.com   uunet!inmet!stt
Intermetrics, Inc.
Cambridge, MA  02138

cline@cheetah.ece.clarkson.edu (Marshall Cline) (10/02/90)

In article <77500058@m.cs.uiuc.edu> johnson@m.cs.uiuc.edu writes:
...
>What do we mean when we say that a language is statically typed?
>I could say that this means that programs are type-checked at
>compile-time, not run-time, but this just begs the question, because
>what does type-checking mean?  What are we checking when we check
>types?  Are we checking implementations or specifications?
>
>It seems clear to me that we are checking specifications.  If I
>type-check a Smalltalk expression (x + y) then it means that the
>value of x is an object that understands the + message and that
>the type of the argument of the method that is invoked contains
>the type of y.  I think that the preceeding sentence HAS to use
>the word "type", and that using "class" instead would be very
>counterintuitive.  Thus, it is clear (to me) that "type" means
>specification and that "class" means implementation.
>
>This fits very nicely with the intuition of the Smalltalk
>community.  Everybody knows what I mean when I say that I
>want to type-check Smalltalk.  Some like the idea, some don't
>like it, but everybody knows that I am requiring the values of
>variables to have certain properties, and that I have to write
>down those properties, i.e. specifications.
>
>Ralph Johnson -- University of Illinois at Urbana-Champaign

I concur whole heartedly: a type is a specification, and a class
implements a type.

In C++, a `type' can be simulated by an abstract base class (ABC) with
no representation and with 100% pure virtuals.  Similar constructs in
other strongly typed OOPLs.

Doug Lea and I are presenting a paper at ECOOP/OOPSLA's ``graphical OO
design for software engineering'' (GOOSE) workshop.  The gist of the
paper is that types and classes are different, that types are
specifications, and that *designing* by specifying types *first*, then
by attaching classes to this type scaffolding *afterwards* is
analogous to the architectural-design/detailed-design of the waterfall
model.  Not that we should do them in strict waterfall order in OOP;
they are simply analogous concepts.

Marshall Cline

--
==============================================================================
Marshall Cline / Asst.Prof / ECE Dept / Clarkson Univ / Potsdam, NY 13676
cline@sun.soe.clarkson.edu / Bitnet:BH0W@CLUTX / uunet!clutx.clarkson.edu!bh0w
Voice: 315-268-3868 / Secretary: 315-268-6511 / FAX: 315-268-7600
Career search in progress; ECE faculty; research oriented; will send vita.
PS: If your company is interested in on-site C++/OOD training, drop me a line!
==============================================================================