liberte@ncsa.uiuc.edu (Daniel LaLiberte) (04/17/91)
I wonder if we can raise the level of the dicussion on static vs dynamic types to consider more generally where static typing fails to be sufficient or convenient, and why. I am also interested in the various ways languages that support *optional* static typing limit the extent of static typing one can do, and why. First of all which languages support optional static typing? I would guess that static typing is made optional in some languages because the designers recognized that static typing does not handle all typing situations, and thus dynamic typing (or, alternatively, C-like void typing) is a useful analog of the goto statement. But given optional static typing, the language designers could then get lazy and not provide sufficient static typing capability for even well known situations. One case of optional static typing that I am aware of is in Objective-C. And one limitation I have come up against is the static typing of mutually recursive instance variable types. An instance variable in each class is of type pointer to an instance of the other class. The problem comes in somehow declaring each class before the other is declared. Here is an example: ----- file A.h ---------- // interface of class A #include "B.h" // A needs to include B's interface @interface A:Object { B *b; // instance variable b is a pointer to B ... } ----- file B.h ---------- // interface of class B #include "A.h" // B needs to include A's interface <-- *ERROR* @interface B:Object { A *a; // instance variable a is a pointer to A ... } In C++ you could handle this situation by declaring "extern class A" and "extern class B" instead of including A.h and B.h. Objective-C provides no such solution, that I could discover at least. One is forced to declare either a or b as just a generic "id" to break the circularity. This is simply a silly limitation of the language, but I am curious to learn of other such examples. I suppose the problem of hetrogeneous arrays (a realistic application is where the types of elements and the functions to be applied to them are not known until runtime) is another area where static typing systems fall short of the ideal. A union of types would be better when it is appropriate than requiring either dynamic typing or unwieldy static typing and (most likely) multiple inheritance. Homogeneous generic types are not handled well in some languages, in that the base type(s) may not be specified so that instances are suitably constrained. Here are some more examples: From: brm@neon.Stanford.EDU (Brian R. Murphy) > For example, I can't I write a function which returns > either a boolean or an integer in a complex way. I can't write my own > Y combinator. I can't write a function which allows either a sequence > of functions which range over numbers or a sequence of numbers as an > argument. What are some other categories of problems that don't fit in existing static typing systems? Is it possible to come up with a "fix" for these problems in the ideal language, or will dynamic typing always be "necessary", like goto is. Can we organize these and maybe learn something about the problem? Dan LaLiberte National Center for Supercomputing Applications liberte@ncsa.uiuc.edu
strom@watson.ibm.com (Rob Strom) (04/17/91)
In article <LIBERTE.91Apr16164831@babbage.ncsa.uiuc.edu>, liberte@ncsa.uiuc.edu (Daniel LaLiberte) writes: |> I wonder if we can raise the level of the dicussion |> on static vs dynamic types to consider more generally where |> static typing fails to be sufficient or convenient, and why. |> |> I am also interested in the various ways languages that support |> *optional* static typing limit the extent of static typing one can do, |> and why. First of all which languages support optional static typing? I'll answer this question for the Hermes language. Hermes provides both strong static typing and strong dynamic typing. That is, it is never possible to apply an operation to an operand of an inappropriate type. This guarantee is sometimes enforced by compile-time checking (static), and sometimes by run-time checking (dynamic). Which technique is used is the programmer's choice --- in this sense static typing is "optional". However, I will candidly admit that Hermes is biased towards the static approach. That is, the language design assumes that most of the time, a variable will: (1) hold values of only a single type, and (2) that type will be known at compile-time. The cases in which the type is not known statically are assumed to occur less frequently. The constructs for manipulating dynamically typed data are more verbose. This bias comes partially from the historical tradition of the language, and partially from the intended application domain: large, long-lived, multi-application systems, in which ease of maintenance is more important than ease of initial writing. In that domain, type declarations serve as a form of machine-readable documentation. |> |> I would guess that static typing is made optional in some languages |> because the designers recognized that static typing does not handle |> all typing situations, and thus dynamic typing (or, alternatively, |> C-like void typing) is a useful analog of the goto statement. |> But given optional static typing, the language designers could then |> get lazy and not provide sufficient static typing capability for even |> well known situations. |> |> [examples from Objective-C and C++] |> |> What are some other categories of problems that don't fit in existing |> static typing systems? Is it possible to come up with a "fix" for |> these problems in the ideal language, or will dynamic typing always be |> "necessary", like goto is. Can we organize these and maybe learn |> something about the problem? |> I agree that there will always be problems for which static typing will be inadequate and therefore that an escape into dynamic typing will be necessary. I don't see the need for "language wars" between proponents of static vs. dynamic typing. When static typing is possible, I prefer it, because I always prefer errors to be caught before a program module begins executing rather than in the middle of execution. When static typing is impossible, I prefer dynamic typing over escaping from type-checking altogether. Had Hermes a different application domain, we might have chosen to make dynamic typing the default and static the exception, rather than the other way around. Here are some applications for which Hermes uses dynamic typing rather than static: o On-the-fly program creation: Hermes has a data type "program" whose value is a syntax-free representation of a Hermes program. Program values can be built at run-time and then instantiated as processes with a particular initialization interface. It is checked at instantiation time that the program you have built is type consistent with the initialization interface you expect. o Name servers: In Hermes, the right to access a service is encoded as a value of a particular "output port" type. A name server is a process which, in its simplest form, owns a table of pairs <name, port>. This table is typically heterogeneous since the "port" components will be of different types, representing the different services being named. o File systems: This is a situation similar to name servers. A file system is a table of pairs <name, thing>, but the <thing>s are of different types. Here is a situation in which Hermes can use static typing where some other languages cannot: Suppose I am managing a collection of operator consoles. These operator consoles have heterogeneous protocols: some are ASCII glass teletypes, some are bitmap displays, some are "windows", etc. However, all support a standard *interface* for printing and reading lines, prompting, etc. In Hermes these console "objects" (called "processes" in Hermes) are visible only through their ports. Since the ports have the same interface, they appear to the console manager as a homogeneous collection. The fact that these console objects have different internals does *not* make them different types. In some other languages, they *would* look heterogeneous. For example, they would be different task types in Ada, and different classes in Smalltalk. Hermes would not need dynamic typing to handle a case like this. Your Objective-C problem has no analog in Hermes. Please note that followups are directed to comp.lang.misc. -- Rob Strom, strom@ibm.com, (914) 784-7641 IBM Research, 30 Saw Mill River Road, P.O. Box 704, Yorktown Heights, NY 10958
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (04/18/91)
In article <LIBERTE.91Apr16164831@babbage.ncsa.uiuc.edu> liberte@ncsa.uiuc.edu (Daniel LaLiberte) writes: > What are some other categories of problems that don't fit in existing > static typing systems? Is it possible to come up with a "fix" for > these problems in the ideal language, or will dynamic typing always be > "necessary", like goto is. Can we organize these and maybe learn > something about the problem? Amazing! Someone else is actually trying to solve problems and make programming easier, instead of arguing about exactly how hard it is! I like this guy's attitude. I propose we focus on just one problem: What should a type system look like in order to allow everything people want to do with types? In other words, what kinds of constraints should the programmer be able to express upon objects (in the general sense, not the OO sense) in order to achieve both safety and generality? Let's not argue whether dynamic typing is necessary. Let's just do what we have to so that it's obviously unnecessary. (Is that too upbeat for comp.lang.misc readers?) As a starting point: Ben, could you describe the Russell type system in more detail, or point us to references? I think the basic idea makes sense. If you apply certain operations to a value inside function f, then f should specify the value as anything subject to those operations. And, as a first sample problem to be solved by a typing system: How do we write a generic, type-safe sorting routine? What's the best that available languages can do to make this work? Remember, type-safe means that the sort routine will never tell you ``can't apply < to glorps'' at run-time, as a dynamically typed system would. We may have to separate the issues of typing for storage management and typing for typechecking/security/etc. Followups to comp.lang.misc. ---Dan
chased@rbbb.Eng.Sun.COM (David Chase) (04/18/91)
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >I propose we focus on just one problem: What should a type system look >like in order to allow everything people want to do with types? In other >words, what kinds of constraints should the programmer be able to >express upon objects (in the general sense, not the OO sense) in order >to achieve both safety and generality? > >Let's not argue whether dynamic typing is necessary. Let's just do what >we have to so that it's obviously unnecessary. (Is that too upbeat for >comp.lang.misc readers?) It might be overambitious to make it "unnecessary". "Rarely necessary" should suffice. Unless you want to make a religious issue out of it, you might make some allowances for what could be done by a suitably-designed optimizer (or type-inference engine) instead of unduly burdening the programmer in the name of mandatory efficiency. Anyhow, check out "Type-dependent Parameter Inference" by Cormack and Wright, in the June 1990 SIGPLAN Notices (conference proceedings of PLDI). I read it recently, and found it quite interesting. I didn't understand all of it, but I'm working on it. See also "Typeful Programming" by Cardelli, Digital Systems Research Center report # 45. Their address (since this might not be in your local library) is Digital Systems Research Center 130 Lytton Avenue Palo Alto, CA 94301 You might also look at Report #10, "A Polymorphic \lambda-calculus with Type:Type". Same author. William Cook (formerly of HP, now at Apple) may have published something on this subject in the last couple of years, but the last time I remember asking him about it, he told me to go read Cardelli's papers. Here's a list of the Russell papers: Two by Demers and Donahue in the 1980 POPL. Boehm, Demers & Donahue, "An informal description of Russell", Cornell CS department TR 80-430. Demers & Donahue, "Data Types are Values", Cornell CS dept TR 79-393 (possibly also a Xerox PARC blue&white -- I'm misplaced my copy of this) Demers & Donahue, "The semantics of Russell, an exercise in abstract data types". Cornell CS TR 80-431. Hook, "Understanding Russell - A first attempt", Semantics of Data Types (Proceedings), Springer-Verlag Lectuer Notes in Computer Science #173. Boehm, Demers & Donahue, "A Programmer's Introduction to Russell", Rice CS tech report TR85-16. Boehm, "Type Inference in the Presence of Type Abstraction", SIGPLAN Notices, Jul7 1989 (PLDI Conference proceedings). If you're serious about type systems and type theory, go read these papers. The authors put a lot of thought into them, and the authors are not stupid people. And no, I won't digest them for the net in ten easy installments - I don't have the time, and that's not what I'm paid to do. David Chase Sun
lerman@stpstn.UUCP (Ken Lerman) (04/18/91)
In article <LIBERTE.91Apr16164831@babbage.ncsa.uiuc.edu> liberte@ncsa.uiuc.edu (Daniel LaLiberte) writes: |> |>----- file A.h ---------- |>// interface of class A |> |>#include "B.h" // A needs to include B's interface |> |>@interface A:Object |>{ |> B *b; // instance variable b is a pointer to B |> ... |>} |> |>----- file B.h ---------- |>// interface of class B |> |>#include "A.h" // B needs to include A's interface <-- *ERROR* |> |>@interface B:Object |>{ |> A *a; // instance variable a is a pointer to A |> ... |>} |> |>Dan LaLiberte |>National Center for Supercomputing Applications |>liberte@ncsa.uiuc.edu A (not very nice) way of handling this in Stepstone's Objective-C compiler is to write struct __objcA *a and struct __objcB *b to declare the instance variables. A better way would be for the compiler to permit @interface A {} @end to be an incomplete definition of class F which could be completed later on. A further deficiency of Objective-C regarding automatic and static instances of objects (those that are declared Object foo, instead of Object *foo) is that the compiler doesn't automatically call an initializer for them so that only the isa pointer is set. This implies that the writer of the class must know that it will be used this way. I personally don't use them and suggest that others avoid them, also. Ken Disclaimer: In case you haven't noticed, I work for Stepstone. I don't speak for them in this matter. lerman@stepstone.com
sfk@otter.hpl.hp.com (Steve Knight) (04/18/91)
Dan writes: > I wonder if we can raise the level of the dicussion > on static vs dynamic types to consider more generally where > static typing fails to be sufficient or convenient, and why. My view is that the key issues for convenience are arbitrary unions and, by implication, arbitrary subtyping. The advantage of dynamic typing is essentially that any collection of types is itself a valid type & that any collection of types can be extended so as to become a super-type or restricted to form a sub-type. He also asks: > I am also interested in the various ways languages that support > *optional* static typing limit the extent of static typing one can do, > and why. The obvious cost of this is that type inference is severely restricted. Furthermore, it is not possible (in general) to find a most general type for an expression in such a system. Steve
anw@maths.nott.ac.uk (Dr A. N. Walker) (04/19/91)
In article <8493:Apr1719:30:0791@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >And, as a first sample problem to be solved by a typing system: How do >we write a generic, type-safe sorting routine? What's the best that >available languages can do to make this work? I quote [without permission, and somewhat edited] from Charles Lindsey's paper on modals, Algol Bulletin #37, p28, July 1974: [Start of quote.] Example 1 -- A (very slow) sort routine. proc sort = (mode z, ref [] z vec, proc (ref z, ref z) bool swapped) void: for i from upb vec by -1 to lwb vec + 1 do for j from i to upb vec while swapped (vec[j-1], vec[j]) do skip od od; [For example ...] mode person = struct (string name, int age); proc ageswap = (ref person lo, hi) bool: if age of hi < age of lo then person p := hi; hi := lo; lo := p; true else false fi; whereupon a row of persons could be sorted thus: [1000] person people; read (people); sort (person, people, ageswap) [End of quote.] Re-implementing "sort" as a Quicksort or similar is left as an exercise! Apart from the actual sorting technique, this seems to me to be the same as C's "qsort" library routine, but type-safe in Dan's sense. Any advances in the last 17 years? I don't think there is any theoretical objection to deleting the "swapped" parameter to "sort", replacing the call "while swapped (...)" by "while vec[j-1] < vec[j]", and defining op < = (ref person lo, ho) bool: (...) in which case "sort" would compile for any "z" for which the operator "<" was defined. C could do this if types were allowed to be parameters to functions. -- Andy Walker, Maths Dept., Nott'm Univ., UK. anw@maths.nott.ac.uk
wg@opal.cs.tu-berlin.de (Wolfgang Grieskamp) (04/20/91)
liberte@ncsa.uiuc.edu (Daniel LaLiberte) writes: >I am also interested in the various ways languages that support >*optional* static typing limit the extent of static typing one can do, >and why. First of all which languages support optional static typing? There is an experimental language called AMBER designed by Luca Cardelli, AT&T. (Cardelli signs also for the MODULA-3 Type System). AMBER supports this kind of record inclusion polymorphism, i.e. having a record r={a: Int, b: Int} then both the records r1={a: Int} and r2={b: Int} are implicitely supertypes of r, i.e. every function accepting r1 resp. r2 accepts also objects of type r. AMBER is weak functional like Standard ML, i.e. it supports assignments but updatable objects have to be typed appropriate. The point now according to your question is that AMBER has a type "Dynamic". Every value can be made dynamic using the expression "dynamic e", where e is an expression. The statically known type information belonging to e is paired with the value of e, resulting in an entity of type Dynamic. Dynamic values can be coerced back to a known type using the expression "coerce e to t" which produces a run-time error if e has not type t. There are also possibilities to inspect the type of dynamic values: there is a type "Type" and a function "typeOf: Dynamic -> Type" for this purpose. Objects of type Type are descriptions of the type information usally known to a compiler. The introduction of dynamic types of this kind seem to cause no extraordinary problems in a primarily statically typed environment. Run-time type errors are limited to the construction "coerce e to t". Its a matter of taste to call AMBER dynamically typed since such run-time errors may occure at all. AMBER has been implemented for the Mac, but dont ask me where to get this implementation and the language description. Probably there is a FTP side anywhere. -- Wolfgang Grieskamp wg@opal.cs.tu-berlin.de tub!tubopal!wg wg%opal@DB0TUI11.BITNET
liberte@ncsa.uiuc.edu (Daniel LaLiberte) (04/20/91)
There seems to be a body of literature out there on better static typing systems (referenced by David Chase) so I am hesitant to say much more before studying the issue. (That hasnt stopped alot of people, of course :-) But let me ask a couple more questions. In article <2400044@otter.hpl.hp.com> sfk@otter.hpl.hp.com (Steve Knight) writes: > My view is that the key issues for convenience are arbitrary unions and, > by implication, arbitrary subtyping. Are statically declared, arbitrary (anonymous?) supertyping (same as unions?) and subtyping sufficient? They would certainly be helpful in taking care of many static typing problems, but are they always enough? Suppose that for some data it is *impossible* for the compiler or programmer to know what restrictions (i.e. data type constraints) may be placed on the data until run time. Languages that support reflection would be prone to this class of problems, I imagine. Debuggers (that typically reflect on some other process) seem to require dynamic typing, no? If you dont know what constraints may be placed on some data even at run time, then you are out of luck - you dont know what you can safely do with the data and may risk core dumps. Maybe the data knows what its limits are (the object oriented approach) and maybe exceptions are handled gracefully, but obviously something has gone wrong. Speaking of which, is there some study of using exceptions not as errors but as a control mechanism. Consider that when a message is sent to an object and its class does not understand the message, this is a kind of exception. But rather than stopping there, the message is forwarded to the super class, etc. If no class all the way up the hierarchy understands the message, then we give up. But why stop there? Perhaps the original sender would like to try again with a different message. One final thought concerns another kind of convenience argument sometimes given for using dynamic typing. Even when a type is statically known and the language supports its specification conveniently, is it sometimes (or always?) better to not explicitly specify the type? If the type of a variable changes during the course of development, then other uses (but not necessarily all other uses) of the type will often need to be changed to maintain consistency - and this is extra work, especially during early development when many things are changing. Part of this problem could be solved by a better programming environment that helped the programmer mutate the program consistently. Another tack is languages which infer types statically. But how about an explicit (programmer specified) static typing system in which the types are unlikely to change, except for typos? That assumes that once an variable's type is declared, it is most likely the correct type. This doesnt seem possible. Any ideas? Dan LaLiberte National Center for Supercomputing Applications liberte@ncsa.uiuc.edu
schwartz@groucho.cs.psu.edu (Scott Schwartz) (04/21/91)
anw@maths.nott.ac.uk (Dr A. N. Walker) writes: | Re-implementing "sort" as a Quicksort or similar is left as an exercise! | Apart from the actual sorting technique, this seems to me to be the same | as C's "qsort" library routine, but type-safe in Dan's sense. Any | advances in the last 17 years? This version of quicksort comes with the SML/NJ documentation: % sml - fun quick(le:'a*'a->bool) = = let fun sort [] = [] = | sort [x] = [x] = | sort (a::rest) = (* the head "a" is the pivot *) = let fun split(left,right,[]) = sort left @ (a::sort right) = | split(left,right,x::l) = = if le(x,a) then split(x::left,right,l) = else split(left,x::right,l) = in split([],[],rest) = end = in sort = end; val quick = fn : ('a * 'a -> bool) -> 'a list -> 'a list - quick (op<) ["foo", "bar", "biff"]; val it = ["bar","biff","foo"] : string list - quick (op>) [1,2,3,9,8,7]; val it = [9,8,7,3,2,1] : int list The 17 year advance is type inference, I guess.
Chris.Holt@newcastle.ac.uk (Chris Holt) (04/22/91)
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes: >And, as a first sample problem to be solved by a typing system: How do >we write a generic, type-safe sorting routine? What's the best that >available languages can do to make this work? Remember, type-safe means >that the sort routine will never tell you ``can't apply < to glorps'' at >run-time, as a dynamically typed system would. What would be nice would be something of the form Sort S <= where S : sequence of D D : domain <= : partial_order (D x D -> boolean) where partial_order : transitive & reflexive & antisymmetric Of course, partial_order would be part of the standard toolbox of the programmer; it maps the domain of dyadic D-to-boolean operations to that subset satisfying the given properties. I will dream on... ----------------------------------------------------------------------------- Chris.Holt@newcastle.ac.uk Computing Lab, U of Newcastle upon Tyne, UK ----------------------------------------------------------------------------- "And when they die by thousands why, he laughs like anything." G Chesterton
cs450a03@uc780.umd.edu (04/23/91)
Chris Holt > Dan Bernstein >> >>And, as a first sample problem to be solved by a typing system: How do >>we write a generic, type-safe sorting routine? >What would be nice would be something of the form >Sort S <= > where S : sequence of D > D : domain > <= : partial_order (D x D -> boolean) > where partial_order : transitive & reflexive & antisymmetric >Of course, partial_order would be part of the standard toolbox of the >programmer; it maps the domain of dyadic D-to-boolean operations to >that subset satisfying the given properties. If you're going to sort, why not just go with sorting by a key? If you have some domain which is ordered, but which the builtin <= doesn't understand, why not just map it into some type which the builtin recognizes? (For example build a table of integers, where each integer corresponds to an element in the unsorted sequence.) You would then need to get at the permutation information which "sort" generates, but that need not be difficult. Also, sort had better deal with sorting strings of some sort, but that need not be difficult either. If the mapping from sequence of type X to sequence of builtin type is simple enough, there is no reason that a compiler couldn't optimize it out as a distinct step. I don't see any reason, by the way, to apply sort to a partial order which is not a complete order. Raul Rockwell
new@ee.udel.edu (Darren New) (04/24/91)
In article <3111@opal.cs.tu-berlin.de> wg@opal.cs.tu-berlin.de writes: >The point now according to your question is that AMBER has a type >"Dynamic". Every value can be made dynamic using the expression >"dynamic e", where e is an expression. [...] This sounds essentially like Hermes' "polymorph" type. It seems to do the job for everything I can think of for dynamic typing except dynamic binding (in the OO sense). -- Darren -- --- Darren New --- Grad Student --- CIS --- Univ. of Delaware --- ----- Network Protocols, Graphics, Programming Languages, FDTs ----- +=+=+ My time is very valuable, but unfortunately only to me +=+=+ +=+ Nails work better than screws, when both are driven with screwdrivers +=+
gudeman@cs.arizona.edu (David Gudeman) (04/24/91)
In article <1991Apr22.153149.8082@newcastle.ac.uk> Chris Holt writes:
]What would be nice would be something of the form
]
]Sort S <=
] where
] S : sequence of D
] D : domain
] <= : partial_order (D x D -> boolean)
] where
] partial_order : transitive & reflexive & antisymmetric
I know that is the standard solution, but I wonder if a better
solution wouldn't be a function $ord : D -> Int$ that maps domain
objects into arbitrary integers with the property that if $d1 < d2$
by the partial order, then $ord(d1) < ord(d2)$. ($d1 < d2$ means that
$d1 <= d2$ and $d1 =/= d2$.)
You would lose the partial order with this function (so you might want
to implement <= anyway), but it seems that $ord$ has more general
uses. It works for a wider variety of sorting methods, for example,
and could also be used for a hash function.
--
David Gudeman
gudeman@cs.arizona.edu
noao!arizona!gudeman
boehm@parc.xerox.com (Hans Boehm) (04/24/91)
gudeman@cs.arizona.edu (David Gudeman) writes: >In article <1991Apr22.153149.8082@newcastle.ac.uk> Chris Holt writes: >]What would be nice would be something of the form >] >]Sort S <= >] where >] S : sequence of D >] D : domain >] <= : partial_order (D x D -> boolean) >] where >] partial_order : transitive & reflexive & antisymmetric >I know that is the standard solution, but I wonder if a better >solution wouldn't be a function $ord : D -> Int$ that maps domain >objects into arbitrary integers with the property that if $d1 < d2$ >by the partial order, then $ord(d1) < ord(d2)$. ($d1 < d2$ means that >$d1 <= d2$ and $d1 =/= d2$.) There is no such function mapping character strings to integers. Note that "a" < "aa" < "aaa" < ...infinitely many strings... < "b". No subset of the integers has this property. See any set theory book for a discussion of transfinite ordinals. Hans boehm@xerox.com
cs450a03@uc780.umd.edu (04/24/91)
Hans Boehm > David Gudeman >> Chris Holt [sort taking args: unsorted list and ordering predicate] >>I know that is the standard solution, but I wonder if a better >>solution wouldn't be a function $ord : D -> Int$ that maps domain >>objects into [list which requires the same permutation as D to sort] >There is no such function mapping character strings to integers. I agree with Mr. Gudeman that mapping into some domain for which sort is defined is a good idea. And, I agree with Mr. Boehm that it is important to be able to sort strings. I disagree with the idea of using a predicate to specify the order. If "sort" can order a list of sequences (in addition to a list of atomic numbers), that should be adequate. Then if you have some internal type that sort doesn't know how to deal with you can provide a sort key which maps to sequences which sort can deal with. Why provide sort keys, rather than a predicate? (1) building a key takes O(n) steps, rather than O(n log n) where n is the length of the unsorted sequence. (And often, the predicate will have to extract keys for two arguments each time it is evaluated). (2) I believe key extraction reduces the communication costs on highly parallel machines. [somebody want to verify this or disagree?] (3) It is simple to derive a predicate from the specification of how to build a key (if you choose to have the compiler optimize for minimum dynamic space requirements). It is not so simple to derive a key from the specification of how to build a predicate, even when that is a desireable optimization. (4) Mapping from one domain to another is a classic (and very general) mathematical technique. It is synergistically good to put effort into supporting this sort of mapping. Er.. in other words, you don't need to complicate the implementation of "sort" by trying to generalize the sorting process to all cases. Just optimize the hell out of a handful of cases and let other features of the language deal with generalizing to "user-defined" cases. Finally, note that the function used for mapping data -> key could be dynamically bound if the situation calls for it (providing the same kind of generality as supplying a predicate to sort). Raul Rockwell
Chris.Holt@newcastle.ac.uk (Chris Holt) (04/25/91)
boehm@parc.xerox.com (Hans Boehm) writes: >gudeman@cs.arizona.edu (David Gudeman) writes: >>In article <1991Apr22.153149.8082@newcastle.ac.uk> Chris Holt writes: [about sorting using a partial order relation as a parameter] >>I know that is the standard solution, but I wonder if a better >>solution wouldn't be a function $ord : D -> Int$ that maps domain >>objects into arbitrary integers with the property that if $d1 < d2$ >>by the partial order, then $ord(d1) < ord(d2)$. ($d1 < d2$ means that >>$d1 <= d2$ and $d1 =/= d2$.) >There is no such function mapping character strings to integers. Note >that "a" < "aa" < "aaa" < ...infinitely many strings... < "b". No subset >of the integers has this property. See any set theory book for a >discussion of transfinite ordinals. There can be a such a function for the finite strings; you use the lengths of the strings, so a < b < ... < z < aa < ab < ... az < ba < bb < (rest of 2-sequences) < (3-sequences) < (4-sequences) etc. Agreed the ordering given by Hans Boehm requires a dense codomain, but again the rationals are dense, and so can handle finite and repeating strings, and they can be mapped to the integers (though without the obvious ordering being preserved). A partial order can always be strengthened to a total order, of course; but one thing I wondered as I decided to make my example partial is, must every algorithm for partial order sorting be at least o(n**2)? ----------------------------------------------------------------------------- Chris.Holt@newcastle.ac.uk Computing Lab, U of Newcastle upon Tyne, UK ----------------------------------------------------------------------------- "And when they die by thousands why, he laughs like anything." G Chesterton
boehm@parc.xerox.com (Hans Boehm) (04/26/91)
Chris.Holt@newcastle.ac.uk (Chris Holt) writes: >boehm@parc.xerox.com (Hans Boehm) writes: >>gudeman@cs.arizona.edu (David Gudeman) writes: >>>In article <1991Apr22.153149.8082@newcastle.ac.uk> Chris Holt writes: >[about sorting using a partial order relation as a parameter] >>>I know that is the standard solution, but I wonder if a better >>>solution wouldn't be a function $ord : D -> Int$ that maps domain >>>objects into arbitrary integers with the property that if $d1 < d2$ >>>by the partial order, then $ord(d1) < ord(d2)$. ($d1 < d2$ means that >>>$d1 <= d2$ and $d1 =/= d2$.) >>There is no such function mapping character strings to integers. Note >>that "a" < "aa" < "aaa" < ...infinitely many strings... < "b". No subset >>of the integers has this property. See any set theory book for a >>discussion of transfinite ordinals. >There can be a such a function for the finite strings; you use >the lengths of the strings, so > a < b < ... < z < aa < ab < ... az < ba < bb < (rest of 2-sequences) > < (3-sequences) < (4-sequences) etc. >Agreed the ordering given by Hans Boehm requires a dense codomain, >but again the rationals are dense, and so can handle finite and >repeating strings, and they can be mapped to the integers (though >without the obvious ordering being preserved). I thought the original point of this was to specify an order for sorting. Certainly the set of character strings, rationals, and integers all have the same cardinality. Thus I can find 1-1 mappings between them. But to be useful, they have to preserve the desired ordering. For character strings, this usually means lexicographic ordering, and there's no way to preserve that by mapping to the integers. I agree that taking ord: D -> Rationals would work in all common cases, but it seems a bit expensive in practice. Hans boehm@xerox.com