[comp.object] generic sort orders

knight@mrco.carleton.ca (Alan Knight) (05/23/91)

In article <1991May22.183044.5634@Think.COM> barmar@think.com writes:
>In article <1991May22.012821.12048@tkou02.enet.dec.com> diamond@jit533.enet@tkou02.enet.dec.com (Norman Diamond) writes:
>>Responsibility for specifying a "<" operation belongs to the call site.
>
>Which then requires the call site to know about all the possible types of
>things it will compare, which is contrary to object orientation.
>
....

>One possible way around this would be for the generic comparison function
>to take a third argument, which would be an object that represents the
>style of comparison being done.  Sorry, I haven't really thought this
>through, but it could contain flags such as case-sensitivity, relative
>priority of fields, etc.  Making it truly generic could be tough, though;
>it would probably work best in a language with good relexive capabilities
>(e.g. the field priorities could be passed in as a list of slot names if
>the language permits accessing slots using computed names, a la CLOS's
>SLOT-VALUE function).
>

The way that this is handled in Smalltalk is fairly trivial.  There
are three things involved, the "sortee", the "sorter", and the
"client" (stupid terminology mine).
    The sortee is the kind of object being compared, which must
provide (one or more) comparison operations (i.e.  there must be some
way to compare strings, numbers, etc.)  These operations must be
generic enough to handle the mixes that arise.  If you want to compare
strings to integers, then you must define some function somewhere that
does this.
    The sorter is, for example, a heap, and it performs all of its
comparison operations in terms of a "sort block", a comparison
function which it is given.
    The client is whoever created the sorter, and it must supply a
comparison function for whatever it's going to compare.  Thus if I
want a sorted collection of strings, I can say

    SortedCollection sortBlock: [:a :b |
       a asUppercase <= b asUppercase].

The sort function defaults to <=.

Thus, rather than trying to make some incredibly general object which
represents all possible comparison orders, you just pass in an
arbitrary function to do the comparisons.  If this is an unusual sort,
the function might be created on the fly.  For more common or more
complex comparison orders there is probably a function that implements
it, e.g. 

    SortedCollection sortBlock: [:aPoint :anotherPoint |
	aPoint angleLessThan: anotherPoint relativeTo:
someReferencePoint].


    


-- 
--
  Alan Knight   knight@mrco.carleton.ca  +1 613 788 5783   Support
  Dept. of Mechanical and Aeronautical Engineering         the
  Carleton University, Ottawa, Ontario, Canada, K1S 5B6    LPF

jls@netcom.COM (Jim Showalter) (05/23/91)

>In article <1991May22.183044.5634@Think.COM> barmar@think.com writes:
>>In article <1991May22.012821.12048@tkou02.enet.dec.com> diamond@jit533.enet@tkou02.enet.dec.com (Norman Diamond) writes:
>>>Responsibility for specifying a "<" operation belongs to the call site.
>>
>>Which then requires the call site to know about all the possible types of
>>things it will compare, which is contrary to object orientation.

Not at all true. The generic sort takes in two arguments: the specific
type being acted upon, and the "<" operation for that particular type.
Both are supplied by the client of the generic. In Ada, it looks like
this:

generic
    type Item is private; -- Formal metatype for any item.
    with function "<" (Left_Item  : in Item;
                       Right_Item : in Item) return Boolean;
package Sort_Generic is
    type Stuff_To_Sort is array (<>) of Item;
    procedure Sort (This_Stuff : in out Stuff_To_Sort);
    -- This procedure calls the supplied "<" operation to do the sorting.
end Sort_Generic;

package Integer_Sorting is new Sort_Generic (Item => Integer,
                                             "<"  => "<" -- For integers.);

Then one declares an array of integers and passes them to Integer_Sorting.Sort.
Note the clean separation of concerns: the sort generic has no clue as to
what it might be instantiated with nor to how to compare what it is instantiated
with, but it doesn't matter because the client, of course, knows how to do
both. The client, on the other hand, has no idea how to sort things.

>>One possible way around this would be for the generic comparison function
>>to take a third argument, which would be an object that represents the
>>style of comparison being done.  Sorry, I haven't really thought this
>>through, but it could contain flags such as case-sensitivity, relative
>>priority of fields, etc.  Making it truly generic could be tough, though;
>>it would probably work best in a language with good relexive capabilities
>>(e.g. the field priorities could be passed in as a list of slot names if
>>the language permits accessing slots using computed names, a la CLOS's
>>SLOT-VALUE function).

None of this is necessary, as per above.
-- 
**************** JIM SHOWALTER, jls@netcom.com, (408) 243-0630 ****************
*Proven solutions to software problems. Consulting and training on all aspects*
*of software development. Management/process/methodology. Architecture/design/*
*reuse. Quality/productivity. Risk reduction. EFFECTIVE OO usage. Ada/C++.    *

barmar@think.com (Barry Margolin) (05/23/91)

In article <1991May22.203058.15452@ccs.carleton.ca> knight@mrco.carleton.ca (Alan Knight) writes:
>In article <1991May22.183044.5634@Think.COM> barmar@think.com writes:
>>One possible way around this would be for the generic comparison function
>>to take a third argument, which would be an object that represents the
>>style of comparison being done.
>The way that this is handled in Smalltalk is fairly trivial.  There
>are three things involved, the "sortee", the "sorter", and the
>"client" (stupid terminology mine).
>    The client is whoever created the sorter, and it must supply a
>comparison function for whatever it's going to compare.

Yes, of course the "object that represents the style of comparison being
done" could be a function.  I was trying to suggest a more
"object-oriented" approach to thinking about this problem.  As I said, I
hadn't really designed the details; were I to try, I would probably give up
and just use a function.  Or perhaps a hybrid, e.g. an object that contains
some pretty generic options, along with a function that implements the more
domain-specific behavior.

-- 
Barry Margolin, Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

rockwell@socrates.umd.edu (Raul Rockwell) (05/24/91)

Barry Margolin:
[on using function arguments for generic sorting]

   Yes, of course the "object that represents the style of comparison
   being done" could be a function.  I was trying to suggest a more
   "object-oriented" approach to thinking about this problem.  As I
   said, I hadn't really designed the details; were I to try, I would
   probably give up and just use a function.  Or perhaps a hybrid,
   e.g. an object that contains some pretty generic options, along
   with a function that implements the more domain-specific behavior.

My personal favorite object for sorting is a "list of strings" (where
the strings are anything that the builtin ">" works for).  The trick
is DON'T SORT THE DATA.  Create a "sort message" (e.g. permute this
way...) which can be passed to whatever you like.

Now if you want to sort something that is nasty (like Japanese phone
book entries :-) you can implement it by "sorting" a representation of
the data which has been processed to so that sort will "do the right
thing".  (e.g. re-arrange first and last names, double each letter so
that the first letter of each pair is monocase and likewise preceed
every non-alpha character with one which indicates which "bucket" to
sort with, ...)

Most people, when they see this, can't figure out why you'd want to do
such a thing.  It's very handy though...

If anyone needs a diagram, here's a general case:

           A                  B               C
   object --> representation --> permutation --> sorted object
                                              ^
                                              |
                                            object

B is the "sorting function".  

Note that you don't need to incur the O(n log n) time penalty of
sorting when building your representation, so you can get an
efficiency gain too.  Er.. it's more efficient in time, you do need
space for your intermediate results.

(Incidentally, you'll find it much nicer if the sort guarantees that
"equal" strings maintain their order with respect to each other.)

Raul Rockwell