ok@quintus.UUCP (Richard A. O'Keefe) (02/15/88)
Subject: Treesort (Covington et al again) Newsgroups: comp.lang.prolog Keywords: tutorial If you read my "review" of "Prolog Programming in Depth", you may recall that I praised the authors for having taken the trouble to measure their various sorting routines. Well, now I've taken a closer look at the code, and I got a surprise. Let me first show you a simple quack-sort: quack_sort(RawList, SortedList) :- quack_sort(RawList, SortedList, []). quack_sort([]) --> []. quack_sort([Head|Tail]) --> { partition(Tail, Head, Before, After) }, quack_sort(Before), [ Head ], quack_sort(After). partition([], _, [], []). partition([X|Xs], Head, [X|Left], Right) :- X < Head, !, partition(Xs, Head, Left, Right). partition([X|Xs], Head, Left, [X|Right]) :- partition(Xs, Head, Left, Right). The authors of "Prolog Programming in Depth" compared several versions of quack_sort with two other sorting methods: insertion sort, and a version of tree sort. I'll show you their tree_sort, then a cleaned-up version, and then explain why this comparison was peculiarly favourable to quack_sort. tree_sort(List, NewList) :- list_to_tree(List, Tree), tree_to_list(Tree, NewList). insert(NewItem, empty, tree(NewItem,empty,empty)) :- !. insert(NewItem, tree(Element,Left,Right), tree(Element,NewLeft,Right)) :- NewItem < Element, !, insert(NewItem, Left, NewLeft). insert(NewItem, tree(Element,Left,Right), tree(Element,Left,NewRight)) :- insert(NewItem, Right, NewRight). insert_list([Head|Tail], Tree, NewTree) :- !, insert(Head, Tree, MiddleTree), insert_list(Tail, MiddleTree, NewTree). insert_list([], Tree, Tree). list_to_tree(List, Tree) :- insert_list(List, empty, Tree). tree_to_list(Tree, List) :- tree_to_list_aux(Tree, [], List). tree_to_list_aux(empty, List, List). tree_to_list_aux(tree(Item,Left,Right), OldList, NewList) :- tree_to_list_aux(Right, OldList, List), tree_to_list_aux(Left, [Item|List], NewList). Here is the cleaned up version. tree_sort(RawList, SortedList) :- insert_items(RawList, empty, Tree), gather_items(Tree, SortedList, []). insert_items([], Tree, Tree). insert_items([Item|Items], Tree0, Tree) :- insert_item(Tree0, Item, Tree1), insert_items(Items, Tree1, Tree). insert_item(empty, Item, tree(Item,empty,empty)). insert_item(tree(Label,Left0,Right), Item, tree(Label,Left,Right)) :- Item < Label, !, insert_item(Left0, Item, Left). insert_item(tree(Label,Left,Right0), Item, tree(Label,Left,Right)) :- insert_item(Right0, Item, Right). gather_items(empty) --> []. gather_items(tree(Label,Left,Right)) --> gather_items(Left), [ Label ], gather_items(Right). Three cuts have been reduced to one. If we were using compare/3, even that one cut could profitably be eliminated. I have good reasons for all the name changes, but I'd like to keep this short. Now, why is the comparison between tree sort and quack_sort so peculiarly favourable to quack_sort? Because (this version of) tree_sort ***IS*** quack_sort! Consider the top node of the tree: the first element of the input goes there, and the rest of the input is routed to either the left or the right subtree, and finally we concatenate the elements of the left subtree, the first element, and the elements of the right subtree. But this is just the partitioning, recursive sorting, and concatenation done by quack_sort! You will notice that gather_items//1 is identical to quack_sort//1 except for the call to partition/4, and that partition/4 and insert_item/3 have essentially the same structure. In quack_sort, the tree is implicit in the control structure of the program, and the intermediate Before/After sequences are explicit. In tree_sort, the tree is explicit, and the Before/After sequences are implicit in the control structure of the program. quack_sort and tree_sort should compare the same pairs of elements (though not in the same order) and construct exactly the same unstably sorted result. This is an interesting result in itself, but Covington et al make no mention of the fact that their tree_sort is fundamentally identical to quack_sort. (Which is to say that it has all the properties which make quack_sort a dreadful choice.) I repeat that while the examples are not the best, there are many good things in the text. For instance, I very much like what they say about certainty factors on pp 328-330 (except that they call them "confidence" factors). I don't think you'll find another book this accessible with this coverage. While I intend to keep commenting on the examples for some time, I would still recommend the book. Vulcan may have been a cripple, but he was arguably the second most *useful* of the gods...