[net.lang.c] Sorting linked list

mdr@bentley.UUCP (M. Rossner) (03/27/86)

It seems that there is an obvious recursive solution to this
problem that has not yet been mentioned.  To sort a linked list
(in place, without allocating ANY more memory), I just need to sort
everything except the first node and then insert the first node into
this list.

Note that this assumes a dummy header with a null key so that the
head of the list is constant.

sort (list)
struct foo *list;
{
	struct foo *current;

	if (list->next->next == NULL)       /* down to last node */
		return (list);
	else {
		current = list->next;
		list->next = current->next;
		sort (list);
		insert (list, current);
		return (list);
	}
}


insert (list, node)
struct foo *list, *node;
{
	struct foo *last, *current;

	last = list;
	current = list->next;
	while (node->key > current->key) {
		last = current;
		current = current->next;
	}
	last->next = node;
	node->next = current;
}

Voila! Less than tweny lines of code in its entirety.  Think recursive, man!

             Marc D. Rossner
		-- Hoping never to see you on this newsgroup again --
		-- In Cinema, televisia, et libra speramus       --
		-- To the Walking Lint, have a nice weekend! --

faustus@cad.UUCP (Wayne A. Christopher) (03/28/86)

In article <669@bentley.UUCP>, mdr@bentley.UUCP (M. Rossner) writes:
> It seems that there is an obvious recursive solution to this
> problem that has not yet been mentioned.  To sort a linked list
> (in place, without allocating ANY more memory), I just need to sort
> everything except the first node and then insert the first node into
> this list.

If you're concerned with the amount of space you use, you should be taking
the stack space that your process is using up by all that recursion.  Also
insertion sort isn't the fastest that you can do -- I think making an array,
qsort()'ing it, and re-listing it is the best you can do.  Besides, nowadays
it's not fashonable to be concerned with saving space -- it's all virtual
anyway... :-)

> Voila! Less than tweny lines of code in its entirety.  Think recursive, man!

In C you shouldn't automatically think in terms of recursion, since it is
a lot easier to write fast iterative code.  Recursion is more elegant but
seldom faster.

	Wayne

kwh@bentley.UUCP (KW Heuer) (03/28/86)

In article <669@bentley.UUCP> mdr@bentley.UUCP (M. Rossner) writes:
>It seems that there is an obvious recursive solution to this
>problem that has not yet been mentioned.

A swap-sort is an obvious way to sort an array, too.

>To sort a linked list (in place, without allocating ANY more memory),

Technically it's not an "in place" sort; recursion consumes stack space.

>Note that this assumes a dummy header with a null key so that the
>head of the list is constant.

Why bother with a dummy node?  Consider this algorithm:

struct foo *insert(list, node) struct foo *list, *node; {
	struct foo **p;
	for (p = &list; *p != NULL && (*p)->key < node->key; p = &(*p)->next);
	node->next = *p;
	*p = node;
	return (list);
}
struct foo *sort(list) struct foo *list; {
	return (list == NULL ? NULL : insert(sort(list->next), list));
}

>Voila! Less than tweny lines of code in its entirety.

I got it in five, and only one local variable.  (No, I don't think I crowded
too much on one line.)

For a once-only application I wouldn't mind using this algorithm.  But for
a general-purpose library routine, I wouldn't want a quadratic algorithm;
I'd use that merge-sort that was recently posted here.

>		-- Hoping never to see you on this newsgroup again --

Stay outa my turf, Marc  :-)

>		-- To the Walking Lint, have a nice weekend! --

Thanks.  (Sorry about posting this to the world.)  See you next week!

Karl W. Z. Heuer (ihnp4!bentley!kwh), The Walking Lint
(Marc and I share an office.  I traded him my C for his K.)

chris@umcp-cs.UUCP (Chris Torek) (03/28/86)

In article <669@bentley.UUCP> mdr@bentley.UUCP writes:
>It seems that there is an obvious recursive solution to this
>problem that has not yet been mentioned.  To sort a linked list
>(in place, without allocating ANY more memory), I just need to sort
>everything except the first node and then insert the first node into
>this list.

This method will work.  Unfortunately, its average performance is
O(n^2).  Given an n-element list, you sort it by sorting the n-1
element list, then inserting the first element in place.  The
insertion will, on the average---assuming a random distribution of
values---search through n/2 list elements before finding the
insertion point.  This gives a recurrence relation as follows:

	T(n) = T(n-1) + n/2

Expanding and using T(0) as the base case we get

	T(n) = 1/2 sum {i from 0 to n} i

	     = 1/2 (n * (n+1) / 2)

	     = 1/4 n^2

which is O(n^2).  Heap and merge sorts are O(n log n).

(Technically, you have neglected stack space as well: you will use
O(n) stack frames in your sort.  The heap and merge sorts use only
O(log n) frames.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1415)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@mimsy.umd.edu

badri@ur-helheim.UUCP (03/28/86)

In article <669@bentley.UUCP> mdr@bentley.UUCP (M. Rossner) writes:
>It seems that there is an obvious recursive solution to this
>problem that has not yet been mentioned.  To sort a linked list
>(in place, without allocating ANY more memory), I just need to sort
>everything except the first node and then insert the first node into
>this list ..........
>
>Voila! Less than tweny lines of code in its entirety.  Think recursive, man!
>
You do not want to use a recursive solution for the following reasons:

1. It involves loads of stack operations, which slow down the process.
2. For each level of recursion you have to retain a snap shot of the
   variables at that call. Thus in reality you do not save any
   memory; you may end up consuming more!
3. There are very efficient iterative sorting algorithms available -
   in fact most sorting algorithms have iterative implementations
   that are NOT long in terms of lines of code.

In any case, I believe, like many others, that the easiest thing to do
is to make use of qsort. If efficiency is of importance then use a good
iterative algorithm and NOT recursion.

gwyn@BRL.ARPA (03/29/86)

Your method is cute, but it has the following drawbacks:
	It amounts to an insertion sort, with high overhead
	due to the large number of function calls, 2N-1 for
	an N-long list.
	The average number of comparisons for an N-long
	randomly-ordered list is on the order of N(N-1)/4,
	i.e. O(N^2), which is not the best attainable for
	large N.  (At least the function call overhead
	dwindles to insignificance by comparison.)
	Despite your claim, it does require extra space
	(on the run-time stack) proportional to the list size.
This algorithm might be suitable for relatively short lists
where code size is an important factor (e.g. ROM application).

Using recursion like this is good for quick hacks, but for
serious production work nothing beats a good algorithm.
The standard reference for sorting algorithms is Knuth Vol. 3.

P.S.  I did the above math in my head, away from reference
books.  I may be off a factor of two or something similar;
that doesn't affect the argument.

greg@utcsri.UUCP (Gregory Smith) (03/29/86)

In article <669@bentley.UUCP> mdr@bentley.UUCP (M. Rossner) writes:
>It seems that there is an obvious recursive solution to this
>problem that has not yet been mentioned.  To sort a linked list
>(in place, without allocating ANY more memory), I just need to sort
			       ^^^^^^^^^^ or does he? see below.
>everything except the first node and then insert the first node into
>this list.

Deleted code says:
Sort_List( List ){
	if( List is 1 element ) return;		/* actually the code said 0 */
	save = dequeue_from_head( List );
	Sort_List( List );		/* sort rest of list */

	... then insert 'save' into 'List' in its correct position ...
}
This has not been mentioned for good reason:
	(1) It takes time proportional to N*N ( we don't like that type
	     thing..) which admittedly is fine for smallish lists

	(2) It takes N ( count 'em, N ) stack frames, which is REALLY
	    NASTY and NOT RECOMMENDED for anything but really small lists.
	    Your working storage ( i.e. your stack frames ) could well
	    take up more room than the list itself. Haven't allocated any
	    memory? Really?

	Moral:    No, Virginia, function calls are *not* free.

>Voila! Less than tweny lines of code in its entirety.  Think recursive, man!

If you want to do an insertion sort, do it iteratively, like they learned
you in school.  It would be even shorter (6 lines?) , use no working storage
other than a pointer or two, and would probably run three times as fast.

If you want to think recursive[ly], do a quick sort :-).
-- 
"If you aren't making any mistakes, you aren't doing anything".
----------------------------------------------------------------------
Greg Smith     University of Toronto      UUCP: ..utzoo!utcsri!greg

mccaugh@uiucdcsb.CS.UIUC.EDU (04/02/86)

 If you happen to know the size (= no. of nodes) in the linked list), why not:

 1) Make one pre-pass thru the list, initalizing corresponding slots of an
   "index vector" to the addresses of the nodes; then:

 2) use quicksort thru the index vector to sort the list?

    For a list of n > 2 nodes, expected performance will be O(n lg n).

kwh@bentley.UUCP (KW Heuer) (04/07/86)

In article <139200023@uiucdcsb> mccaugh@uiucdcsb.CS.UIUC.EDU writes:
>If you happen to know the size (= no. of nodes) in the linked list), why not:

Knowing the length of the list isn't critical, since it can be determined
with an additional O(n) pass, which will be dominated by the qsort.

>1) Make one pre-pass thru the list, initalizing corresponding slots of an
>  "index vector" to the addresses of the nodes; then:
>
>2) use quicksort thru the index vector to sort the list?

Several people have made this suggestion, or a similar one using heapsort.
Like the insertion sort, it's easy (given that qsort() has already been
written!); it has the added advantage of being O(n log n) time.

However, it also consumes O(n) space.  This is undesirable for a general-
purpose library routine.  Presumably it will call malloc() for the array.
What does it do if malloc() fails?  Also, one might want to implement it
using a language/environment that doesn't allow dynamic allocation.

Is it possible to implement a quicksort on a linked list directly?  (Just
wondering; I don't think it would be an improvement on the posted merge
sort, which is already O(n log n) time and O(1) space.)

Karl W. Z. Heuer (ihnp4!bentley!kwh), The Walking Lint

greg@utcsri.UUCP (Gregory Smith) (04/10/86)

In article <695@bentley.UUCP> kwh@bentley.UUCP (KW Heuer) writes:
>Is it possible to implement a quicksort on a linked list directly?  (Just
>wondering; I don't think it would be an improvement on the posted merge
>sort, which is already O(n log n) time and O(1) space.)
>
>Karl W. Z. Heuer (ihnp4!bentley!kwh), The Walking Lint

Good point - it can be done on a doubly-linked list, as long as you have
a 'tail' pointer ( which takes only O(n) if you don't already have one).
The 'qsort' algorithm works its way in from both ends of the list,
swapping things until the list is separated into elements [less than] and
[greater than ] some pivot value. It then recursively qsort's the two
lists independently of each other.
This takes O(n log n) time in the best case and O(log n) space in the
worst case (if you count stack frames).

The rest of this posting is advice for anyone who is thinking of
implementing this ( or any qsort ).

The 'usual' way of picking a pivot is to take the first element in the list.
If the list is sorted already, the two resulting lists are [a single item]
and [the rest of the list]. This is the worst case and it reduces to an
insertion sort, O( n*n ) time. The trick here is to avoid the O(n)
stack usage that would result with a straightforward recursive
implementation. Say qsort(s,f) sorts the list starting at 's' and ending
at 'f'. It ends up with two lists 's1,f1' and 's2,f2'. What it should do
next is to remember the *longest* list in auto space ( done nicely by
setting its own parameters s and f to point to the sublist ) and then call
itself recursively to sort the shortest list. It should then *loop back*
to sort the longest list (if it is more  than one item). Thus a stack frame
is saved while sorting the longer one. Here is a skeleton:

quickie(s,f){
	do{
		...( do partitioning, find s1,f1,s2,f2)...
		...( and sizes n1 and n2 )...

		if( n1 > n2 ){
			s = s1; f = f1;		/* s=s1 normally redudant */
			if( n2>1) quickie(s2,f2);
		}else{
			s = s2; f = f2;		/* f=f2 normally redundant */
			if( n1>1) quickie(s1,f1);
		}
	}while( list_size(s,f) > 1);
}
In the worst case mentioned above, the above code makes no recursive calls.

Also: write this as a separate function which calls itself recursively,
and then call that non-recursively from the actual sort function. That
way things like the 'compare' function address which do not change
during a sort can be stuffed into an extern and will not need to be
passed to each call of quickie ( this can save mucho time and reduces
the stack frame ).  Use as few auto (as opposed to static) variables as
you can get away with in quickie, again to reduce stack space. You
probably only need s and f. Anything which is not needed after the
recursive call can be static.
Finally: if quickie sees a two-item list, it should just compare them,
swap if required, and then return, since the generic qsort algorithm
behaves badly on small lists. This should of course be *after* the
'do{' above.

Why I am I writing all of this? Good question... Mainly because I have
seen quicksort many times in textbooks, but rarely do they discuss the
above points. I guess it is usually 'beyond the scope...'. I even saw
one example which figured out which list was smaller, and then called
itself twice recursively, first sorting the smaller and then the larger.
The reason for this was not given - I assume it was a mistaken attempt
to reduce stack space, based on the way that that is done in the
non-recursive qsort algorithm - but I won't go into that. :-).

Enough Ramblin'...

-- 
"If you aren't making any mistakes, you aren't doing anything".
----------------------------------------------------------------------
Greg Smith     University of Toronto      UUCP: ..utzoo!utcsri!greg

kwh@bentley.UUCP (KW Heuer) (04/11/86)

In article <2516@utcsri.UUCP> utcsri!greg (Gregory Smith) writes:
>What it should do next is .., call itself recursively to sort the shortest
>list.  It should then *loop back* to sort the longest list (if it is more
>than one item).  Thus a stack frame is saved while sorting the longer one.

A good optimizer should compile tail-recursion into a jump anyway.  (But
there are few good optimizers by that standard.)

>Also: write this as a separate function which calls itself recursively,
>and then call that non-recursively from the actual sort function. That
>way things like the 'compare' function address which do not change
>during a sort can be stuffed into an extern and will not need to be
>passed to each call....  Use as few auto (as opposed to static) variables
>as you can get away with....

Although this does reduce the stack space and presumably the execution
time, it also makes it non-reentrant.  I have no objections to anyone who
wants to write this routine as described, but the non-reentrancy should be
mentioned under BUGS or WARNINGS.

Sometimes when I have to write a recursive routine, I code it in assembler
using the primitive procedure-calling mechanism (bsb-rsb on the vax), with
arguments in fixed registers, and enclose it in a C-callable envelope.  That
way it's both reentrant AND fast (and I keep a C version for portability).

Karl W. Z. Heuer (ihnp4!bentley!kwh), The Walking Lint

franka@mmintl.UUCP (Frank Adams) (04/14/86)

In article <695@bentley.UUCP> kwh@bentley.UUCP writes:
>Is it possible to implement a quicksort on a linked list directly?  (Just
>wondering; I don't think it would be an improvement on the posted merge
>sort, which is already O(n log n) time and O(1) space.)

(1) Yes, it is certainly possible.

(2) No, it isn't an improvement over merge sort for linked lists.  It is
    also expensive to do the usual sort of worst-case reduction (by taking
    median of first, last, and middle as the separator value) which is common
    for quick sort.  Without this kind of optimization, quick-sort is
    O(n^2) on lists which are already sorted.  (It remains O(n^2) worst
    case, but the worst case becomes much less common.)

(3) The posted merge sort is O(log n) space.  You have to count stack space,
    and it recurses to depth O(log n).

Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
Multimate International    52 Oakland Ave North    E. Hartford, CT 06108