[net.arch] One really good use for BIG core memory

wa263@sdcc12.UUCP (bookmark) (09/23/86)

<--- bug snack

	I know one really good use for a BIG core memory.  There is a
simple O(n) sorting algorithm... suppose we want to sort n elements with
keys ranging in value from 1 to k (k >= n).  Just get a big array
of k slots, mark them all with some illegal value, then copy
each of the items to be sorted into the appropriate (k'th) place in the
array (this is called an "address calculation" sort).  Then scan the array
through and copy out the items in order.  The sort phase is O(n), and the
scan phase takes about constant+(n*otherconst) time, so the whole thing is O(n).

	(Obviously, this simple scheme doesn't deal with multiple items 
having the same key... but there are a variety of simple fixes for that.
If the item is big, so you don't want to copy it, you can just copy
a pointer to it, or its index in the source array, or whatever.  Please...
don't bother posting (or mailing) anything about similar quibbles--if you
can think of a quibble and its solution in less than five minutes,
so can everyone else including me.)

	Now, this is a great sorting scheme.  The reason we worry
all the time about neat things like Quicksort, or some of the time about
horrors like twelve-tape polyphase merge sorts, is because we often don't
have enough memory to do a sort in the simple way detailed above.  However,
the more core you give me, the more able I'll be to use algorithms like this
sorting scheme that trade size for speed-- and the faster my programs will run.


					bookmark@sdcsvax.ARPA

				   ...!sdcsvax!bookmark

desj@brahms.BERKELEY.EDU (David desJardins) (09/25/86)

In article <600@sdcc12.UUCP> wa263@sdcc12.UUCP (bookmark) writes:
>
>	I know one really good use for a BIG core memory.  There is a
>simple O(n) sorting algorithm... suppose we want to sort n elements with
>keys ranging in value from 1 to k (k >= n).  Just get a big array of k
>slots, mark them all with some illegal value, then copy each of the items
>to be sorted into the appropriate (k'th) place in the array (this is
>called an "address calculation" sort).  Then scan the array through and
>copy out the items in order.  The sort phase is O(n), and the scan phase
>takes about constant+(n*otherconst) time, so the whole thing is O(n).

   The 'scan phase' takes ** O(k) ** time, not O(n).  In typical sorting
applications k >>> n.  If k ~ n, then we already have enough memory for
the size k table (if we have enough memory to quicksort the items them-
selves!).

   -- David desJardins

jlg@lanl.ARPA (Jim Giles) (09/25/86)

In article <600@sdcc12.UUCP> wa263@sdcc12.UUCP (bookmark) writes:
>	I know one really good use for a BIG core memory.  There is a
>simple O(n) sorting algorithm... suppose we want to sort n elements with
>keys ranging in value from 1 to k (k >= n).  Just get a big array
>of k slots, mark them all with some illegal value, then copy
>each of the items to be sorted into the appropriate (k'th) place in the
>array (this is called an "address calculation" sort).  Then scan the array
>through and copy out the items in order.  The sort phase is O(n), and the
>scan phase takes about constant+(n*otherconst) time, so the whole thing is O(n).

This algorithm is O(k) not O(n).  If you want to sort 10^6 elements (about
2^20) and each element was a 32 bit integer, the algorithm would be 2^12
times as slow as an O(n).  Of course you could map the integers into a 20
bit address and then sort the mapped data, but this is hashing.  Now you
have to worry about collisions, efficiency of the hash function, etc.  In
general, quicksort would be faster than the direct O(k) sort.

J. Giles
Los Alamos

ark@alice.UucP (Andrew Koenig) (09/26/86)

>	I know one really good use for a BIG core memory.  There is a
> simple O(n) sorting algorithm... suppose we want to sort n elements with
> keys ranging in value from 1 to k (k >= n).  Just get a big array
> of k slots, mark them all with some illegal value, then copy
> each of the items to be sorted into the appropriate (k'th) place in the
> array (this is called an "address calculation" sort).  Then scan the array
> through and copy out the items in order.  The sort phase is O(n), and the
> scan phase takes about constant+(n*otherconst) time, so the whole thing is O(n).

Only trouble is that it's O(k), not O(n), because that's how long
it takes to scan all the way through your BIG memory.

When k is much larger than n, you can sort in O(n log(k)/R) by doing a
radix sort in radix R.

alverson@decwrl.DEC.COM (Robert Alverson) (09/26/86)

Clearly, this algorithm is O(k), but...usually, k is a linear
function of n, determined before starting the sort.  So it is
O(n) also.

Bob

jqj@gvax.cs.cornell.edu (J Q Johnson) (09/27/86)

wa263@sdcc12.UUCP proposes using large memories to obtain an O(n) sort
by using the obvious open-addressing approach.  Although I think his
general conclusion is valid:  that there are many places where space/
time tradeoffs exist in algorithms, the particular case of open address
sorting is not a very good example.

Note that your O(n) addressing-sort is successful only if the set is
dense.  Consider sorting a list of n personal names and addresses.
Presumably, n<4e10, since we have a bound based on the number of people
in the world, but the data requires, say 60 chars to represent it so
you need perhaps 300 bits of address, or 1e60 cells.  You scan phase is
NOT O(n), but rather O(2^keysize).

Conclusion:  rating sorting algorithms on their space and time
requirements as a function only of n (# of elements to be sorted) is
faulty; algorithms should be compared based on looking at their
performance bases on both n and keysize.


Incidentally, there ARE of course sorting algorithms that do better than
O(n*log n) on time performance and don't fall into the keysize trap as
deeply.  Consider all those radix sorts...

radford@calgary.UUCP (Radford Neal) (09/29/86)

RE: Discussion of bucket sorting at a use for huge memories.

Bucket sort is not O(n) in time, nor O(k), but O(nlogn). This is
because the time required to decode the addresses for your huge
memories is logarithmic in the size of the memory.

    Radford Neal
    The University of Calgary

desj@brahms.BERKELEY.EDU (David desJardins) (10/01/86)

In article <403@vaxb.calgary.UUCP> radford@calgary.UUCP (Radford Neal) writes:
>
>Bucket sort is not O(n) in time, nor O(k), but O(nlogn). This is
>because the time required to decode the addresses for your huge
>memories is logarithmic in the size of the memory.

   This doesn't really seem like a fair characterization, because the
logarithmic penalty for the actual comparisons/moves/calculations are
built into all sorting algorithms.  It is just as true that it takes
time O(log n) for a quicksort to exchange two entries.  You have to
specify the units to be used, and the usual choice is single machine
operations such as comparison or exchange, which take logarithmic time.

   -- David desJardins

phil@osiris.UUCP (Philip Kos) (10/03/86)

One thing I don't understand about the address sort.. it doesn't seem
like it would be sufficiently general to make a useful utility anyway,
because I can't see how it could possibly handle duplicate keys.  Am I
missing something here?



Phil Kos			  ...!decvax!decuac
The Johns Hopkins Hospital                         > !aplcen!osiris!phil
Baltimore, MD			...!allegra!umcp-cs

"In the end there's still that song,
Comes crying like the wind
Down every lonely street that's ever been." - Robert Hunter

johnl@ima.UUCP (John R. Levine) (10/04/86)

In article <403@vaxb.calgary.UUCP> radford@calgary.UUCP (Radford Neal) writes:
>
>Bucket sort is not O(n) in time, nor O(k), but O(nlogn). This is
>because the time required to decode the addresses for your huge
>memories is logarithmic in the size of the memory.

Not really true -- memory addresses are not decoded by a tree but more
typically by broadcasting the address and having all of the memory units
look for their own addresses in parallel.  More realistically, by the time
you take into account caches, pipelined buses, interleaved memory and all
the other speed up tricks commonly done in hardware, the address decode time
is the least of your problems.
-- 
John R. Levine, Javelin Software Corp., Cambridge MA +1 617 494 1400
{ ihnp4 | decvax | cbosgd | harvard | yale }!ima!johnl, Levine@YALE.EDU
The opinions expressed herein are solely those of a 12-year-old hacker
who has broken into my account and not those of any person or organization.