[comp.lang.c] Wanted: tricky algorithm

enevill@acorn.co.uk (Edward Nevill) (04/14/89)

I have an array of pointers to strings in a blob of memory.
The array of pointers has just been sorted according to the
strings in this blob of memory. I now want to re-order
the blob of memory according to the array of pointers.

Does anyone have, or can anyone think of an algorithm to
do the above without using an auxiliary array and with a
fairly decent running order (ie < N*N).

Thanks in advance,

Edward Nevill (enevill@acorn.co.uk)

flee@shire.cs.psu.edu (Felix Lee) (04/16/89)

In article <749@acorn.co.uk>,
   enevill@acorn.co.uk (Edward Nevill) wants
to reorder strings in a blob of memory according to a sorted array of
pointers to the strings, using constant memory.

The obvious algorithm (obvious to me, at least) runs in something like
O(n*n + n*m) time, where n is the number of strings and m is the size
of the blob of memory.

Given
  a[0 .. m-1] the blob of memory
  s[0 .. n-1] indexes into a[], referring to the strings
  l[0 .. n-1] the lengths of the strings

  // invariant: a[0 .. p-1] contains the strings s[0 .. i-1]
  //   in the right order
  p := 0
  for i in 0 .. n-1 do
    swap a[p .. s[i]-1], a[s[i] .. s[i]+l[i]]

    for j in i+1 .. n-1 do      // adjust the string pointers
      if s[j] < s[i] then s[j] := s[j] + l[i]

    s[i] := p; p := p + l[i]

Note that the "swap" exchanges unequal-sized segments.  This can be
done in place.

Less obvious algorithms are . . . less obvious.  I can't seem to come
up with any.  The amount of character swapping, O(n*m), feels like it
could be reduced by something suitably clever.  Maybe if the memory
constraint were relaxed a bit. . . .
--
Felix Lee	flee@shire.cs.psu.edu	*!psuvax1!shire!flee

rwhite@nusdhub.UUCP (Robert C. White Jr.) (04/18/89)

in article <749@acorn.co.uk>, enevill@acorn.co.uk (Edward Nevill) says:
> Does anyone have, or can anyone think of an algorithm to
> do the above without using an auxiliary array and with a
> fairly decent running order (ie < N*N).

The clasic "heap sort" is an in-place sort useful for such things,
though it wouoldn't be that useful here becuse you have an index into the
memory in the form of your array of pointers.  If you are going to
do anything in-place with the block of memory itself your strings
would need to be maximally aligned (e.g. alligned w.r.t. the longest
string.) and you would need one temporary holder string; then preform
a modified insertion-exchange: remove the string in lowest memory to
the temporary variable and adjust it's pointer; move the losest value
into the lowest (not vacant) slot; place the temporary back into the
space vacated by the exchange; advance the low water mark; repeat.
"back-pointers" from memory are helpful, but not necessary.

If you can approprate a large enough block of memory a sequential
copy using the indexing array will yeild a Big-O of N.  If you
don't have the memory but you do have the time, you could build a
memory image file (laid out as you would lay out the memory block
exactly) and then bulk-load the file back into place in memory.
This will also produce a big-O of N, but the elementary size of each
cycle is somewhat larger/slower because of the file opp.

You have, however piqued my curisoty, what do you need with the sorted
memory block while you have the pointer array.  I's not that I can't
think of a few reasons myself, but most of the ones that come to mind
first are rahter unlikely.

Rob.

Disclamer:  These opinions belong to no one; they are however 
copyrighted by a small corporation in the caman islands.  The
D.E.A. is investigating...

tps@chem.ucsd.edu (Tom Stockfisch) (04/19/89)

In article <1317@nusdhub.UUCP> rwhite@nusdhub.UUCP (Robert C. White Jr.) writes:
>in article <749@acorn.co.uk>, enevill@acorn.co.uk (Edward Nevill) says:
>> Does anyone have, or can anyone think of an algorithm to
>> do the above
  [basically, sort a bunch of strings w/o allocating any more
   memory, given that you have an array of pointers to the strings]
>>without using an auxiliary array and with a
>> fairly decent running order (ie < N*N).
 
>The clasic "heap sort" is an in-place sort useful for such things,
>though it wouoldn't be that useful here becuse you have an index into the
>memory in the form of your array of pointers....

You've got a bunch of STRINGS, i.e., ascii data.
There in a block of memory but spread all over the place.
You want them sorted and compacted.
You don't want to allocate any more memory.
You don't want to spend a ridiculous amount of time.

The solution:

Sort the pointers in place using qsort().
Write the strings out in order to a temporary file.
Read the temporary file back in starting at the base of your memory
unlink() the temporary.
-- 

|| Tom Stockfisch, UCSD Chemistry	tps@chem.ucsd.edu