[comp.lang.c] Character String Permutation

mcvic@prcrs.UUCP (David J. McVicar) (09/14/89)

I have wrestled with this annoying algorithm long enough.  Can anyone
on the net lend assitance?

Here's what I'm trying to do...

I have written a small C program to list out all the permutations
of a given text string.

(ex. unix unxi uinx uixn uxni uxin nuix nuxi niux nixu nxui nxiu iunx iuxn inux 
     inxu ixun ixnu xuni xuin xnui xniu xiun xinu)

But, when a string containing repeating characters is entered, I would like
to output only unique strings.

(ex.  foo foo ofo oof ofo oof  <=  my current program
      foo ofo oof	       <=  desired output)


An obvious solution for UNIX types is to pipe the output into sort -u, but for 
long character strings this would be prohibitive.  Is there any way, given 
the nature of my solution below, to determine repeats in permutations on the 
fly?

Thanks in advance for any help with this one.

Dave McVicar (uunet!prcrs!mcvic)

======  The program below will print out all permutations, with repeats =====

#include <stdio.h>

#define  COLWID		80
extern char  *malloc();

int      *pa;
char     *str;
char     *outbuf;
int	  size;

int func()
{
	static int    col = 0;
	register int i;

	for(i=0; i < size; ++i)
		outbuf[i] = str[pa[i]];
	outbuf[i] = '\0';
	if ((col += (size + 1)) >= COLWID) {
		printf("\n");
		col = size + 1;
	}
	printf("%s ", outbuf);
	return (1);
} /* end func() subroutine */


int
doperm(offset, func)
register int offset;
int    (*func)();
{
	int               index;
	int               k, skip;

	if (offset == size)
		return( (*func)() );
	for (index = 0; index < size; ++index) {
		for (k = 0; k < offset; ++k)
			if (skip = (index == pa[k]))
				break;
		if (skip)
			continue;
		pa[offset] = index;
		doperm(offset + 1, func);
	}
	return;
}

main(argc, argv)
int 		argc;
char 	      **argv;
{
	register int i,j;

	if (argc == 1) {
		fprintf (stderr, "usage:  permute <string>\n\n");
		exit (2);
	}

	str = argv[1];
	size = strlen(str);
	if (size == 0)
		exit(0);
	else if (size == 1) {
		printf("%s\n", argv[1]);
		exit(0);
	}
	pa = (int *)malloc(size * sizeof(int));
	outbuf = malloc(size + 1);

	for (i = 0; i < size; ++i) {
		pa[0] = i;
                for (j = 1; j < size; ++j)
			pa[j] = 0;
	        doperm(1, func);
	}
	free(pa);
	free(outbuf);
	exit(0);
}

======================

bill@twwells.com (T. William Wells) (09/15/89)

In article <193@prcrs.UUCP> mcvic@prcrs.UUCP (David J. McVicar) writes:
: I have written a small C program to list out all the permutations
: of a given text string.
:
: (ex. unix unxi uinx uixn uxni uxin nuix nuxi niux nixu nxui nxiu iunx iuxn inux
:      inxu ixun ixnu xuni xuin xnui xniu xiun xinu)
:
: But, when a string containing repeating characters is entered, I would like
: to output only unique strings.
:
: (ex.  foo foo ofo oof ofo oof  <=  my current program
:       foo ofo oof            <=  desired output)

Group identical characters of the input string together. Sorting them
will do nicely. Do the standard recursive generation of the
permutations, except that, when replacing a character on a given
level, if it is the same as the old character for that level, skip the
character.

Warning: never compiled C code follows. Not only that, but this naive
algorithm is O(Length**Length) rather than O(Length!). Fixing that is
left as an exercise for the student. :-)

char    Selected[MAX];          /* initially all zero */
char    Perm_vector[MAX+1];     /* bad name, my APL past is showing! */
char    Sorted_vector[MAX];     /* sorted alphabetically, nul not allowed */
int     Depth;                  /* initially zero */
int     Length;                 /* chars Sorted_vector, <= MAX */

void
generate()
{
	int     i;
	int     lastc;

	Perm_vector[Depth] = 0;
	if (Depth == Length) {
		puts(Perm_vector);
		return;
	}
	for (i = 0; i < Length; ++i) {
		if (Selected[i]
		    || Sorted_vector[i] == Perm_vector[Depth]) {
			continue;
		}
		Perm_vector[Depth] = Sorted_vector[i];
		++Depth;
		Selected[i] = 1;
		generate();
		Selected[i] = 0;
	}
}

---
Bill                    { uunet | novavax | ankh | sunvice } !twwells!bill
bill@twwells.com

userAKDU@ualtamts.BITNET (Al Dunbar) (09/18/89)

In article <1989Sep14.184346.8361@twwells.com>, bill@twwells.com (T. William Wells) writes:
>In article <193@prcrs.UUCP> mcvic@prcrs.UUCP (David J. McVicar) writes:
>: I have written a small C program to list out all the permutations
>: of a given text string.
>:
>: (ex. unix unxi uinx uixn uxni uxin nuix nuxi niux nixu nxui nxiu iunx iuxn inux
>:      inxu ixun ixnu xuni xuin xnui xniu xiun xinu)
>:
>: But, when a string containing repeating characters is entered, I would like
>: to output only unique strings.
>:
>: (ex.  foo foo ofo oof ofo oof  <=  my current program
>:       foo ofo oof            <=  desired output)
>
>Group identical characters of the input string together. Sorting them
>will do nicely. Do the standard recursive generation of the
>permutations, except that, when replacing a character on a given
>level, if it is the same as the old character for that level, skip the
>character.
 
That  would  certainly  work,  but  then  David's  unix --> xinu
example would become inux --> xuni. Perhaps he has a reason  for
the  resulting  sequence to start with the original word and end
with it spelled in reverse. If this is  the  case,  change  your
comment  to:  "..  except  that, when replacing a character on a
given level, if it is the same as any character already used for
that  level,  skip  the character". This could easily be done by
strcat'ing each character to be used to an auto string variable,
but  only  if  strchr'ing  for it first indicates that it is not
there yet. The space required for this variable  would  decrease
by  one for each level, requiring a total of ((size+1)*(size)/2)
bytes.
 
/Al Dunbar - Civil Eng. Univ of Alberta, CANADA

doug@ozdaltx.UUCP (doug) (09/20/89)

Much easier to get algorithms from _Collected Algorithms of the ACM_.
All sorts of interesting (and nasty) problems like yours have published
solutions.  I have implemented a version with non-repeating characters
and I believe I remember one with repeating characters.