[comp.lang.c] How to do sets?

taylorj@yvax.byu.edu (03/12/89)

This may be a dumb question, but I haven't been able to find anyone yet who
could answer it...

Is there a way to do sets in C (like in Pascal)?  That is, check for set
inclusion (char in ["0".."9"]), and perform unions and intersections.  And the
answer is not "use bitwise operators," I already know how to do that and would
rather not if there's an easier way, especially with large sets requiring 32
bytes or more.

Thanks.

henry@utzoo.uucp (Henry Spencer) (03/16/89)

In article <518taylorj@yvax.byu.edu> taylorj@yvax.byu.edu writes:
>Is there a way to do sets in C (like in Pascal)?  That is, check for set
>inclusion (char in ["0".."9"]), and perform unions and intersections.  And the
>answer is not "use bitwise operators," I already know how to do that and would
>rather not if there's an easier way, especially with large sets requiring 32
>bytes or more.

Why are you averse to bitwise operators?  That's how Pascal sets are almost
invariably implemented.  One can write C preprocessor macros that cover up
the details in somewhat the same way that Pascal does.
-- 
Welcome to Mars!  Your         |     Henry Spencer at U of Toronto Zoology
passport and visa, comrade?    | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

mike@hpfcdc.HP.COM (Mike McNelly) (03/18/89)

> This may be a dumb question, but I haven't been able to find anyone yet who
> could answer it...

> Is there a way to do sets in C (like in Pascal)?  That is, check for set
> inclusion (char in ["0".."9"]), and perform unions and intersections.  And the
> answer is not "use bitwise operators," I already know how to do that and would
> rather not if there's an easier way, especially with large sets requiring 32
> bytes or more.

> Thanks.

No set operations are inherent in the C language. It's easy enough, however,
to write a library with routines to 
	1) create a set of a specified size
	2) manipulate the set (e.g. intersection)
	3) free a set
That's precisely how set operations are implemented in languages such as 
Pascal.

Sorry, I can't provide such a library since it's part of our proprietary
software.

Mike McNelly	mike%hpfcla@hplabs.hp.com

daveb@geaclib.UUCP (David Collier-Brown) (03/24/89)

From article <5080031@hpfcdc.HP.COM>, by mike@hpfcdc.HP.COM (Mike McNelly):
quotes someone who says:
> Is there a way to do sets in C (like in Pascal)?  That is, check for set
> inclusion (char in ["0".."9"]), and perform unions and intersections. 

Sure, here's a procedural model in an ad-hoc specification language:

define
	create: size of set -> set | error
	assign: set x value -> set | error
	in: set x value -> YES | NO
	and: set x set -> set | error
	... other binary booleans ditto ...
	uncreate: set -> error trap
implement as if
	create(size) ::=
		malloc((size/BITSPERWORD)+2)[0] = signature
					    [1] = size - 2
	in(set,value) ::=
		if set[0] == signature 
			if value <= size
				set[2+value/BITSPERWORD][value%BITSPERWORD]
				(see note below, though)
			else
				NO
		else
			error
	and(set1,set2) ::=
		if set1[0] == set2[0] == signature 
		   && set1[1] == set2[1] 
			for i=0; i < size; i++
				set1[2+i] &= set2[2+i]
		else
			error
	uncreate(set) ::=
		if set[0] == signature
			set[0] = error
		else
			error
   As you can probably see from the shorthand above, a set is a
behavior, implementable as procedures acting on a sequence of
storage units which can be accessed by the bit, with a signature and
a length.
   The signature is just for run-time checking, and the length is
mostly for the "in" operation, so it doesn't lie when you fall off
the end.  In an implementation one could probably use a struct so as
to allow most sanity checks to be compiled to (constant-expression ==
variable)?  Dirty tricks with data encoding can also improve
performance. (eg: make the signature & length a float on one
particular machine: the sanity check becomes an exception mechanism
in the hardware)
  Constantly-varying lengths can be done easily with realloc, by
the way, so you can start with a size hint instead of a value.  The
extra overhead doesn't cost much unless you start using the
variability feature a lot.

Etc, etc, ad infinitum.  Indeed, ad nauseam.  C is a subtle
language, although by no means a sophisticated one.

--dave (a philosopher, not a sophist) c-b
   
-- 
 David Collier-Brown.  | yunexus!lethe!dave
 Interleaf Canada Inc. |
 1550 Enterprise Rd.   | He's so smart he's dumb.
 Mississauga, Ontario  |       --Joyce C-B