[net.lang.c] places where sets are useful

jss@sjuvax.UUCP (J. Shapiro) (02/26/85)

[Pacman's revenge...]

	It seems to me that in compiler writing, the case where one has
multiple transitions on a single input character is an ideal case for sets.
This is of course only an intermediate stage, so perhaps it does not
justify a full blown set type.

	A better example is one of modal software, where one's modes are
defined by a set of their names... For you pascalions:

type
	mymodes = (insert, error, append, ...);
var
	status = set of mymodes;

begin
...
if (status * [insert, append]) <> [] then begin....
...
end
...
end.

Is much more readable and therefore less likely to get mangled in software
rewrites then is the exhaustive "if" phrasing.


Jon Shapiro

kdmoen@watcgl.UUCP (Doug Moen) (02/27/85)

> type
> 	mymodes = (insert, error, append, ...);
> var
> 	status = set of mymodes;
> begin
> ...
> if (status * [insert, append]) <> [] then begin....
> ...
> end
> ...
> end.
> 
> Is much more readable and therefore less likely to get mangled in software
> rewrites then is the exhaustive "if" phrasing.

I agree.  Here's how I would write this in C:

enum {insert, error, append, ... };
#define elem(n) (1 << (int)(n))
short status;
...
	if (status & (elem(insert) | elem(append))) {
		...
	/* add 'error' to set 'status' */
	status |= elem(error);

	/* remove 'error' from set 'status' */
	status &= ~elem(error);

jss@sjuvax.UUCP (J. Shapiro) (03/05/85)

>> type
>> 	mymodes = (insert, error, append, ...);
>> var
>> 	status = set of mymodes;
>> begin
>> ...
>> if (status * [insert, append]) <> [] then begin....
>> ...
>> end
>> ...
>I agree.  Here's how I would write this in C:
>
>enum {insert, error, append, ... };
>#define elem(n) (1 << (int)(n))
>short status;
>...
>	if (status & (elem(insert) | elem(append))) {
>		...
>	/* add 'error' to set 'status' */
>	status |= elem(error);
>
>	/* remove 'error' from set 'status' */
>	status &= ~elem(error);

	As the original poster, I thought I had said that enums are problematic
because many compilers don't support them.  Also, this leaves one with the
limit that the number of elements of the set must be <= number of bits in a
long.

It has been suggested that this is equally much a problem in pascal, but in
reality every compiler I have worked with handles the code extensions
automatically. Why should the programmer be obliged to write his own
bitfield manipulation code when this is something the compiler could (and
in pascal does) do perfectly well for itself.

	I am all in favor of hands on control, but only where doing the work
yourself really buys you something.  Any compiler which generates worse
code than I do by hand to handle large bitsets is making a concerted effort
to generate bad code.

	Incidentally, many people provided macros that handle everything for
sets which can be described with the number of bits in a long. *All* of
them said that "extending this for longer sets is trivial."  No one to date
has bothered to cough up code I believe (with one exception - but I would
need to go searching my mailbox to find out who).

	If C is really so adequate to do the job, would somebody cough up a
working set of macros (or a preprocessor - which I believe is the necessary
item) which:

	a) all of us will agree to use
	b) all of us will be satisfied with as something to augment the
		language.
	c) will be useable with any C compiler (excludes enums...)

I can't think of anything which meets all of these qualifications.... I
hope I am wrong ;-)

Jon Shapiro
happily provided

sasaki@harvard.ARPA (Marty Sasaki) (03/06/85)

I use sets in two places, in recursive decent parsers, and in my chess
playing program. If sets were implemented in C, then certain
optimizations could be done by the compiler. Of couse, both places
require more bits than provided in a long, but both of the Pascal
compilers that I use allow at least 64 items in a set (DEC's Pascal for
VMS, and Oregon Software's compiler for PDP-11's).

The chess program represents chess boards as a set of 64 squares. By
doing set operations on these "bitboards" one can generate moves, check
for attacking pieces, evaluate pawn structure, etc. These operations can
be done very fast.

The recursive decent parser uses sets for error recovery. When a
procedure which deals with a non-terminal is called, it is passed the
set of symbols in the follow set for the non-terminal. If an error
occurs, symbols are skipped until a symbol in the input stream is found
that is in the follow set. The procedure is then returns.
-- 
			Marty Sasaki
			Havard University Science Center
			sasaki@harvard.{arpa,uucp}
			617-495-1270

roger@ll-sst (Roger Hale) (03/06/85)

(From: Brandon Allbery <ncoast!bsa>)
> In a project I'm involved in, I was going to use the following:
> 
> #define PLUS |
> #define MINUS &~
> #define IN &
> 
> #define ELEM_1 01
> ...
> if (ELEM_1 IN bunch) ...
(Forgive me if I've misquoted slightly)

I find this appealing, in a way.  My question for the day:

I can say
	bunch PLUS= ELEM_1;
but not
	bunch MINUS= ELEM_1;
Is this an argument for an
	&~=
assignment operator?  :=)

Yours,		Roger Hale	(roger@ll-sst.arpa)