swarbric@tramp.Colorado.EDU (Frank Swarbrick) (05/07/88)
A few day ago I wrote a messages asking why something like foo({1,2,3,4}) is not allowed. I suppose the reasons make some sense, but I still think that something like foo((int []){1,2,3,4}) or foo((int *){1,2,3,4}) should be allowed. You will see the reason I did not even think of this second way at the time is because my "foo" is actually declaired as foo(void *), so it wouldn't really matter what type of pointer you cast it too. Anyway, the following program is my attempt to allow some type of sets in C. Maybe my Pascal background is obscuring my seeing a better way of doing this, but at least it was a learning experience for me. The program is three files, really, but you'll have to split them up yourself if you want to use them. And you can use them, if you like. The third one is just two "main" functions to make it easier for you to understand how to use the functions. I must admit that I'm not very good at documentation. --------------------------CUT HERE------------------------------------------ /* sets.h Header file for sets.c Written by Frank Swarbrick, April/May 1988 May be used without permission. */ #include <mem.h> #ifndef _BYTE #define _BYTE typedef unsigned char byte; #endif #ifndef _BOOL #define _BOOL typedef enum {false = 0, true = !false} bool; #endif int whereinset(const void *el, const void *set, size_t setsize, size_t typesize); void *setplus(const void *el, void *set, size_t *setsize, size_t typesize); void *setminus(const void *el, void *set, size_t *setsize, size_t typesize); #define inset(a,b,c,d) (whereinset((a),(b),(c),(d)) != -1) /* sets.c Set functions. Written by Frank Swarbrick, April/May 1988 May be used without permission. */ #include "sets.h" #include <stdio.h> #include <alloc.h> #include <process.h> /* whereinset() * * *el points to the element you are searching for. set is what you are * searching. *set may be either a pointer to a scalar type or an array * of a scalar type. setsize is how many elements are in the set, and * typesize is the size of whatever el and set point to. This function * returns the position in set where *el first occurs, or it returns -1 * if *el cannot be found in set. */ int whereinset(const void *el, const void *set, size_t setsize, size_t typesize) { register int i, j; /* loop variables */ register bool in = false; /* go through each element of the set */ for (i = 0; !in && i < setsize * typesize; i += typesize) { in = true; /* compare the set to the element; byte by byte */ for (j = 0; in && j < typesize; j++) in = (*((byte *)el+j) == *((byte *)set+i+j)); } return (in) ? (i/typesize)-1 : -1; } /* setplus() * * set *must* be a pointer to a scalar type. It cannot be an array. el * is the same as in whereinset(). setsize is a *pointer* to the size of * the set. typesize is the same as in whereinset(). This function adds * *el to the set. It returns the set with *el added. *setsize is * incremented. */ void *setplus(const void *el, void *set, size_t *setsize, size_t typesize) { register int i; /* loop variable */ (byte *)set = (*setsize == 0) ? (byte *)malloc(typesize) : (byte *)realloc(set, (*setsize+1)*typesize); if (set == NULL) { fprintf(stderr, "Out of memory\n"); exit(1); } /* add the element to the end of the set; byte by byte */ for (i = 0; i < typesize; i++) *((byte *)set+(*setsize*typesize+i)) = *((byte *)el+i); (*setsize)++; return set; } /* setminus() * * All values are the same as setplus(). This function first makes sure that * *el is in the set. If *el is in the set, it takes it out and returns the * new set. If *el is not in the set it just returns the old set. */ void *setminus(const void *el, void *set, size_t *setsize, size_t typesize) { register int p; /* position of element in set */ if ((p = whereinset(el, set, *setsize, typesize)) != -1) { p *= typesize; /* move remainder of set over the top of removed element */ memmove((byte *)set+p, (byte *)set+p+typesize, (*setsize-1)*typesize-p); (*setsize)--; } return set; } /* settest.c Tests set functions. Link with sets.c. Written by Frank Swarbrick, April/May 1988. */ #include "sets.h" #include <stdio.h> /* uncomment one of these depending on which main() you want to use */ /* #define MAIN1 */ /* #define MAIN2 */ #ifdef MAIN1 void main() { unsigned amt; /* number of elements to be put in set */ long *longset; /* set of long integers */ long el; /* element of set to be added or removed */ unsigned count = 0; /* number of elements actually in set */ register int i; /* loop variable */ longset = NULL; /* must be done to get rid of annoying warning */ printf("How many elements? "); scanf("%d", &amt); /* get elements and add them to the set */ for (i = 0; i < amt; i++) { printf("\n#%d: ", i+1); scanf("%ld", &el); longset = (long *)setplus(&el, longset, &count, sizeof(*longset)); } /* while there are still elements in the set display them and ask what elements to remove */ while (count > 0) { for (i = 0; i < count; i++) printf("%ld ", longset[i]); printf("\nDispose of what number? ", longset); scanf("%ld", &el); longset = (long *)setminus(&el, longset, &count, sizeof(*longset)); } } #endif #ifdef MAIN2 void main() { /* constant set of unsigned integers */ const unsigned intset[] = {0U, 12U, 2112U, 255U, 256U, 32767U, 65535U}; unsigned num; /* number to be searched for in set */ printf("Type a number: "); scanf("%u", &num); if (inset(&num, intset, sizeof(intset)/sizeof(*intset), sizeof(*intset))) printf("%u is in the set\n", num); else printf("%u is not in the set\n", num); } #endif Frank Swarbrick (and his cat) swarbric@tramp.Colorado.EDU ...!{ncar|nbires}!boulder!tramp!swarbric "Live to Rock -- Rock to live"
U23405@UICVM.BITNET (08/12/88)
(Please don't flame at me, I'm just being curious.) :-) In PASCAL, there is a datatype called SET that (obviously) deals with sets. What I would like to know is whether anyone has tried to implement this datatype in C (either by a system header file or by typedefs). Any comments would be helpful. (And I mean *ANY* comments) Thanks, Michael Steiner Email: U23405@UICVM
dawson@fpssun.fps.com (Bryan Dawson ext 1184) (08/15/88)
In article <8808121452.AA14152@ucbvax.berkeley.edu>, U23405@UICVM.BITNET writes: > In PASCAL, there is a datatype called SET that (obviously) deals with sets. > What I would like to know is whether anyone has tried to implement this > datatype in C (either by a system header file or by typedefs). Any comments > would be helpful. (And I mean *ANY* comments) > There is an implementation of SMALL sets (using the bits in a presumably 32 bit unsigned int as a bit array of elements) which is outlined in 'C A Reference Manual' by Harbison & Steele (Pg 174 and 175 in 2nd ed.). Basically, they #define macros which use bitwise operations to implement intersection, union, difference, membership, etc... I have used a refined version of this technique to implement SPECIFIC well defined sets for applications which really needed them. I believe that a well behaved GENERIC set type similar to PASCAL would take quite a bit of effort to implement well in C. Luckily, I find that it is often possible to tailor the requirements to meet the application and thus enable a simple implementation similar to the one in H&S quite workable. -- ============================================================================ = Bryan Dawson tektronix!fpssun!dawson ...The story you have just = = read is true, reality was subsequently changed to protect the innocent. = ============================================================================
swarbric@tramp.Colorado.EDU (Frank Swarbrick) (08/16/88)
Will Jim Reid (I think) send me mail? I lost your letter before I could send you the files you requested. Frank Swarbrick (and, yes, the net.cat) swarbric@tramp.Colorado.EDU ...!{ncar|nbires}!boulder!tramp!swarbric "There's a guy on my block, he lives for Rock; He plays records day and night." --The Kinks
leo@philmds.UUCP (Leo de Wit) (08/17/88)
In article <8808121452.AA14152@ucbvax.berkeley.edu> U23405@UICVM.BITNET writes: >(Please don't flame at me, I'm just being curious.) :-) > >In PASCAL, there is a datatype called SET that (obviously) deals with sets. >What I would like to know is whether anyone has tried to implement this >datatype in C (either by a system header file or by typedefs). Any comments >would be helpful. (And I mean *ANY* comments) An obvious way to implement this in C could be using an (array of) integer(s) and the logical operations &, |, ~, and ^ to perform operations on these values. The bits of the integer(s) should each correspond to a data element of a certain type (the type in: set of type). So the GAME of this implementation is such that each bit of the SET MATCHes a value of the base type (been watching tennis a bit too much lately 8-). Of course this rules out type checking - unless you do something like putting the integer(s) into a struct and adding a type identifier to this struct. The nice thing about using the logical operators on integers is that this opens the possibility to code the stuff using macro's, thus improving speed. The use of bit arrays forbids any large base type (for example: set of integer). Dunno how that's handled in Pascal, but I seem to remember that the language does not pose any restrictions on the base type but implementations do. In this case one will have to resort to linked lists or something alike. Sounds interesting ... might even give it a try myself 8-). Leo.
greim@sbsvax.UUCP (Michael Greim) (08/17/88)
In article <8808121452.AA14152@ucbvax.berkeley.edu> U23405@UICVM.BITNET writes: <In PASCAL, there is a datatype called SET that (obviously) deals with sets. <What I would like to know is whether anyone has tried to implement this <datatype in C (either by a system header file or by typedefs). Any comments <would be helpful. (And I mean *ANY* comments) In article <2732@boulder.Colorado.EDU>, swarbric@tramp.Colorado.EDU (Frank Swarbrick) writes: < Well, I have written three functions called inset(), setplus(), and setminus() < which use arrays of any type as sets. inset() tells you if something is in < your set, setplus() adds something to your set, and setminus() takes something < out. It's not as good as Pascal's real sets, but it's OK. I could send you < them if you like. I have a set of functions too. You can do all the usual set operations on any data type which convertible to integer. This may not be what you want, but if you are interested I can send you a copy. (First I will have to improve the code, though :-) -mg -- UUCP: ...!uunet!unido!sbsvax!greim | Michael T. Greim or greim@sbsvax.UUCP | Universitaet des Saarlandes CSNET: greim%sbsvax.uucp@Germany.CSnet| FB 10 - Informatik (Dept. of CS) ARPA: greim%sbsvax.uucp@uunet.UU.NET | Bau 36, Im Stadtwald 15 voice: +49 681 302 2434 | D-6600 Saarbruecken 11, West Germany # include <disclaimers/std.h>
ben@bosco.UUCP (ben ullrich) (08/18/88)
``Programming in C'' by Lawrence H. Miller & Alexander E. Quilici (New York: John Wiley & Sons, 1986) has a section on sets as an abstract data type. i have found this book to be of immense value for one who is learning c from pascal and other 'usual' languages. ...ben -- ben ullrich sybase, inc. database administrator 6475 christie avenue mis department emeryville, ca 94608 {pyramid,pacbell,sun,mtxinu,capmkt}!sybase!ben 415 - 596 - 3654
robert@pvab.UUCP (Robert Claeson) (08/20/88)
In article <594@philmds.UUCP>, leo@philmds.UUCP (Leo de Wit) writes: > In article <8808121452.AA14152@ucbvax.berkeley.edu> U23405@UICVM.BITNET writes: > >In PASCAL, there is a datatype called SET that (obviously) deals with sets. > >What I would like to know is whether anyone has tried to implement this > >datatype in C (either by a system header file or by typedefs). Any comments > >would be helpful. (And I mean *ANY* comments) > An obvious way to implement this in C could be using an (array of) > integer(s) and the logical operations &, |, ~, and ^ to perform > operations on these values. I recall there was a set datatype for C posted to comp.sources.something <= 1 1/2 year ago.
jonasn@ttds.UUCP (Jonas Nygren) (08/20/88)
Why not K&R Appendix A, 8.5 and 7.8, 7.9, 7.10 ???
eric@snark.UUCP (Eric S. Raymond) (08/22/88)
It's not like this is a difficult problem or anything. /* bitmacros.h -- bitmap-allocation and get/set macros */ #define ALLOC_BITS(n) calloc((unsigned) ALLOC_LEN(n), 1) #define REALLOC_BITS(p, m, n) if (m>n)(void)realloc(p,(unsigned)ALLOC_LEN(n)) #ifndef UNPACKED #define ALLOC_LEN(n) (((n) >> 3) + 1) #define GET_BIT(map, b) ((map)[(b) >> 3] & (1 << ((b) % 8))) #define SET_BIT(map, b) ((map)[(b) >> 3] |= (1 << ((b) % 8))) #define CLEAR_BIT(map, b) ((map)[(b) >> 3] &= ~(1 << ((b) % 8))) #else #define ALLOC_LEN(n) (n) #define GET_BIT(map, b) map[b] #define SET_BIT(map, b) (map[b] = 1) #define CLEAR_BIT(map, b) (map[b] = 0) #endif /* !UNPACKED */ /* bitmacros.h ends here */ Can we drop this subject now? Please? -- Eric S. Raymond (the mad mastermind of TMN-Netnews) UUCP: ..!{uunet,att,rutgers!vu-vlsi}!snark!eric @nets: eric@snark.UUCP Post: 22 South Warren Avenue, Malvern, PA 19355 Phone: (215)-296-5718
rhys@batserver.cs.uq.oz.au (Rhys Weatherley) (06/28/90)
gwyn@smoke.BRL.MIL (Doug Gwyn) writes: >In article <1990Jun26.144523.8804@cs.utk.edu> wozniak@utkux1.utk.edu (Bryon Lape) writes: >> I know that the enum type is simliar to sets, but is there >>anyway to have sets in C like there is in Pascal? >Yes, roll your own. It's not hard. >>What of the set operators (+,-,*,/)? >These are set operators? Yep, In Pascal they are respectively union, difference, intersection, and ... something else I can't remember :-). I replied to Bryon myself, but I'll also post an excerpt from my reply here as well to belay more unnecessary bandwidth wasting discussion on the topic: The easiest way to get a set is to use an 'int' or 'long' to represent it, and deal with bit fields. e.g. the following macros, defs, etc: /* The set type for elements 0..n */ typedef int SetType; /* OR... */ typedef long SetType; /* Convert an element (0..n) into a set */ #define SET_ELEM(x) (1 << (x)) /* Get the union of two sets */ #define UNION(x,y) ((x) | (y)) /* Get the intersection of two sets */ #define INTERSECT(x,y) ((x) & (y)) /* Diference operator */ #define DIFFERENCE(x,y) ((x) & ~(y)) /* See if something (x) is a member of a set (y) */ #define MEMBER(x,y) (SET_ELEM(x) & (y)) /* See if something (x) is a subset of a set (y) */ #define SUBSET(x,y) (((y) & (x)) == (x)) /* See if something (x) is a proper subset of a set (y) */ #define PROPER(x,y) (SUBSET(x,y) && !((x) == (y))) /* .... etc .... */ Be careful of side effects when using the last two (SUBSET and PROPER). If you pull apart the generated code from most Pascal compilers, you usually find it is implemented this way anyway! If you need more arbitrary sets, i.e. elements in the range 100..110, etc, you just need to change SET_ELEM, and away you go. e.g: #define SET_ELEM(x) (1 << ((x) - 100)) Also, by having a different SET_ELEM for each set type you want, you can still use all the other operators with no worries. For "niceness" use the type 'SetType' or an equivalent always, as this makes it clearer as to what you are doing. Totally arbitrary sets are a different matter, and are best implemented with linked lists, binary trees, and other such structures, but for the run-of-the-mill Pascal type sets, this method is adequate. Rhys. +===============================+==============================+ || Rhys Weatherley | University of Queensland, || || rhys@batserver.cs.uq.oz.au | Australia. G'day!! || +===============================+==============================+
gtoal@tharr.UUCP (Graham Toal) (06/29/90)
Archive-name: setdemo.c /* File: setdemo.c Author: Graham Toal <gtoal%uk.ac.ed@nsfnet-relay.ac.uk> Created: 27/06/90 Bryon Lape <wozniak@utkux1.utk.edu> recently asked for C code to implement pascal-like sets. By coincidence I had written this just a few days earlier, as a proof-of-concept test program. Slight lack of comments but should be fathomable. The code in fact implements something stronger than pascal sets; it allows arbitrarily sized sets, and it lets you construct expressions out of those sets *without* using any scratch workspace to evaluate intermediates. (The real use of the code is in free text indexing, where the sets are lists of addresses of words in a large file, and we are performing queries on the file) (c) Graham Toal 1990. I retain copyright on this only to stop anyone else copyrighting it and restricting my use of my own code! Otherwise it is entirely in the public domain. Use it as you wish, commercially or otherwise. Note: the sets are stored as lists of elements; if the input data is always unique and sorted, the output data will be too; a simple proof by induction can show this. So when constructing sets as input to this code, remember to validate them. For example, list1 = [ 1 2 38 3456 100006 ] is ok, but list2 = [ 1 38 2 3456 100006 ] is bad (order), and list3 = [ 1 1 1 2 38 3456 100006 ] is also bad (not unique). */ /*#define DEBUG*/ #include <stdio.h> #define TRUE (0==0) #define FALSE (0!=0) #define DATA int typedef struct PIPEstruct PIPE; typedef int (*UNARYFN)(PIPE *, DATA *); typedef int (*BINFN)(PIPE *, PIPE *, DATA *); /* easily extended to ternary functions for things such as range-checking */ #define MONO 1 #define BINARY 2 #define PENDING (EOF+1) #define UNDEF (PENDING+1) struct PIPEstruct { int fntype; UNARYFN monofun; BINFN bifun; struct PIPEstruct *op1, *op2; int state; /* EOF, PENDING, UNDEF */ DATA lookahead; int *array; /* These two represent data stream for now */ int index; /* DATA offset1, offset2; */ /* These two would be used in real life to point into a file of word indexes */ }; void unread(PIPE *p, DATA d) { p->state = PENDING; p->lookahead = d; } int and(PIPE *left, PIPE *right, DATA *value) { DATA d1, d2; /* Perform an AND operation on two streams */ for (;;) { if (!eval(left, &d1) || !eval(right, &d2)) return(FALSE); while ((d1 < d2) && eval(left, &d1)); while ((d2 < d1) && eval(right, &d2)); if (d1 == d2) {*value = d1; return(TRUE);} } } int or(PIPE *left, PIPE *right, DATA *value) { DATA d1, d2; /* Perform an OR operation on two streams */ if (!eval(left, &d1)) return(eval(right, value)); *value = d1; if (eval(right, &d2)) { if (d1 < d2) unread(right, d2); else if (d2 < d1) {unread(left, d1); *value = d2;} } return(TRUE); } int eval(PIPE *expr, DATA *value) { if (expr->state == PENDING) { expr->state = UNDEF; *value = expr->lookahead; return(TRUE); } switch (expr->fntype) { case MONO: return(expr->monofun(expr->op1, value)); case BINARY: return(expr->bifun(expr->op1, expr->op2, value)); } } int data(PIPE *d, DATA *value) { if (d->state == PENDING) { d->state = UNDEF; *value = d->lookahead; return(TRUE); } if (d->array[d->index] == -1) d->state = EOF; /* Hack for demo -- real code would compare offset1 to offset2 */ if (d->state == EOF) return(FALSE); *value = d->array[d->index++]; d->state = UNDEF; return(TRUE); } PIPE *make_data_stream(int O_List[]) { PIPE *tmp; tmp = (PIPE *)malloc(sizeof(PIPE)); tmp->fntype = MONO; tmp->monofun = data; tmp->array = O_List; tmp->index = 0; tmp->op1 = tmp; /* Actually op1 should be a structure containing array and index from above; this is just laziness */ tmp->state = UNDEF; return(tmp); } PIPE *make_pipe_binop(int (*fn)(PIPE *left, PIPE *right, DATA *value), PIPE *arg1, PIPE *arg2) { PIPE *tmp; tmp = (PIPE *)malloc(sizeof(PIPE)); tmp->bifun = fn; tmp->fntype = BINARY; tmp->op1 = arg1; tmp->op2 = arg2; tmp->state = UNDEF; return(tmp); } int main(int argc, char **argv) { /* Simulated Search Engine -- effectively performing set operations on arbitrarily large sets. */ static int O_List1[16] = { 1, 3, 5, 6, 7, 8, 10, 12, 14, 16, 17, 18, 20, 21, 22, -1}; static int O_List2[7] = {1, 2, 4, 5, 6, 7, -1}; static int O_List3[9] = {14, 15, 16, 17, 18, 19, 20, 21, -1}; static int O_List4[4] = {12, 13, 14, -1}; /* The problem is a free-text database lookup. We have an index which when presented with a keyword, returns a list of identifiers of records which contain that keyword. We wish to use and and or operations on several keywords, eg 'Give me all the records which contain "programmer" and "poor", or "manager" and "rich"' Here is a worked example of the problem. There is a very easy solution possible *iff* all the sets are kept in memory; however they can grow to be excessively large, so the chosen algorithm has to work on small sections of these sets at any one time. Conceptually the simplest way to do this is to define a mechanism where the operator is akin to a unix filter or process, receiving its input data from a stream, and writing its results to a stream. Then the expression evaluation becomes a dataflow problem of connecting these streams together, and reading the results coming out of the topmost filter. One minor implementation note: accessing the elements of each of these lists off CD Rom turn-about would cause lots of disk-head movement; we can optimise this to any degree necessary by having the bottom-level functions buffer their data in blocks in Ram. List1 <- 1 3 5 6 7 8 10 12 14 16 17 18 20 21 22 / OR / \ / List2 <- 1 2 4 5 6 7 Result <- AND \ List3 <- 12 13 14 15 16 17 18 19 \ / OR \ List4 <- 12 13 14 <- 1 3 4 5 6 7 8 10 12 14 16 17 18 20 21 22 / / Result <- AND \ \ <- 12 13 14 15 16 17 18 19 20 21 Result <- 12 14 16 17 18 20 21 */ PIPE *data1, *data2, *data3, *data4, *or1, *or2, *and1; DATA value; /* The Object-lists are simulated by a C array; in real life they would be found in a large index file. (They are actually found by following a 'trie' from a master index) */ data1 = make_data_stream(O_List1); data2 = make_data_stream(O_List2); data3 = make_data_stream(O_List3); data4 = make_data_stream(O_List4); /* Compile our expression */ or1 = make_pipe_binop(or, data1, data2); or2 = make_pipe_binop(or, data3, data4); and1 = make_pipe_binop(and, or1, or2); /* Perform the evaluation of the expression; if you want, you can store the results in a new list; often it is easier to do things with them one at a time as the members of the list are presented */ fprintf(stderr, "The result of list1|list2 & list3|list4 is:\n"); while (eval(and1, &value)) { fprintf(stderr, " %d", value); } fprintf(stderr, "\n"); }
wozniak@utkux1.utk.edu (Bryon Lape) (06/29/90)
Wow!! Finally an answer I can use!! After getting stuff like "You should not post that here," and "Why would you want to do that," I finally got my answer. Thanks very much. I was beginning to wonder about this net. -bryon lape-
mcdaniel@amara.uucp (Tim McDaniel) (06/29/90)
gtoal@tharr.UUCP (Graham Toal) has provided a neat and very useful set of lazy-evaluation list functions. Many thanks! From my limited understanding of copyright law, though, there may be copyright problems. I point them out so that others can perhaps avoid problems in the future. I hope that those more knowledgable will correct me. > (c) Graham Toal 1990. A copyright notice must begin with "Copyright" (upper- or lowercase), or with a "c" in a circle. "(c)" is not "c" in a circle, and thus this statement might not be considered a valid copyright notice. > Otherwise it is entirely in the public domain. Any text is either "public domain" or "copyright", but cannot be both. "Public domain" implies "not copyright". Any reasonable judge would presumably follow what was meant, not what was written. In other words, I wouldn't bet 5 cents on it. 8-) -- "I'm not a nerd -- I'm 'socially challenged'." Tim McDaniel Internet: mcdaniel@adi.com UUCP: {uunet,sharkey}!puffer!mcdaniel