ld231782@longs.LANCE.ColoState.Edu (Lawrence Detweiler) (02/04/90)
----- I'm interested in corresponding about the following research that applies to set theory and optimization. (Pardon me if I use nonstandard terms where standard ones apply.) Suppose we have many sets that are to be combined in union to yield some "resultant" set. Suppose further that we would like to minimize the number of sets involved. In short, the "minimum union problem" is to Find the minimum union of sets that contains the resultant. "Minimum" means that there are no unions of fewer sets that contain the resultant (although perhaps some with the same count). "Contains" means that the resultant is a subset ( either equivalent or not) of the union. (No sets contain repeated elements.) The problem has a wide variety of applications (named and isolated here for this reason), for example logic circuit design: it is the same problem as finding the minimum number of prime implicants that cover all terms in the Quine-McCluskey algorithm for boolean function optimization. That is, their algorithm DOES NOT OPTIMIZE (in general) unless it addresses this problem (some presentations gloss over it). Another application is scheduling, where for example a good time for a meeting must be found that is compatible with everyone's schedules. There are surely many other applications. A greedy algorithm by itself is not enough to guarantee optimization either (although it appears to give a good approximation). EXAMPLE: For a particularly striking example consider how to optimize a stock quotation system where clients are informed of price changes in all stocks they subscribe to. The stock prices are sent in packets over relays. The relays are not mutually exclusive in the clients they reach; that is, the same client may be reached by any number of relays. Clients note the stocks prices they subscribe to (if any) in packets that pass them on the relay. The optimum solution dictates how to inform everyone in the fewest packets and smallest packet size. SOLUTION: First, to minimize the number of packets, we consider all clients who must be notified (those that subscribe to stocks prices that have changed) which comprise the resultant set. View each relay as a set that contains only these subscribers (all others are excluded). Then find the minimum union of relays that reach every affected client. Since a packet will be sent on every chosen relay, it follows that when chosen relays are minimized, packets are also. Next, to minimize the size of the packets, we consider all subscribers per stock. As before view each relay as a set that contains only these subscribers. We find the minimum union of chosen relays that reach all subscribers of that stock (the resultant). The stock price is added to the packet of every relay in the solution. After all stocks are considered the packets are sent. Note that either phase may be eliminated to find the fewest packets only or smallest packet size only. This is the striking nature of the example; the optimum solution involves multiple minimum unions. Set "maps" make a convenient notation for the problem, where each place in a string indicates the presence or absence of an element. For an illustration, suppose relays are denoted by letters and clients by numbers; a global mapping might look like this: a ## # # b ## # c ## # d # # # 123456 which shows, for example, the A relay reaches clients 1, 2, 4, and 6. Using maps, a union is even simpler to find than numerical addition; for each place, by definition if an element is in any set then it is also in the union. (Note that the union of all relays reaches all clients.) Suppose that clients 1, 3, and 6 are affected by price changes in two stocks. Columns that correspond to the other clients can be eliminated: a # # b ## c # 136 where the D relay is eliminated because it reaches none of these clients. The minimum union that reaches all clients is A + B, so only these relays will be used. Now maps are drawn for affected clients on a per stock basis. Suppose clients 1 and 6 subscribe to the first stock. Focusing only on chosen relays and these subscribers gives a ## b # 16 which shows that the minimum union involves only A, so the price is placed in the packet for that relay. Now suppose all (affected) clients subscribe to the second stock: a # # b ## 136 The minimum union requires the maximum of A + B so this price is placed in both packets, completing the optimization. Both stocks are sent in the A relay packet and the second stock in only the B relay packet. In general, for N sets there are 2^N possible combinations of sets. Thus an algorithm to solve (find the optimum for) the minimum union problem in worst case polynomial time (of sets) is probably not possible. However, close approximations in polynomial time are possible with a surprisingly straightforward approach (particularly in the C language). Simple enhancements to the basic algorithm improve both the solution and the time required to determine it. A sensible algorithm for actual optimization is accessable as well. Send email if interested.