[comp.arch] Hypercube partitioning - Wrong question

segall@caip.UUCP (03/14/87)

>	To follow up on Eugene's remarks, trying to map processes to processors
>	so that only neighbour to neighbour communication is required is the
>	subgraph isomorphism problem (NP-complete, see Garey&Johnson) see also


This has been bothering me for some time:

Everytime someone talks about optimizing the partitioning or placement
of processes on processors (especially on a hypercube of processors),
someone else says: "Yeah, but that's NP-complete." According to the 
above quote, and the name "the subgraph ISOMORPHISM problem" you
can see that the result people are referring to is for an EXACT
embedding of one graph in another (i.e. nearest-neighbor communication 
only - no communication paths of length greater than 1).

I would like to pose a question to the net: What about solutions
(approximate or exact) to the static communication/computation
balancing problem, which remove this constraint? [see ref below]

In explanation: Suppose you are willing to have communication paths of
length greater than 1, and you are willing to have 0, 1 or more
processes per processor. Obviously, the minimum communication
embedding would be achieved by locating all processes on one
processor. This solution would have no communication overhead, but
could take far too long to compute! 

In order to avoid this, charge a penalty for unbalancing the
computation load (putting disparate amounts of computation load
on different processors). Also charge penalties for long communication
paths, so that the minimum cost embedding is a combination of
low communication and balanced computation.

Some work has been done in this area. Craig Steele did an M.S. thesis
at Caltech on just this subject ('85, I think). He modeled the problem
to consider memory requirements and I/O, as well. The bulk of his
thesis, as I recall (can't find my copy right now) investigated the
use of simulated annealing to (approximately) solve a restricted, but
useful specialization of this model. He included quite a number of
figures showing the results of embedding a variety graphs in a variety
of networks.

So, you may ask, why am I posting this question if the problem has been
solved? 

(1) I would like to know if the balancing problem, as stated, is
NP-complete. I suspect it is, but that doesn't mean much. I believe
Steele said it was, but I don't think he showed how to map this
problem to one of the 'known' NP-complete problems. Because of the
above-mentioned difference between this and subgraph isomorphism, I'm
interested in futher explanation.

(2) I am very interested in other approximate solutions. Simulated
annealing is quite nice, and easy to program, but it's not the only
way to optimize. In particular, it is not directly parallelizable,
though I believe there has been some work toward parallel versions.
I don't know how well they perform. In addition, some excellent work
has been done on approximate algorithms recently, which give near
optimal or (I think) optimal results _almost_ all the time, and
which can run faster than simulated annealing. I don't know if any of
these solutions can be applied to this problem, but I would certainly
be interested in knowing about any that can.


One caveat: this is a hard problem. I am primarily interested in
finding about well-analyzed solutions. It's easy to think up lots of
'intuitive' solutions to this problem. Many people have, including
myself. What's difficult is figuring out if a solution is any good,
or if it even makes sense. However, all ideas are welcome.

Thanks,

Ed Segall
-- 

uucp:   ...{harvard, seismo, ut-sally, sri-iu, ihnp4!packard}!topaz!caip!segall
arpa:   SEGALL@CAIP.RUTGERS.EDU