stein@pixelpump.osc.edu (Rick 'Transputer' Stein) (06/29/89)
I've been investigating the computational complexity of evaluating the electrical potential around a protein molecule in solution. It seems that the Poisson-Boltzman equation describes this system: div . ( e(x)grad(phi(x)) ) + sinh[phi(x)]/(k(x)*k(x)) + 4pirho(x)= 0 where x is a vector (x,y,z), e(x) is a dielectric function phi(x) is the field k(x) is the modified Debye-Huckel parameter, and rho(x) is the charge density. The PB equation is one of your gnarlier non-linear PDEs! Now one typically solves this kind of equation via finite differences over some cubic volume of space. So, I guestimated that the HIV genome which consists of ~ 4000 bases ( or 50K atoms ) would fit into a box of say oh 400 angstroms on a side. When you create a grid over the volume, you may want about 4 points/angstrom in the x,y, and z directions. This implies that you have 2^32 (or 4096e6) pts. at which you must evaluate the field. To store this quantity of information: phi(x) needs 2^32 pts @ 24 bytes/pt (double precision) = 98.3 Gbytes rho(x), e(x), and k(x) @ 8 bytes each * 3= 98.3 Gbytes So, about 200 Gbytes of physical memory just for the data. I have a iterative matrix solver (Gauss-Siedel with over relaxation), and I know that for a 2D grid, the number of iterations required goes like 3 times the grid size (as a general rule, to obtain convergence to 1e-6 error tolerance). Since we are now in 3D, the number of iterations goes like the square of the grid times 3. On a transputer, it takes about 120 microsecs to evaluate one pt. So, I have the following formula for guestimating the convergence time (for a single cpu): T(days) ~ K * 120 usec * npts * 3 * gridsize^2 where K= 1.15e-5 days/sec This works out to be: 43.3e6 days! (assuming a single transputer can address this much (which it can't but who cares, its the principle of the thing)). Now, if you build a multicomputer with 64Knodes, and assumed linear speedup by partitioning the dataset among the 64Knodes, and the communications were all nearest neighbor (which they are in this case), then the time to compute the potential surface is reduced to about 662 days. The cost of a machine with 64Knodes is, with 4 Mbytes at each node, with a T800 transputer is ~ $1K/node. So $70K just for iron, chuck in about $100K for wire and cooling fans plus about 1 Terabyte of disk here and there, and you've got about $500K for the machine. The problem is that the reliability will get you. 1000 T800s in a box will experience a failure once each 5 years. So 64K will experience a failure about once each 3 weeks (about 40 times too often, but I'll add that very few chips can equivalence this kind of reliability standard). So, any alternatives? Yes, if I was smart enough to figure out how to use Greengard's algorithm (an O(n) solution, instead of the O(n^4) from FD), then I could solve the problem on a single transputer in (43.3M)^.25 or 81 days. One can have a 256 node multicomputer do this problem in about 8 hrs or so. This is reasonable. Oh, I might add, that if you substitute 120 usec for the transputer with .15 usecs for the local supervector ghettoblaster with 200 Gbytes of physical memory, and 64 processors, then the time to solve is about: T(days) ~ K * .15 usec * npts * 3 * gridsize^2/64 or 1000 days via FD or 5.6 days with an O(n) solution. I guess I like the multicomputer better because it'll give a cost performance of oh at least 1000 to 1. -=- Richard M. Stein (aka Rick 'Transputer' Stein) Concurrent Software Specialist @ The Ohio Supercomputer Center Trollius: The Official Multicomputer OS of the 1992 Olympic Games Internet: stein@pixelpump.osc.edu, Ma Bell Net: 614-292-4122