uucp@ucbvax.BERKELEY.EDU (UNIX Copy) (02/09/86)
================================================== NUMERICAL ANALYSIS TITLES -- Volume 3, Number 2, Part 3 of 3 -- August 27, 1985 ================================================== Please note: Due to its length, this list is being distributed in 3 parts, each part is about 7 pages in length. ##### JOHNS HOPKINS ##### [48] Ralph Byers and Stephen Nash On the Singular "Vectors" of the Lyapunov Operator Tech Rep 438 Mathematical Sciences Department The Johns Hopkins University June 1985 For a real matrix A, the separation of A' and A is sep(A',-A) = min ||A'X + XA||/||X||, where ||.|| represents the Frobenius norm, and A' is the transpose of A. We discuss the conjecture that the minimizer X is symmetric. This conjecture is related to the numerical stability of methods for solving the matrix Lyapunov equation. The quotient is minimized by either a symmetric matrix or a skew symmetric matrix and is maximized by a symmetric matrix. The conjecture is true if A is 2-by-2, if A is normal, if the minimum is zero, or if the real parts of the eigenvalues of A are of one sign. In general the conjecture is false, but counter examples suggest that symmetric matrices are nearly optimal. Submitted by: Steve Nash (sgn@jhu.csnet) Obtainable from: Steve Nash Department of Mathematical Sciences The Johns Hopkins University Baltimore, MD 21218 --------------- ##### MARYLAND ##### [49] Dianne P. O'Leary, Systolic Arrays for Matrix Transpose and Other Reorderings Computer Science Department University of Maryland TR-1481, March, 1985. In this note, a systolic array is described for computing the transpose of an n-by-n matrix in time 3n-1 using n squared switching processors and n squared bit buffers. A one-dimensional implementation is also described. Arrays are also given to take a matrix in by rows and put it out by diagonals, and vice versa. Submitted by: Dianne Oleary (oleary@maryland.arpa) Obtainable from: Dianne Oleary Department of Computer Science University of Maryland College Park, MD 20742 --------------- [50] Dianne P. O'Leary, G. W. Stewart Assignment and Scheduling in Parallel Matrix Factorization Computer Science Department University of Maryland TR-1486, April, 1984. We consider in this paper the problem of factoring a dense n-by-n matrix on a network consisting of P MIMD processors when the net- work is smaller than the number of elements in the matrix 2 (P < n ). The specific example analyzed is a computational net- work that arises in computing the LU, QR, or Cholesky factoriza- tions. We prove that if the nodes of the network are evenly dis- tributed among processors and if computations are scheduled by a round-robin or a least-recently-executed scheduling algorithm, then optimal order of speed-up is achieved. However, such speed- up is not necessarily achieved for other scheduling algorithms or if the computation for the nodes is inappropriately split across processors, and we give examples of these phenomena. Lower bounds on execution time for the algorithm are established for two important node-assignment strategies. Obtainable from: D. Oleary, above. --------------- ##### OAK RIDGE ##### All Oak Ridge reports listed below may be obtained from: Ms. Dawn Human Mathematical Sciences Oak Ridge National Laboratory P. O. Box Y, Bldg. 9207-A Oak Ridge, TN 37831 electronically: human@ornl-msr.arpa ##################### [51] George Ostrouchov ANOVA Model Fitting via Sparse Matrix Computations ORNL Tech. Report, August 1985 The least-squares computational approach is used in fitting analysis of variance models when data are unbalanced or incomplete. For large models containing factors with many levels, standard statistical packages require enormous amounts of time and storage due to the size of the crossproducts matrix. The model matrix X (often called the design matrix) consists of dummy variables and is sparse, thus great reduction in time and storage can be achieved by viewing this as a sparse matrix problem. A method, based on orthogonal Givens factorization of a maximal rank set of sparse columns of X, for computing estimates and residual sums of squares is presented. Selection of a maximal rank set of sparse columns is a key part of this method and is based on the linear dependencies between columns of the model matrix. The linear dependencies are described using a notation based on index sets associated with model terms. Time and storage requirements of the new algorithm are compared to the requirements of GLIM, which uses the same basic numerical algorithm that is used in most statistical packages. Both requirements of the new algorithm are less by up to three orders of magnitude for large models and are competitive for small models. Submitted by: George Ostrouchov -------------- [52] V. Alexiades J.B. Drake G.A. Geist G.E. Giles A.D. Solomon R.F. Wood Mathematical Modeling of Laser-Induced Ultrarapid Melting and Solidification ORNL-6129, July 1985 Mathematical and numerical aspects of laser induced phase change modeling are dicussed. Several numerical formulations, including explicit and implicit enthalpy, adaptive grid finite volume and front tracking are derived and results are presented. In addition, a novel hyperbolic formulation is explored analytically and numerically. The explicit enthalpy approach allows the greatest flexibility and efficiency in modeling problems with complicated phase transitions. Submitted by: Vasili Alexiades -------------- [53] Max Morris Vasili Alexiades Sensitivity Analysis of a Numerically Simulated HgCdTe Solidification Process by Statistical Methods ORNL-6210, August 1985 Statistical response surface methodology is applied to the problem of determining simple approximations for outputs as functions of input parameter values in a numerical simulation of the solidification of a mercury-cadmium-telluride alloy. The approximating polynomials are then differentiated with respect to parameter values to determine sensitivities. A table of percent changes in output values as a result of one per cent changes in parameter values is presented. The empirical procedure used constitutes an unorthodox application of the statistical methodology, but it is easy to use, quite generally applicable and very effective. Submitted by: Vasili Alexiades -------------- [54] V. Alexiades G. A. Geist A. D. Solomon Numerical Simulation of a HgCdTe Solidification Process ORNL-6127, August 1985 The solidification of a cylindrical ingot of mercury-cadmium-telluride is modeled taking into account both heat conduction and solute diffusion. Values of the relevant thermophysical parameters of the pseudo-binary HgTe-CdTe are compiled. The model is implemented numerically by a finite-difference discretization and results of the simulation of a freezing experiment are reported. Submitted by: Vasili Alexiades -------------- [55] George A. Geist Michael T. Heath Parallel Cholesky Factorization on a Hypercube Multiprocessor ORNL-6190, July 1985 Two types of message-passing parallel algorithms are developed for solving symmetric systems of linear equations on a hypercube multiprocessor. One type involves broadcast communication among processors, and the other involves communication along a ring of processors. Details are provided in the form of C programs that implement the algorithms on a hypercube simulator and which should run with little modification on real hypercube hardware. Performance of the various algorithms is demonstrated by means of processor utilization graphs and parallel speedup curves. Submitted by: Mike Heath -------------- [56] Michael T. Heath Parallel Cholesky Factorization in Message-Passing Multiprocessor Environments ORNL-6150, May 1985 Parallel algorithms are presented for computing the Cholesky factorization on multiprocessors having only private local memory. Synchronization of multiple processes is based on message passing. Several possible processor interconnection networks are considered. Submitted by: Mike Heath -------------- [57] George A. Geist ORNL Efficient Parallel LU Factorization with Pivoting on a Hypercube Multiprocessor ORNL-6211, August 1985 A message-passing parallel algorithm is developed for forming the LU factors of general non-singular matrices on a hypercube multiprocessor. Partial pivoting is performed to ensure numerical stability, but the scheduling of tasks is such that the pivot search in the parallel algorithm is completely masked. Empirical tests show that the load imbalance produced by random pivoting causes a 5%-14% increase in execution time. A complementary parallel triangular solution algorithm is also given. Comparisons with the non-pivoting case are used to demonstrate the efficiency of this new algorithm. Submitted by: Al Geist -------------- ##### PITT ##### [58] Rami G. Melhem A Study of Data Interlock in VLSI Computational Networks for Sparse Matrix Multiplication TR-CSD-505 Department of Computer Science Purdue University April 1985 The general question addressed in this study is: Are regular VLSI networks suitable for sparse matrix computations?. More specifically, we consider a special purpose self-timed network that is designed for certain specific dense matrix computation. We add to each cell in the network the capability of recognizing and skipping operations that involve zero operands, and then ask how efficient this resulting network is for sparse matrix computation?. In order to answer this question, it is necessary to study the effect of data interlock on the performance of self-timed networks. For this, the class of pseudo systolic networks is introduced as a hybrid class between systolic and self-timed networks. Networks in this class are easy to analyze, and provide a means for the study of the worst case performance of self-timed networks. The well known concept of computation front is also generalized to include irregular flow of data, and a technique based on the propagation of such computation fronts is suggested for the estimation of the processing time and the communication time of pseudo systolic networks. Submitted by: Rami Melhem (melhem@purdue OR melhem@pitt) Obtainable from: Rami Melhem University of Pittsburgh Department of Mathematics Pittsburgh, PA 15260 --------------- [59] Rami G. Melhem A Modified Frontal Technique Suitable for parallel Systems ICMA-85-84 Department of Mathematics University of Pittsburgh July 1985 Frontal techniques offer the potential for processing the assembly and the factorization phases of finite element analysis in parallel. However, the rows of the stiffness matrix are assembled and factored in different order, thus depriving frontal solvers of the uniformity desirable in parallel processing. On the other hand, band solution techniques handle the factorization phase in a very uniform way but do not interleave assembly and factorization. In this paper, we suggest a technique that borrows from both frontal and band solvers those characteristics that are advantageous for parallel processing. Moreover, book-keeping and data manipulation are simpler in the suggested techniques than in the classical frontal method. This makes the suggested technique also attractive for sequential systems. Obtainable from: Rami Melhem, as above. --------------- ##### STANFORD ##### [60] Gene H. Golub and Carl D. Meyer Simultaneous Computation of Stationary Probabilities with Estimates of Their Sensitivity SIAM ms.no.10958 < No Abstract > Submitted by: Gene H. Golub (golub@su-navajo.arpa) Obtainable from: Gene H. Golub (golub@su-navajo.arpa) Computer Science Department Stanford University Stanford, CA 94305 --------------- [61] Philip E. Gill, Walter Murray, Michael A. Saunders, J. A. Tomlin, and Margaret H. Wright ON PROJECTED NEWTON BARRIER METHODS FOR LINEAR PROGRAMMING AND AN EQUIVALENCE TO KARMARKAR'S PROJECTIVE METHOD We discuss interior-point methods for linear programming derived by applying a logarithmic barrier transformation and performing projected Newton steps for a sequence of barrier parameters. Under certain conditions, one of these ``projected Newton barrier'' methods is shown to be equivalent to Karmarkar's (1984) projective method for linear programming. Details are given of a specific barrier algorithm and its practical implementation. Numerical results are given for several nontrivial test problems. Submitted by: Philip E. Gill (or.gill@su-sierra.arpa) Obtainable from: Philip E. Gill (or.gill@su-sierra.arpa) Systems Optimization Laboratory Operations Research Department Stanford University Stanford, CA 94305 --------------- ##### SUNY/BUFFALO ##### [62] P.J.Eberlein On the Schur Decomposition of a Matrix for Parallel Computation TR 85-689, June 1985, Cornell University An algorithm to solve the eigenproblem for non-symmetric matrices on an N x N array of mesh connected processors, isomorphic to the architecture described by Brent and Luk for symmetric matrices, is presented. This algorithm is a generalization of the classical Jacobi method which holds promise for parallel architectures. Only unitary transformations are used. The rotational parameters are carefully analysed, and many examples of a working program, simulating the parallel architecture, are given. Submitted by: P.J.Eberlein (eberlein%gvax@cornell.arpa) Obtainable from: P.J.Eberlein Dept. of Computer Science SUNY/Buffalo, Bell Hall, Buffalo, N.Y. 14260 --------------- ##### TORONTO ##### [63] D.W. Decker and A. Jepson Convergence cones near bifurcation TR 171/84. < No Abstract > Submitted by: Ken R. Jackson (krj@toronto.csnet) Obtainable from: The Scientific Computing Group, Department of Computer Science, University of Toronto, Toronto, Ontario, Canada M5S 1A4 --------------- [64] Paul Muir Implicit Runge-Kutta methods for two-point boundary value problems TR 175/84. < No Abstract > Obtainable from the above. --------------- [65] Kevin Burrage The order properties of implicit multivalue methods for ordinary differential equations TR 176/84. < No Abstract > Obtainable from the above. --------------- [66] Kevin Burrage Order and stability properties of explicit multivalue methods TR 177/84. < No Abstract > Obtainable from the above. --------------- [67] W.H. Enright, K.R. Jackson, S.P. Norsett, and P.G. Thomsen Interpolants for Runge-Kutta formulas TR 180/85. < No Abstract > Obtainable from the above. --------------- [68] J.M. Fine Low order Runge-Kutta-Nystrom methods with interpolants TR 183/85. < No Abstract > Obtainable from the above. --------------- ##### WASHINGTON STATE ##### [69] ALAN GENZ FULL SYMMETRIC INTERPOLATORY FOR MULTIPLE INTEGRALS CS-85-136 WASHINGTON STATE UNIVERSITY JULY, 1984 A method is given for the direct determination of the weights for fully symmetric integration rules for the hypercube, using multivariable Lagrange interpolation polynomials. The formulas for the weights lead to new classes of efficient rules. Submitted by: Alan Genz ARPANET: na.genz@su-score CSNET: acg@wsu Obtainable from: Alan Genz Computer Science Department Washington State University Pullman, WA 99164-1210 --------------- [70] ALAN GENZ MATRIX METHODS FOR THE FORCED DIFFUSION EQUATION CS-85-137 WASHINGTON STATE UNIVERSITY JULY, 1985 Finite difference methods for the forced diffusion equation with time dependent boundary conditions are developed using matrices to represent both the approximate solution and the difference operations. This technique allows boundary conditions to be included in a natural way and eliminates the need for an analysis of the commutativity of the spatial difference operations. The matrix methods are used to develop algorithms that are second order accurate in space and time for a standard square domain and an annular domain. Obtainable from: Alan Genz, as above. --------------- ##### WATERLOO ##### [71] Senad Busovaca Handling Degeneracy in a Nonlinear L-1 Algorithm Technical Report CS-85-34 Department of Computer Science University of Waterloo This paper is concerned with handling degeneracy in a nonlinear optimization algorithm based on an active set strategy. Although the solution to the problem is given in the context of nonlinear L-1 optimization, the approach is more general and can be used to overcome the non-uniqueness of dual variables in methods that use optimality conditions construc- tively. Submitted by: Richard Bartels Obtainable from: Richard Bartels University of Waterloo Computer Science Department Waterloo, Ontario N2L 3G1 CANADA ---------------