[mod.techreports] Numerical Analysis Reports Part III

daemon@ucbvax.BERKELEY.EDU (The devil himself) (01/29/86)
==================================================
NUMERICAL ANALYSIS TITLES
     -- Volume 3, Number 2, Part 3 of 3
     -- August 27, 1985
==================================================

Please note:  Due to its length, this list is
              being distributed in 3 parts, each
              part is about 7 pages in length.

##### JOHNS HOPKINS #####

[48] Ralph Byers and Stephen Nash

    On the Singular "Vectors" of the Lyapunov Operator

    Tech Rep 438
    Mathematical Sciences Department
    The Johns Hopkins University
    June 1985

For a real matrix A, the separation of A' and A is sep(A',-A) =
min ||A'X + XA||/||X||, where ||.|| represents the Frobenius norm,
and A' is the transpose of A.  We discuss the conjecture that the
minimizer X is symmetric.  This conjecture is related to the numerical
stability of methods for solving the matrix Lyapunov equation.  The
quotient is minimized by either a symmetric matrix or a skew symmetric
matrix and is maximized by a symmetric matrix.  The conjecture is true 
if A is 2-by-2, if A is normal, if the minimum is zero, or if the real
parts of the eigenvalues of A are of one sign.  In general the conjecture
is false, but counter examples suggest that symmetric matrices are
nearly optimal.

Submitted by: Steve Nash (sgn@jhu.csnet)
Obtainable from:
  Steve Nash
  Department of Mathematical Sciences
  The Johns Hopkins University
  Baltimore,  MD  21218
---------------

##### MARYLAND #####

[49] Dianne P. O'Leary,

    Systolic Arrays for Matrix Transpose and Other Reorderings

    Computer Science Department
    University of Maryland
    TR-1481, March, 1985.

In  this  note,  a  systolic array is described for computing the
transpose  of  an n-by-n matrix in time 3n-1 using n squared switching
processors  and n squared  bit buffers.  A one-dimensional implementation
is also  described.   Arrays  are  also given to take a matrix in by
rows and put it out by diagonals, and vice versa.

Submitted by: Dianne Oleary  (oleary@maryland.arpa)
Obtainable from:
  Dianne Oleary
  Department of Computer Science
  University of Maryland
  College Park, MD  20742
---------------

[50] Dianne P. O'Leary, G. W. Stewart

    Assignment and Scheduling in Parallel Matrix Factorization

    Computer Science Department
    University of Maryland
    TR-1486, April, 1984.

We  consider  in  this paper the problem of factoring a dense n-by-n
matrix on a network consisting of P MIMD processors when the net-
work  is  smaller  than  the  number  of  elements  in the matrix
      2
(P < n ).   The specific example analyzed is a computational net-
work  that arises in computing the LU, QR, or Cholesky factoriza-
tions.  We prove that if the nodes of the network are evenly dis-
tributed  among processors and if computations are scheduled by a
round-robin  or  a  least-recently-executed scheduling algorithm,
then  optimal order of speed-up is achieved. However, such speed-
up is not necessarily achieved for other scheduling algorithms or
if  the computation for the nodes is inappropriately split across
processors,  and  we  give  examples  of  these phenomena.  Lower
bounds  on  execution  time for the algorithm are established for
two important node-assignment strategies.

Obtainable from: D. Oleary, above.
---------------

##### OAK RIDGE #####
All Oak Ridge reports listed below
may be obtained from:
     Ms. Dawn Human
     Mathematical Sciences 
     Oak Ridge National Laboratory
     P. O. Box Y, Bldg. 9207-A
     Oak Ridge, TN  37831
electronically:
     human@ornl-msr.arpa
#####################

[51]  George Ostrouchov

     ANOVA Model Fitting via Sparse Matrix Computations

     ORNL Tech. Report, August 1985

  The least-squares computational approach is used in fitting
analysis of variance models when data are unbalanced or incomplete.
For large models containing factors with many levels, standard
statistical packages require enormous amounts of time and storage due
to the size of the crossproducts matrix.  The model matrix X (often
called the design matrix) consists of dummy variables and is sparse,
thus great reduction in time and storage can be achieved by viewing
this as a sparse matrix problem.
  A method, based on orthogonal Givens factorization of a maximal
rank set of sparse columns of X, for computing estimates and residual
sums of squares is presented.  Selection of a maximal rank set of
sparse columns is a key part of this method and is based on the linear
dependencies between columns of the model matrix.  The linear
dependencies are described using a notation based on index sets
associated with model terms.
  Time and storage requirements of the new algorithm are compared to
the requirements of GLIM, which uses the same basic numerical algorithm
that is used in most statistical packages.  Both requirements of the
new algorithm are less by up to three orders of magnitude for large
models and are competitive for small models.

    Submitted by: George Ostrouchov
--------------

[52]  V. Alexiades  J.B. Drake  G.A. Geist  G.E. Giles  A.D. Solomon  R.F. Wood

     Mathematical Modeling of Laser-Induced Ultrarapid
     Melting and Solidification

     ORNL-6129, July 1985 

Mathematical and numerical aspects of laser induced phase change 
modeling are dicussed.  Several numerical formulations, including 
explicit and implicit enthalpy, adaptive grid finite volume and 
front tracking are derived and results are presented.  In addition, 
a novel hyperbolic formulation is explored analytically and 
numerically.  The explicit enthalpy approach allows the greatest 
flexibility and efficiency in modeling problems with complicated 
phase transitions.

Submitted by: Vasili Alexiades
--------------

[53]  Max Morris   Vasili Alexiades 

     Sensitivity Analysis of a Numerically Simulated
     HgCdTe Solidification Process by Statistical Methods

     ORNL-6210, August 1985

Statistical response surface methodology is applied to the problem of 
determining simple approximations for outputs as functions of input 
parameter values in a numerical simulation of the solidification of 
a mercury-cadmium-telluride alloy.  The approximating polynomials are 
then differentiated with respect to parameter values to determine 
sensitivities.  A table of percent changes in output values as a 
result of one per cent changes in parameter values is presented.  
The empirical procedure used constitutes an unorthodox application 
of the statistical methodology, but it is easy to use, quite generally 
applicable and very effective.

    Submitted by: Vasili Alexiades
--------------

[54]  V. Alexiades     G. A. Geist      A. D. Solomon

     Numerical Simulation of a HgCdTe Solidification Process

     ORNL-6127, August 1985

The solidification of a cylindrical ingot of mercury-cadmium-telluride is 
modeled taking into account both heat conduction and solute diffusion.  
Values of the relevant thermophysical parameters of the pseudo-binary 
HgTe-CdTe are compiled.  The model is implemented numerically by a 
finite-difference discretization and results of the simulation of a 
freezing experiment are reported.

    Submitted by: Vasili Alexiades
--------------

[55]  George A. Geist        Michael T. Heath

     Parallel Cholesky Factorization on a Hypercube Multiprocessor

     ORNL-6190, July 1985

Two types of message-passing parallel algorithms are developed for
solving symmetric systems of linear equations on a hypercube
multiprocessor.  One type involves broadcast communication among
processors, and the other involves communication along a ring of
processors.  Details are provided in the form of C programs that
implement the algorithms on a hypercube simulator and which should run
with little modification on real hypercube hardware.  Performance of
the various algorithms is demonstrated by means of processor
utilization graphs and parallel speedup curves.

    Submitted by: Mike Heath
--------------


[56]  Michael T. Heath

     Parallel Cholesky Factorization in
     Message-Passing Multiprocessor Environments

     ORNL-6150, May 1985

Parallel algorithms are presented for computing the Cholesky
factorization on multiprocessors having only private local memory.
Synchronization of multiple processes is based on message passing.
Several possible processor interconnection networks are considered.

Submitted by: Mike Heath
--------------


[57]  George A. Geist
     ORNL
   
     Efficient Parallel LU Factorization with Pivoting
     on a Hypercube Multiprocessor

     ORNL-6211, August 1985

A message-passing parallel algorithm is developed for forming 
the LU factors of general non-singular matrices on a hypercube
multiprocessor.  Partial pivoting is performed to ensure numerical
stability, but the scheduling of tasks is such that the pivot search 
in the parallel algorithm is completely masked.  Empirical tests show 
that the load imbalance produced by random pivoting causes a 5%-14% 
increase in execution time.  A complementary parallel triangular 
solution algorithm is also given.  Comparisons with the non-pivoting 
case are used to demonstrate the efficiency of this new algorithm.

Submitted by: Al Geist
--------------

##### PITT #####

[58] Rami G. Melhem

    A Study of Data Interlock in VLSI Computational
    Networks for Sparse Matrix Multiplication

    TR-CSD-505
    Department of Computer Science
    Purdue University
    April 1985

The general question addressed in this study is: Are regular VLSI networks
suitable for sparse matrix computations?. More specifically, we consider
a special purpose self-timed network that is designed for certain specific
dense matrix computation. We add to each cell in the network the capability
of recognizing and skipping operations that involve zero operands, and then
ask how efficient this resulting network is for sparse matrix computation?.
In order to answer this question, it is necessary to study the effect of
data interlock on the performance of self-timed networks. For this, the
class of pseudo systolic networks is introduced as a hybrid class between 
systolic and self-timed networks. Networks in this class are easy to analyze,
and provide a means for the study of the worst case performance of self-timed
networks. The well known concept of computation front is also generalized to 
include irregular flow of data, and a technique based on the propagation of
such computation fronts is suggested for the estimation of the processing
time and the communication time of pseudo systolic networks.

Submitted by: Rami Melhem (melhem@purdue OR melhem@pitt)
Obtainable from:
  Rami Melhem
  University of Pittsburgh
  Department of Mathematics
  Pittsburgh, PA  15260
---------------

[59] Rami G. Melhem

    A Modified Frontal Technique Suitable for parallel Systems

    ICMA-85-84
    Department of Mathematics
    University of Pittsburgh
    July 1985

Frontal techniques offer the potential for processing the assembly and
the factorization phases of finite element analysis in parallel.
However, the rows of the stiffness matrix are assembled and factored 
in different order, thus depriving frontal solvers of the uniformity 
desirable in parallel processing. On the other hand, band solution 
techniques handle the factorization phase in a very uniform way but
do not interleave assembly and factorization. In this paper, we
suggest a technique that borrows from both frontal and band solvers
those characteristics that are advantageous for parallel processing. Moreover,
book-keeping and data manipulation are simpler in the suggested
techniques than in the classical frontal method.
This makes the suggested technique also attractive for sequential systems.

Obtainable from: Rami Melhem, as above.
---------------

##### STANFORD #####

[60] Gene H. Golub and Carl D. Meyer

    Simultaneous Computation of Stationary Probabilities
    with Estimates of Their Sensitivity

    SIAM ms.no.10958

    < No Abstract >

Submitted by: Gene H. Golub (golub@su-navajo.arpa)
Obtainable from:
    Gene H. Golub (golub@su-navajo.arpa)
    Computer Science Department
    Stanford University
    Stanford,  CA  94305
---------------

[61] Philip E. Gill, Walter Murray, Michael A. Saunders,
    J. A. Tomlin, and Margaret H. Wright

    ON PROJECTED NEWTON BARRIER METHODS FOR LINEAR
    PROGRAMMING AND AN EQUIVALENCE TO KARMARKAR'S PROJECTIVE METHOD

We discuss interior-point methods for linear programming derived by applying
a logarithmic barrier transformation and performing projected Newton steps
for a sequence of barrier parameters.  Under certain conditions,
one of these ``projected Newton barrier'' methods is shown to be
equivalent to Karmarkar's (1984) projective method for linear programming.
Details are given of a specific barrier algorithm and its practical 
implementation.  Numerical results are given for several nontrivial test
problems.

Submitted by: Philip E. Gill (or.gill@su-sierra.arpa)
Obtainable from:
    Philip E. Gill (or.gill@su-sierra.arpa)
    Systems Optimization Laboratory
    Operations Research Department
    Stanford University
    Stanford,  CA  94305
---------------

##### SUNY/BUFFALO #####

[62] P.J.Eberlein

    On the Schur Decomposition of a Matrix for Parallel Computation

    TR 85-689, June 1985, Cornell University

An algorithm to solve the eigenproblem for non-symmetric matrices on 
an N x N array of mesh connected processors, isomorphic to the architecture
described by Brent and Luk for symmetric matrices, is presented.  This
algorithm is a generalization of the classical Jacobi method which holds
promise for parallel architectures.  Only unitary transformations are used.
The rotational parameters are carefully analysed, and many examples of a
working program, simulating the parallel architecture, are given.

Submitted by: P.J.Eberlein (eberlein%gvax@cornell.arpa)
Obtainable from:
  P.J.Eberlein
  Dept. of Computer Science
  SUNY/Buffalo, Bell Hall,
  Buffalo, N.Y. 14260
---------------

##### TORONTO #####

[63] D.W. Decker and A. Jepson

    Convergence cones near bifurcation

    TR 171/84.

    < No Abstract >

Submitted by: Ken R. Jackson (krj@toronto.csnet)
Obtainable from:
    The Scientific Computing Group,
    Department of Computer Science,
    University of Toronto,
    Toronto, Ontario,
    Canada   M5S 1A4
---------------

[64] Paul Muir

    Implicit Runge-Kutta methods for
    two-point boundary value problems

    TR 175/84.

    < No Abstract >

Obtainable from the above.
---------------

[65] Kevin Burrage

    The order properties of implicit multivalue
    methods for ordinary differential equations

    TR 176/84.

    < No Abstract >

Obtainable from the above.
---------------

[66] Kevin Burrage

    Order and stability properties of
    explicit multivalue methods

    TR 177/84.

    < No Abstract >

Obtainable from the above.
---------------

[67] W.H. Enright, K.R. Jackson, S.P. Norsett, and P.G. Thomsen

    Interpolants for Runge-Kutta formulas

    TR 180/85.

    < No Abstract >

Obtainable from the above.
---------------

[68] J.M. Fine

    Low order Runge-Kutta-Nystrom methods with interpolants

    TR 183/85.

    < No Abstract >

Obtainable from the above.
---------------

##### WASHINGTON STATE #####

[69] ALAN GENZ

    FULL SYMMETRIC INTERPOLATORY FOR MULTIPLE INTEGRALS

    CS-85-136
    WASHINGTON STATE UNIVERSITY
    JULY, 1984

A method is given for the direct determination of the weights for fully
symmetric integration rules for the hypercube, using multivariable Lagrange
interpolation polynomials. The formulas for the weights lead to new classes 
of efficient rules.

Submitted by: Alan Genz
  ARPANET: na.genz@su-score
  CSNET: acg@wsu
Obtainable from:
  Alan Genz
  Computer Science Department
  Washington State University
  Pullman, WA 99164-1210
---------------
                  

[70] ALAN GENZ

    MATRIX METHODS FOR THE FORCED DIFFUSION EQUATION

    CS-85-137
    WASHINGTON STATE UNIVERSITY
    JULY, 1985

Finite difference methods for the forced diffusion equation with time
dependent boundary conditions are developed using matrices to represent both
the approximate solution and the difference operations. This technique allows
boundary conditions to be included in a natural way and eliminates the need
for an analysis of the commutativity of the spatial difference operations.
The matrix methods are used to develop algorithms that are second order
accurate in space and time for a standard square domain and an annular domain.

Obtainable from: Alan Genz, as above.
---------------

##### WATERLOO #####

[71] Senad Busovaca

    Handling Degeneracy in a Nonlinear L-1 Algorithm

    Technical Report CS-85-34
    Department of Computer Science
    University of Waterloo

This  paper is concerned with handling degeneracy
in  a  nonlinear  optimization  algorithm  based on an
active  set  strategy.  Although  the  solution to the
problem  is  given  in  the  context  of nonlinear L-1
optimization,  the approach is more general and can be
used  to overcome the non-uniqueness of dual variables
in  methods  that  use optimality conditions construc-
tively.

Submitted by: Richard Bartels
Obtainable from:
	Richard Bartels
	University of Waterloo
	Computer Science Department
	Waterloo, Ontario  N2L 3G1
	CANADA
---------------