bertoni@june.cs.washington.edu (Jonathan Bertoni) (06/11/91)
ADVANCE PROGRAM 1991 INTERNATIONAL CONFERENCE ON PARALLEL PROCESSING August 12-16, 1991 Pheasant Run Hotel St. Charles (West of Chicago), IL 60174 CONFERENCE PROGRAM ------------------ (r) regular paper (c) concise paper Tuesday, August 13 SESSION 1A: System Architectures (r) Multiple Interleaved Bus Architectures J. Bertoni and W. -H. Wang (r) The Performance of Hierarchical Systems with Wiring Constraints W. T. Hsu and P. -C. Yew (c) A Hybrid Architecture and Adaptive Scheduling for Parallel Execution of Logic Programs S. Auwatanamongkol and P. Biswas (c) Bus Conflicts for Logical Memory Banks on a Cray Y-MP Type Processor System K. A. Robbins and S. Robbins SESSION 1B: Resource Allocation (r) Prime Cube Graph Approach for Processor Allocation in Hypercube Multiprocessors H. Wang and Q. Yang (r) Communication-Efficient Vector Manipulations on Binary N-Cubes W. Lin (c) A General Model for Scheduling of Parallel Computations and its Application to Parallel I/O Operations R. Jain, J. Werth, and J. C. Browne (c) Efficient Interprocessor Communication on Distributed Shared-Memory Multiprocessors H. -M. Su and P. -C. Yew SESSION 1C: Cedar System (r) The Organization of the Cedar System J. Konicek, T. Tilton, A. Veidenbaum, C. -Q. Zhu, E. S. Davidson, R. Downing, M. Haney, M. Sharma, P. C. Yew, P. M. Farmwald, D. Kuck, D. Lavery, R. Lindsey, D. Pointer, J. Andrews, T. Beck, T. Murphy, S. Turner, and N. Warter (r) Restructuring Fortran Programs for Cedar R. Eigenmann, J. Hoeflinger, G. Jaxon, Z. Li, and D. Padua (c) The Xylem Operating System P. A. Emrath, M. S. Anderson, R. R. Barton, and R. E. McGrath (c) Preliminary Performance Analysis of the Cedar Multiprocessor Memory System K. Gallivan, W. Jalby, S. Turner, A. Veidenbaum, and H. Wijshoff SESSION 1D Numerical Algorithms (r) A Parallel Algorithm for Exact Solution of Linear Equations C. K. Koc, and R. M. Piedra (r) An Iterative Sparse Linear System Solver on Star Graphs K. Kim and V. K. P. Kumar (c) Shared Memory Parallel Algorithms for Homotopy Curve Tracking D. C. S. Allison, K. M. Irani, C. J. Ribbens, and L. T. Watson (c) Efficient Systolic Array for Matrix Multiplication F. Klass and U. Weiser SESSION 2A: Processor Architectures I (r) The Concurrent Execution of Multiple Instruction Streams on Superscalar Processors G. E. Daddis Jr. and H. C. Torng (r) A Benchmark Evaluation of a Multi-threaded RISC Processor Architecture R. G. Prasadh and C. -L. Wu (c) Multiple Stream Execution on the DART Processor J. Shetler and S. E. Butner (c) Performance Advantages of Multithreaded Processors W. W. Park, D. S. Fussell, and R. M. Jenevein SESSION 2B: Mapping Algorithms to Parallel Systems (r) A Mapping Strategy for MIMD Computers J. Yang, L. Bic, and A. Nicolau (r) Impact of Temporal Juxtaposition on the Isolated Phase Optimization Approach to Mapping an Algorithm to Mixed-Mode Architectures T. B. Berg, S. -D. Kim, and H. J. Siegel (c) Manipulation of Parallel Algorithms to Improve Performance L. F. Wilson and M. J. Gonzalez (c) Using Simulated Annealing for Mapping Algorithms onto Data Driven Arrays B. Mendelson and I. Koren SESSION 2C: Performance Evaluation 1 (c) Predicting the Effect of Mapping on the Communication Performance of Large Multicomputers Suresh Chittor and Richard J. Enbody (c) Optimizing the Parallel Execution Time of Homogeneous Random Workloads Emile K. Haddad (r) An Empirical Comparison of Four Synchronization Optimization Techniques S.P. Midkiff and D.A. Padua (r) An Effectiveness Study of Parallelizing Compiler Techniques Rudolf Eigenmann and William Blume SESSION 2D Reconfigurable Meshes (r) Selection on the Reconfigurable Mesh H. ElGindy and P. Wegrowicz (r) Reconfigurable Mesh Algorithms for the Hough Transform J.-F. Jenq and S. Sahni (r) Configurational Computation: A New Computation Method on Processor Arrays with Reconfigurable Bus Systems B.-F. Wang, G.-H. Chen, and H. Li SESSION 3A: Processor Architectures II (r) Microprocessor Architecture with Multi-Bit Scoreboard Concurrency Control T. Tran and C. -L. Wu (r) Performance Analysis of an Address Generation Coprocessor P. T. Hulina, L. D. Coraor, and S. -W. Sun (c) A Percolation Based VLIW Architecture A. Abnous, R. Potasman, N. Bagherzadeh, and A. Nicolau (c) The TMS320C40 and it's Application Development Environment: A DSP for Parallel Processing R. Simar Jr. SESSION 3B: Synchronization (r) Efficient Synchronization Schemes for Large-Scale Shared-Memory Multiprocessors K. Ghose and D. -C. Cheng (r) MISC: A Mechanism for Integrated Synchronization and Communication Using Snoop Caches T. Matsumoto, T. Tanaka, T. Moriyama, and S. Uzuhara (c) Wired-NOR Barrier Synchronization for Designing Large Shared-Memory Multiprocessors K. Hwang and S. Shang (c) A New Synchronization Mechanism T. Chen and C. -Q. Zhu SESSION 3C: Compilers 1 (c) Runtime Compilation Methods for Multicomputers Janet Wu, Joel Saltz, Seema Hirandandani and Harry Berryman (c) An Overview of Compiler Optimization of Interprocess Communication and Synchronization Mechanisms Ronald A. Olsson and Carole M. McNamee (r) The PARULEL Parallel Rule Language Salvatore J. Stolfo, Hasanat M. Dewan and Ouri Wolfson (r) The Tiny Loop Restructuring Research Tool Michael Wolfe SESSION 3D Sorting and Searching (r) Breadth-First Traversal of Trees and Integer Sorting in Parallel C. C.-Y. Chen, and S. K. Das (c) Bitonic Sort with an Arbitrary Number of Keys B.-F. Wang, G.-H. Chen, and C.-C. Hsu (c) Experiments in Parallel Heuristic Search L. de P. S. Freitas, and V. C. Barbosa (c) Efficient Parallel Sorting and Merging Algorithms for Two-Dimensional Mesh-Connected Processor Arrays S.-Y. Kuo and S.-C. Liang (c) Optimal Parallel External Merging under Hardware Constraints J.-Y. Fu and F.-C. Lin SESSION 4A: Memory Systems (r) Scalar-Vector Memory Interference in Vector Computers R. Raghavan and J. P. Hayes (r) Design and Evaluation of Fault-Tolerant Interleaved Memory Systems S. -Y. Kuo, A. Louri, and S. -C. Liang (r) Memory System for a Statically Scheduled Supercomputer C. S. Joshi, B. A. Reger, and J. R. Feehrer SESSION 4B: Effective Communication (r) Message Vectorization for Converting Multicomputer Programs to Shared-Memory Multiprocessors D. K. Panda and K. Hwang (r) Allocation and Communication in Distributed Memory Multiprocessors for Periodic Real-Time Applications S. B. Shukla and D. P. Agrawal (c) Broadcast Networks for Fast Synchronization C. J. Beckmann and C. D. Polychronopoulos (c) Arc-Disjoint Spanning Trees on Cube-Connected Cycles Networks P. Fraigniaud and C. -T. Ho SESSION 4C: Tasks 1 (r) Split-Join and Message Passing Programming Models on the BBN TC2000 Eugene D. Brooks,III, Brent C. Gorda, Karen H. Warren and Tammy S. Welcome (r) On Synchronization Patterns in Parallel Programs Jean-Loup Baer and Richard N. Zucker (r) A Non-Blocking Algorithm for Shared Queues using Compare-and-Swap Sundeep Prakash, Yann Hang Lee and Theodore Johnson SESSION 4D Complexity and Optimization (r) Algorithms for Determining Optimal Partitions in Parallel Divide-and-Conquer Computations A. Saha and M. D. Wagh (r) Efficient Parallel Computation of Hamilton Paths and Circuit in Interval Graphs M. A. Sridhar and S. Goyal (c) Solving a Load Balancing Problem Using Boltzmann Machines I. Hwang and R. Varadarajan (c) Study of an Inherently Parallel Heuristic Technique I. Pramanick and J. G. Kuhl EVENING SUPER PANEL: Toward Teraflop Computing panelists: Chris Hsuing, Cray Research, Dave Patterson, UC Berkeley, Justin Ratner, Intel, Burton Smith, Tera Computer, Guy Steele, Thinking Machines, Steve Wallach, Convex Computer Wednesday, August 14 SESSION 5A: Caches I (r) Cache Coherence on a Slotted Ring L. A. Barroso and M. Dubois (r) A Solution of Cache Ping-Pong Problem in RISC Based Parallel Processing Systems J. Fang and M. Lu (c) A Lockup-Free Multiprocessor Cache Design P. Stenstrom, F. Dahlgren, and L. Lundberg (c) Streamline: Cache-Based Message Passing in Scalable Multiprocessors G. T. Byrd and B. A. Delagi SESSION 5B: Permuters (r) Decomposition of Perfect Shuffle Networks K. E. Batcher (r) Fast Self-Routing Permutation Networks C. Jan and A. Y. Oruc (c) On Self-Routable Permutations in Benes Network N. Das, K. Mukhopadhyaya, and J. Dattagupta (c) Realizing Frequently Used Permutations on Syncube Z. Liu and J. -H. You SESSION 5C: Data Issues (r) Managing Data Synchronization Automatically for Distributed-Memory Architectures Ko-Yang Wang (r) An Integrated Hardware/Software Solution for Effective Local Storage Management Elana D. Granston and Alexander V. Veidenbaum (r) Dynamic Load Sharing in the Presence of Information Obsolescence in Distributed Database Environments Avraham Leff and Phillip S. Yu SESSION 5D Graphs and Trees I (r) A Parallel Algorithm For Computing Fourier Transforms On the Star Graph P. Fragopoulou and S. G. Akl (r) A Parallel Algorithm for the PROFIT/COST Problem Y. Han (c) Parallel Algorithms on Outerplanar Graphs H.A. Choi and M.J. Chung (c) Graph-Partitioning on Shared-Memory Multiprocessor Systems K.S. Natarajan SESSION 6A: Caches II (r) Inter-Section Locality of Shared Data in Parallel Programs J. -K. Peir, K. So and J. -H. Tang (c) Analytical Modeling for Finite Cache Effects J. C. Wang, M. Dubois, and F. A. Briggs (c) The Effects of Network Delays on the Performance of MIN-Based Cache Coherence Protocols S. J. Baylor and Y. Hsu (c) A Cache Coherency Mechanism with Limited Combining Capabilities for MIN-Based Multiprocessors K. Ghose and S. Simhadri (c) Effects of Program Parallelization and Stripmining Transformation on Cache Performance in a Multiprocessor M. Gupta and D. A. Padua SESSION 6B: Reliability and Fault Tolerance of Networks (r) Fault Tolerant Routing on a Class of Rearrangeable Networks Y. M. Yeh and T. Y. Feng (c) Fault Side-Effects in Fault Tolerant Multistage Interconnection Networks T. Schwederski, E. Bernath, G. Roos, W. G. Nation, and H. J. Siegel (c) New Bounds on the Reliability of Two Augmented Shuffle-Exchange Networks B. Menezes, U. Bakhru, and R. Sergent (c) A Reliable Butterfly Network for Distributed-Memory Multiprocessors N. -F. Tzeng (c) Multiple-Fault Tolerant Cube-Connected Cycles Networks C. J. Shih and K. E. Batcher SESSION 6C: Performance Evaluation 2 (r) Multiprocessor Simulation and Tracing Using Tango Helen Davis, Stehpen R. Goldschmidt and John Hennessy (c) Recovering Uncorrupted Event Traces from Corrupted Event Traces in Parallel/Distributed Computing Systems Mark S. Andersland and Thomas L. Casavant (c) Experimental Verification of the Critical Path Simulation of an SIMD/MIMD Parallel Processing System Edward C. Bronson and Leah H. Jamieson (r) Visualizing System Behavior Ted Lehr, David Black, Zary Segall and Dalibor Vrsalovic SESSION 6D Algorithm Potpourri (r) Parallel Join Algorithms for SIMD Models S. Azadegan and A. Tripathi (c) Massively Parallel Algorithms for Network Partition Functions A. G. Greenberg and I. Mitrani (c) An Algorithm for Concurrent Search Trees A. Colbrook, E. A. Brewer, C. N. Dellarocas, and W. E. Weihl (c) Optimal Data Parallel Methods for Stochastic Dynamical Programming H. H. Xu, F. B. Hanson, and S.-L. Chung (c) Triangulation, Voronoi Diagram, and Convex Hull in k-Space on Mesh-Connected Arrays and Hypercubes J.A. Holey and O.H. Ibarra SESSION 7A: Data Flow Architectures (r) An Enhanced Data-Driven Architecture H. Ahmed,L. -E. Thorelli, and J. Wennlund (r) A Fine Grain MIMD System with Hybrid Event-Driven/Dataflow Synchronization for Bit-Oriented Computation M. R. Thistle and T. L. Sterling (c) Liger: A Hybrid Dataflow Architecture Exploiting Data/Control Locality T. -C. Chiueh (c) Data flow Model for a Hypercube Multiprocessing Network A. Shaout and D. Smyth SESSION 7B: Effective Memory Access Methods (r) Two Techniques to Enhance the Performance of Memory Consistency Models K. Gharachorloo, A. Gupta, and J. Hennessy (c) Efficient Storage Schemes for Arbitrary Size Square Matrices in Parallel Processors with Shuffle-Exchange Networks R. V. Boppana and C. S. Raghavendra (c) Recent Results on the Parallel Access to Tree-Like Data Structures - The Isotropic Approach, I R. Creutzburg and L. Andrews (c) Multiple-Port Memory Access in Decoupled Architecture Processors S. Weiss (c) Eliminating False Sharing S. J. Eggers and T. E. Jeremiassen SESSION 7C: Compilers 2 (r) Automatic Parallel Program Generation and Optimization from Data Decompositions Edwin M. Paalvast, Henk J. Sips and A.J. van Gemund (r) On Loop Transformations for Generalized Cycle Shrinking Weijia Shang, Mathew T. O'Keefe and Jose A. B. Fortes (c) The Effect of Compiler Optimizations on Available Parallelism in Scalar Programs Scott A. Mahlke, Nancy J. Warter, William Y. Chen Pohua P. Chang and Wen-mei W. Hwu (c) Generalized Unimodular Loop Transformations for Distributed Memory Multiprocessors K.G. Kumar, K. Kulkarni and A Basu SESSION 7D Numerical Applications (r) Parallelizing SPICE2 on Shared-Memory Multiprocessors G.-C. Yang (r) Multifrontal Factorization of Sparse Matrices on Shared-Memory Multiprocessors K. Eswar, P. Sadayappan, and V. Visvanathan (c) An Efficient Arnoldi Method Implemented on Parallel Computers S. K. Kim and A. Chronopoulos (c) Parallel Implementation of Gauss-Seidal Type Algorithms for Power Flow Analysis on a SEQUENT Parallel Computer G. Huang and W. Ongsakul SESSION 8A: Application Specific Architectures (r) Effective Load and Resource Sharing in Parallel Protocol-Processing Systems T. V. Lakshman, D. Ghosal, Y. Huang, and S. K. Tripathi (c) A Distributed Token-Driven Technique for Parallel Zero-Delay Logic Simulation on Massively Parallel Machines M. J. Chung and Y. Chung (c) Parallel Hough Transform for Image Processing on a Pyramid Architecture A. Kavianpour and N. Bagherzadeh (c) UWGSP4: Merging Parallel and Pipelined Architectures for Imaging and Graphics H. W. Park, T. Alexander, K. S. Eo, and Y. Kim (c) Application of Neural Networks in Handling Large Incomplete Databases: VLSI Design and Performance Analysis B.Jin, S. H. Pakzad, and A. R. Hurson SESSION 8B: Practical Aspects of Network Design (r) Branch-and-Combine Clocking of Arbitrarily Large Computing Networks A. El-Amawy (c) Emulation of a PRAM on Leveled Networks M. Palis, S. Rajasekaran, and D. S. L. Wei (c) Optimal Nonblocking Networks with Pin Constraints M. V. Chien and A. Y. Oruc (c) Design and Implementation of a Versatile Interconnection Network in the EM-4 S. Sakai, Y. Kodama, and Y. Yamaguchi (c) A Masively Parallel Processing Unit with a Reconfigurable Bus System RIPU W. -T. Chen, C. -C. Liu, and M. -Y. Fang SESSION 8C: Operating Systems (r) RTPSP: A Real-Time Parallel Signal Processing Environment for Fast Homogeneous Message-Passing Multicomputers Urban A. Thoeni (r) OS Experimentation and a User Community Coexist Under the DUnX Kernel Richard P. LaRowe,Jr. and Carla Schlatter Ellis (r) Locking and Reference Counting in the Mach Kernel David L. Black, Avadis Tevanian,Jr., David B. Golub and Michael W. Young SESSION 8D Image Processing and Graphics (r) Parallelization of the EM Algorithm for 3D PET Image Reconstruction: Performance Estimation and Analysis C. M. Chen and S.-Y. Lee (c) On the Complexity of Parallel Image Component Labeling V. Chaudhary and J. K. Aggarwal (c) Detecting Repeated Patterns on Mesh Computers R. Miller and S. L. Tanimoto (c) Parallel Processing of Incremental Ray Tracing on a Multiprocessor Workstation S. Horiguchi, A. Katahira, and T. Nakada (c) Computer Graphics Rendering on a Shared Memory Multiprocessor S. Whitman and P. Sadayappan Thursday, Auguest 15. SESSION 9A: Routing (r) Performance Evaluation of Multicast Wormhole Routing in 2D-Mesh Multicomputers X. Lin, P. K. McKinley, and L. M. Ni (r) O(LogN/LogLogN) Randomized Routing in Degree-LogN "Hypermeshes" T. Szymanski (c) Randomized Routing of Virtual Connections in Extended Banyans T. Szymanski and C. Fang (c) Fault-Tolerant Message Routing and Error Detection Schemes for the Extended Hypercube J. M. Kumar and L. M. Patnaik SESSION 9B: Multicomputer Networks Design (r) Two Algorithms for Mutual Exclusion in a Distributed System K. Makki, P. Banta, K. Been, and R. Ogawa (r) A Modular High-Speed Switching Network for Integration of Heterogeneous Computing Resources W. E. Chen, Y. M. Kim, and M. T. Liu (c) Dynamically Reconfigurable Architecture of a Transputer-Based Multicomputer System L. Jin, L. Yang, C. Fullmer, and B. Olson (c) A Cell-Based Data Partitioning Strategy for Efficient Load Balancing in A Distributed Memory Multicomputer Database System K. A. Hua, C. Lee, and H. C. Young SESSION 9C: Tasks 2 (r) The Preprocessed Doacross Loop Joel H. Saltz and Ravi Mirchandaney (r) Directed Taskgraph Scheduling Using Simulated Annealing Erik H. D'Hollander and Yves Devis (r) Intelligent Scheduling AND-and OR-Parallelism in the Parallel Logic Programming System RAP/LOP-PIM Gao Yaoqing, Qiu Xiaolin and Hu Shouren SESSION 9D Application Potpourri (r) Parallel Discrete Event Simulation Using Space-Time Memory K. Ghosh and R. M. Fujimoto (c) Partitioning and Mapping Nested Loops on Multiprocessor Systems J.-P. Sheu and T.-H. Tai (c) Finding Optimal Quorum Assignments for Distributed Databases D. B. Johnson and L. Raab (c) Multi-associativity: A Framework for Solving Multiple Non-uniform Problem Instances Simultaneously on SIMD Arrays M. C. Herbordt and C. C. Weems (c) Efficient Parallel Maze Routing Algorithms on a Hypercube Multicomputer C. Aykanat and T. M. Kurc SESSION 10A: Fault Diagnosis, Recovery, and Tolerance (r) Area Efficient Computing Structures for Concurrent Error Detection in Systolic Architectures M. O. Esonu, A. J. Al-Khalili and S. Hariri (c) Channel Multiplexing in Modular Fault Tolerant Multiprocessors M. S. Alam and R. G. Melhem (c) A Cache-Based Checkpointing Scheme for MIN-Based Multiprocessors M. S. Algudady, C. R. Das, and M. J. Thazhuthaveetil (c) CRAFT: Compiler-Assisted Algorithm-Based Fault Tolerance in Distributed Memory Multiprocessors V. Balasubramanian and P. Banerjee (c) Experimental Evaluation of Multiprocessor Cache-Based Error Recovery B. Janssens and W. K. Fuchs SESSION 10B: Cube Networks (r) Base-m n-cube: High Performance Interconnection Networks for Highly Parallel Computer PRODIGY N. Tanabe, T. Suzuoka, S. Nakamura, Y. Kawakura, and S. Oyanagi (r) Fault-Tolerant Resource Placement in Hypercube Computers H. -L. Chen and N. -F. Tzeng (c) Multiphase Complete Exchange on a Circuit Switched Hypercube S. H. Bokhari (c) Hierarchical Uni-Directional Hypercubes C. -H. Chou and D. H. C. Du SESSION 10C: Portable Software (r) Supporting Machine Independent Programming on Diverse Parallel Architectures W. Fenton, B. Ramkumar, V.A. Saletore, A.B. Sinha and L.V. Kale (c) On the Synthesis of Programs for Various Parallel Architectures H. Allan Fencl and Chua-Huang Huang (c) Experience with a Portable Parallelizing Pascal Compiler Eran Gabber, Amir Averbuch and Amiram Yehudai (r) Paragon: A Parallel Programming Environment for Scientific Applications using Communications Structures Craig M. Chase, Alex L. Cheung, Anthony P. Reeves and Mark R. Smith SESSION 10D Software Development (panel) SESSION 11A: Non-uniform Traffic Models (r) Consecutive Requests Traffic Model in Multistage Interconnection Networks Y. -H. Lee, S. E. Cheung, and J. -K. Peir (c) An Approach to the Performance Improvement of Multistage Interconnection Networks with Nonuniform Traffic Spots N. -F. Tzeng (c) Modeling and Analysis of "Hot Spots" in an Asynchronous NxN Crossbar Switch E. Pinsky and P. Stirpe (c) Performance of Multistage Combining Networks B. -C. Kang, G. H. Lee, and R. Kain (c) Effect of Hot Spots on Multiprocessor Systems Using Circuit Switched Interconnection Networks L. Kurian and J. Thazhuthaveetil SESSION 11B: Reconfiguration of Arrays (r) Reconfiguration of Computational Arrays with Multiple Redundancy R. Melhem and J. Ramirez (c) Reconfigurable Multipipelines Y. -H. Choi (c) Embedding of Multidimensional Meshes on to Faulty Hypercubes P. -J. Yang, S. -B. Tien, and C. S. Raghavendra (c) A Way of Deriving Linear Systolic Arrays from a Mathematical Algorithm Description: Case of the Warshall-Floyd Algorithm J. F. Myoupo (c) B-SYS: A 470-Processor Programmable Systolic Array R. Hughey and D. P. Lopresti SESSION 11C: Programming Models and Paradigms (r) Communication in Linda/Q - Datatypes and Unification Sato Hiroyuki and Shimasaki Masaaki (c) Object-Oriented Programming for Massively Parallel Machines Michael F. Kilian (c) Experiences Implementing Dataflow on a General-Purpose Parallel Computer Ellen Spertus and William J. Dally (r) Generalised Invariant Semantics of Concurrent Systems R. Janicki and M. Koutny SESSION 11D: Hypercube Computing (r) On the Performance of a Deadlock-free Routing Algorithm for Boolean n-Cube Interconnection Networks with Finite Buffers M.-Y. Horng and L. Kleinrock (r) A Comparison of SIMD Hypercube Routing Strategies M. Fulgham, R. Cypher, and J. Sanz (c) Fault Tolerant Based Embeddings of Quadtrees into Hypercubes N. Krishnakumar, V. Hegde, and S. S. Iyengar (c) Dialation-6 Embeddings of 3-Dimensional Grids into Optimal Hypercubes H. Liu and S. H. S. Huang SESSION 12A: Performance Evaluation of Networks (r) General Analytic Models for the Performance Analysis of Unique and Redundant Path Interconnection Networks M. Rahman and D. G. Meyer (r) Performance Evaluation of Multistage Interconnection Networks with Finite Buffers J. Ding and L. N. Bhuyan (c) Effects of Arbitration Protocols on the Performance of Multiple-Bus Multiprocessors Q. Yang (c) Performance Evaluation of Hardware Support for Message Passing in Distributed Memory Multicomputers J. -M. Hsu and P. Banerjee SESSION 12B: Load Balancing (r) Parallel Iterative Refining A* Search G. -J. Li and B. W. Wah (c) Scheduling in Parallel Database Systems S. Dandamudi and C. Y. Chow (c) Allocating Partitions to Task Precedence Graphs B. Narahari and H. -A. Choi (c) Algorithms for Mapping & Partitioning Chain Structured Parallel Computations H. -A. Choi and B. Narahari (c) Mapping Task Trees onto a Linear Array D. Ghosal, A. Mukherjee, R. Thurimella and Y. Yesha SESSION 12C: Task Scheduling (r) New "Post-game Analysis" Heuristics for Mapping Parallel Computations to Hypercubes Jerry C. Yan (c) Chain-Based Partitioning and Scheduling of Nested Loops for Multicomputers Peiyi Tang and Gavin Michael (c) Dynamic Loop Scheduling for Share-Memory Multiprocessors Ten H. Tzen and Lionel M. Ni (r) MTOOL: A Method for Isolating Memory Bottlenecks in Shared Memory Multiprocessor Programs Aaron Goldberg and John Hennessy SESSION 12D: Graphs and Trees II (r) Efficient Parallel Construction and Manipulation of Quadtrees F. Dehne, A. G. Ferreira, and A. Rau-Chaplin (c) A Fast Parallel Algorithm to Compute Path Functions for Cographs R. Lin and S. Olariu (c) A Fault-Tolerant Routing Algorithm for Star Graph Interconnection Network S. Sur and P. K. Srimani (c) Embedding Binary Trees in Orthogonal Graphs I. D. Scherson and C. Huang (c) Performance Analysis of Layered Task Graphs H. Jiang and L. N. Bhuyan POSTER SESSION -------------- AREA 1: Parallel Architectures 1) AP1000 Architecture and Performance of LU Decomposition T. Horie, H. Ishihata, T. Shimizu, S. Kato, S. Inano, and M. Ikesaka 2) Performance Modeling of Quartet Multiprocessor Kernels with Failures N. G. Bourbakis and F. N. Barlos 3) An Architecture for the Rollback Machine J. R. Agre and P. A. Tinker 4) The Architecture of a Multilayer Dynamically Reconfigurable Transputer System M. Tudruj and M. Thor 5) Concurrent Automata and Parallel Architecture T. Y. Lin 6) Towards a General-Purpose Parallel System for Imaging Operations H. R. Arabnia 7) A Fast 1024-Point FFT Architecture T. Chen and L. Zhu 8) A Parallel Architecture for Image Processing W. -K. Chou and D. Y. Y. Yun 9) Improving the Efficiency of Multi-rate Signal Processing Architectures S. R. Deshpande & P. P. Jain 10) A Systolic Array Exploiting the Inherent Parallelisms of Artificial Neural Networks J. H. Chung, H. Yoon,and S. R. Maeng 11) A Fault-Tolerant Design of CFM VLSI Parallel Array for Yield enhancement and Reliability Improvement W. -G. Che, Y. -D. Li, and Y. Jiao 12) The MULTITOP Parallel Computers for ASDEX-Upgrade G. Raupp and H. Richter AREA 2: Parallel Processing and Systems 1) Performance Indices for Parallel Marker-Propagation R. F. DeMara and D. I. Moldovan 2) Performance Penalty Communication in Multiprocessor Systems W. H. Burkhardt 3) Simulation Results for a Scalable Coherent Cache System with Incomplete Directory State J. E. Hoag and E. D. Brooks III 4) A Hardware Synchronization Technique for Structured Parallel Computing Z. Xu 5) A Study of Single-Bit Spin-Waiting Alternatives D. W. Opitz 6) Mapping Multiple Problem Instances into a Single Systolic Array with Application to Concurrent Error Detection C. -N. Zhang and H. D. Cheng 7) A Parallel Reconfiguration Algorithm for WSI/VLSI Processor Arrays J. H. Kim and P. K. Rhee 8) Choosing the Right Grains for Data Flow Machines R. Hardon and S. S. Pinter 9) Efficient Production System Execution on the PESA Architecture F. Schreiner and G. Zimmermann 10) Implementation of the Hopfield Neural Network on the MasPar System J. S. Clary and S. Kothari 11) Loop Staggering, Loop Staggering and Compacting: Restructuring Techniques for Thrashing Problem G. Jin, X. Yang, and F. Chen 12) Memory Models, Compiler Optimizations, and P-RISC K. Gopinath 13) Inverted Memory S. Bhattacharya, C. T. Liang, and W. T. Tsai AREA 3: Network Topologies 1) Cartesian Product Networks A. Youssef 2) The VEDIC Network for Multicomputers V. Chaudhary, B. Sabata, and J. K. Aggarwal 3) Block Shift Network: A New Interconnection Network for Efficient Parallel Computation Y. Pan and H. Y. H. Chuang 4) Layered Networks, A New Class of Multistage Interconnection Networks D. B. Bennett and S. M. Sohn 5) On the Construction of Fault-Tolerant Cube-Connected Cycles Networks J. Bruck, R. Cypher, and C. -T. Ho 6) Binary Hypermesh Networks for Parallel Processing D.M. Blough and W. -K. Tsai 7) Hypermesh: A Combined Quad Tree and Mesh Network for Parallel Processing W. K. Tsai, N. Bagherzadeh, and Y. C. Kim 8) C2SC: A Four-Path Fault Tolerant Interconnection Network F. N. Sibai and A. Abonamah 9) On Multistage Networks with Ternary Switching Elements K. Padmanabhan AREA 4: Connection Technology 1) Self Synchronizing Data Transfer Schemes for Multicomputers L. Guoning, J. M. Ostby, O. Sorasen, and Y. Lundh 2) A Parallel Computer Architecture Based on Optical Shuffle Interprocessor Connections B. G. Douglass 3) Parallel Modular Arithmetic on a Permutation Network A. Y. Oruc, V. G. J. Peris, and M. Y. Oruc 4) Multicasting in Optical Bus Connected Processors Using Coincident Pulse Techniques C. Qiao, R. G. Melhem, D. M. Chiarulli, and S. P. Levitan 5) Fault-Tolerant Ring Embedding in de Bruijn Networks R. Rowley and B. Bose 6) On Embedding Virtual Incomplete Hypercubes into WDM-Based High-Speed Optical Networks S. T. Tan and D. H. C. Du 7) Embedding Large Binary Trees to Hypercube Multiprocessors M. -Y. Fang and W. -T. Chen 8) Simulation of SIMD Algorithms on Faulty Hypercubes S. -B. Tien and C. S. Raghavendra 9) A Batcher Double Omega Network with Combining H. Amano and G. Kalidou 10) Stability and Performance of Alternative Two-level Interconnection Networks S. Chowdhury and M. A. Holliday 11) Fibonacci Cubes - Properties of an Asymmetric Interconnection Topology W. -J. Hsu Area 5: Software 1): Framework: a General Purpose, Heterogeneous, Distributed Concurrent/Parallel Processing Architecture Michael J. Andrescavage 2) A Scheduling Scheme for Efficiently Executing Hybrid Nested Loops Chien-Min Wang and Sheng-De Wang 3) Architecture of an Extensible Parallel Debugger C.W. Johnson and P. Mackerras 4) Message Passing and Dynamic Scheduling on Loosely-coupled Multiprocessor Systems Richard P. Ma 5) FOG: A Distributed Communications Platform I.Y. Chiou, W.L. Chan and W.C. Tsai 6) A Threshold Test for Dynamic Load Balancers Milton C. Wikstrom, John L. Gustafson and G.M. Prabhu 7) Characterization and Measurement of Parallelism in Communications Protocol Software Shikharesh Majumdar, C. Murray Woodside and Don Bailey 8) Sensitivity Analysis and Mapping Programs to Parallel Architectures Michael A. Driscoll, Jingsong Fu, Satish Maruti Pai, Chintamani Patwardhan, Liono Setiowijoso, De-Zheng Tang and Kiswanto Thayib 9) A Heterogeneous Distributed Processing Interface Specification Language David D.H. Lin, Behrooz Shirazi and Krishna Kavi 10) The Booster Approach to Annotating Parallel Algorithms Leo C. Breebaart, Edwin M. Paalvast and Henk J. Sips 11) Average Waiting Time Profiles of DQDB Nageswara Rao, Kurt Maly, Steve Olariu, Liping Zhang and David Game 12) Using Visual Tools for Developing an Asynchronous Parallel Layout Algorithm Dror Zernik and Larry Rudolph 13) Fortran-Style Transformations for Functional Programs David C. Sehr, Laxmikant V. Kale and David A. Padua 14) Integrating Profiling into Debugging Kent Beck, Jon Becher and Zaide Liu 15) Parallel Conceptual Clustering through Message-Driven Computing Peter Wohl and Thomas W. Christopher 16) Workload Scheduling: A New Technique for Scheduling Task Graphs with Communication Costs in Parallel Systems Yahui Zhu 17) Compiler Support for Parallel I/O Operations A.L. Narasimha Reddy, P. Banerjee and D.K. Chen 18) XPAT: An Interactive Graphical Tool for Synthesis of Concurrent Software Using Petri Nets Yiannis E. Papelis and Thomas L. Casavant 19) Reducing Latency in DoSpread Loops with Constant Dependence Distances Jeffrey D. Martens 20) A Systolizing Compilation Scheme: Abstract Michael Barnett and Christian Lengauer 21) SIZEUP: A New Parallel Performance Metric Xian-He Sun and John L. Gustafson 22) Efficient Distributed Deadlock Detection and Resolution in Semantic Lock-Based Systems Shu-jen Wang, Mukesh Singhal and Ming T. Liu 23) Efficient Detection of Communication Deadlocks in Distributed Systems Shu-jen Wang, Mukesh Singhal and Ming T. Liu 24) Parallel Simulation of Fully Associative Caches Rabin A. Sugumar and Santosh G. Abraham Area 6: Advances in Parallel Algorithms and Applications 1) Reconfigurable Mesh Algorithms for the Area and Perimeter of Image Components J.-F. Jenq and S. Sahni 2) A Practical Convex Hull Algorithm C. Narayanaswami 3) Ranking, Unranking, and Parallel Enumerating of Topological Orders B. Y. Wu and C. Y. Tang 4) Fast and Efficient Parallel Algorithms for Single Source Lexicographic Depth-First Search, Breadth-First Search and Topological-First Search P. de la Torre and C. P. Kruskal 5) A Parallel Perceptron Learning Algorithm T.-P. Hong and S.-S. Tseng 6) A Multidestination Routing Scheme for Hypercube Multiprocessors Z. Li and J. Wu 7) A Recursive Mutual Exclusion Algorithm for Multiprocessor Systems with Shared Memory T.-K. Woo and K. Block 8) Cost-Optimal Parallel Algorithms for Constructing B-Trees B.-F. Wang, G.-H. Chen, and M. S. Yu 9) An NC Algorithm for Recognizing Strict 2-threshold Graphs L. Y. Tseng and W. D. Hao 10) Using Separators Instead of Dynamic Programming in Approximation Algorithms for Planar Graphs F. Wan and G. E. Shannon 11) Dynamic Detection of Forest of Tree-Connected Meshes E. Jennings, A. Lingas, and L. Motyckova 12) A Parallel Approximation Algorithm for 0/1 Knapsack T. E. Gerasch 13) Efficient Execution of Homogeneous Tasks with Unequal Run Times on the Connection Machine A. Bestavros and T. Cheatham 14) On a Massively Parallel e-Relaxization Algorithm for Linear Transformation Problems X. Li and S. A. Zenios 15) Wild Anomalies in Parallel Branch and Bound A. P. Sprague 16) A Combined Clustering and Parallel Optimization Approach to the Traveling Salesman Problem B. Freisleben and M. Schulte 17) Timing Analysis of a Parallel Algorithm for Toeplitz Matrices on a MIMD Parallel Machine I. Gohberg, I. Koltracht, A. Averbuch, and B. Shoham 18) Large 1-D Fast Fourier Transforms on a Shared Memory System Y. Solowiejczyk and J. Petzinger 19) An Overlapped FFT Algorithm for Hypercube Multicomputers C. Aykanat and A. Dervis 20) A Massively Parallel Linear System Solver for General and Structural Analysis Uses R. C. Shieh and T. Kraay 21) Parallel Computation of the Modified Extended Kalman Filter M. Lu, X. Qiao, andG. Chen 22) Parallel Processing of Sparse Matrix Solution Using Fine Grain Tasks on OSCAR H. Kasahara, W. Premchaiswadi, M. Tamura, Y. Maekawa, and S. Narita 23) Parallel Performance Evaluation of General Engineering Applications T. J. Tautges 24) Parallel Recognition of Two-Dimensional Images M. Nivat and A. Saoudi 25) Parallel Incremental LR Parsing N. Viswanathan and Y. N. Srikant 26) An Asynchronous Distributed Approach to Test Vector Generation Based on Circuit Partitioning on Parallel Processors S. Ghosh and T. J. Chakraborty 27) Granularity Analysis for Parallel 3D Coronary Arteriography A. Sarwal, S.G. Tan, F. Ozguner, and D.L. Parker SPECIAL EVENT ------------- EVENING Monday, August 12: 20th Ann. Banquet Speaker: Professor Duncan Lawrie the University of Illinois at Urbana-Champaign REGISTRATION AND HOTEL RESERVATION ---------------------------------- Conference registration ----------------------- Conference registration $240 Each tutorial session $240 Tutorial session (on August 12) Tutorial 1: parallel computing Tutorial 2: Artifical Neural Networks & Parallel Processing Tutorial 3. Parallel Programming on Multiprocessors and Multicomputers Tutorial 4. Parallel Algorithms & Systems (on August 16) Tutorial 5. System integration with interconnection networks Tutorial 6. Parallel architecture for data/knowledge base systems Tutorial 7. Modern methodologies for developing parallel processing systems Tutorial 8. Parallel programming: Compilers & Operating Systems ------------------------------------------------------------------ (Please print or type) Mr. Mrs. Dr. Ms. Prof. Miss ----------------------------------------------------------------- (First) (Middle) (Last) Affiliation------------------------------------------------------ Address---------------------------------------------------------- City, State, Zip------------------------------------------------- Country---------------------------------------------------------- Bussiness phone-------------------Home phone--------------------- Room Reservation ---------------- Room deposit: One night deposit payable to Pheasant Run: P.O. Box 64, St. Charles, Illinois 60174, Att. Ms. Carlo Mielczarek Arrival Date-----------------Departure Date------------------- Number of----------Adults, ------------Children, -----------Rooms Room rates: Single Double Standard room $63 $73 Deluxe/Tower room $83 $93 Bi-level Suite $125/Suite (for 3-4 persons) Check in time: 4pm Important Notice ---------------- We must receive TWO checks (one payable to ICPP for conference registration, the other for a one-night deposit payable to Pheasant Run for room reservation), in order to guarantee both registration and room reservation. They must be received July, 24, 1991. Send BOTH checks to : Pheasant Run, P.O. Box 64, St. Charles, IL 60174 att. Ms Carlo Mielczarek Registration fees must be paid in U.S. dollar in a check or money order drawn on a U.S. Bank Checks must be received with registration and reservation forms Cancellation of registration MUST be writing to: Prof. T. Feng EE East Bldg. The Pennsylvania State Univ. University Park, PA 16802 Received by July 24, 1991. Further information: ------------------- Technical contents of the Conference: Prof. C. L. Wu Dept. of Electrical and Computer Engr. Univ. of Texas Austin, Texas 78712 tel: 512-471-4085 Hotel and Conference registration: Ms Carlo Mielczarek Pheasant Run Phone: 312-584-6300