leff@smu.UUCP (Laurence Leff) (03/17/88)
Naomi Schulman Publications COMPUTER SYSTEMS LABORATORY STANFORD UNIVERSITY Stanford, CA 94305-4055 RECENT REPORTS & NOTES LIST #12 FEBRUARY 1988 To order reports, see end of this announcement. ABSTRACTS CSL-TR-87-333 MANAGING AND MEASURING TWO PARALLEL PROGRAMS ON A MULTIPROCESSOR Jerry C Yan June 1987 71 pages.....$6.05 Research is being conducted to determine how distributed computations can be mapped onto multiprocessors so as to minimize execution time. Instead of employing optimization techniques based on some abstract program/machine models, the approach being investigated here (called "post-game analysis") is based on placement heuristics which utilizes program execution history. Although initial experiments have demonstrated that "post-game analysis" indeed discovered mappings that exhibit significantly shorter execution times than the worst cases for the programs tested, three important issues remain to be addressed: i) the need to evaluate the performance of placement heuristics against the "optimal" speed-up attainable, ii) to find evidence to help explain why these heuristics work and iii) to develop better heuristics by understanding how and why the basic set performed well. Parallel program execution was simulated using "Axe"--an integrated environment for computation model description, processor architecture specification, discrete-time simulation and automated data collection. Five groups of parameters are measured representing different aspects in the concurrent execution environment: (i) overall measurements, (ii) communication parameters, (iii) cpu utilization, (iv) cpu contention and (v) dependencies between players. Two programs were simulated--a "pipe-line" of players and a "divide-and-conquer" program skeleton. The results showed that program execution time indeed correlated well with some of the parameters measured. It was also shown that "post-game" analysis achieved close to 96% optimal speed-up for both programs in most cases. ------------------------------------------------------------------------------- CSL-TR-87-334 TASK SEQUENCING LANGUAGE FOR SPECIFYING DISTRIBUTED ADA SYSTEMS TSL-1 D.C. Luckham, D.P. Helmbold, S. Meldal, D.L. Bryan & M.A. Haberler June 1987 46 pages.....4.80 TSL-1 is a language for specifying sequences of tasking events occuring in the execution of distributed Ada programs. Such specifications are intended primarily for testing and debugging of Ada tasking programs, although they can also be applied in designing programs. TSL-1 specifications are included in an Ada program as formal comments. They express constraints to be satisfied by the sequences of actual tasking events. An Ada program is consistent with its TSL-1 specifications if its runtime behavior always satisfies them. This paper presents an overview of TSL-1. The features of the language are described informally, and examples illustrating the use of TSL-1, both for debugging and for specification of tasking programs, are given. A definition of robust TSL-1 specifications that takes into account uncertainty in runtime observation of behavior of distributed systems is given. A runtime monitor for checking consistency of an Ada program with TSL-1 specifications has been implemented. In the future, constructs for defining abstract tasks will be added to TSL-1, forming a new language, TSL-2, for the specification of distributed systems prior to their implementation in any particular programming language. ------------------------------------------------------------------------------- CSL-TR-87-335 ALLOCATIONS OF OBJECTS CONSIDERED AS NONDETERMINISTIC EXPRESSIONS - TOWARDS A MORE ABSTRACT AXIOMATICS OF ACCESS TYPES Sigurd Meldal September 1987 21 pages.....$3.55 The concept of access ("reference" or "pointer") values is formalized as parametrized abstract data types, using the axiomatic method of Guttag and Horning as extended by Owe. Two formalizations are given. The first is a formalization of the approach used in the definition of a partial correctness system for Pascal by Hoare and Wirth. Its lack of abstraction is pointed out. This is caused by the annotation language being too expressive. An approach is taken which results in a more abstract system: The expressiveness of the annotation language is reduced and the allocation operator is viewed as a nondeterministic expression. This reinterpretation of the program language results in an appropriate level of abstraction of the proof system. An example is given, verification of a package defining a set type. ------------------------------------------------------------------------------- CSL-TR-87-336 BIBLIOGRAPHY OF THE COMPUTER SYSTEMS LABORATORY, 1968 - 1987 Naomi Schulman October 1987 64 pages.....$5.70 This report lists, in chronological order, all technical reports and notes published by the Computer Systems Laboratory of Stanford University, from 1968 to date. Information regarding availability, prices, and alternative sources is included. ------------------------------------------------------------------------------- CSL-TR-87-337 DESIGN OF TESTBED AND EMULATION TOOLS Michael J. Flynn and Stephen Lundstrom October 1987 42 pages.....$4.60 In order to understand how to predict the performance of concurrent computing systems, an experimental environment is needed. The purpose of the research conducted under the grant was to investigate various aspects of this environment. A first performance prediction system was developed and evaluated (by comparison both with simulations and with actual systems). The creation of a second, complementary system is well underway. ------------------------------------------------------------------------------- CSL-TR-87-338 SPARSE, DISTRIBUTED MEMORY PROTOTYPE: PRINCIPLES OF OPERATION M. J. Flynn, P. Kanerva, B. Ahanin, N. Bhadkamkar, P. Flaherty, and P. Hickey October l987 97 pages.....$7.35 Sparse distributed memory is a generalized random-access memory (RAM) for long (e.g., 1,000 bit) binary words. Such words can be written into and read from the memory, and they can also be used to address the memory. The main attribute of the memory is sensitivity to similarity, meaning that a word can be read back not only by giving the original write address but also by giving one close to it as measured by the Hamming distance between addresses. Large memories of this kind are expected to have wide use in speech and scene analysis, in signal detection and verification, and in adaptive control of automated equipment---in general, in dealing with real-world information in real time. The memory can be realized as a simple, massively parallel computer. Digital technology has reached a point where building large memories is becoming practical. This research project is aimed at resolving major design issues that have to be faced in building the memories. This report describes the design of a prototype memory with 256-bit addresses and from 8K to 128K locations for 256-bit words. A key aspect of the design is extensive use of dynamic RAM and other standard components. ------------------------------------------------------------------------------- CSL-TR-87-339 MIPS-X: THE EXTERNAL INTERFACE Arturo Salz, Anant Agarwal, and Paul Chow October 1987 42 pages.....$4.60 MIPS-X is a 20-MIPS-peak VLSI processor designed at Stanford University. This document describes the external interface of MIPS-X and the organization of the MIPS-X processor system, including the external cache and coprocessors. The external interface has been designed to optimize the paths between the processor, the external cache and the coprocessors. The signals used by the processor and their timing are documented here. Signal use and timings during exceptions and cache misses are also shown. ------------------------------------------------------------------------------- CSL-TR-87-340 STORAGE RECLAMATION MODELS FOR ADA PROGRAMS Geoffrey Mendal October 1987 23 pages.....$3.65 Given Ada's semantics regarding dynamically allocated objects, do programmers believe that storage reclamation is impractical? At first glance, it would appear that given these semantics, one cannot derive workable models for reclaiming all unneeded objects. In reality, Ada provides features that allows programmers to define storage reclamation models that operate at close to 100 percent capacity. This paper describes methods by which Ada programs can reclaim objects. Examples of Ada storage reclamation models are presented along with their associated algorithms. A taxonomy of units that perform storage reclamation is also discussed. ------------------------------------------------------------------------------- CSL-TR-87-341 LINKING PROGRAMS INCREMENTALLY Mark A. Linton and Russell W. Quong December 1987 14 pages.....$3.20 Linking is traditionally a batch process that resolves cross references between object modules and run-time libraries to produce a stand-alone executable image. Because most program changes only involve a small part of the program, we have implemented an incremental linker, named inclink, that processes only the changed modules. Inclink generates a new executable in time proportional to the size of a change; in contrast, a batch linker generates an excutable in time proportional to the size of the program. To minimize updates to the executable, inclink allocates extra space for every module. By allocating 25% of the excutable for overflow space, inclink can update a module in place over 95% of the time. Measurements show that inclink is more than an order magnitude faster than the Unix batch linker and that 85% of all links will take less than two seconds on a MicroVAX-2, independent of program size. ------------------------------------------------------------------------------- CSL-TR-87-342 INTERPROCEDURAL ANALYSIS USELESS FOR CODE OPTIMIZATION S. Richardson and M. Ganapathi November 1987 32 pages.....$4.10 The problem of tracking data flow across procedure boundaries has a long history of theoretical study by people who believed that such information would be useful for code optimization. Building upon previous work, we have implemented an algorithm for interprocedural data flow analysis. The algorithm produces three flow-insensitive summary sets: MOD, USE, and ALIASES. The utility of the resulting information was investigated using an optimizing Pascal compiler. Over a sampling of 27 benchmarks, we found that additional optimizations performed as a result of interprocedural summary information contributed almost nothing to program execution speed. ------------------------------------------------------------------------------- Naomi Schulman COMPUTER SYSTEMS LABORATORY Stanford University Stanford, CA 94305-4055 415-723-1430 EM: CSL.Schulman@Sierra.Stanford.Edu Hours: M-Th, 8-1 ORDER FORM To Order Reports: Print or type your name and address in the space provided. Check or circle the report(s) you wish to purchase, whether hardcopy or microfiche. All orders must be PREPAID. CALIFORNIA RESIDENTS must add sales tax of their local county. Return this order form with your check or money order made payable to Stanford University. FOREIGN ORDERS must be paid with an international money order or a check drawn on a U.S. bank, payable in dollars. Please type or print your name and complete address: ______________________________________________ ______________________________________________ ______________________________________________ ______________________________________________ ______________________________________________ ______________________________________________ Report # Hardcopy Microfiche Report # Hardcopy Microfiche -------- -------- ---------- -------- -------- ---------- TR 333 $6.05 $2.00 TR 338 $7.35 $2.00 TR 334 $4.80 $2.00 TR 339 $4.60 $2.00 TR 335 $3.55 $2.00 TR 340 $3.65 $2.00 TR 336 $5.70 $2.00 TR 341 $3.20 $2.00 TR 337 $4.60 $2.00 TR 342 $4.10 $2.00 Subtotal __________ Your Local CA County tax __________ Total __________ -------