mosse@himnariki.cs.umd.edu (Daniel Mosse') (12/15/90)
Here is some information on the MARUTI Real-Time OS. It is being developed at the University of Maryland at College Park This message gives an overview of the system, and some of the future work. The current version is a prototype, intended only for the proof of concept, not as a full system. ----------------------------------------------------------------- MARUTI: A Hard Real-Time Operating System The MARUTI operating system is designed to support hard real-time applications on a variety of distributed systems while providing a fault tolerant operation. It is an object oriented design and provides a communication and an allocation mechanism that allows transparent use of the resources of a distributed system. MARUTI supports guaranteed-service scheduling, in which jobs that are accepted by the system are guaranteed to meet the time constraints of the computation requests under the requested degree of fault tolerance. As a consequence, MARUTI applications can be executed in a predictable, deterministic fashion. The development of hard real-time applications requires that the analyst estimate the resource requirements for all parts of the computation and that the system ensure availability of resources in a timely manner to meet the timing constraints. Since this is a very cumbersome process, as a part of MARUTI system a set of tools have been developed to support the hard real-time applications during various phases of their life cycle. Structures MARUTI is object-oriented system whose unit entity is an object. While the concepts of object and the encapsulation have been used in many systems, in order to incorporate these concepts in a hard real-time system some extensions have been made in MARUTI. Objects in MARUTI consist of two main parts: a control part (or joint), and a set of service access points (SAPs), which are entry points for the services offered by an object. A joint is an auxiliary data structure associated with every object. Each joint maintains information about the object (e.g., computation time, protection and security information) and its requirements (service and resource requirements). Timing information, also maintained in the joint, is dynamic and includes all the temporal relations among objects. Such information is kept in a calendar, a data structure, ordered by time, which contain the name of the services that will be executed and the timing information for each execution. An application is depicted by a collection of objects gathered in a DAG called computation graph. The vertices represent the services, and the arcs the data or precedence relationship between the two vertices. A job is submitted to the system by naming the root of the computation graph. A job may have timing and fault tolerance requirements associated with it. Objects communicate with one another via semantic links. These links are called semantic because they also perform type and range checking in the values they carry. This concept permits implementation of semantic checks as a part of the implementation of the communication link. Objects that reside in different sites need agents as their representative on a remote site. It is responsible not only for the remote transmission of messages but also for the external data translation of these messages. Security functions, such as decryption/encryption, can also be specified for the semantic links, such as, decryption/encryption. There are two types of jobs in MARUTI, namely real-time and non-real-time, justifiable due to the requirement of a reactive system (i.e., those that accept new jobs while executing already-accepted guaranteed jobs). A real-time job is assumed to have a hard deadline and an earliest start time. For non-real-time jobs, no time constraints are specified and, therefore, jobs are executed on the basis of time and resource availability. Priorities can be easily incorporated, for example, by implementing a scheme for the revocation of jobs or a multi-priority queue. We note that non-real-time jobs are assumed to be preemptable so that their processing requirements can be satisfied in the time slots which are available between the executions of real-time jobs. MARUTI views the distributed resources as organized in various name domains. A name domain contains a mutually exclusive set of resources. This concept is used in the implementation of the fault tolerance using replication. In addition, such division of resources is useful for the distribution, load balancing, faults independence, and feasibility of fault tolerant schemes. Approach and Principles MARUTI is organized in three distinct levels, namely the kernel, the supervisor, and the application levels. The kernel is a collection of core-resident server objects. The kernel is the minimum set of servers needed at execution time and is comprised of resource manipulators. The functions of the kernel are: dispatching, loading, time manipulation, and communication. The main task of the supervisor level objects is to prepare the jobs for execution by making reservations through the resource manipulators. The services provided at the supervisor level include: allocation, schedule verification, binding, login service, and name service. The object principle and the use of the joints allow each access to an object to be direct, and the binding philosophy of the operating system supports it uniformly. Access to an executing object is an invocation of a particular service of that object. For timing reasons, all access rights are checked at pre-runtime. The resources needed for the execution of the applications are reserved through the resource verifiers at the supervisor level, prior to the the start-time of the application. The communication channels, CPU, memory, disks, and all other necessary resources are made available so that contention is ruled out, and timing guarantees can be issued by the system. The use of joints, and specifically of calendars, allows verification of schedulability, reservation of guaranteed services, and synchronization. Projection mechanisms support propagation of time constraints between different localities. These projections maintain the required event ordering and assure the satisfaction of the timing constraints. Furthermore, these data structures facilitate the management of mutual exclusion and protection mechanisms. Exceptions and validity tests are reduced after a semantic link is established. The links are created by the binding and loading processes and the protection mechanisms are activated and authorizations are established prior to run-time, to allow direct access afterwards. Semantic links to remote objects are established through their respective agents. Fault tolerance is an integral part of the design of MARUTI. The joint of each object may implement the fault detection, monitoring, local recovery, and reporting. Initially, each joint contains a consistency control mechanism to manage alternatives (redundant objects with state information) or replicas (redundant stateless objects). The resource allocation algorithm supports a user-defined level of fault tolerance, where redundancy can be established temporally (execute again) or physically (parallel execution of alternatives). A capability based approach is used for protection and security. This system is completely predefined prior to execution of the jobs. The necessary information for the capabilities are stored in the joint, and the capability itself is furnished by the user. Jobs in MARUTI are viewed by the system as computation graphs. Tools are provided to assist in the design and verification of applications. To use the primitives and tools developed, a set of language extensions is required. For that reason, a precompiler is used to convert the MARUTI program into standard programming language and also automatically generates the joints. While MARUTI supports many different fault tolerance mechanisms, applications can be written without knowledge of the policy used. STATUS A prototype is running at the University of Maryland, College Park. This version supports a distributed environment, with a user defined level of fault tolerance. The prototype runs using a virtual clock, and is dependent on the underlying system calls. Scheduling tools have been developed showing the calendar utilization in a system wide calendar, as well as local calendars. A resource requirement and computation graph interactive display allows users to modify the resource user for each service. All graphical iterfaces are carried out using X windows. CONTACTS: Olafur Gudmundsson Internet: ogud@cs.umd.edu Daniel Mosse' mosse@cs.umd.edu Dept. of Computer Science Tel: (301)-405-2767 University of Maryland FAX: (301)-405-6707 College Park MD. 20742 REFERENCES, as of November 1990 - MARUTI: A Hard Real-Time Operating System O. Gudmundsson, D. Mosse, A. K. Agrawala, S. K. Tripathi, Proceedings of Second IEEE Workshop on Experimental Distributed Systems, Huntsville Oct 1990. - Invisible Resource Usage in UNIX. O. Gudmundsson, D. Sanghi, A. Agrawala, A. Thareja, CS-TR-2509, 7/90. - Mission Critical Planning: AI on the MARUTI Real-Time Operating System. J. Hendler, A. Agrawala, CS-TR-2486, 6/90. - Language Support for Maruti Real-Time System. V. Nirkhe, S. Tripathi, A. Agrawala, CS-TR-2481, 5/90 and 11th IEEE RTSS. - Evaluation of a Decomposition Approach for Real-Time Scheduling Using a Stochastic Model. X. Yuan, A. Agrawala, CS-TR-2462, 4/90. - Real Time System Design A. Agrawala, S. Levi, McGraw Hill, New York, 1990. - Scheduling Real-Time Tasks in Single Schedule Subsets. X. Yuan, A. Agrawala, CS-TR-2347, 11/89. - Decomposition with a Strongly-Leading Relation for Hard Real-Time Scheduling. X. Yuan, A. Agrawala, CS-TR-2346, 11/89. - A Decomposition Approach to Nonpreemptive Real-Time Scheduling. X. Yuan, A. Agrawala, CS-TR-2345, 12/89. - Resource Allocation for Fault Tolerant Systems Using External Backups. Y. Huang, S. Tripathi, CS-TR-2343, 11/89. - Mission Critical Operating Systems Requirements and the MARUTI Project. A. Agrawala, O. Gudmundsson, D. Mosse, CS-TR-2342, 11/89. - MARUTI: A Platform for Hard Real-Time Applications. A. Agrawala, S. Tripathi, O. Gudmundsson, D. Mosse, K. Ko, 1989 Workshop on Operating Systems for Mission Critical Computing, September 19-21, 1989. - Synchronization in Hard Real-Time Systems. V. Nirkhe, S. Tripathi, CS-TR-2337, 10/89. - MARUTI: An Environment for Hard Real-Time Applications. O. Gudmundsson, D. Mosse, K. Ko, A. Agrawala, S. Tripathi, CS-TR-2328, 10/89. - The MARUTI Hard Real-Time Operating System. A. Agrawala, S. Tripathi, S. Carson, S. Levi, ACM Operating Systems Review, Volume 23, Number 3, July 1989. - Real-Time Scheduling with Both Preemption and Nonpreemption Requirements. X. Yuan, A. Agrawala, CS-TR-2248, 4/89. - An Efficient Communication Structure for Decentralized Algorithms with Fault Tolerance. S. Yuan, A. Agrawala, CS-TR-2206, 2/89. - Allocation of Real-Time Computations under Fault Tolerance Constraints. S. Levi, D. Mosse, A. Agrawala, IEEE Real-Time Systems Symposium, Huntsville, AL, Dec 1988. CS-TR-2018, 5/88. - A Methodology for Designing Distributed, Fault-Tolerant, Reactive, Real-Time Operating Systems, S. Levi, Ph.D. Dissertation, 1988. - Objects Architecture for Real-Time Operating Systems, A. Agrawala, S. Levi, IEEE Workshop on Real-Time Operating Systems, pp 142-148, Cambridge, MA, July 1987. - A Structuring Framework for Distributed Operating Systems. J. Nehmer, CS-TR-2079, 7/88. - Introducing the MARUTI Hard Real-Time Operating System. S. Levi, A. Agrawala, S. Tripathi, CS-TR-2010, 4/88. - An Object Architecture for Hard Real-Time Systems. J. Nehmer, CS-TR-2003, 3/88. - Scheduling Tasks in a Real-Time System. P. Chintamaneni, X. Yuan, S. Tripathi, A. Agrawala, CS-TR-1991, 2/88. - Scheduling in Real-Time Distributed Systems-A Review. X. Yuan, S. Tripathi, A. Agrawala, CS-TR-1955, 12/87. - Temporal Relations and Structures in Real-Time Operating Systems. S. Levi, A. Agrawala, CS-TR-1954, 12/87. - On Fault Tolerance in Manufacturing Systems. Y. Shieh, S. Tripathi, P. Chintamaneni, P. Jalote, CS-TR-1939, 10/89. - An Analysis of a Buddy System for Fault Tolerance. D. Finkel, S. Tripathi, CS-TR-1924, 8/87. - Objects Architecture: A Comprehensive Design Approach for Real-Time Distributed, Fault-Tolerant, Reactive Operating Systems. S. Levi, A. Agrawala, CS-TR-1915, 8/87. - On Real-Time Systems Using Local Area Networks. S. Levi, S. Tripathi, CS-TR-1892, 7/87. - On Real-Time Operating Systems. S. Levi, A. Agrawala, CS-TR-1838, 4/87. - Real-Time Programs: Design Implementation and Validation-A Survey. S. Levi, A. Agrawala, CS-TR-1837, 4/87. -- Daniel Mosse' voice: (301) 405-2723 Dept of Computer Science e-mail: UUCP: uunet!mimsy!mosse University of Maryland Internet: mosse@cs.umd.edu College Park, MD 20742