ken@cs.cornell.edu (Ken Birman) (04/16/91)
> From: sc268116@seas.gwu.edu (Kavosh Soltani) > Subject: ISIS Clarification > Date: Tue, 16 Apr 91 7:45:04 EDT > Hi, > I have sent a copy of the following to comp.sys.isis group; I am not certain > if you are more equipped to answer some of the questions. For some reason, the message bounced and I am posting it on your behalf... > Thank you for the help with installation! I have ran the system on 3 SUN 3/80 > 2 SUN 3/60 and 2 HP 9000 conurrently. It works fine! > I am preparing a presentation for our advanced operating system course > at George Washington University. We are evaluating distributed system kits > available on the workstations and reccomending one for future use in the > institution. I am evaluating ISIS, but have a few questions I could not > find the answer to in the manual. Could you take a few minutes and fill in > the answers, if possible? Your help is greatly appreciated.... > PS. We are interested very much in underlying algorithms (not in detail!) > but it will take more time than I presently have to go through all of > the source code. You do not need go in too much detail, for instance > you could say that you use a UNIX semaphore to implement such and so.. We don't use UNIX semaphores at all. In fact, UNIX doesn't have semaphores. Pretty much all ISIS uses from UNIX is the ability to send UDP messages and to find out what time it is! > - What does ISIS mean? (What's the reason behind the name?) The name is a reference to an Egyptian myth in which the Goddess Isis restored Osiris to health after he was torn to pieces by Seth in a battle. I've posted the whole story in the past -- if you skim old postings to this group (on ftp.cs.cornell.edu in pub/comp.sys.isis) you should be able to find the "whole story". Basically, we liked the image. Prof. Amr El Abbadi (U.C. Santa Barbara) suggested the name and gets full credit for a very appropriate choice! > - Are the developers of ISIS the same names appearing on the cover page of > V 2.0 of the manual? Many people have contributed but most of the code in the core of the system was originally written by myself and Tommy Joseph. By now, Tommy has been at SUN for some time and a lot of his code has been rewritten, so I probably wrote more than 80% of the actual lines of code in the current V3.0 release. The rest was written by Robert Cooper, Brad Glade, Frank Schmuck, Alex Siegel, Mark Steiglitz, Pat Stephenson, Mark Wood, and others. > - Is the package owned by Cornell? An early version of ISIS was public domain; not much of the code remains, but any lines that survive unchanged from ISIS V1.2 are still "public domain". V1.3 was the first to include a copyright notice. Version 2.1 of ISIS is the current distribution and is about 75% different from V1.2 (a rough estimate). For example, all of the bypass code is new and the entire message library was rewritten; the gbcast protocol was rewritten, etc. Copyright for ISIS V2.1 is jointly held by the authors; Cornell has relinquished all rights and the US Government requires only that we "disclose" our work, which we obviously do. The new V3.0 release contains a lot of proprietary code that was developed by ISIS Distributed Systems and is not available except under license to the company. So, IDS "owns" V3.0, or at least the changes and extensions it made to turn V2.1 into V3.0. These changes are very extensive. The ISIS Group at Cornell has been granting all reasonable requests to use ISIS V2.1 in any manner at all, commercial, public, or private. However, we reserve the right to review any requests that do not fall strictly under the terms of the copyright notice in the system source. There are no fees for ISIS V2.1; fees for V3.0 are (we think) reasonable. > - Does the scheduler take over after isis_start_done? Actually, the ISIS schedule takes over whenever all other tasks are blocked. It also runs when you call isis_accept_events. The code is in clib in cl_isis.c: run_tasks() > - Why are these tasks lightweight? (how is overhead small and does UNIX > provide features for saving the state of a task before switching?) They share an address space and basically consist of a stack and a set of registers. UNIX only knows about the "process" and even if you have several tasks running, UNIX will think there is only one program active and won't distinguish between them. The tasks switch using a co-routine mechanism. UNIX does provide the feature for saving state: we usually do this with setjmp/longjmp, which does not involve any system calls on most machines. So, context switching just involves saving and loading the registers -- roughtly the cost of a subroutine call. > - If our system does pre-emptive processing, and isis grants isis_mutex to > currently active task, does it mean that, if several instanances of an isis > application (say teller) is run, resources are being wasted? I don't understand this question. > - In short, how can you write tasks in UNIX? (What I mean is, does UNIX > provide system calls so you can tell it which area to use as stack?) We do this by using malloc to allocate a big stack area, computing an initial stack frame, using setjmp to figure out the caller's context, changing the SP and other registers to point to the new frame, then calling longjmp to jump into the new frame and immediately call the user's top-level routine. If there is an existing threads package, like cthreads, we just map our calls into their calls. Our scheme is such that an active task never "exits". Instead, if the user's code returns, we put the task on an idle queue to be recycled on the next request (saves an extra free/malloc). See cl_task.c for details. It is really pretty simple code. > - You mentioned semaphores in introdution to chapter 11 of the manual but > no further references in the documentation. Do you support them? ISIS supports a "token" tool, which is described in detail in the end of Chapter 11. The token tool is basically a distributed binary semaphore. We don't support general semaphores, but this would make a nice exercise if you are trying to get the hang of using ISIS. > - Using bcast, what is the principle of the underlying algorithm the enforces > virtual synchrony? > - How is atomicity guarantee implemented (all or nothing delivery)? You will need to read the papers. The versions of the protocols in the ISIS protocols servers correspond pretty closely to the paper on "Reliable communication in the presense of failures" (ACM TOCS 1987). The more recent work comes from Pat Stephenson's thesis and is based on two papers, on by Aleta Ricciardi on Failure detection using process groups and one by Pat, myself and Andre Schiper that will appear in ACM TOCS later this year (Lightweight Process Groups and Group Multicast). In the later paper the "principle" is an an extension of Lamport's causal clocks, something called a "vector clock" or "vector timestamp". The atomicity guarantee is implemented using a flushing technique. > - What is the exact format of messages? (Are they linked lists or packed > lists and how do you maintain pointers to each field, name. and type?) More or less: lists stored within bigger blocks of memory, but no "links". They look like packed symbol tables: each field has a name, type, length and is followed by the data or, if the data is out of line (%*C, for example), by a pointer to the body. Because messages can share data blocks and can contain other messages, the implementation is somewhat more complex than this sounds. > - We know UNIX calls can block a program. How can we avoid blocking in say > I/O commands? You need to call isis_select and in this way avoid doing any IO call that might block. > - 2-Phase flushing of transactions seem to be interesting. But how do you > guarantee the data is actually saved? (That is following the second phase > one of the processes involved has not crashed before saving.) We don't. YOU force the data when the prepare-to-commit routine is called. But, note that the transaction tool is not really a "common" interface to ISIS. Performance is slow compared to non-transactional data management, because of the log flush the tool does. At any rate, the prepare routine is supposed to do the flush and then say "yes" if the data is secure and "no" if not. It is probably a good idea to flush ahead of time, during idle periods, so that the delay at this point will be as short as possible. The interface we used is based on something that X/Open is developing, called XA. We intend to support the XA interfaces, in fact, in a future release of ISIS. Actually, I only know of one or two groups that have even played with ISIS transactions, and although they work, my guess is that IDS will have to develop a commercial strength version before many people use ISIS this way, e.g. to implement replicated databases. -- Kenneth P. Birman E-mail: ken@cs.cornell.edu 4105 Upson Hall, Dept. of Computer Science TEL: 607 255-9199 (office) Cornell University Ithaca, NY 14853 (USA) FAX: 607 255-4428