masticol@cadenza.rutgers.edu (Steve Masticola) (10/25/90)
Can anyone help out with a reference to "firewalls"? A professor here says they're a modularization structure which is intended to stop the spread of error effects within a software system. Unfortunately, he doesn't remember where he saw the reference. Thanks for your help! - Steve (masticol@athos.rutgers.edu)
strom@arnor.uucp (10/26/90)
In article <Oct.24.22.04.05.1990.393@cadenza.rutgers.edu>, masticol@cadenza.rutgers.edu (Steve Masticola) writes: |> Can anyone help out with a reference to "firewalls"? A professor here |> says they're a modularization structure which is intended to stop the |> spread of error effects within a software system. Unfortunately, he |> doesn't remember where he saw the reference. |> |> Thanks for your help! |> |> - Steve (masticol@athos.rutgers.edu) I don't know if this is the reference intended by your professor, but I used this term in my paper with Shaula Yemini, ``Typestate: A Programming Language Concept for Enhancing Software Reliability'' (IEEE Trans. Software Eng., SE-12, 1, January 1986). As you point out, a firewall is a form of protection. If programs A and B are running together on the same machine, and program A contains an error, it is desirable to confine the effects of the error so that program B is not affected. Having firewalls improves reliability (since program B will still work), security (since program A cannot sabotage program B), and problem determination (if program A misbehaves, I need not search in B for the possible cause). All other things being equal, the finer-grained your firewalls, the better. The most common firewalls are the address spaces provided by operating systems---e.g. UNIX processes, Mach tasks, VM ``virtual machines''. These firewalls are effective, but heavyweight. Communication across address spaces is more expensive than communication within address spaces. It would not be practical to put each *module* of a large software system in a separate address space. We would ideally like firewalls to have a granularity as small as a single module, but without any performance penalty. Our approach is to rely on compile-time checking to detect those kinds of programming errors which, if undetected, would result in undefined, implementation-dependent side-effects on other modules. Conventional type-checking is inadequate, since many program bugs are the result of issuing otherwise correct operations *in the wrong order* --- e.g. storing into a buffer before it has been allocated. *Typestate* checking is a dataflow technique which statically identifies which subset of operations on a particular data object are legal at which program points. When an operation is issued from an incorrect context, it is flagged as an error. We incorporated typestate checking in an experimental language called NIL (Network Implementation Language), and in Hermes. In these languages, any module which successfully compiles is guaranteed at execution time not to corrupt other modules, even if these modules are running in the same address space. We thus get the effect of fine-grained ``firewalls'' between modules without the performance penalty. Modules belonging to different applications can safely coexist in one address space, but can communicate as cheaply as modules of the same application. As an additional benefit, we catch an additional class of programming errors at compile-time. What do you give up to get typestate checking? First, typestate checking requires that any aliasing of variables be detectable by the compiler. As has already been discussed on comp.lang.misc, Hermes is a pointer-free and hence alias-free language, so this requirement was already met. Other languages would have to be restricted to meet this requirement. Second, you have to structure your program to avoid ambiguous typestates. For example, a variable which is initialized on some paths to a statement and uninitialized on other paths would have to be explicitly declared as a variant. In our experience, the potential for error detection, the gain in efficiency of cross-application communication, and the security/reliability/debugging advantages of firewalls are benefits which are well worth the costs. -- Rob Strom, strom@ibm.com, (914) 784-7641 IBM Research, 30 Saw Mill River Road, P.O. Box 704, Yorktown Heights, NY 10958
nick@cs.edinburgh.ac.uk (Nick Rothwell) (10/26/90)
In article <1990Oct25.193935.375@arnor.uucp>, strom@arnor.uucp writes: > Conventional type-checking is inadequate, since > many program bugs are the result of issuing otherwise correct operations > *in the wrong order* --- e.g. storing into a buffer before it has been > allocated. Just a minor point here - that's a fault with conventional procedural languages with assignable variables, and nothing to do with typechecking. Functional and logic languages don't have this problem at all. > In our experience, the potential for > error detection, the gain in efficiency of cross-application communication, > and the security/reliability/debugging advantages of firewalls are > benefits which are well worth the costs. -- Nick Rothwell, Laboratory for Foundations of Computer Science, Edinburgh. nick@lfcs.ed.ac.uk <Atlantic Ocean>!mcsun!ukc!lfcs!nick ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ "Now remember - and this is most important - you must think in Russian."
gudeman@cs.arizona.edu (David Gudeman) (10/27/90)
In article <961@skye.cs.ed.ac.uk> nick@cs.edinburgh.ac.uk (Nick Rothwell) writes: ]In article <1990Oct25.193935.375@arnor.uucp>, strom@arnor.uucp writes: ]> many program bugs are the result of issuing otherwise correct operations ]> *in the wrong order* ]Just a minor point here - that's a fault with conventional procedural ]languages with assignable variables, and nothing to do with typechecking. ]Functional and logic languages don't have this problem at all. First, some functional languages and the only well-known logic language (Prolog) certainly _do_ have this problem. If you do things in the wrong order, you may get peculiar results like non-termination. Second, the implication that this would show some sort of superiority for non-procedural languages is just wrong. I could just as well say "most problems in functional programs are caused by composing the right functions in the wrong ways -- procedural languages don't have this problem since you can't compose functions." or "most problems in logic languages are caused by unifying otherwise correct terms with the wrong variables -- procedural languages don't have this problem since they don't have unification". Basically someone screwed-up somewhere, and changing the paradigm just changes the nature of the screw-ups, not the quantity or the quality. This is not to say that changing the language itself can't reduce errors, just that there is no reason to assume that functional or logic languages by the fact of being function-based or predicate logic-based reduce errors. To be sure, functional and logic languages tend to be higher-level than procedural languages, and higher-level languages seem to reduce errors, but this matter of semantic level is independent of whether the language is based on functions, predicates, procedures, or some combination of the above. -- David Gudeman Department of Computer Science The University of Arizona gudeman@cs.arizona.edu Tucson, AZ 85721 noao!arizona!gudeman
zed@mdbs.uucp (Bill Smith) (10/28/90)
In article <1990Oct25.193935.375@arnor.uucp> strom@andreadoria.watson.ibm.com (Rob Strom) writes: > >In article <Oct.24.22.04.05.1990.393@cadenza.rutgers.edu>, masticol@cadenza.rutgers.edu (Steve Masticola) writes: >|> Can anyone help out with a reference to "firewalls"? A professor here >|> says they're a modularization structure which is intended to stop the >|> spread of error effects within a software system. Unfortunately, he >|> doesn't remember where he saw the reference. >|> >|> Thanks for your help! >|> >|> - Steve (masticol@athos.rutgers.edu) > >I don't know if this is the reference intended by your professor, but I used >this term in my paper with Shaula Yemini, ``Typestate: A Programming Language >Concept for Enhancing Software Reliability'' (IEEE Trans. Software Eng., >SE-12, 1, January 1986). > >As you point out, a firewall is a form of protection. If programs A and B >are running together on the same machine, and program A contains an error, >it is desirable to confine the effects of the error so that program B >is not affected. Having firewalls improves reliability (since program >B will still work), security (since program A cannot sabotage program B), >and problem determination (if program A misbehaves, I need not search >in B for the possible cause). I find this an excellent philosophy for life under pressure although I haven't empirically determined it's worth. It might be too rigid for me, personally. (Of course, I'me too rigid for me, personally too. ;-) The idea I mean is that each person has to have their own private self that they keep alone from anyone else. In this way, they become one with their private self and are able to keep it in whatever shape that want to. "Give me some space." is a slang equivalent to "You're trying to beat on my firewall. Let me keep myself reliable so that I'm able to be of use to you when you need me. I still love you, but everyone has their limits too." >All other things being equal, the >finer-grained your firewalls, the better. Each firewall should be set according to the desires of the one they protect. If a program doesn't work with big, cement firewalls, then it shouldn't have them. "Tear down the walls." But, the walls have to be there until every works the way the program requires. >We would ideally like firewalls to have a granularity as small as a single >module, but without any performance penalty. Our approach is to >rely on compile-time checking to detect those kinds of programming >errors which, if undetected, would result in undefined, >implementation-dependent side-effects on other modules. Well, run time checking will always be necessary. Someone might spill some some Coke on the motherboard. (or spill some coke down your nose... ;-) You don't know what might happen, so it's hard to be able to have a program without some run-time firewalls, some boundaries that are always set to prevent it from hurting itself by reformatting the hardware. This is the essence of fault tolerance. >Conventional type-checking is inadequate, since >many program bugs are the result of issuing otherwise correct operations >*in the wrong order* --- e.g. storing into a buffer before it has been >allocated. A program has to be willing to work with the operating system first, then with itself, not the other way around. If you don't know your own software, how can you be sure how you relate to the operating system. I think I know myself, but do I really? How can I KNOW myself? I don't know what I'll do next. I pray that it will be good for me and good for others, but I must take a leap of faith that the OS has been designed by a good OS development team. Even if there is "proof" that my program and the OS work together, I'll still have to take a leap of faith to accept the proof, the proof system or whatever it is that makes me sure. >*Typestate* checking is a dataflow technique which >statically identifies which subset of operations on a particular data >object are legal at which program points. Personifying this technique, what I'll do is (somehow) find out ahead of time a set of rules that each person (program) needs to keep his or her firewalls in good shape. These rules will have to be appropriate for a particular data object (person) and chosen by that person (program points). The typestate (person) is up to the situation at a given time to follow to maintain reality checks (assertions in CS lingo) so that if a violation occurs somewhere, the firewall may be activated in its pure force to support a safe landing and prevent program (program) crashes. >When an operation is >issued from an incorrect context, it is flagged as an error. I will complain if you hit me, but is it an error? Only by examining the context will we be able to know for sure. Until you are able to read my mind without even looking at me (I hope you will soon) you won't know my context. >We incorporated typestate checking in an experimental language called >NIL (Network Implementation Language), and in Hermes. In these languages, >any module which successfully compiles is guaranteed at execution time >not to corrupt other modules, even if these modules are running in the >same address space. We thus get the effect of fine-grained ``firewalls'' >between modules without the performance penalty. Modules belonging >to different applications can safely coexist in one address space, >but can communicate as cheaply as modules of the same application. >As an additional benefit, we catch an additional class of programming >errors at compile-time. Wow! I am impressed with your creativity. Please do not send me any information about the implementation details of your project so that I am not obligated to IBM for possible infringement unless you have patented these ideas. >What do you give up to get typestate checking? First, typestate checking >requires that any aliasing of variables be detectable by the compiler. >As has already been discussed on comp.lang.misc, Hermes is a pointer-free >and hence alias-free language, so this requirement was already met. >Other languages would have to be restricted to meet this requirement. What is required is self-discipline not laws and restrictions. >Second, you have to structure your program to avoid ambiguous typestates. >For example, a variable which is initialized on some paths to a >statement and uninitialized on other paths would have to be explicitly >declared as a variant. Life does not seem to have these requirements. All names (aliases) belong to God who prevents the confusion that could result from literal interpretation of each sound uttered. >In our experience, the potential for >error detection, the gain in efficiency of cross-application communication, >and the security/reliability/debugging advantages of firewalls are >benefits which are well worth the costs. Ambiguity is inherent in life. Life is without price. >Rob Strom, strom@ibm.com, (914) 784-7641 >IBM Research, 30 Saw Mill River Road, P.O. Box 704, Yorktown Heights, NY 10958 Bill Smith pur-ee!mdbs!zed [Specific disclaimer: The use I would like to put these ideas is not part of any project planned by the management of mdbs Inc.]
nick@cs.edinburgh.ac.uk (Nick Rothwell) (10/29/90)
In article <26865@megaron.cs.arizona.edu>, gudeman@cs.arizona.edu (David Gudeman) writes: > In article <961@skye.cs.ed.ac.uk> nick@cs.edinburgh.ac.uk (Nick Rothwell) writes: > ]In article <1990Oct25.193935.375@arnor.uucp>, strom@arnor.uucp writes: > ]> many program bugs are the result of issuing otherwise correct operations > ]> *in the wrong order* > > ]Just a minor point here - that's a fault with conventional procedural > ]languages with assignable variables, and nothing to do with typechecking. > ]Functional and logic languages don't have this problem at all. > > First, some functional languages and the only well-known logic > language (Prolog) certainly _do_ have this problem. If you do things > in the wrong order, you may get peculiar results like non-termination. Ok, I stand corrected. The way I read the original article is that there are problems with referring to variables which are unassigned or which go out of scope (dangling pointers and the like). Higher-level languages don't have these problems. But, yes, you still have to "do things in the right order" in the sense you mean. > Basically someone screwed-up somewhere, and changing the paradigm > just changes the nature of the screw-ups, not the quantity or the > quality. This is not to say that changing the language itself can't > reduce errors, just that there is no reason to assume that functional > or logic languages by the fact of being function-based or predicate > logic-based reduce errors. True, but I think the fact that assignment is non-existent (or kept to a minimum), the fact that the languages are garbage-collected and heap-safe, make them "better" in this sense. That isn't an argument about which paradigm to use (and that wasn't really the impression I wanted to give). > David Gudeman Nick. -- Nick Rothwell, Laboratory for Foundations of Computer Science, Edinburgh. nick@lfcs.ed.ac.uk <Atlantic Ocean>!mcsun!ukc!lfcs!nick ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ "Now remember - and this is most important - you must think in Russian."
gudeman@cs.arizona.edu (David Gudeman) (10/30/90)
In article <1087@skye.cs.ed.ac.uk> nick@cs.edinburgh.ac.uk (Nick Rothwell) writes:
]Ok, I stand corrected. The way I read the original article is that there
]are problems with referring to variables which are unassigned or which
]go out of scope (dangling pointers and the like). Higher-level languages
]don't have these problems...
If that's what you meant, then I agree, assuming we are using the same
definition of "higher-level" (but I don't think we are). When you say
]...the fact that assignment is non-existent (or kept
]to a minimum),
I suspect that you have a mental linkage between the term
"higher-level" and the term "applicative". You aren't alone in this,
but I think there are two distinct concepts there, and they should be
kept seperate. I _will_ agree with
]the fact that the languages are garbage-collected and
]heap-safe, make them "better" in this sense.
since if you don't have automatic storage managment, then you are
extremely limited in the types of first-class objects you can have.
In fact, I am tempted to define "higher-level" in terms of the
built-in data types the language suports.
--
David Gudeman
Department of Computer Science
The University of Arizona gudeman@cs.arizona.edu
Tucson, AZ 85721 noao!arizona!gudeman
nick@cs.edinburgh.ac.uk (Nick Rothwell) (10/30/90)
In article <26931@megaron.cs.arizona.edu>, gudeman@cs.arizona.edu (David Gudeman) writes: > In article <1087@skye.cs.ed.ac.uk> nick@cs.edinburgh.ac.uk (Nick Rothwell) writes: > I suspect that you have a mental linkage between the term > "higher-level" and the term "applicative". You aren't alone in this, > but I think there are two distinct concepts there, and they should be > kept seperate. You're probably right; that's because the properties I associate with higher level languages (less restrictions on built-in datatypes, first-class status of data objects, extensible types, heap security, abstraction, interfaces, modularisation and so on) are mostly seen in applicative languages. I'm sure that a non-applicative language could support these properties as well, but I'm not aware of one (although Eiffel comes close, I suppose, and Modula-3, although it's fairly conventional). > since if you don't have automatic storage managment, then you are > extremely limited in the types of first-class objects you can have. > In fact, I am tempted to define "higher-level" in terms of the > built-in data types the language suports. ... and how it allows them to be used (as arguments, results, via polymorphism, in abstractions, and so on). I'd also judge the level of the language by the sophistication, flexibility, and *soundness* of the type system (which excludes a lot of languages). Note that I've refrained from mentioning "pointers"... :-) > David Gudeman -- Nick Rothwell, Laboratory for Foundations of Computer Science, Edinburgh. nick@lfcs.ed.ac.uk <Atlantic Ocean>!mcsun!ukc!lfcs!nick ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ ~~ "Now remember - and this is most important - you must think in Russian."
strom@arnor.uucp (11/01/90)
In article <1132@skye.cs.ed.ac.uk>, nick@cs.edinburgh.ac.uk (Nick Rothwell) writes: |> In article <26931@megaron.cs.arizona.edu>, gudeman@cs.arizona.edu (David Gudeman) writes: |> > In article <1087@skye.cs.ed.ac.uk> nick@cs.edinburgh.ac.uk (Nick Rothwell) writes: |> > I suspect that you have a mental linkage between the term |> > "higher-level" and the term "applicative". You aren't alone in this, |> > but I think there are two distinct concepts there, and they should be |> > kept seperate. |> I agree with David. Applicative languages are assignment-free languages based on function application. High-level languages are languages in which machine representations are hidden from the programmer --- the compiler is free to choose data representations and to apply ``aggressive optimizations''. Low-level languages retain ``performance transparency'' --- the property that the reader of the source program can determine what the implementation will be doing at least to the degree that performance can be estimated. Applicative languages and imperative languages can be either high or low level. |> You're probably right; that's because the properties I associate with |> higher level languages (less restrictions on built-in datatypes, first-class |> status of data objects, extensible types, heap security, abstraction, |> interfaces, modularisation and so on) are mostly seen in applicative |> languages. I'm sure that a non-applicative language could support these |> properties as well, but I'm not aware of one (although Eiffel comes close, |> I suppose, and Modula-3, although it's fairly conventional). |> Hermes meets all the requirements that you list. (1) All datatypes, including builtin datatypes, are machine-independent. Word size, structure layout, bit/byte order, etc. do not show through. The implementation is free to use clever representations (e.g. storing only a single copy of a large structure). I am assuming that this is what you meant by "less restrictions on builtin datatypes". (2) All types are first-class. That is, they can be put in tables, sent in messages, passed as parameters, etc. (3) There is a type-definition mechanism and a powerful set of type constructors for tuples, tables, call-messages, etc. (4) Storage leaks are avoided through typestate checking (see my earlier posting in this thread). This guarantees that all objects are finalized on termination of a process. (5) Super-lightweight processes provide "abstraction, interfaces, modularisation and so on". Hermes is strictly imperative. Each process has variables and assignments. These variables are not visible to other processes, however. Wirth, who developed Pascal and was a designer and advisor for the Modula-n efforts, supported performance transparency and opposed complex compilers. Our research group is exploring the opposite philosophy. We believe that hiding low-level details makes programming easier, and programs more portable. We conjecture that starting from a more abstract model facilitates optimizations that will generate efficient implementations on diverse target platforms. We're willing to give up performance transparency (most of the time) in exchange for performance. We are currently exploring some ``aggressive optimizations'' for distributed and multithreaded environments. An example is transparent process replication to increase concurrency and reduce communications costs of a process which at the source level looks like a performance bottleneck. Other imperative languages (e.g. SETL, CLU, APL) are high-level. I therefore disagree that high-level in practice implies applicative. |> > since if you don't have automatic storage managment, then you are |> > extremely limited in the types of first-class objects you can have. |> > In fact, I am tempted to define "higher-level" in terms of the |> > built-in data types the language suports. |> |> ... and how it allows them to be used (as arguments, results, via |> polymorphism, in abstractions, and so on). I'd also judge the level of the |> language by the sophistication, flexibility, and *soundness* of the type |> system (which excludes a lot of languages). |> I understand *soundness* for inferencing/checking techniques but can you elaborate on what it means for the type system itself to be sound? |> Note that I've refrained from mentioning "pointers"... :-) |> |> > David Gudeman |> |> -- |> Nick Rothwell, Laboratory for Foundations of Computer Science, Edinburgh. |> nick@lfcs.ed.ac.uk <Atlantic Ocean>!mcsun!ukc!lfcs!nick -- Rob Strom, strom@ibm.com, (914) 784-7641 IBM Research, 30 Saw Mill River Road, P.O. Box 704, Yorktown Heights, NY 10958