[comp.software-eng] Reuse, Tolerance, and Tradeoffs

cwk@ORCRIST.GANDALF.CS.CMU.EDU (Charles Krueger) (02/06/91)

In article <785@caslon.cs.arizona.edu>, dave@cs.arizona.edu (Dave P. Schaumann) writes:
 
> Ok, so I stand corrected.  There are a lot of applications out there that
> don't need exact answers.  But my question was in the context of code re-use.
> ...
> I can't see how the "tolerence paradigm" could possibly lead to a reasonable
> means of code re-use.

There are some dimensions in resuable artifacts in which tolerance makes
sense.  In particular, there are tradeoffs in many reusable artifacts, and
these tradeoffs can involve tolerance.  As a simple example, a sine routine
is never a pure mathematical sine function.  It's just sort of "siney".
As a reuser, I would like to decide how much computation time I am willing
to spend in order to make it more siney.  I can dedicate arbitrarily large
amounts of time if my tolerance for error is small, or vice versa.

With your stack example, I can imagine some flexibility in the stack
semantics, based on tradeoffs with other constraints.  For example, maybe
in the "pure" definition of a stack, an error is generated when you try to
pop an empty stack.  However, in my application I can guarantee that an
empty stack will never be popped, so I don't want to pay the overhead of
checking for an empty stack.  By omitting this check, the artifact is no
longer a pure stack, but just sort of stacky.

The key to making this work, of course, is to keep the tolerance and the
variability in the artifacts well defined.  The person reusing the
artifact must clearly know what can be varied and the effect on the
semantics.  Abstraction is the key to presenting this information to the
reuser.  It is a hard problem.

weide@elephant.cis.ohio-state.edu (Bruce Weide) (02/07/91)

In article <11803@pt.cs.cmu.edu> cwk@ORCRIST.GANDALF.CS.CMU.EDU (Charles Krueger) writes:

>With your stack example, I can imagine some flexibility in the stack
>semantics, based on tradeoffs with other constraints.  For example, maybe
>in the "pure" definition of a stack, an error is generated when you try to
>pop an empty stack.  However, in my application I can guarantee that an
>empty stack will never be popped, so I don't want to pay the overhead of
>checking for an empty stack.  By omitting this check, the artifact is no
>longer a pure stack, but just sort of stacky.
>
>The key to making this work, of course, is to keep the tolerance and the
>variability in the artifacts well defined.  The person reusing the
>artifact must clearly know what can be varied and the effect on the
>semantics.  Abstraction is the key to presenting this information to the
>reuser.  It is a hard problem.

Charles, given the second paragraph above (with which I agree if I
understand it) I can't see how you draw the conclusion of the first
one.  In what sense is a stack design, in which no particular error is
generated when an empty stack is popped, no longer a "pure stack"?  As
you point out, this is the RIGHT way to design a most-reusable stack
component.  If some client wants an error message upon popping an
empty stack, he/she can layer a "defensive stack" design on top of
yours.  That way the rest of us don't have to pay for unnecessary
tests for emptiness.  (By the way, Bertrand Meyer and our group have
been saying this for a long time, but most people still don't buy it.
I'm not sure why.)

Maybe this just illustrates the difficulty with ambiguous natural
language descriptions of components.  Charles thinks "pure stack"
means one thing, I think it means something else.  Sheesh, if we can't
even agree on what STACKS are, how likely is it that we will agree on
what more complex components should act like, just from their names?
Guess we'd better consider writing some specifications for these
things...

This reminds me of a paper we got back recently.  One of the reviewers
claimed that our formal specification of a stack component was "wrong"
because it didn't include any a priori bound on how many elements a
stack could contain :-).  Fortunately, the editor did not pay much
attention to this comment.

	-Bruce

nancy@murphy.ICS.UCI.EDU (Nancy Leveson) (02/08/91)

	 In what sense is a stack design, in which no particular error is
	 generated when an empty stack is popped, no longer a "pure stack"?  As
	 you point out, this is the RIGHT way to design a most-reusable stack
	 component.  If some client wants an error message upon popping an
	 empty stack, he/she can layer a "defensive stack" design on top of
	 yours.  That way the rest of us don't have to pay for unnecessary
	 tests for emptiness.  (By the way, Bertrand Meyer and our group have
	 been saying this for a long time, but most people still don't buy it.
	 I'm not sure why.)

Perhaps because many people do not feel that tests for emptiness are
"unnecesssary."  Most serious accidents caused by software that I have
heard about result from nonrobust software -- software that makes
assumptions about the environment and those assumptions are incorrect
or the environment changes during use of the software.  At the least,
the software response should be defined (e.g., put out an error message)
rather than random.

This seems to be even MORE important for software that will be reused in
environments where the assumptions about it may be very different than
those made by the developers of the reused code.  Sure, you can write
down all the assumptions made by the developers (e.g., an empty stack
will never be popped) and the user can check for them -- easier for
something like a stack where we have been exploring what those assumptions
are in hundreds of papers for the past 20 years than in other software.

We are not even aware of many (if not most) assumptions we make in
programming.  Formal specifications will help somewhat, but there are
lots of cases of incomplete formal specifications for even simple things
like stacks being written.  We have a paper coming out in the March
issue of IEEE Trans. on Soft. Eng. that attempts to define the
assumptions you need to specify, but it does not completely solve
this problem.  And the amount of specification necessary is VERY large.

When reusing some air traffic control software in Britain that had been
developed in the U.S., it was discovered after it was put into place that
the American developers had not taken 0 degrees longitude into account
(which they did not need to in the U.S.) and also did not warn anyone
about this (perhaps they did not even think about it?).  The software
folded the map of Britain in half along the Greenwich meridian, placing
Warwick on top of Manchester.

If this is the "RIGHT way" to design reusable software, then reusability
is in trouble.

weide@CIS.OHIO-STATE.EDU (Bruce Weide) (02/09/91)

Nancy,

Perhaps my one-sentence summary of the issue was not clear.  There are
two important points I'd like to make in order to clarify the position
that a most-reusable stack abstraction should NOT involve a test for
emptiness during the "pop" operation.

(1) We are working in a system in which components DO have formal
specifications and in which programs are verified.  Of course most
real-world software today is not carefully specified or verified, but
we're arguing that new-generation software SHOULD be.  I guess we
agree on that; or maybe not?

In any case, the formal specification of pop in our stack component
includes an explicit precondition that the stack is not empty.  This
means that the implementer is free to assume that the stack being
popped is not empty -- and therefore need not test for it -- and that
the client is obligated to be able to demonstrate this fact wherever
pop is called.  For instance, consider stack s as used in the
following program fragment (actually, quite a typical use):

	while not is_empty (s) do
		pop (s, x);
		...
	end while;

There is no problem verifying that pop's precondition is satisfied for
this use of pop.  Why, then, should another check for emptiness be
made inside the implementation of pop?  The client program has just
checked that.

(2) In no way did I mean to imply that client programs should not
check for errors, when it is appropriate.  I am quite fond of your
work on software safety and I appreciate your examples and the points
you make about it.  My point does not conflict in any way.  It is that
reusable software components should be designed to permit maximum
reuse, through judicious use of layering.  One obvious way to reuse
the stack component I proposed is to make it a building block for a
"safe" or "defensive" stack component in which pop does something
quite predictable if the stack is empty.  (The specification of this
version of pop includes the desired behavior for an empty stack in its
postcondition, and its precondition is simply "true.")

Now you have at least two choices: you can layer the non-defensive
stack component on top of the defensive one, or vice versa.  The
problem with putting the defensive design in the lower layer is that a
client (such as the one above) who can PROVE that pop will never get
an empty stack must pay for a redundant test for emptiness.  Now this
may not seem like a big problem here because stacks are simple and
it's presumably easy and fast to test this condition.  However, there
are other cases where that is not true, and substantial performance
penalties are incurred because of the insistence that EVERY reusable
module (even the lowest level ones) be completely defensive.


While I'm writing, there is something else you said that puzzles me:

 "Sure, you can write down all the assumptions made by the developers
 (e.g., an empty stack will never be popped) and the user can check
 for them -- easier for something like a stack where we have been
 exploring what those assumptions are in hundreds of papers for the
 past 20 years than in other software...  We are not even aware of
 many (if not most) assumptions we make in programming.  Formal
 specifications will help somewhat, but there are lots of cases of
 incomplete formal specifications for even simple things like stacks
 being written.  We have a paper coming out in the March issue of IEEE
 Trans. on Soft. Eng. that attempts to define the assumptions you need
 to specify, but it does not completely solve this problem."

Apaprently your point is that formal specification will be hard for
more complex abstractions than stacks.  Agreed.  Not impossible (we've
got dozens of examples to prove it), but harder than for stacks.
That's why we're working on it: it's an interesting research problem.
But if your conclusion is true, how do you propose to write software
that TESTS that these assumptions are satisfied if you can't even
write down what the assumptions are?  If you can write them down,
great, put them in the specification.  If not, seems like you're
doomed to have software that crashes sometimes.  I don't see how this
is an attack on the idea of formal specification, or on our arguments
about design-for-reuse.


In fact, if I might be allowed to speculate a bit about some of the
difficulties with real-time programs that constitute many of your
examples of problem software... I do have some experience in this
area.  It is precisely a concern with efficiency (even constant
factors) that leads many real-time systems folks to eschew well-tested
and possibly verified reusable software components in favor of
one-time, ad hoc components that are far more likely to contain errors
in design or implementation.  Reusable components MUST be efficient,
as well as correct, if we really expect them to be reused.  To deny
that things like unnecessary checks of preconditions have any
influence over how real-time software is actually written seems fairly
implausible to me.

	-Bruce

eichmann@cs.wvu.wvnet.edu (David Eichmann) (02/09/91)

weide@CIS.OHIO-STATE.EDU (Bruce Weide) writes:
...
>In any case, the formal specification of pop in our stack component
>includes an explicit precondition that the stack is not empty.  This
>means that the implementer is free to assume that the stack being
>popped is not empty -- and therefore need not test for it -- and that
>the client is obligated to be able to demonstrate this fact wherever
>pop is called.  For instance, consider stack s as used in the
>following program fragment (actually, quite a typical use):

>	while not is_empty (s) do
>		pop (s, x);
>		...
>	end while;

>There is no problem verifying that pop's precondition is satisfied for
>this use of pop.  Why, then, should another check for emptiness be
>made inside the implementation of pop?  The client program has just
>checked that.
...

   This is actually a good argument for a form of "second-order"
compiler optimization.  We have a pre-condition asserted both in the
client code and the server code - when only one is needed.  I *do*
take the stance from pure paranoia that one or the other must
verify the pre-condition.

   However, I'm also sorry to say that I can see two immediate
dilemmas with such a s-o optimization:

   1) optimizing the is_empty away from the loop predicate obviously
screws up the implicit semantics of the original client program
text, but maintains the original intermodule dependencies.

   2) optimizing the is_empty away from the body of the client
maintains the implicit semantics of the client program, but introduces
a new form of intermodule dependency between the client and the
server - and forces deferral of the optimization of the server body
until after the compilation of the client.

- Dave
======
David Eichmann
Dept. of Statistics and Computer Science
West Virginia University                  Phone: (304) 293-3607
Morgantown, WV  26506                     Email: eichmann@a.cs.wvu.wvnet.edu

cwk@ORCRIST.GANDALF.CS.CMU.EDU (Charles Krueger) (02/10/91)

In article <88102@tut.cis.ohio-state.edu>, weide@elephant.cis.ohio-state.edu (Bruce Weide) writes:

> In what sense is a stack design, in which no particular error is
> generated when an empty stack is popped, no longer a "pure stack"?  

> Maybe this just illustrates the difficulty with ambiguous natural
> language descriptions of components.  Charles thinks "pure stack"
> means one thing, I think it means something else.  

I doubt if there is such a thing as a "pure stack" specification.
However, to illustrate how tolerance plays a role in reuse technology, my
stack example was intended to show how a stack implementation might have
some variance in its specification.  That is, that the stack specification
might say; "This reusable artifact satisfies the ANSI/Weide Pure Stack
Standard postconditions under the following preconditions."  I can reuse
this artifact in my application if my application can satisfy the
preconditions requirements of the artifact and if the postconditions
satisfy the requirements of my application.  Since the preconditions and
postconditions represent a range conditions under which the stack artifact
will operate correctly, the preconditions and postconditions correspond
to the "tolerance" in the stack implementation.

In some reuse technologies it is possible to vary the preconditions and
postconditions of an artifact in order to optimize its performance in the
application.  In these cases the reuser can vary the tolerance.  
Consider again the stack example that has two variants: one that tests for
an empty stack before popping and one that doesn't.

  1. Has empty/pop check.  Average time of execution: UVW
     Preconditions: XYZ

  2. No empty/pop check.  Average time of execution: UVW - 10
     Preconditions: XYZ + <empty stack must not be popped>.

If the application provides tighter precondtions, then the artifact can
operate faster.  In general, a reusable artifact should take advantage of
all of the preconditions that the application has to offer, and provide
the minimum postconditions required by the application.

If I were try and concisely summarize the concept of "tolerance in
reusable artifacts", I would say that the preconditions of an artifact
correspond to the degree of tolerance that the artifact has for a new
application, and that the postconditions of an artifact correspond to the
degree of tolerance that the new application must have for the artifact.