[net.lang] and if you put this in your language ...

franka@mmintl.UUCP (03/08/86)

There is a fairly common, relatively simple programming structure which
cannot be easily coded using standard control statements (not gotos) in
any language I know of.  In flow-chart form, the control structure is as
follows:

         |
       A:/\
     +---\/---+
    +-+       |
    |X|       |
    +-+       |
     |        |
   B:/\       |
  +--\/-------+
 +-+         +-+
 |Y|         |Z|
 +-+         +-+
  +-----+-----+
        |

One often winds up writing this in the form:

   if (A) then X;
   if (A && B) then Y; else Z;

However, this fails if executing X changes the value of A (or if A has side
effects); and if evaluating A is expensive, the additional evaluation may be
undesirable.  In languages where the order of execution is not specified for
the 'and' operator, the evaluation of (A && B) may fail if evaluation of B
is not valid unless A is true and/or X has been executed (which is fairly
common).

In C, one often sees something like:

   if (A &&  (X, B)) then Y; else Z;

However, this only works if X can be written in-line.

A solution in full generality requires defining a boolean variable, say L,
and writing:

   if (A) then {X; L = B;} else L = FALSE;
   if (L) then Y; else Z;

I am proposing that a new clause be added to the 'if' statement, which will
represent this function directly, and in a way I think is more readable than
any of these options.  This would be the 'and if' clause (or 'andif',
depending on the language).  It would be used as follows:

   if (A) then X; and if (B) then Y; else Z;

Naturally, one would be able to use multiple and ifs in a single if statement.

Having added an and if, it seems natural to add an or if, as well.  This
would be syntactically identical to the and if, and would represent the
following control structure:

         |
       A:/\
     +---\/---+
    +-+     B:/\
    |X|       \/-+
    +-+       |  |
     +---+----+  |
        +-+     +-+
        |Y|     |Z|
        +-+     +-+
         +---+---+
             |

This seems to me considerably less useful than the and if; I don't
particularly remember ever needing this structure.  There is also the
problem of mixing and ifs and or ifs; the effect of doing this is likely
to being confusing.  However, I don't see any real disadvantages to
including an or if clause in the language, at least if mixed ands and ors
are not allowed; the meaning is quite intuitive.

For languages which don't have and and or operators which specify the order
of evaluation, providing both and if and or if clauses deals with most cases
where such is needed.  In particular, if the statement labelled 'X' in each
example is null, the and if is just an &&, and the or if is an ||.

Has anyone seen constructs like these implemented anywhere?  Does anyone
else think they are a good idea?  Arguments both pro and con are welcomed.

Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
Multimate International    52 Oakland Ave North    E. Hartford, CT 06108

bobc@tikal.UUCP (Bob Campbell) (03/11/86)

In article <1187@mmintl.UUCP> franka@mmintl.UUCP (Frank Adams) writes:
>There is a fairly common, relatively simple programming structure which
>cannot be easily coded using standard control statements (not gotos) in
>any language I know of.  In flow-chart form, the control structure is as
>follows:

I must beg to differ your flow-chart can be easily converted to a less
complex diagram (uses only two of the three basic units) as follows.
(Yours is on the Left :-)

         |                            |
       A:/\                         A:/\
     +---\/---+                   +---\/---+
    +-+       |                  +-+       |
    |X|       |                  |X|       |
    +-+       |                  +-+       |
     |        |                   |        |
   B:/\       |                 B:/\       |
  +--\/-------+                +--\/--+    |
 +-+         +-+              +-+    +-+  +-+
 |Y|         |Z|              |Y|    |Z|  |Z|
 +-+         +-+              +-+    +-+  +-+
  +-----+-----+                +--+---+    +
        |                         |        |
	|                         +---+----+
				      |

My referance for this piece of work is "The Programming Language
Landscape" by H. Ledgard and M. Marcotty. (A rather interesting look at
programming language constructs).  Any way the book refers to the
"Fundamental Control Structure Theorem" and credits Boehm and Jacopini
[Communications of the ACM, 1966] with the theorem, and Mills [IBM
Corporation Report FSC 71-6012, 1972] with the formal proof.

It basicly shows that this and the other example you had of the "orif"
can also be transformed in a simular fashion.

One premise for the use of control structures is that it keeps the flow
of control clear, therefore reducing the chance of overlooking something.
It is my opinion that the "andif" and "orif" will add unneeded confusion.

Oh by the was don't bother to tell me that Z appears twice, I know that
it does, and in most cases the amount of code space wasted is not worth
saving (and in those cases where the code space is needed use a GOTO).

Bob Campbell
Teltone Corp.
{amc,dataio,fluke,uw-beaver}!tikal!bobc

bright@dataioDataio.UUCP (Walter Bright) (03/11/86)

In article <367@tikal.UUCP> bobc@tikal.UUCP (Bob Campbell) writes:
>I must beg to differ your flow-chart can be easily converted to a less
>complex diagram (uses only two of the three basic units) as follows.
>(Yours is on the Left :-)
>
>         |                            |
>       A:/\                         A:/\
>     +---\/---+                   +---\/---+
>    +-+       |                  +-+       |
>    |X|       |                  |X|       |
>    +-+       |                  +-+       |
>     |        |                   |        |
>   B:/\       |                 B:/\       |
>  +--\/-------+                +--\/--+    |
> +-+         +-+              +-+    +-+  +-+
> |Y|         |Z|              |Y|    |Z|  |Z|
> +-+         +-+              +-+    +-+  +-+
>  +-----+-----+                +--+---+    +
>        |                         |        |
>	|                         +---+----+
>				      |
>
>Oh by the was don't bother to tell me that Z appears twice, I know that
>it does, and in most cases the amount of code space wasted is not worth
>saving (and in those cases where the code space is needed use a GOTO).

A good compiler will transform the case on the right to the case on the
left, the technique is called 'tail-merging'.

taylor@glasgow.glasgow.UUCP (Jem Taylor) (03/13/86)

In article <1187@mmintl.UUCP> franka@mmintl.UUCP (Frank Adams) writes:
>One often winds up writing this in the form:
>   if (A) then X;
>   if (A && B) then Y; else Z;
>However, this fails if executing X changes the value of A (or if A has side
>effects); and if evaluating A is expensive, the additional evaluation may be
>undesirable.  In languages where the order of execution is not specified for
>the 'and' operator, the evaluation of (A && B) may fail if evaluation of B
>is not valid unless A is true and/or X has been executed (which is fairly
>common).
>
>In C, one often sees something like:
>
>   if (A &&  (X, B)) then Y; else Z;
>
>However, this only works if X can be written in-line.
>
	if (A)
	{	X;
		if (B)	Y;
		else	Z;
	}
	else	Z;

If Z is so large that it is not reasonable to have two copies of it's code, it's
time to make it a separate procedure anyway.

-Jem.

g-rh@cca.UUCP (Richard Harter) (03/16/86)

In article <> taylor@glasgow.UUCP (Jem Taylor) writes:
>	..................
>	if (A)
>	{	X;
>		if (B)	Y;
>		else	Z;
>	}
>	else	Z;
>
>If Z is so large that it is not reasonable to have two copies of
>it's code, it's time to make it a separate procedure anyway.
>
	Sorry Jem, the code you suggest is wrong in principle
even though it may be the right thing to do in practice as a
matter of convenience, the principle bei:

	AVOID USING MULTIPLE COPIES OF THE SAME CODE

Procedures are one way to avoid multiple copies.  In any particular
instance, however, creating another procedure may well be the wrong
thing to do.  The most general solution is to use a flag (which, as
I recall, people were objecting to originally), e.g.

	flag = 0;
	if (A) {
	   X;
	   if (B) Y;
	   else flag = 1;
	   }
	else flag = 1;
	if (flag) Z;

[I'm not in love with this code either.]  If X, Y, and Z are large
blocks of code one might also consider:

	xflag = 0;
	yflag = 0;
	zflag = 0;
	if (A) xflag = 1;
	else   zflag = 1;
        if (xflag) {
	   X;
	   if (B) yflag = 1;
	   else   zflag = 1;
	   }
	if (yflag) Y;
	if (zflag) Z;

This is more long winded in schematic form.  However it will be clearer
if X, Y, and Z are complicated pieces of code.

[All complaints about the above style of indentation will be forwarded
to /dev/null or to Rich Rosen.]

	Richard Harter, SMDS Inc.

taylor@glasgow.glasgow.UUCP (Jem Taylor) (03/19/86)

In article <6699@cca.UUCP> g-rh@cca.UUCP (Richard Harter) writes:
>In article <> taylor@glasgow.UUCP (Jem Taylor) writes:
>>	..................
>>	if (A)
>>	{	X;
>>		if (B)	Y;
>>		else	Z;
>>	}
>>	else	Z;
>>
>>If Z is so large that it is not reasonable to have two copies of
>>it's code, it's time to make it a separate procedure anyway.
>>
>	Sorry Jem, the code you suggest is wrong in principle
>even though it may be the right thing to do in practice as a
>matter of convenience, the principle bei:
>
>	AVOID USING MULTIPLE COPIES OF THE SAME CODE
>
>Procedures are one way to avoid multiple copies.  In any particular

As someone else has pointed out, the above creates two source-code sections
which need to be kept in step during program development. The solution is to
use macros ( or #defines as C calles them ) to locate the text of Z in one place
and use the pre-processor to place multiple copies in the construct above.

Regardless of the programmer's style, the compiler should detect that two
or more identical code sections are specified, and generate the apropriate
object code, with BRANCH and/or SUBROUTINE or whatever, to make whatever
tradeoff between code size and speed is required. The programmer doesn't
need to complicate his problem by using *unnecessary* constructs as has been
proposed.

>thing to do.  The most general solution is to use a flag (which, as
>I recall, people were objecting to originally), e.g.
>
>	flag = 0;
>	if (A) {
>	   X;
>	   if (B) Y;
>	   else flag = 1;
>	   }
>	else flag = 1;
>	if (flag) Z;
>
>[I'm not in love with this code either.]  If X, Y, and Z are large
>blocks of code one might also consider:
>
>	xflag = 0;
>	yflag = 0;
>	zflag = 0;
>	if (A) xflag = 1;
>	else   zflag = 1;
>        if (xflag) {
>	   X;
>	   if (B) yflag = 1;
>	   else   zflag = 1;
>	   }
>	if (yflag) Y;
>	if (zflag) Z;
>
>This is more long winded in schematic form.  However it will be clearer
>if X, Y, and Z are complicated pieces of code.

I dont find this clearer than what I (and others) proposed. Therefore, for
my purposes, it is not what is required. [ I want other people to understand
what the code (a) is meant to do (b) does, when I am not there to work it
out myself. I certainly dont intend to try and remember myself :-) ].

-Jem.

P.S. no-one here agrees with my indenting style either.

anw@nott-cs.UUCP (03/20/86)

	The Algol solution would be

		if if not A
		   then true
		   elif X; B
		   then Y; false
		   else true
		   fi
		then Z
		fi

[even if X, Y, Z are large, and with very minor mods if A and B are].  This
may not be elegant, but it is no less so than the problem.

							-- Andy Walker

g-rh@cca.UUCP (Richard Harter) (03/23/86)

[.... The continuing saga of the coding problem that created
	the line eating bug .....]

In article <> taylor@glasgow.UUCP (Jem Taylor) writes:
>In article <6699@cca.UUCP> g-rh@cca.UUCP (Richard Harter) writes:
>>In article <> taylor@glasgow.UUCP (Jem Taylor) writes:
>>>	..................
>>>	if (A)
>>>	{	X;
>>>		if (B)	Y;
>>>		else	Z;
>>>	}
>>>	else	Z;
>>>
>>>If Z is so large that it is not reasonable to have two copies of
>>>it's code, it's time to make it a separate procedure anyway.
>>>
>>	Sorry Jem, the code you suggest is wrong in principle
>>even though it may be the right thing to do in practice as a
>>matter of convenience, the principle bei:
>>
>>	AVOID USING MULTIPLE COPIES OF THE SAME CODE
>>
>>Procedures are one way to avoid multiple copies.  In any particular
>
>As someone else has pointed out, the above creates two source-code sections
>which need to be kept in step during program development. The solution is to
>use macros ( or #defines as C calles them ) to locate the text of Z in one
>place and use the pre-processor to place multiple copies in the construct
>above.
>
	I went on to outline a schematic suggestion for using flags and
Jem wasn't happy with that.  Frankly, I'm not enthusiastic about flags
either.  The real difficulty is that there is not good solution to the
problem which can be roughly stated as follows: Given a net of if-then-else
logic in which a functional block of code occurs more than once, how do
you structure the code?  Some possible solutions are:

(a)	Duplicate the block
(b)	Replace the block by a procedure
(c)	Replace the block by a macro
(d)	Separate the logic structure and the executable structure,
	using flags to control the execution
(e)	Programming tricks based on language syntax

	All of these solutions have merits and defects.  If the
duplicated block is short, alternative (a) is often preferable even
though it involves duplication of code.  (Suppose Z is 'i++;'.)
Alternative (b) can be a big win; it loses [For Strunk and White's
sake, the word is 'lose' and not 'loose'] if you have to export part
of the environment that the block is embedded in.  Alternative (d)
is useful if you have complicated logic, but is not desirable in
the cited example because flag setting and executable code are
intermingled.  Alternative (e) can be the method of choice if
everything is simple and there are no side effects.  Finally there
are macros (c).  Sometimes macros are the appropriate and elegant
solution; sometimes they lead to the most godawful messes.  Finally
there is:

(f)	Reformulate the problem.

	We can't do this in the cited example because we don't
know what X, Y, and Z are.  In practice we do and quite often we
find that the problem can be decomposed in a different way that
has cleaner logic.  There are deep problems with reformulation
also; it leads to only doing those things which are easy to do
in the language of choice.

	Richard Harter, SMDS Inc.

lambert@boring.uucp (Lambert Meertens) (03/24/86)

In article <800009@ccvaxa> aglew@ccvaxa.UUCP writes:

>                                     Finally, something I'm not sure whether
> it's my idea, or Knuth's, or somebody else's, there is the concept of an
> ASIDE. An ASIDE is a labelled block of code that can only be entered via
> a goto to the label at the top of the aside (and can only be exited in one of
> the normal ways to exit a block). Code that would seemm to fall through to
> an aside branches around it. So above you really have
> 	goto END
> 	AB: begin Z; goto END end
> 	END: ...
> The aside has saved you a goto and a label - it makes it seem just a bit less
> like spaghetti.
> [...]
> but if Z is not simply a function call then you have to make sure that two
> copies of code get updated. That's what macros are for [...]
> A big #define in line looks ugly, and is really just like an aside, except
> that it hides a bit of the concept of doing work to get simpler cases.

Some languages have "refinements", which are somewhat like macros but they
really are part of the language and do not get expanded by a preprocessor.
Mainly useful for programming in a relaxed top-down style, but also good to
prevent code duplication.  The languages I know of that support refinements
are (in the chronological order in which the refinements entered the design)
ELAN, ABC (the upcoming revision of our B) and SETL.  In ABC, a possible way
of expressing the program is

       SELECT:
          a:
             X
             SELECT:
                b: Y
                ELSE: Z
          ELSE: Z
       ...
    X: <definition of X>
    Y: <definition of Y>
    Z: <definition of Z>

X and Y may be expanded in place, but for Z this gives code duplication.
Tests may also be refined, so if the evaluation of a and b is fast enough,
you can also write

       SELECT:
          a AND b:
             X
             Y
          a AND NOT b:
             X
             Z
          NOT a:
             Z

I wrote an article to make propaganda for refinements: Program Text and
Program Structure, in: Constructing Quality Software (P.G. Hibbard and
S.A. Schuman, eds), 271-281, North-Holland, 1978.

If any other languages than the three mentioned above sport refinements, I
would like to hear about them.

-- 

     Lambert Meertens
     ...!{seismo,okstate,garfield,decvax,philabs}!lambert@mcvax.UUCP
     CWI (Centre for Mathematics and Computer Science), Amsterdam

db@cstvax.UUCP (Dave Berry) (03/25/86)

In article <6843@boring.UUCP> lambert@boring.UUCP (Lambert Meertens) writes:
>In article <800009@ccvaxa> aglew@ccvaxa.UUCP writes:
>> but if Z is not simply a function call then you have to make sure that two
>> copies of code get updated. That's what macros are for [...]
>
>Some languages have "refinements", which are somewhat like macros but they
>really are part of the language and do not get expanded by a preprocessor.

Isn't it simpler to have a parameterless procedure with no local variables
and a compiler that recognises such things?  Maybe with pragmas giving the
user control over the optimisation if you want it?  

Procedures handle sequential abstraction fine.  Want you seem to be after
is greater control over the way they're implemented.  You don't have to
allocate a new frame on the stack just because you call a procedure, if
there is nothing to put in it!

Maybe I've spent too long programming in functional languages, where you
have to do things like this (and tail recursion optimisation similarly)
to get decent performance.

(Lambert's example, to refresh your memories.  This is the last I say).
>
>       SELECT:
>          a:
>             X
>             SELECT:
>                b: Y
>                ELSE: Z
>          ELSE: Z
>       ...
>    X: <definition of X>
>    Y: <definition of Y>
>    Z: <definition of Z>
>
>X and Y may be expanded in place, but for Z this gives code duplication.
-- 
	Dave Berry. CS postgrad, Univ. of Edinburgh		
					...mcvax!ukc!cstvax!db

lambert@boring.uucp (Lambert Meertens) (03/27/86)

In article <87@cstvax.UUCP> db@cstvax.UUCP (Dave Berry) writes:

> In article <6843@boring.UUCP> lambert@boring.UUCP (Lambert Meertens) writes:
>> ...
>> Some languages have "refinements", which are somewhat like macros but they
>> really are part of the language and do not get expanded by a preprocessor.
> 
> Isn't it simpler to have a parameterless procedure with no local variables
> and a compiler that recognises such things?  Maybe with pragmas giving the
> user control over the optimisation if you want it?  

Parameterless procedures are fine provided that:
1) The language allows to define procedures that are local to a given
   procedure and which inherit the environment;
2) You can postpone the definitions of the local procedures.

Under these conditions, there is in fact no discernible difference between
refinements and these procedures, except maybe for a lighter syntax for
refinements.

Condition 1 is vital, but is not fulfilled by ELAN, ABC or SETL (if I
remember correctly), nor by C and many other languages.  All procedure
definitions are on the same, global level.  Since the refining procedure
has to be able to modify the environment, it cannot then be parameterless.
(Unless procedure variables have no local scope, but that is too awful.)
Using parameters would be too awkward, especially since later data
refinement in the top-down style can change the parameters to be passed.

Condition 2 is not vital; however, it is awkward if you have to return to
the top of the body for declaring the procedure.  The same is true for
languages that require introductory declarations for variables, but in that
case programmers have to put up with it since they have no choice.

> Procedures handle sequential abstraction fine.  Want you seem to be after
> is greater control over the way they're implemented.  You don't have to
> allocate a new frame on the stack just because you call a procedure, if
> there is nothing to put in it!
> 
> Maybe I've spent too long programming in functional languages, where you
> have to do things like this (and tail recursion optimisation similarly)
> to get decent performance.

As a language *designer*, I am not particularly interested in how
procedures are implemented.  As a language *user*, I mainly wish more
compiler writers would recognize the importance for people trying to
program in a decent style of optimizations like replacing tail recursion by
a jump, and in general of fast procedure entry/exit.  Given the usual heavy
overhead, the only control over the implementation I have sometimes longed
for was an option advising the compiler to generate inline code for a
procedure.

-- 

     Lambert Meertens
     ...!{seismo,okstate,garfield,decvax,philabs}!lambert@mcvax.UUCP
     CWI (Centre for Mathematics and Computer Science), Amsterdam

chan@hpfcmt (04/02/86)

>	BLOCK label_name {
>	  .....
>	  }

>	LOOP label_name {
>	  .....
>	  }

>	FRAGMENT label_name {
>	  .....
>	  }

>A FRAGMENT is essentially an internal procedure without arguments;
>a BLOCK is a labelled block of code; and a LOOP is a labelled (forever)
>loop.  The command statement that I want is:

>	escape [(block/loop)_label_name] [via fragment_name]

>This statement transfers control to the next statement after the
>block (loop) but first executes the named fragment.  If there is
>no block/loop label name the escape leaves the current block/loop.
>If there is no via clause control is transferred directly to the
>next statement.  Every once in a while I find that I would really
>like to have something like this available.

Common Lisp (as well as other dialects) has a capability similar to
what you desire, that is CATCH and THROW, and UNWIND-PROTECT.

(CATCH <label> <form>*) says evaluate the <form>s and if any throws to the
same label occur during that evaluation, return the thrown value as the
value of the CATCH. There may however be intervening catches with the
same label that will "intercept" the throw. A throw looks like
(THROW <label> <value>).

If there is cleanup code that needs to be executed, you need to set up
an UNWIND-PROTECT.

(UNWIND-PROTECT <protected-form> <cleanup-form>*) "guarantees" that the
cleanup forms will be evaluated, even if there is a non-local exit such
as a THROW from the protected form. This is useful for instance for
making sure that a file that was opened in the protected form gets closed.

			-- Chan Benson
			{ihnp4 | hplabs}!hpfcla!chan
			Hewlett-Packard Company
			Fort Collins, CO

As usual, HP has nothing to do with what I say here.

g-rh@cca.UUCP (Richard Harter) (04/08/86)

In article <> chan@hpfcmt writes:
>
>Common Lisp (as well as other dialects) has a capability similar to
>what you desire, that is CATCH and THROW, and UNWIND-PROTECT.
>
>(CATCH <label> <form>*) says evaluate the <form>s and if any throws to the
>same label occur during that evaluation, return the thrown value as the
>value of the CATCH. There may however be intervening catches with the
>same label that will "intercept" the throw. A throw looks like
>(THROW <label> <value>).
>
>If there is cleanup code that needs to be executed, you need to set up
>an UNWIND-PROTECT.
>
>(UNWIND-PROTECT <protected-form> <cleanup-form>*) "guarantees" that the
>cleanup forms will be evaluated, even if there is a non-local exit such
>as a THROW from the protected form. This is useful for instance for
>making sure that a file that was opened in the protected form gets closed.
>
	Sorry, this is a little obscure to the non-Lisp user.  I follow
the idea of the THROW; however what happens to the value of the CATCH.
How do I tell one CATCH from another?  Is (CATCH ...) an expression 
that returns a value in other expressions?  If so, what does it return
if nothing is thrown to it?  The (UNWIND-PROTECT ...) looks like a
demon, i.e. it executes whenever there is an exit from the protected
form as an interjection.  Also, what happens when there is a THROW?
Does flow control transfer to the CATCH or does it continue?  These
may seem like dumb questions, but if one is not a LISP programmer it
seems a little obscure.

		Richard Harter, SMDS Inc.

chan@hpfcmt (04/14/86)

> How do I tell one CATCH from another?  Is (CATCH ...) an expression 
> that returns a value in other expressions?  If so, what does it return
> if nothing is thrown to it?

The tag is the only thing that distinguishes catchers. Remember though,
that a particular CATCH can only be thrown to while its forms are being
evaluated. If no THROW occurs, the CATCH returns whatever its last form
returned. If a THROW does occur, the CATCH returns the value specified
in the THROW.

> The (UNWIND-PROTECT ...) looks like a
> demon, i.e. it executes whenever there is an exit from the protected
> form as an interjection.  Also, what happens when there is a THROW?
> Does flow control transfer to the CATCH or does it continue?  

I'm fuzzy as to what you mean here, but I'll try to explain with an example.

(catch 'foo
  (unwind-protect (if (= x 1)
		      (throw 'foo "equal"))
		  ;; Protect forms
		  (print "Cleaning up...")
		  )
  "unequal"
  )

If x is 1, then this form returns "equal" after printing "Cleaning up...".
This is the case where the catch returns what it is thrown. Note that
the protect form(s) are evaluated, but the second <form> of the catch 
("unequal") is thrown over and does not get evaluated.

If x is not 1, then this form returns "unequal" after printing
"Cleaning up...". This is the case where the catch returns
whatever the last <form> returns.

> These may seem like dumb questions, but if one is not a LISP programmer
> it seems a little obscure.

They don't seem dumb to me, thanks for showing some interest. My original
posting was kept intentionally short and cryptic.

			-- Chan Benson
			{ihnp4 | hplabs}!hpfcla!chan
			Hewlett-Packard Company
			Fort Collins, CO

As usual, HP has nothing to do with what I say here.