[net.lang] Varieties of conditional statement

jer@peora.UUCP (J. Eric Roskos) (10/09/85)

>   The above difference of opinion has in the past been called a religious
> argument, with good reason.

Back in the old days, before religion was declared unconstitutional,
religious arguments were logical (albeit usually modal).  Presently
"religious argument" seems to imply "irrational", but actually reason
underlies most things, except where, through habit and ignorance,
irregularities prevail (and these are the things people argue over
endlessly).

The different brace styles actually have their basis in a much more
fundamental, and presently very important, matter having to do with the
design of languages.

Some languages, such as Pascal and C, define conditional statements (this
includes ifs, whiles, etc) like this:

	<conditional> <statement>

e.g.,
	if (x) <statement>

You then have <statement> defined recursively as either of a simple
statement, or as a sequence of statements delimited by a begin/end pair.

However, this style has recently fallen into disfavor.  Thus we have the
sort of definition that is in Ada, in which all conditional statements
can contain one or more statements, like this:

	<conditional-begin-type-1> {<statement>} <conditional-end-type-1>

where the conditional itself is imbedded in the conditional begin.

Now, I would argue that the latter is a "weaker" construct than the former.
The reason is (among other things) that the <statement> which can be a
compound statement can also have private data (and thus is a "block"),
whereas the latter approach requires a separate "block" construct to be
defined if you want to have a local scope of reference (and note that
syntactically, defining a local scope of reference within each type of
conditional is harder since the declarations have to be duplicated
in each conditional type).

To expand on this, I would argue that a language construct "a" is
stronger than two language constructs "b" and "c" if "a" can express
everything that both "b" and "c" can express, but "b" cannot express some
things "c" can express, or vice versa.  I would further argue that a
language having two identical ways of expressing the same language construct
is weaker than one having only one unique way of expressing the same
construct.

Clearly, the compound statement can express both the body of a conditional,
and a "block" (where, for this discussion, let us define a "block" as a
sequence of statements having its own private scope of reference).  However,
the body of a conditional cannot be a block: because if it is, then it is
equivalent to a compound statement; and in this latter case, the language
is weaker than a language with only one form of compound statement,
because several variations of this equivalent compound statement exist
(corresponding to the numbered <conditional-begin-types> above).  Likewise,
a block cannot be the body of a compound statement, unless again there are
several synonymous ways of expressing the block, which once again makes
the language weaker.

Thus I would argue that a language with a unique compound statement, and
conditionals of the form "<conditional> <statement>", is a stronger
language than one with the Ada-style of conditionals.

Now, this brings us to C.  It is my feeling (which is open to interpretation,
since it is a sort of meta-semantic one) that writing

	if (bool) {
		stmts;
	}

is in a general semantic sense (i.e., looking at the semantics of the
coding style, as opposed to those of the language itself) equivalent to
writing

	if (bool)
		stmts;
	endif

or, alternately, that it is stylistically equivalent to writing

	c = a + (c
	    + b);

as opposed to writing

	c = a +
	    (c + b);

Because, in the latter interpretation, you are writing two distinct
syntactic components (the <conditional> and part of the <statement>) on the
same line, but then breaking the latter one (the <statement>) across lines;
and in the former, you are "hiding" the opening brace.  Neither way seems
as stylistically satisfactory as writing the conditional on one line, and
then writing the compound statement on the lines below it; it's just a
matter of breaking the statement at a point that is most reasonable.

As far as whether to indent the braces to the level of the statements that
make up the compound statement, or only to the level of the conditional,
now THAT is truly a matter of opinion.  (Though I think not indenting them
aids in finding them).

Now, we could go on to argue whether or not data declarations should be
indented, and how, but now that I have expressed the above opinion, I will
not comment any more; I try to stay out of these debates, due to the
tendency of people to write "flames" without reading the original posting
they are following up on thoroughly.  Besides, my office has meanwhile
filled up with 3 assembly-language programmers who are furiously arguing
about how they should allocate some registers, and thus it's hard enough to
proofread what I have written already. :-)

Beware: since the above was more language-theoretic than C-theoretic, I have
directed followups to net.lang; you should change this if you are going to
discuss only C braces.
-- 
Shyy-Anzr:  J. Eric Roskos
UUCP: Ofc:  ..!{decvax,ucbvax,ihnp4}!vax135!petsd!peora!jer
     Home:  ..!{decvax,ucbvax,ihnp4}!vax135!petsd!peora!jerpc!jer
  US Mail:  MS 795; Perkin-Elmer SDC;
	    2486 Sand Lake Road, Orlando, FL 32809-7642

kurt@fluke.UUCP (Kurt Guntheroth) (10/10/85)

Mr. Roskos uses his own measure of what makes a language 'weaker' to
validate his basically religious argument that languages should be built
with pascal/c style conditional statements rather than ada 'comb' style 
statements.

There are other arguments.  For instance, the Pascal style if statement is
well known to result in ambiguities in the construction of reasonable
parsers.  (I believe there is a grossly complex mechanism that can be
applied to disambiguate the if statement; see SIMULA BEGIN by Birtwistle et.
al.)  One could argue on this basis that such languages are syntactically
weak, and thus that Ada style conditional constructs are superior.

Mr. Roskos also argues that having only one kind of block containing local
declarations is a property only of pascal style conditionals.  This is not
true, as it depends on the way block is defined in a language.  I also note
that the two blocks in an if-then-else statement will necessarily be
distinct under either style, so there is no clear superiority in this area
either.

I can't resist giving my own opinion here:  I like the Ada style 'comb'
structured statements.  The syntax is clean, and furthermore they result in
a more obvious cannonical indentation, making everybody's programs more
similar.  You can also define the list of statements enclosed in the parts
of a conditional or loop statement to be a block if the rest of your syntax
allows you to tell the difference between a declaration and a statement.

-- 
Kurt Guntheroth
John Fluke Mfg. Co., Inc.
{uw-beaver,decvax!microsof,ucbvax!lbl-csam,allegra,ssc-vax}!fluke!kurt

tynor@gitpyr.UUCP (Steve Tynor) (10/11/85)

In article <1703@peora.UUCP> jer@peora.UUCP (J. Eric Roskos) writes:

>The different brace styles actually have their basis in a much more
>fundamental, and presently very important, matter having to do with the
>design of languages.
>
>Some languages, such as Pascal and C, define conditional statements (this
>includes ifs, whiles, etc) like this:
>
>	<conditional> <statement>
>
>e.g.,
>	if (x) <statement>
>
>You then have <statement> defined recursively as either of a simple
>statement, or as a sequence of statements delimited by a begin/end pair.
>
>However, this style has recently fallen into disfavor.  Thus we have the
>sort of definition that is in Ada, in which all conditional statements
>can contain one or more statements, like this:
>
>	<conditional-begin-type-1> {<statement>} <conditional-end-type-1>
>
>where the conditional itself is imbedded in the conditional begin.
>
>Now, I would argue that the latter is a "weaker" construct than the former.
>The reason is (among other things) that the <statement> which can be a
>compound statement can also have private data (and thus is a "block"),
>whereas the latter approach requires a separate "block" construct to be
>defined if you want to have a local scope of reference (and note that
>syntactically, defining a local scope of reference within each type of
>conditional is harder since the declarations have to be duplicated
>in each conditional type).

You're really talking about the semantics of the particular construct,
not it's syntax.  The syntax of: IF c THEN s ENDIF mearly emphasizes what
kind of a statement sequence we're closing.  It does not imply any different
semantics (such as an inability to declare local variables within the block).

Syntactically, there's no reason why we couldn't have something like:
    IF c THEN
	 DECLARE a : INTEGER;
	 ...
    ENDIF

And personally, I'd prefer it to :
    IF c 
    {
       DECLARE a : INTEGER;
       ...
    }

since nested blocks can really make matching brackets annoyingly difficult. 

BTW, both Ada and Modula-2 require special end-markers at the end of
procedure and module(package) bodies too. (end procedure_name;)  This is a
great help in program readability, just as the end-if, end-while are.

=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
No problem is so formidable that you can't just walk away from it.
                     
    Steve Tynor
    Georgia Instutute of Technology

 ...{akgua, allegra, amd, harpo, hplabs,
     ihnp4, masscomp, ut-ngp, rlgvax, sb1,
     uf-cgrl, unmvax, ut-sally}  !gatech!gitpyr!tynor

franka@mmintl.UUCP (Frank Adams) (10/14/85)

In article <1703@peora.UUCP> jer@peora.UUCP (J. Eric Roskos) writes:
>Some languages, such as Pascal and C, define conditional statements (this
>includes ifs, whiles, etc) like this:
>
>	<conditional> <statement>
>
>e.g.,
>	if (x) <statement>
>
>You then have <statement> defined recursively as either of a simple
>statement, or as a sequence of statements delimited by a begin/end pair.
>
>However, this style has recently fallen into disfavor.  Thus we have the
>sort of definition that is in Ada, in which all conditional statements
>can contain one or more statements, like this:
>
>	<conditional-begin-type-1> {<statement>} <conditional-end-type-1>
>
>where the conditional itself is imbedded in the conditional begin.
>
>Now, I would argue that the latter is a "weaker" construct than the former.
>The reason is (among other things) that the <statement> which can be a
>compound statement can also have private data (and thus is a "block"),
>whereas the latter approach requires a separate "block" construct to be
>defined if you want to have a local scope of reference (and note that
>syntactically, defining a local scope of reference within each type of
>conditional is harder since the declarations have to be duplicated
>in each conditional type).

With the if ... endif type of syntax, one doesn't really need the concept
of a "statement".  The syntax is if <condition> <block of code> ... endif.
One never has a context where a single statement only is required.

One no more has to duplicate the declarations to permit a local scope of
reference in each conditional type than one has to duplicate the declaration
for an expression to permit expressions in, say, conditionals of if clauses
as well as in assignments.  Personally, I would favor doing so univerally;
every block of code should be permitted to have local declarations.

(The big disadvantage of having if's control single statements is that
"if ... then if ... then ... else ..." is ambiguous; which if does the else
go with?  This is well known, of course; I just though I'd repeat it in case
anyone has forgotten.)

Dealing specifically with c, I think you are right that the
   if (condition) {
      ...
   }
syntax is based on the if ... endif idea.  But the appropriate indentation
for the if statement and compound expression interpretation (which is the
language definition) is
   if (condition)
      {
         ...
      };
which almost nobody likes.  From this point of view, we are trading off
the number of constructs in the language definition against the number of
constructs required in a particular piece of code.  No proposes that a
language should maximize the number of constructs in its definition; but
if you want to minimize them, write in assembler (or FORTRAN, if you want
portability).

I have sent this article to both net.lang and net.lang.c, but directed
followups to net.lang only.  Change this if your response fits better in
net.lang.c.

Frank Adams                           ihpn4!philabs!pwa-b!mmintl!franka
Multimate International    52 Oakland Ave North    E. Hartford, CT 06108

mac@uvacs.UUCP (Alex Colvin) (10/15/85)

> Syntactically, there's no reason why we couldn't have something like:
>     IF c THEN DECLARE a : INTEGER; ... ENDIF

As, in fact, you do in Algol68.

    IF c THEN INT a; ... FI

rcd@nbires.UUCP (Dick Dunn) (10/15/85)

Without entering the general fray here, there's one matter which needs to
be corrected on parsing:
> There are other arguments.  For instance, the Pascal style if statement is
> well known to result in ambiguities in the construction of reasonable
> parsers.  (I believe there is a grossly complex mechanism that can be
> applied to disambiguate the if statement; see SIMULA BEGIN by Birtwistle et.
> al.)  One could argue on this basis that such languages are syntactically
> weak, and thus that Ada style conditional constructs are superior.

Ambiguities exist as language attributes, quite independently of the
construction of parsers.  The "Pascal style" if has existed at least since
Algol 60.  Any reasonable mechanism for constructing parsers must be able
to deal with it.  No grossly complex mechanism is required.  Rather, in a
hand-coded parser the ambiguity is resolved by the only natural choice in
the code for the parser.  In LR parsers, the solution is to indicate which
action is to be taken in the event of a shift-reduce conflict.  In LL
parsers, the solution is to indicate which alternative is assumed in the
event of a director-set overlap.  Very little complexity is added to the
parser-generation system.  No complexity is added to the generated parser
itself.
-- 
Dick Dunn	{hao,ucbvax,allegra}!nbires!rcd		(303)444-5710 x3086
   ...Simpler is better.

peter@graffiti.UUCP (Peter da Silva) (10/15/85)

This is a matter of personal opinion.

And now for mine: I prefer the 'C'/Pascal form for several reasons, but
only one is important...

	if (expression)
		statement

when statement is *not* a block is much neater than...

	if (expression)
		statement
	endif

Of course the multiple priorities & esthetic decisions that resolve into
the word "neat" are also matter of personal opinion.

gvcormack@watmum.UUCP (Gordon V. Cormack) (10/18/85)

>  ...  discussion of parsing dangling ELSE clauses
> In LR parsers, the solution is to indicate which
> action is to be taken in the event of a shift-reduce conflict.  In LL
> parsers, the solution is to indicate which alternative is assumed in the
> event of a director-set overlap.  Very little complexity is added to the
> Dick Dunn	{hao,ucbvax,allegra}!nbires!rcd		(303)444-5710 x3086
>    ...Simpler is better.

Using LR parsers, there is absolutely no need to screw around
with shift-reduce conflicts.  An LR grammar can easily be written
that parses Pascal if statements correctly.