[comp.lang.c] Optional semi-colons

chase@Ozona.orc.olivetti.com (David Chase) (04/28/89)

In article <12716@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
>Yes. For example, why does C treat a carriage return as whitespace?
>Nobody programs like that.  Most people put _one_ statement per line,
>so the use of _both_ semicolon and carriage return as statement terminators
>seems redundant.  Why does C choose to ignore the "wrong" one?

In article <10134@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>Many programming environments work best when source file line lengths
>are limited, typically to no more than 80 characters (the exact useful
>limit varies).  ...  Therefore, allowing simple statements to span
>multiple lines is a practical convenience for the programmer.

I don't think that's a very good answer, because:

It's a solved problem, and it was solved before C was around.  I've
programmed a good bit in such a language, and found that it worked
quite well.  To quote [Richards & Whitby-Strevens]:

  BCPL programs are written in free format.  You can put several
  statements on a single line, or use several lines for a single
  statement.  Semicolons must be used to separate statements on a
  single line to resolve ambiguity and can also be included for
  greater clarity.  `end of line' has the effect of terminating a
  statement if syntactically this is possible.  So if you want to
  split a statement over two lines, then the split may be at any
  point where the statement could not be terminated, for example
  after a + or -.

I should also note that the systems on which I used BCPL were of the
dreaded fixed-line-length variety.  It was, in fact, the language of
choice for me (over Fortran, Pascal, and PL/1) on these systems,
though its lack of a type system sometimes made life interesting.

This just makes the "why does C do it the other way" question more
curious; the only explanation that comes to mind is that it makes it
easier to write programs that generate programs (lex and yacc, e.g.).
Of course, hacking around the output to ensure non-ambiguity is not
too hard, especially when compared with the other problems solved by
these programs (besides, there's always filters, right?).

David

desnoyer@Apple.COM (Peter Desnoyers) (04/28/89)

In article <41117@oliveb.olivetti.com> chase@Ozona.UUCP (David Chase) writes:
>In article <12716@lanl.gov> jlg@lanl.gov (Jim Giles) writes:
>>so the use of _both_ semicolon and carriage return as statement terminators
>>seems redundant. 
>
> [BCPL - ';' OR newline]
>This just makes the "why does C do it the other way" question more
>curious; the only explanation that comes to mind is that it makes it
>easier to write programs that generate programs (lex and yacc, e.g.).

Comments are redundant. Variable names longer than about 3 characters
(>62^^3 possible names) are redundant. The purpose of a programming
language is to introduce redundancy when it helps humans, and
eliminate it when it hurts. (e.g. macros and functions)

Along these lines I have heard that people make fewer errors with
C-style semicolons - {statement;statement;} - than with Pascal-style
ones - {statement;statement}. Is this true?

A final comment - I spent a lot of time programming in CLU one
semester. In CLU, the block structure is unambiguous, and there is no
need for statement terminators. The end effect was that the compiler
would come up with an error many statements after the incorrect line.
It was a royal pain in the butt.

				Peter Desnoyers

jlg@lanl.gov (Jim Giles) (04/29/89)

From article <29785@apple.Apple.COM>, by desnoyer@Apple.COM (Peter Desnoyers):
> A final comment - I spent a lot of time programming in CLU one
> semester. In CLU, the block structure is unambiguous, and there is no
> need for statement terminators. The end effect was that the compiler
> would come up with an error many statements after the incorrect line.
> It was a royal pain in the butt.

The same thing happens with C "}" marks - or with Pascal (et.al.) "BEGIN"
"END" pairs.  It is partly for this reason that most more modern languages
don't use the 'compound statement' model for flow control constructs.  For
a complete discussion of this issue, check out the famous 'Hare' experiments
done in the mid 70's and referenced in a number of language design books
and journal articles.  (I don't have the paper right in front of me, or
I would give the reference here.)

mike@arizona.edu (Mike Coffin) (04/29/89)

From article <12856@lanl.gov>, by jlg@lanl.gov (Jim Giles):
> From article <29785@apple.Apple.COM> (Peter Desnoyers):
>> A final comment - I spent a lot of time programming in CLU one
>> semester. In CLU, the block structure is unambiguous, and there is no
>> need for statement terminators. The end effect was that the compiler
>> would come up with an error many statements after the incorrect line.
>> It was a royal pain in the butt.
> 
> The same thing happens with C "}" marks - or with Pascal (et.al.) "BEGIN"
> "END" pairs.  It is partly for this reason that most more modern languages
> don't use the 'compound statement' model for flow control constructs.

I don't think so.  The compound statement tends to be easier for
humans to screw up, but it doesn't make error reporting any more
difficult.  The problem with CLU (and SR, which I'm more familiar
with) is that error detection and recovery is much more difficult
without semicolons, even though they're technically redundant.
Both languages allow things like
	a := b + c + (d*e)
		+ f
	/* six pages of comments */
		- g

This is fine as long as there are no errors.  The problem arises when
a line contains a programming mistake, but there is SOME continuation
that would form a syntactically valid statement.  (This happens more
often than you might expect.)  The compiler can't report the error
until it sees that the continuation isn't valid, so it continues to
eat tokens.  When it finally reaches a token that won't fit, it
reports the error --- probably on a line that is perfectly ok.  (The
SR compiler has some messy little hacks to avoid doing this too often,
but it isn't perfect.)

I think the moral is that a little redundancy is a good thing.  If you
don't like semicolons, add redundancy somewhere else.  For instance,
have newlines mark the end of statements unless an explicit
continuation mark appears, as in Fortran. Personally, I think that
"..." on the end of a continued statement would look nice. 

-- 
Mike Coffin				mike@arizona.edu
Univ. of Ariz. Dept. of Comp. Sci.	{allegra,cmcl2}!arizona!mike
Tucson, AZ  85721			(602)621-2858

gwyn@smoke.BRL.MIL (Doug Gwyn) (04/29/89)

In article <29785@apple.Apple.COM> desnoyer@Apple.COM (Peter Desnoyers) writes:
>Along these lines I have heard that people make fewer errors with
>C-style semicolons - {statement;statement;} - than with Pascal-style
>ones - {statement;statement}. Is this true?

I seem to recall the Ada designers saying that statement terminators
had been shown to be less error-prone that statement separators.

trebor@biar.UUCP (Robert J Woodhead) (04/29/89)

In article <29785@apple.Apple.COM> desnoyer@Apple.COM (Peter Desnoyers) writes:
>A final comment - I spent a lot of time programming in CLU one
>semester. In CLU, the block structure is unambiguous, and there is no
>need for statement terminators. The end effect was that the compiler
>would come up with an error many statements after the incorrect line.

You think this was bad?  Back when I was a Teaching Assistant at the
Cornell School for Hotel Administration, I was assigned the dubious
honor of debugging some Hotel Management programs written in "MOBAL",
or, as we referred to it, "The Language of Kings."

MOBAL ran on microNOVA microminicomputers, and was a "compiler" that
was actually a multipass macro expansion system that ended up generating
assembly language source code.

The problem was, it had an arcane syntax and if you made a single "dot i
or cross t" type of typing error, you got (and this is not an exaggeration)
47 pages of multipass recursive error messages, none of which had anything
to do with the original error.  It got to the point where I threw some
darts at the source code in an effort to find the bugs.

The day that the proud author of MOBAL visited us was probably the only
day in my life where I contemplated torturing another human being.  Death
was too good for this turkey.  Unfortunately, the worst tortures my
fever'd brain could devise all involved programming in MOBAL.

 
-- 
Robert J Woodhead, Biar Games, Inc.  ...!uunet!biar!trebor | trebor@biar.UUCP
"The NY Times is read by the people who run the country.  The Washington Post
is read by the people who think they run the country.   The National Enquirer
is read by the people who think Elvis is alive and running the country..."

wietse@wzv.UUCP (Wietse Z. Venema) (04/29/89)

In article <41117@oliveb.olivetti.com> chase@Ozona.UUCP (David Chase) writes:
>
>  BCPL programs are written in free format. ... `end of line' has the 
>  effect of terminating a statement if syntactically this is possible.  
>
>This just makes the "why does C do it the other way" question more
>curious; the only explanation that comes to mind is that it makes it
>easier to write programs that generate programs (lex and yacc, e.g.).

For example, the C preprocessor. Having a unique statement terminator frees
the programmer of worrying where newlines might show up after macro expansion.
-- 
work:	wswietse@eutrc3.uucp	| Eindhoven University of Technology
work:	wswietse@heitue5.bitnet	| Mathematics and Computing Science
home:	wietse@wzv.uucp		| Eindhoven, The Netherlands

mcdonald@uxe.cso.uiuc.edu (04/30/89)

>For example, the C preprocessor. Having a unique statement terminator frees
>the programmer of worrying where newlines might show up after macro expansion.

Really? What happens if macro expansion results in a string longer
than 509 characters, or an expression several thousands of bytes long?
I have had both of those happen. Serious worries result.

Doug McDonald

cik@l.cc.purdue.edu (Herman Rubin) (05/01/89)

In article <225800165@uxe.cso.uiuc.edu>, mcdonald@uxe.cso.uiuc.edu writes:
> 
> 
> >For example, the C preprocessor. Having a unique statement terminator frees
> >the programmer of worrying where newlines might show up after macro expansion.
> 
> Really? What happens if macro expansion results in a string longer
> than 509 characters, or an expression several thousands of bytes long?
> I have had both of those happen. Serious worries result.
> 

In C, there is already a situation in which semicolons are not used for
terminators, newlines are, and there is provision for multiple line statements.

This is in #defines, where semicolons frequently occur.  It is necessary to
escape a newline to prevent termination.  Why not do this in the language?
It handles the situation of multi-line statements without requiring a statement
termination character.  One could still allow semicolons to end statements, so
existing code could be reasonably edited.

-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)

jlg@lanl.gov (Jim Giles) (05/01/89)

From article <10531@megaron.arizona.edu>, by mike@arizona.edu (Mike Coffin):
> [...]
> I think the moral is that a little redundancy is a good thing.  If you
> don't like semicolons, add redundancy somewhere else.  For instance,
> have newlines mark the end of statements unless an explicit
> continuation mark appears, as in Fortran. Personally, I think that
> "..." on the end of a continued statement would look nice.

This is, what I have been recommending all along!  Actually, I don't want
to eliminate semicolons - I just want the end-of-line to be an alias for
semicolons.  That way, the semicolons are optional after the last (and
usually only) statement of each line.

wietse@wzv.UUCP (Wietse Z. Venema) (05/02/89)

Arguing why \n should not terminate C statements where syntax permits, I wrote:
>For example, the C preprocessor. Having a unique statement terminator frees
>the programmer of worrying where newlines might show up after macro expansion.

mcdonald@uxe.cso.uiuc.edu (Doug McDonald) writes:
>Really? What happens if macro expansion results in a string longer
>than 509 characters, or an expression several thousands of bytes long?
>I have had both of those happen. Serious worries result.

You will have to insert some extra newlines into the source.  My point
is that you won't have to worry about prematurely terminated statements.
-- 
work:	wswietse@eutrc3.uucp	| Eindhoven University of Technology
work:	wswietse@heitue5.bitnet	| Mathematics and Computing Science
home:	wietse@wzv.uucp		| Eindhoven, The Netherlands