[net.lang.c] goto's in 'C'

breuel@harvard.ARPA (Thomas M. Breuel) (01/20/85)

>[Rob Warnock:]
>The incidence of the "goto" in C code is so rare anyway, I dare say
>we could abolish it altogether (replacing it with BLISS's "leave")
>and not miss the loss. (I certainly never missed it in BLISS.)

I would miss it. I have never used it for breaking out of loops or
making loops in the conventional sense. It is extremely useful, though,
for 'hardcoding' finite state machines. If they contain more than say 3
or 4 states, the use of structured constructs is awkward.  One example
where I have used 'goto' is in a non-recursive threading garbage
collector.

Indeed, rather than abolishing it, what about permitting indirect
goto's (something which, so rumour has it, existed in some older
versions of the 'C' compiler). Why? Well, it is useful if you have to
maintain your own evaluation stack or avoid explicit recursion.

						Thomas.

steiny@scc.UUCP (Don Steiny) (01/22/85)

>
> >The incidence of the "goto" in C code is so rare anyway, I dare say
> >we could abolish it altogether (replacing it with BLISS's "leave")
> >and not miss the loss. (I certainly never missed it in BLISS.)
> 
> It is extremely useful, though,
> for 'hardcoding' finite state machines. 

	Boy ain't that the truth.  I assign the task of rewriting
lex and yacc so that their output does not contain goto's to the
person who suggested they be abolished.    A typical yacc
output has hundreds of goto's.  It is not a bad thing when
is is done automatically.

-- 
scc!steiny
Don Steiny - Personetics @ (408) 425-0382
109 Torrey Pine Terr.
Santa Cruz, Calif. 95060
ihnp4!pesnta  -\
fortune!idsvax -> scc!steiny
ucbvax!twg    -/

rpw3@redwood.UUCP (Rob Warnock) (01/23/85)

+---------------
| >[rpw3] The incidence of the "goto" in C code is so rare anyway, I dare say
| >we could abolish it altogether (replacing it with BLISS's "leave")
| >and not miss the loss. (I certainly never missed it in BLISS.)
| I would miss it. I have never used it for breaking out of loops or
| making loops in the conventional sense. It is extremely useful, though,
| for 'hardcoding' finite state machines. If they contain more than say 3
| or 4 states, the use of structured constructs is awkward...
| 						Thomas.
+---------------

Well, that depends on what "structured construct" you use. The standard
technique (criticized by Wulf in "Global Variable Considered Harmful") is
to use a (global) state variable which is really the "PC" of your state
machine, with assignments to "state" instead of "goto"s:

	state = INITIAL;
	while (state != TERMINATION) switch (state) {
	...
	case ENABLED:	...
			if (...) state = ACTIVE;
			break;
	case ACTIVE:	...
			break;
	...
	}

This can be fairly clean if you never allow "fallthrough" from one state
to another (each case ends in "break"), and to my taste is somewhat
easier to follow than the equivalent mass of "goto"s.

If your machine is of the form INPUT X STATE ==> (OUTPUT, NEXT-STATE)
it is sometimes more useful to have the state variable be a pointer to
the processing routine for the current state, which is called for each
"event" (input). This way, the size of each state "module" can be kept
manageable. This approach was used in a disk driver I know of, with
great success (the controller never told you WHY it was interrupting,
you had to know what you'd asked it do do):

	diskintr(p)			/* called on interrupt from disk */
	register struct DISKDDB *p;	/* any of several controllers */
	{
		while ( (*(p->state))(p) )	/* return 0 to dismiss */
			;
	}

This also works nicely for lexical analysis routines which are driven off
of each character "event".

In any case, however, one must be aware that one is dealing with potentially
explosive complexity on the order of the number of input values times the
number of states (and if next-state depends on things other than the "input",
that can be inputs times number of states squared, or worse). As Dijkstra says
[EWD512, in "Selected Writings..."]:

	"But programming, when stripped of all its circumstantial irrelevancies,
	boils down to no more and no less than very effective thinking so as to
	avoid unmastered complexity, to very rigorous separation of your many
	different concerns... We have to fight chaos, and the most effective
	way of doing that is to prevent its emergence. We have to learn to
	avoid all forms of combinatorial complexity generators that, when
	active, rapidly tax our ability to carry out a case-analysis far beyond
	the limits of our power of reasoning."

When forced into a "state machine" approach, wherever possible we must factor
the problem so that the EVENTS x STATES decisions are not more than we can
individually examine for correctness (hopefully much less than a dozen cases).
There are two useful tools for this: factoring the events and factoring the
states.

Factoring the input events can often be done by noting when we are really only
interested in equivalence classes. For example, in lexical analyzers, when in
the "idle" state we usually only care whether the character is a <number>, a
<letter>, some <space>, or an <operator>. (Of course, once we are in "number"
state, we care about each of the possible values of a <number>.)

Likewise, many state machines can be partitioned into either disjoint or
hierarchically nested sub-machines. This often has the consequent desirable
effect of allowing further reduction in the size of the input space. The
entire state "number" above could be considered (from the "top" level) to
be a sub-machine that is interested only in events of the form <number>
and <not-number>. Likewise, states such as "operator" are only concerned
with handling <operator> and <non-operator> inputs (although internally it
must distinguish among all the individual operators).

Whether we use the "goto", "while/switch", routine dispatch, threaded code,
or any other coding style, state machines can be complicated beasts if we
are not extremely careful.


Rob Warnock
Systems Architecture Consultant

UUCP:	{ihnp4,ucbvax!dual}!fortune!redwood!rpw3
DDD:	(415)572-2607
USPS:	510 Trinidad Lane, Foster City, CA  94404

breuel@harvard.ARPA (Thomas M. Breuel) (01/24/85)

|From rpw3@redwood.UUCP (Rob Warnock) Tue Jan 22 21:17:38 1985
|
|+---------------
|| >[rpw3] The incidence of the "goto" in C code is so rare anyway, I dare say
|| >we could abolish it altogether (replacing it with BLISS's "leave")
|| >and not miss the loss. (I certainly never missed it in BLISS.)
|| I would miss it. I have never used it for breaking out of loops or
|| making loops in the conventional sense. It is extremely useful, though,
|| for 'hardcoding' finite state machines. If they contain more than say 3
|| or 4 states, the use of structured constructs is awkward...
|| 						Thomas.
|+---------------
|
|Well, that depends on what "structured construct" you use. The standard
|technique (criticized by Wulf in "Global Variable Considered Harmful") is
|to use a (global) state variable which is really the "PC" of your state
|machine, with assignments to "state" instead of "goto"s:

I would not refer to a state machine with an explicit state variable
as 'hard coded' (sorry about the fuzzy terminology). It remains that in
many cases, state machines implemented with 'goto' are at least as
readable as those implemented using a state variable and a while loop. In
addition, they don't require you to define your states in a separate
place and they are more efficient.

'goto' is useful when used with care, and not providing it in a
systems programming language like 'C' would probably be a mistake.

						Thomas.

jc@mit-athena.ARPA (John Chambers) (01/25/85)

> The incidence of the "goto" in C code is so rare anyway, I dare say
> we could abolish it altogether (replacing it with BLISS's "leave")
> and not miss the loss. 

Don't you dare!  There are some of us who need to write "pre-processors"
to translate some funny (usually special-purpose) language into something
compilable.  C is a popular target for such translators.  The main
reason is that C's grammar is flexible enough that it's relatively
easy to generate it.  Without gotos, it would be much harder.  Have
you ever tried to write a routine to generate properly structured
code when the semantics of the languages don't quite line up?  The
only practical way I know of is with labels and gotos.

Granted, gotos may be undesirable with human-generated code.  I use
them, but normally only to exit from deep nesting, and even this can
be done with only a small loss of performance by writing subroutines 
and using return instead of goto.  But generating structured code
automatically?  Does anyone know a clean way to do this?

If you want a test case to play with, consider translating a C for
loop into the goto-less language of your choice.  Take an example
like:
    for (i=0,p=&x[j]; *p && p->f && p->f[i]; i++, p=p->next) {
       ...
       if (p->x >= p->f[i]) break;
       ...
    }
Note that I'm not asking how you would express this in your goto-less
language.  That's easy.  I'm asking how you would write a translator
that takes this and produces code that is semantically exactly the
same in your goto-less language.  I contend that, with gotos it is
in general easy; without gotos it is in general difficult.

			John Chambers

henry@utzoo.UUCP (Henry Spencer) (01/30/85)

This is a specific case of a general principle:  mechanically-generated
input generally stresses a translator much more severely than human-
generated input.  In any case, since DMR put goto into C right at the
start, we're stuck with it.
-- 
				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,linus,decvax}!utzoo!henry

rpw3@redwood.UUCP (Rob Warnock) (01/30/85)

+---------------
| | | ...[goto] is extremely useful, though,
| | | for 'hardcoding' finite state machines...  || 	Thomas.
| | ...The standard technique ... is
| | to use a (global) state variable which is really the "PC" of your state
| | machine, with assignments to "state" instead of "goto"s...	-Rob
| I would not refer to a state machine with an explicit state variable
| as 'hard coded' (sorry about the fuzzy terminology)...  | 	Thomas.
+---------------

Sorry if *I* was confusing... by "hardcoded" I meant the same thing I
thought you did: the state transitions are coded into explicit C code, such
as "if (...) goto state43" or "if (...) state = STATE43", not whether the
"goto" was explicit or implicit in the loop. Both methods have a "goto";
that's what Wulf's "Global Variable Considered Harmful" was pointing out.

To me, non-"hardcoded" state machines use a table-lookup method (e.g. Floyd
Production Languages) to map {current-state, event} ==> {action, nextstate},
where "action" can produce output and/or push/pop an evaluation stack. The
only explicit C code is a small interpreter to obey the table, plus the
(small number of fixed) action routines. All of the complexity of the state
transitions is explicit in the table. See Gries' book "Compiler Construction
For Digital Computers" (John Wiley 1971), Chapter 7: "Production Language"
(pp 154-169).


Rob Warnock
Systems Architecture Consultant

UUCP:	{ihnp4,ucbvax!dual}!fortune!redwood!rpw3
DDD:	(415)572-2607
USPS:	510 Trinidad Lane, Foster City, CA  94404


p.s. As a side note, anyone interested in making an improved "connect"
function for "cu" or "uucp" or similar tasks might want to consider some
form of Production Language for the "syntax analysis" of the dialog. Hint:
invent a pseudo-input metaclass "<timeout>" as one of the possible "inputs".