[comp.lang.c++] Improved switch statement

campbell@redsox.UUCP (Larry Campbell) (12/16/88)

In article <1907@ogccse.ogc.edu> wm@ogccse.UUCP (Wm Leler) writes:
}I don't think you could extend switch statements to work on strings,
}since there is some ambiguity between whether you want to compare
}the strings by comparing their contents (using strcmp) or by comparing
}their pointers (a very reasonable thing to do).

You leave that decision to the person who implemented == for the class.
I think it would be eminently sensible for the switch statement to operate
on any type that has an == operator defined; of course the case labels would
have to be the correct type (or coerced thereto).

A common and useful idiom I learned in BLISS which is, unfortunately, not
possible in C or C++ today, is the following code fragment (transliterated
from BLISS into pseudo-C), which checks a number of boolean flags:

	bool flag1, flag2, flag3;
	switch (TRUE)
	    {
	    case flag1:
		// stuff
	    case flag2:
		// stuff
	    case flag3:
		// stuff
	    default:
		// whatever
	    }

I think this is much clearer and neater and far less prone to error during
maintenance than the current alternative:

	bool flag1, flag2, flag3;
	if (flag1)
	then
	    // stuff
	else
	    if (flag2)
	    then
		// stuff
	    else
		if (flag3)
		then
		    // stuff
		else
		    // whatever

-- 
Larry Campbell                          The Boston Software Works, Inc.
campbell@bsw.com                        120 Fulton Street
wjh12!redsox!campbell                   Boston, MA 02146

english@stromboli.usc.edu (Joe English) (12/17/88)

In article <574@redsox.UUCP> campbell@redsox.UUCP (Larry Campbell) writes:
>In article <1907@ogccse.ogc.edu> wm@ogccse.UUCP (Wm Leler) writes:
>}I don't think you could extend switch statements to work on strings,
>}since there is some ambiguity between whether you want to compare
>}the strings by comparing their contents (using strcmp) or by comparing
>}their pointers (a very reasonable thing to do).
>
>You leave that decision to the person who implemented == for the class.
>I think it would be eminently sensible for the switch statement to operate
>on any type that has an == operator defined; of course the case labels would
>have to be the correct type (or coerced thereto).
>
>A common and useful idiom I learned in BLISS which is, unfortunately, not
>possible in C or C++ today, is the following code fragment (transliterated
>from BLISS into pseudo-C), which checks a number of boolean flags:
>
[code fragment using switch(), equivalent to if (flag1)...else if (flag2)... ]

This is useful only as syntactic sugar:  the switch statement 
in C/C++ does *not* compile into a series of

if (expr == case1) ... else if (expr == case2) ...

statements, as your suggestion implies.  Instead, (in all the
implementations I've seen, anyway) it does a vectored jump, or if the
(case_i) values are not close together (e.g., case 10: ... case
100:... case 1000:...) into a table lookup with a vectored jump, both
of which are much more efficient than a series of if () tests.  The
switch statement you suggest would (of necessity) compile into a bunch
of if()s, which is not really what switch was designed to do...


      /|/| "How do you convince your dermatologist 
-----< | |                      that you're being sexually responsible?
  O   \|\| english%lipari@oberon.usc.edu

campbell@redsox.UUCP (Larry Campbell) (12/18/88)

In article <14104@oberon.USC.EDU> english@stromboli.usc.edu (Joe English) writes:
}[code fragment using switch(), equivalent to if (flag1)...else if (flag2)... ]
}
}This is useful only as syntactic sugar:  the switch statement 
}in C/C++ does *not* compile into a series of
}
}if (expr == case1) ... else if (expr == case2) ...
}
}statements, as your suggestion implies.  Instead, (in all the
}implementations I've seen, anyway) it does a vectored jump, or if the
}(case_i) values are not close together (e.g., case 10: ... case
}100:... case 1000:...) into a table lookup with a vectored jump ...

So what?  The compiler could still generate the vectored jump or table
lookup if the type of the argument was an integer or enum, and you'd still
have your precious efficiency.  I don't think it's reasonable to hamstring
the language in order to make it easier for people to hack compilers
together.

My point, by which I stand, is that the new behavior is:

    1)	Consistent with the old behavior

    2)	Completely backward compatible

    3)	Provides a useful facility which can clarify code significantly

    4)	Can generate the same efficient code in the old (integer) case

    5)	Is not even a new feature, per se, but merely the relaxation of
	an annoying restriction, which restriction was probably initially
	made in order to simplify compiler construction

Remember, we're talking about C++, not C, so we're not -- yet -- I hope --
constrained by the weight of years of tradition, inertia, and laziness.
-- 
Larry Campbell                          The Boston Software Works, Inc.
campbell@bsw.com                        120 Fulton Street
wjh12!redsox!campbell                   Boston, MA 02146

guy@auspex.UUCP (Guy Harris) (12/18/88)

>This is useful only as syntactic sugar:  the switch statement 
>in C/C++ does *not* compile into a series of
>
>if (expr == case1) ... else if (expr == case2) ...
>
>statements, as your suggestion implies.  Instead, (in all the
>implementations I've seen, anyway)

Sorry, I've seen C implementations where the compiler chooses one of:

	1) an "if" chain

	2) a vectored jump

	3) a table lookup with a vectored jump

depending on the number of cases.  One such implementation is PCC, upon
which a boatload of UNIX C compilers are based.

>The switch statement you suggest would (of necessity) compile into a bunch
>of if()s, which is not really what switch was designed to do...

"switch" was designed to switch among a set of several alternatives; if
that can only be done by an "if" chain, well, that's life....

henry@utzoo.uucp (Henry Spencer) (12/18/88)

In article <574@redsox.UUCP> campbell@redsox.UUCP (Larry Campbell) writes:
>I think this is much clearer and neater and far less prone to error during
>maintenance than the current alternative:
>
>	bool flag1, flag2, flag3;
>	if (flag1)
>	then
>	    // stuff
>	else
>	    if (flag2)
>	    then
>		// stuff
>	    else
>		if (flag3)
>		...

Indenting should reflect the structure of code, not some arbitrary rule
that says that the bodies of ifs should always be indented one stop beyond
the if.  Like this:

	bool flag1, flag2, flag3;
	if (flag1)
	then
		// stuff
	else if (flag2)
	then
		// stuff
	else if (flag3)
	then
		...

This particular situation is common enough that enlightened indenting
standards recognize it as a special case.  More generally, it is sometimes
the case that one can make code clearer by violating a style standard.
(Note, I am not speaking of defining one's own standard, but of a single
violation in otherwise-conforming code.)  This is not something to do
without careful thought, and experienced judgement is required to decide
when it is really in order (i.e., novice programmers should probably be
required to do things "by the book"), but once in a long while it helps.

For example, aside from its tendency to run into the right margin, code
with many indenting levels can be hard to follow because complex nesting
is not easy for the human mind to deal with.  One can sometimes make an
important improvement with a "compound" control structure, e.g.:

	/* NOTE NONSTANDARD INDENTING */
	for (c = getchar(); c != '\n'; c = getchar()) switch (c) {
	case 'a':
		...
	case 'b':
		...
	...
	}

This can be especially significant if there are many cases, meaning that
the beginning and end of the indenting level(s) are far apart, which makes
it particularly hard to match them up.

This sort of thing needs to be done *VERY* sparingly, restricted to places
where it makes a big difference, and commented well.  But it can help.
-- 
"God willing, we will return." |     Henry Spencer at U of Toronto Zoology
-Eugene Cernan, the Moon, 1972 | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

campbell@redsox.UUCP (Larry Campbell) (12/19/88)

In article <1988Dec18.010151.27461@utzoo.uucp> henry@utzoo.uucp (Henry
Spencer) objects to the indentation I used in my example of the "if"
statement present-day C++ forces you to use when you really want a "switch"
statement but the argument isn't an integer or the case labels aren't
constant expressions.

When I wrote my article, I _knew_ someone would flame the indentation, but
with the ghost of the immortal Will Strunk peering over my shoulder, I
decided not to address that issue.

While Henry's suggested indentation is an improvement, I still claim that no
matter how you indent it, the "if" statement isn't as clear as the "switch"
statement; and that a trivial relaxation of an unnecessary restriction,
which does no violence to the language, can make the "switch" statement
generally useful, which today it is not.

(Henry's other point, about the occasional necessity of violating coding
standards in certain special cases, is well taken.)
-- 
Larry Campbell                          The Boston Software Works, Inc.
campbell@bsw.com                        120 Fulton Street
wjh12!redsox!campbell                   Boston, MA 02146

mat@mole-end.UUCP (Mark A Terribile) (12/19/88)

> I think [the BLISS switch statement] is much clearer and neater and far
> less prone to error during maintenance than the current alternative:
> 
> 	bool flag1, flag2, flag3;
> 	if (flag1)
> 	then
> 	    // stuff
> 	else
> 	    if (flag2)
> 	    then
> 		// stuff
> 	    else
> 		if (flag3)
> 		then

Indeed, but it's not clearer than the linearized form preferred by
Kernighan and Plaugher:

	if( ... )
		...
	else if( ... )
		...
	else if( ... )
		...
	else if( ... )
		...
	else
		...

I prefer to go a step further and break the else-if across two lines.  I'm
not afraid of white space and I like to be able to see the shapes of the
program text and know what's happening without having to read the individual
statements.  Oh, and I use an editor that makes it easy to slide up and down
a couple of lines rather than whole pages.

One very useful programming technique I call ``reduction by trivial case.''
Consider a function that must operate on a linked list; its job is to ensure
that the current node is preceeded by a ``marker node,'' creating it if
necessary and in any case returning a pointer to the marker node.  If we
do this without allowing early return and without ordering our cases, we
get:

	if( nodep )
	{
		if( node_p->previous )
		{
			if( node_p->previous->type == MARKER )
				result = node_p->previous;
			else
			{
				ins_node( node_p, new_marker() );
				result = node_p->previous;
			}
		}
		else
		{
			node_p->previous = new_marker();
			result = node_p->previous;
		}
	}
	else
		mark_p = 0;

	return mark_p;

Among this fragment's various faults is that the compiler will have to check
very carefully to ensure that ``result'' is set in every path through.  It's
not impossible, and not even ``difficult'' but it needlessly forces the to
rely on a sophisticated checking mechanism where a simple one would do.
Omnidijkstratization doesn't always simplify things.

This really ought to be a member function, but member functions shouldn't be
in the business of checking that their >this< pointers are null.  Assuming
though that is *is* a seperate function, we write it as below.  By using early
returns, we reduce the checking that the compiler has to do from ``used before
set/multipath'' to ``return .vs. return e'' .  It's easier for the compiler
to check, and easier for the reader to check as well.

node*
marknode( node* node_p )
{
	if( ! node_p )		// Note that this test and return removes the
		return 0;	// need for a level of nesting.  It also
				// introduces an assumption that can safely
				// be made by subsequent code--the pointer is
				// non-null and should be presumed valid.
				// By removing this trivial case, we have
				// changed the problem and exposed the next
				// trivial case.

	if( ! node_p->previous )
		return ( node_p->previous = new_marker() );
				// We've stripped off another troublesome but
				// trivial case, and changed the problem to
				// expose the next trivial case.

	if( node_p->previous->type != MARKER )
		insnode( node_p, new_marker() );
				// We've now reduced the two remaining cases
				// to one, introducing the safe assumption
				// that the previous instruction is a marker
				// without needlessly introducing extra
				// marker noded.

	return node_p->previous;
}
-- 

(This man's opinions are his own.)
From mole-end				Mark Terribile

day@grand.UUCP (Dave Yost) (12/20/88)

In article <577@redsox.UUCP> campbell@redsox.UUCP (Larry Campbell) writes:
>In article <1988Dec18.010151.27461@utzoo.uucp> henry@utzoo.uucp (Henry
>Spencer) objects to the indentation I used in my example ...
>
>When I wrote my article, I _knew_ someone would flame the indentation, but ..

I hereby coin a new term for this situation:

 flame bait

 --dave yost

nevin1@ihlpb.ATT.COM (Liber) (12/22/88)

In article <574@redsox.UUCP> campbell@redsox.UUCP (Larry Campbell) writes:

|I think it would be eminently sensible for the switch statement to operate
|on any type that has an == operator defined; of course the case labels would
|have to be the correct type (or coerced thereto).

You have to be far more restrictive than that.  Not only must
operator== be defined, but it must be defined to return a scalar (since
the controlling expression of an if statement must have scalar type).
Note:  you can't easily use implicit type conversion, since there is no
way of telling which of the scalar types (pointer, float, int, etc.)
the return value should meaningfully be converted to.

Also, it makes a big difference on whether

	switch (x)
	{
		case y:
	...
	}

is translated to

	if ((x) == (y))

or

	if ((y) == (x))

since there is no guarantee by the language (nor should there be, even
if it were feasable) that operator== is commutative with respect to it's
arguments, even if the arguments are the same type (let alone if the
arguments are of different types).  In effect, operator== must
*semantically* mean 'equal-to', in order for this extension to switch
to not appear to be kludgey.  This proposal doesn't fit in with the
rest of the language.

|A common and useful idiom I learned in BLISS which is, unfortunately, not
|possible in C or C++ today, is the following code fragment (transliterated
|from BLISS into pseudo-C), which checks a number of boolean flags:

|	bool flag1, flag2, flag3;
|	switch (TRUE)
|	    {
|	    case flag1:
|		// stuff
|	    case flag2:
|		// stuff
|	    case flag3:
|		// stuff
|	    default:
|		// whatever
|	    }

|I think this is much clearer and neater and far less prone to error during
|maintenance than the current alternative:

|	bool flag1, flag2, flag3;
|	if (flag1)
|	then
|	    // stuff
|	else
|	    if (flag2)
|	    then
|		// stuff
|	    else
|		if (flag3)
|		then
|		    // stuff
|		else
|		    // whatever

I might agree, except that your pseudo-C does not translate into the
if-then-else chain that you wrote here.  You have assumed that each of
the cases has an implicit "break" statement, which is something you
cannot do in the generalized extension (if you don't want the extension
to look kludgy, that is).  What do you plan on doing about fallthrough?
Although you can do it using temp variables as flags, the resultant
code will probably be less efficient than hand-coding the if-then-else
chain (your flow of control would probably be different, since you
weren't trying to squeeze it in to the switch statement format).


IMHO, this proposal is very kludgy, at best.
-- 
NEVIN ":-)" LIBER  AT&T Bell Laboratories  nevin1@ihlpb.ATT.COM  (312) 979-4751

guy@auspex.UUCP (Guy Harris) (12/22/88)

 >Not only must operator== be defined, but it must be defined to return
 >a scalar...

In fact, it had better return a Boolean value, or at least 0 for "not
equal" and non-0 for "equal"...

 >since there is no guarantee by the language (nor should there be, even
 >if it were feasable) that operator== is commutative with respect to it's
 >arguments, even if the arguments are the same type (let alone if the
 >arguments are of different types).  In effect, operator== must
 >*semantically* mean 'equal-to', in order for this extension to switch
 >to not appear to be kludgey.  This proposal doesn't fit in with the
 >rest of the language.

Well, offhand I'd think a program defining "==" operator that didn't act
like an equality operator would be a good candidate for an "Obfuscated
C++ Code" contest; I'd *strongly* suggest they pick some other operator.
Is there some good reason why one would want to define "==" as an
operator that didn't behave like a "comparison for equality" operator? 
If not, the feature might just have in its documentation

	*WARNING: this feature uses the class's "==" operator to do the
	comparisons; if this isn't an appropriate operator, don't use
	this feature.

and rely on the sanity and common sense of the programmer (a dangerous
assumption at times, admittedly).

nevin1@ihlpb.ATT.COM (Liber) (12/23/88)

In article <781@auspex.UUCP> guy@auspex.UUCP (Guy Harris) writes:

>Well, offhand I'd think a program defining "==" operator that didn't act
>like an equality operator would be a good candidate for an "Obfuscated
>C++ Code" contest; I'd *strongly* suggest they pick some other operator.

But part of the beauty of C++ is that you don't have to define
operator== as the *C* equality operator!

>Is there some good reason why one would want to define "==" as an
>operator that didn't behave like a "comparison for equality" operator? 

Suppose I defined operator== for a class (call it ihlpb) that, when
comparing two instances of ihlpb (call them nevin and liber) in a
"nevin == liber" manner, returns a NULL pointer if false, or a pointer
to liber if true.  This equality operator does not have the property of
being commutative, as is consequently not the same as the C equality
operator.

Since C++ makes no guarantee on how a given operator can be defined, it
would be very bad to add a selection statement to the language which is
*implicitly* dependent on the semantics of an operator.  One of the
strong points of C++ is that the code is more self-documenting;
generalizing the switch statement in this manner would go against this
goal.
-- 
NEVIN ":-)" LIBER  AT&T Bell Laboratories  nevin1@ihlpb.ATT.COM  (312) 979-4751

campbell@redsox.UUCP (Larry Campbell) (12/24/88)

In article <9264@ihlpb.ATT.COM> nevin1@ihlpb.UUCP (55528-Liber,N.J.) writes:
}In article <781@auspex.UUCP> guy@auspex.UUCP (Guy Harris) writes:
}
}>Well, offhand I'd think a program defining "==" operator that didn't act
}>like an equality operator would be a good candidate for an "Obfuscated
}>C++ Code" contest; I'd *strongly* suggest they pick some other operator.
}
}But part of the beauty of C++ is that you don't have to define
}operator== as the *C* equality operator!

... followed by an example of a non-commutative == operator ...

It is an EXTREMELY bad engineering practice to use deceptive or misleading
names in programs.  Well-written programs inform and educate the reader --
they do not surprise or deceive him.

Remember, operators are just function names.  It is possible, but foolish,
to define a function called "sqrt" that returns, say, the cube root of its
argument.  It is also possible, but equally foolish, to define an "=="
operator that does something radically different than what most people
expect from an equality operator (like not being commutative, not returning
a boolean value, etc.)

Just because C++ lets you do it doesn't mean it's a good idea!
-- 
Larry Campbell                          The Boston Software Works, Inc.
campbell@bsw.com                        120 Fulton Street
wjh12!redsox!campbell                   Boston, MA 02146

shawn@pnet51.cts.com (Shawn Stanley) (12/28/88)

nevin1@ihlpb.ATT.COM (Liber) writes:
>In article <574@redsox.UUCP> campbell@redsox.UUCP (Larry Campbell) writes:
>
>|I think it would be eminently sensible for the switch statement to operate
>|on any type that has an == operator defined; of course the case labels would
>|have to be the correct type (or coerced thereto).
>
>You have to be far more restrictive than that.  Not only must
>operator== be defined, but it must be defined to return a scalar (since
>the controlling expression of an if statement must have scalar type).
>Note:  you can't easily use implicit type conversion, since there is no
>way of telling which of the scalar types (pointer, float, int, etc.)
>the return value should meaningfully be converted to.

I would think that either the switch() operand's type or the case operand's
type would be used.  Most probably case, since that would allow greater
flexibility.  This would imply:

  switch(x)
  {
    case y:
    ...
  }

In this instance, the above would be the same as (y == x), which would use the
type of (y) and convert (x) to that type for the compare.


UUCP: {rosevax, crash}!orbit!pnet51!shawn
INET: shawn@pnet51.cts.com

nevin1@ihlpb.ATT.COM (Liber) (12/28/88)

In article <581@redsox.UUCP> campbell@redsox.UUCP (Larry Campbell) writes:

>It is an EXTREMELY bad engineering practice to use deceptive or misleading
>names in programs.  Well-written programs inform and educate the reader --
>they do not surprise or deceive him.

I agree with you here.  However, it is very hard to take function names
out of its' environment; ie, the context of the program it appears in,
and then say that it is badly named.  My example was to show what is
possible, legal, and in some circumstances, 'good'.

>Remember, operators are just function names.  It is possible, but foolish,
>to define a function called "sqrt" that returns, say, the cube root of its
>argument.  It is also possible, but equally foolish, to define an "=="
>operator that does something radically different than what most people
>expect from an equality operator (like not being commutative, not returning
>a boolean value, etc.)

I suppose you would say that it is also possible, but equally foolish,
to define a "<<" operator which does something radically different
than what most people expect from a bitwise shift operator, like putting
something out to a stream.  Or to define a "*" operator which is not
commutative, as in the case of matrix multiplication.  Or having a "+"
operator which is not associative, which most languages that implement
floating point have; or not commutative, as in the case of string
concatenation.

I agree that "==" should mean "equality"; however, I do not agree that
"==" should (necessarily) mean "equality that is commutative and always
returns a 0 or a 1", especially in cases where this meaning is not
appropriate for the given class.  The language does not and cannot
enforce these properties; it is dangerous to assume them true and
develop new *language* features based upon them.

BTW, all of this is irrelevant to the original discussion.  The point
is that, whether or not we like it, C++ allows you to define an operator
just about any way you want.  The only useful way to used the proposed
change to the switch statement requires that I know exactly how it gets
rewritten as an if-then-else chain, in which case I would rather use the
if-then-else chain, since it is more 'self-documenting'.


Obligatory C++ feature proposal:  I would like a *complete* set of compound
assignment operators.  I propose that ===, !==, ||=, &&=, =< (<= is
already taken), =<=, =>, =>= be added to C++.  Any takers? :-)
-- 
NEVIN ":-)" LIBER  AT&T Bell Laboratories  nevin1@ihlpb.ATT.COM  (312) 979-4751

johnson@p.cs.uiuc.edu (12/29/88)

This discussion about modifying the switch statement in C++ is pretty
amusing.  Case statements are bad programming style in object-oriented
programs.  Instead, you should use virtual functions.  Don't say that
you want to distinguish the values of integers and virtual functions
won't work in that context.  That just reveals that you are using
integers where you should be using something else.

There are certainly places where one should use switch statements, mainly
when you are converting from a non-object-oriented part of the system
to the clean, elegant part.  However, I put switch statements in the same
category with goto statements: sometimes necessary but never elegant.

Virtual functions are even faster than switch statements.  For proof,
see the paper by Russo and Kaplan in the latest C++ conference.  There
is no reason to use switch statements when you can use virtual functions.

phcoates@uqvax.decnet.uq.oz (12/31/88)

In article <9276@ihlpb.ATT.COM> nevin1@ihlpb.ATT.COM(Nevin Liber) writes:
 
|BTW, all of this is irrelevant to the original discussion.  The point
|is that, whether or not we like it, C++ allows you to define an operator
|just about any way you want.  The only useful way to used the proposed
|change to the switch statement requires that I know exactly how it gets
|rewritten as an if-then-else chain, in which case I would rather use the
|if-then-else chain, since it is more 'self-documenting'.

I like code to be consistent.  I could quite easily live with a 'switch'
statement that used the current definition of '=='.  However I understand that,
to account for those individuals who might use the above operator in some new
and unknown way (and *then* think that they can get away with using a 'switch'
with an instance of their new Mickey Mouse class as the selector or as a case
label), a 'switch' structure based on '==' could be inferior to using

			if (...) {
			} else if (...) {
			/* etc. */
			} else {
			}

If so, then why bother to keep the switch statement at all?  Must C++ be chained
to its ancestry if it is better that it fly free?  Surely if the 'if' is better
then any program would be more 'self-documenting' if all such selections were
treated using 'if' constructs.

It comes down to this - either have a 'switch' that uses '==', or get rid of
'switch' all together.  You can't stand with one foot on either side of a
barbed-wire fence.

Tony Coates.

mat@mole-end.UUCP (Mark A Terribile) (01/02/89)

(Warning: Herein I shamelessly mix quotes from two authors.  Don't bother
trying to make me feel guilty about it; I won't.)

There's been some discussion on the switch() statement in C++.  Should we
redefine it?  Do away with it?  Pillory and scourge it?

> ... why bother to keep the switch statement at all?  Must C++ be chained to
> its ancestry if it is better that it fly free? ...

Yes, C++ is most certainly chained to its ancestry.

	``Clearly some problems could be avoided if some of the
	C heritage was rejected ....  This was not done because
	(1) there are millions of lines of C code that might
	benefit from C++, provided that a complete rewrite ...
	were unnecessary; ... ''
					--Stroustrup, *The C++ Programming
					Language, in the ``Notes to the
					Reader''

(This is known as proof by Appeal To Authority.)  At this point, all the
rest of the discussion becomes, in one sense, irrelevant.  Or does it?  Most
of us agree that using the goto statement is usually bad practice, but C has
it and I would not suggest removing it.  Is it possible that the switch()
statement could be relegated to ``bad practice almost all the time''?

> ...  Surely if the 'if' is better then any program would be more
> 'self-documenting' if all such selections were treated using 'if' constructs.

Well, is a Ferrari better than a pickup truck?  Answer: ``Better FOR WHAT?''

The author also presumes that ``self-documenting'' and ``better'' are the
same.  Perhaps they are and perhaps they are not.  Perhaps if all other things
were equal, self-documenting would always be better than non-self documenting.
But all things being equal, all things are never equal and there may be good
reasons for choosing the second-of-N-most-self-documenting methods when that
method brings other big advantages.  Or, as Chesterton said, ``We must first
establish what we mean by the word `Good.'  If a man were to shoot his
grandmother with a rifle at a range of five hundred yards, I should call him a
good shot, but not necessarily a good man.''
 
>There are certainly places where one should use switch statements, mainly
>when you are converting from a non-object-oriented part of the system
>to the clean, elegant part.  However, I put switch statements in the same
>category with goto statements: sometimes necessary but never elegant.

This too makes some presumptions: that OOP is necessarily more elegant than
non-OOP and that more-elegant is better than not-more-elegant.  The latter
is dangerous; ``elegant'' and ``tricksy'' are too easily confused.

That non-OOP paradigms sometimes work better than OOP is pretty well
established.  Arithmetic types certainly seem to require less motion when
their implementation mixes the OOP and data abstraction paradigms.  Bjarne
has given a paper called *What Is Object Oriented Programming?* that
addresses some of this.  (Now where's my copy ...?)

>...  Case statements are bad programming style in object-oriented programs.
>Instead, you should use virtual functions.  Don't say that you want to
>distinguish the values of integers and virtual functions won't work in that
>context.  That just reveals that you are using integers where you should be
>using something else.

> There is no reason to use switch statements when you can use virtual
> functions.

False.  It is sometimes ``better'' to store an explicit type field.  If the
type field can be stored in a byte or in two or three bits, and if there are
several thousand of the object and you are short on memory, the explicit type
field may be better.

False again in that the switch() nicely complements enumerated types.
Unless you wish to go through the rhetoric of making state constants, etc.,
into objects (and there is sometimes good reason) the switch-on-enumerated-
type offers at least the protection that the compiler can (and *should*) warn
if a case is left uncovered by a change.  Also, when implementing one object
type, it is not always terrible to leave the object paradigm internally.
Again, sometimes the rhetoric of the object paradigm outweighs the benefits
of the paradigm.

False again in that a system which admits to input and output must at some
point acknowledge things like characters, which are NOT objects.

> |... C++ allows you to define an operator just about any way you want.  The
> |only useful way to use the proposed change to the switch statement requires
> |statement requires that I know exactly how it gets rewritten as an if-then-
> |else chain, in which case I would rather use the ... chain ...

I'm inclined to agree, simply because the switch() is an at-most-1-of-N branch
which depends on--and guarantees--the cases being mutually exclusive.  The
if-then-else chain neither requires it nor guarantees it.  Where the switch()
can be used, it immediately tells the reader something that the if-then-else
chain only reveals upon close examination of each test.
 
> It comes down to this - either have a 'switch' that uses '==', or get rid of
> 'switch' all together.  You can't stand with one foot on either side of a
> barbed-wire fence.

I disagree.  The switch() guarantees exclusivity and it works because of the
property of the sets (integer types, enumerated types) which are its domain:
the compiler can check exclusivity and can guarantee, based on the properties
of sets and numbers, that only one case *could* fire, no matter in what order
the test was done.
-- 

(This man's opinions are his own.)
From mole-end				Mark Terribile

geoff@tom.harvard.edu (Geoff Clemm) (01/03/89)

In article <1031@uqvax.decnet.uq.oz> phcoates@uqvax.decnet.uq.oz writes:
>In article <9276@ihlpb.ATT.COM> nevin1@ihlpb.ATT.COM(Nevin Liber) writes:
>|...  The only useful way to used the proposed
>|change to the switch statement requires that I know exactly how it gets
>|rewritten as an if-then-else chain, in which case I would rather use the
>|if-then-else chain, since it is more 'self-documenting'.
> ...
>If so, then why bother to keep the switch statement at all?  Must C++ be chained
>to its ancestry if it is better that it fly free?  Surely if the 'if' is better
>then any program would be more 'self-documenting' if all such selections were
>treated using 'if' constructs.

The switch statement is used as a statement to the compiler that the following
alternatives consist of mutually exclusive cases of a discrete type (such
as integers and characters), which therefore can be implemented via a jump
table.  One advantage of a switch statement is that a compiler can tell you
if you erroneously have two alternatives labelled with the same value.

<flame-on>
All of this silliness about "extending" the switch statement so that it becomes
identical to an if-elseif chain, and then the compounded silliness of asking
why then we have a switch statement is becoming excessive.
<flame-off>

Geoff Clemm

haahr@phoenix.Princeton.EDU (Paul Gluckauf Haahr) (01/03/89)

in article <77300018@p.cs.uiuc.edu> johnson@p.cs.uiuc.edu writes:
> This discussion about modifying the switch statement in C++ is pretty
> amusing.  Case statements are bad programming style in object-oriented
> programs.  Instead, you should use virtual functions.  Don't say that
> you want to distinguish the values of integers and virtual functions
> won't work in that context.  That just reveals that you are using
> integers where you should be using something else.

the switch statement is very useful for some programs;  it does a large
part of the job when you want to model a finite state automaton.  for
example, a switch construct is normally the best thing for implementing
a lexical analyzer.  many program generators rely on switch; classic
examples are lex and yacc.  i know of a c compiler where the code
generator is a (machine generated) 3000+ line switch statement that
does dag rewriting.  switch (and even goto) are constructs that come in
handy for programs that write programs.

> There are certainly places where one should use switch statements, mainly
> when you are converting from a non-object-oriented part of the system
> to the clean, elegant part.  However, I put switch statements in the same
> category with goto statements: sometimes necessary but never elegant.

maybe a lexical analyzer or parser is a "non-object-oriented part of
[a] system," but that doesn't mean it should be discounted.

i put switch and goto in the same category:  the appropriate constructs
to model some actions, typically a finite state machine which is the
output of a program.

paul haahr		haahr@princeton.edu	princeton!haahr

jbn@glacier.STANFORD.EDU (John B. Nagle) (01/03/89)

     Djykstra looked at this issue some years ago, and came up with
the notion of "guarded commands".  See his "Algorithms + Data Structures
= Programs".  The idea is interesting, but in some ways too cute.  
Nevertheless, his thinking on this subject is the clearest to date.

					John Nagle

mac3n@babbage.acc.virginia.edu (Alex Colvin) (01/03/89)

>      Djykstra looked at this issue some years ago, and came up with
> the notion of "guarded commands".  See his "Algorithms + Data Structures
> = Programs".

Actually, A+D=P is Wirth. "Discipline of Programming" is Dijkstra's guarded
command book.

I believe the switch statement with case labels is Hoare's, as an
alternative to the Algolish label array.

Parnas combined the IF and DO guarded command sets into a IT statement that
required explicit looping or exit.

As to switch, if I'd wanted a long chain of tests, I could have written it
that way myself.

johnson@p.cs.uiuc.edu (01/05/89)

I love a good fight.  Thanks, Mark!

When I say that virtual functions are "better" than switch statements,
I mean that they result in programs that are easier to understand and
maintain.  I also argued that they were just as fast, as least with
the computers and compilers we are using.  There are certainly reasons
why one might want to use tags and switches: space efficiency (as Mark
suggested), compatibility with other programs (my "outside world"
comment).  If it turns out that a parser generator needs to generate
switch statements and gotos, I don't mind at all, unless you are going
to force me to read the program that gets produced.  

>This too makes some presumptions: that OOP is necessarily more elegant than
>non-OOP and that more-elegant is better than not-more-elegant.  The latter
>is dangerous; ``elegant'' and ``tricksy'' are too easily confused.

I disagree that elegant and tricksy are at all similar.  In my book,
elegant means "simple and easy to understand".  It does not mean fast,
small, or anything else.  My experience is that if you work long enough
you can usually get a simple solution to a problem that is also fast and
small, though there are certainly an infinite number of exceptions.

One of the "laws of computing" that I saw once was "All non-trivial
programs have bugs in them."  A corollary is "If your program has no
bugs, it is trivial."  There is certainly a lot of truth to this.  I
want to write bug-free programs, which means that they must be trivial.
If it is possible to write a bug-free operating system, elegance is
a prerequisite.

There are certainly non-OOP paradigms that work better than OOP in many
cases, but 90% of programmers never use them, either.  My experience
is that OO solutions are almost always better than traditional "structured"
methods, which is what most programmers use.  Functional and logic
programming are better solutions for many problems, but they have always
had small followings.

>Unless you wish to go through the rhetoric of making state constants, etc.,
>into objects (and there is sometimes good reason) the switch-on-enumerated-
>type offers at least the protection that the compiler can (and *should*) warn
>if a case is left uncovered by a change.

The only reason not to make state constants objects in C++ is space.
If you are only checking the state once then it is probably not worth
the trouble of making separate classes for each state.  (But then you
are probably doing something else wrong, or you are talking to the
outside world (see below) or this is a rare case.)  However, let us
suppose that we have an enumerated type with 7 cases and we check
for a tag of that type 11 times.  In my book, there should be an abstract
class that defines 11 virtual functions and 7 concrete subclasses.  If you
want to add a new case in the C-like version then you will have to go find
all 11 places you had a switch statement and fix them up.  This code might
not even belong to you (i.e. it might be object code that the vender
provided).  However, in the OO version you just create a new subclass.
If it is a minor variation on an existing class then you can subclass one
of the concrete classes and so do little work.

One of the advantages of this style of programming is that it helps you
to discover objects.  People always ask "how do you find the objects?"
I don't know what your objects should be, but I know how to find them,
and building lots of little classes like this is one of the important
steps.  Some times I overdo it, in which case I will back up and merge
classes.  However, most of the time this leads to new class hierarchies.

>False again in that a system which admits to input and output must at some
>point acknowledge things like characters, which are NOT objects.

I said that in my first message.  There are lots of examples like this:
network protocols, robotics, disk drivers.  This just shows that the
world is not as object-oriented as some people claim.  I always encapsulate
the I/O in an object, hide the switch statements there, and from then on
have a consitent OO view of the world.

dw@demon.siemens.com (Dan Wolfson) (01/05/89)

A few days ago I mentioned some problems we were having with the OASYS C++
compiler on VMS.  For those interested or amused (or who perhaps even have
constructive comments) here is the result of trying to compile our bison based
parser.  This parser does compile successfully with G++ on a sun.


ccxx +s -c parse.cxx

	Glockenspiel C pre-processor ( System V compatible )
	Copyright 1986, Glockenspiel Ltd.
Warning in File "vaxc$include:stdio.hxx" at line 37 :- the identifier "TRUE" has been defined already
Warning in File "vaxc$include:stdio.hxx" at line 38 :- the identifier "FALSE" has been defined already
Warning in File "mpl.h" at line 11 :- the identifier "NULL" has been defined already
	AT&T C++ Translator, release 1.1V2
	Copyright 1984 AT&T, Inc. and 1986 Glockenspiel Ltd.
 sizes and alignments

	size	align	largest
char	1	1
short	2	2
int	4	4	2147483647
long	4	4
float	4	4
double	8	4
bptr	4	4
wptr	4	4
struct	4	4
struct2	0	1
8 bits in a byte, 32 bits in a word, 4 bytes in a word
Dynamic Memory starts at: 398K
Memory at 407K,	Total allocated is 17K,	Input buffer 23
Memory at 430K,	Total allocated is 41K,	Input buffer 99
Memory at 448K,	Total allocated is 58K,	Input buffer 401
Memory at 473K,	Total allocated is 83K,	Input buffer 401
Memory at 506K,	Total allocated is 117K,	Input buffer 464
Memory at 522K,	Total allocated is 133K,	Input buffer 464
Memory at 542K,	Total allocated is 152K,	Input buffer 464
Memory at 563K,	Total allocated is 174K,	Input buffer 511
Memory at 587K,	Total allocated is 197K,	Input buffer 1069
Memory at 604K,	Total allocated is 215K,	Input buffer 1304
Memory at 634K,	Total allocated is 244K,	Input buffer 1304
Memory at 650K,	Total allocated is 261K,	Input buffer 1304
Memory at 679K,	Total allocated is 290K,	Input buffer 1304
Memory at 703K,	Total allocated is 314K,	Input buffer 1304
Memory at 720K,	Total allocated is 331K,	Input buffer 1304
Memory at 744K,	Total allocated is 354K,	Input buffer 1304
Memory at 761K,	Total allocated is 372K,	Input buffer 1304
Memory at 787K,	Total allocated is 398K,	Input buffer 1304
Memory at 816K,	Total allocated is 427K,	Input buffer 1304
Memory at 833K,	Total allocated is 443K,	Input buffer 1304
Memory at 849K,	Total allocated is 460K,	Input buffer 1304
Memory at 870K,	Total allocated is 481K,	Input buffer 1304
Memory at 898K,	Total allocated is 509K,	Input buffer 1304
Memory at 914K,	Total allocated is 525K,	Input buffer 1304
Memory at 946K,	Total allocated is 556K,	Input buffer 1304
Memory at 970K,	Total allocated is 581K,	Input buffer 1304
Memory at 987K,	Total allocated is 598K,	Input buffer 1304
Memory at 1012K,	Total allocated is 622K,	Input buffer 1304
Memory at 1036K,	Total allocated is 647K,	Input buffer 1304
Memory at 1055K,	Total allocated is 666K,	Input buffer 1304
Memory at 1080K,	Total allocated is 691K,	Input buffer 1304
Memory at 1104K,	Total allocated is 715K,	Input buffer 1304
Memory at 1150K,	Total allocated is 761K,	Input buffer 1304
Memory at 1185K,	Total allocated is 796K,	Input buffer 1304
Memory at 1211K,	Total allocated is 822K,	Input buffer 1304
Memory at 1227K,	Total allocated is 838K,	Input buffer 1304
Memory at 1243K,	Total allocated is 854K,	Input buffer 1304
Memory at 1267K,	Total allocated is 878K,	Input buffer 1304
Memory at 1291K,	Total allocated is 902K,	Input buffer 1304
Memory at 1308K,	Total allocated is 919K,	Input buffer 1304
Memory at 1329K,	Total allocated is 940K,	Input buffer 1304
Memory at 1346K,	Total allocated is 956K,	Input buffer 1304
Memory at 1370K,	Total allocated is 980K,	Input buffer 1304
Memory at 1388K,	Total allocated is 999K,	Input buffer 1304
Memory at 1420K,	Total allocated is 1031K,	Input buffer 1304
Memory at 1436K,	Total allocated is 1047K,	Input buffer 1530
Memory at 1458K,	Total allocated is 1069K,	Input buffer 3054
Memory at 1474K,	Total allocated is 1085K,	Input buffer 4246
Memory at 1490K,	Total allocated is 1101K,	Input buffer 5467
Memory at 1506K,	Total allocated is 1117K,	Input buffer 6653
Memory at 1522K,	Total allocated is 1133K,	Input buffer 8053
Memory at 1538K,	Total allocated is 1149K,	Input buffer 9287
Memory at 1554K,	Total allocated is 1165K,	Input buffer 9287
Memory at 1570K,	Total allocated is 1181K,	Input buffer 9979
"parse.y", line 1064: internal <<cfront 05/20/86>> error: input buffer overflow


Any ideas?

Dan

mat@mole-end.UUCP (Mark A Terribile) (01/10/89)

> When I say that virtual functions are "better" than switch statements,
> I mean that they result in programs that are easier to understand and
> maintain.  ...

I agree, for those places where the paradigm is appropriate.  Bjarne writes
that if you can think ``it'' as an idea, make it a class (a type); if you
can think if ``it'' as an entity, make it an object.

BUT ... where the ``it'' is wholly abstract, for instance a finite set of
some sort, and where the ``it'' is part of a larger abstraction (directions
of cursor movement in a terminal window package or the state of a call
appearance at a telephone set) it may make more sense to represent it as an
enumerated type.  The compiler can check that the switch statements cover
all of the cases; the grunge is limited to one class, and should you need to
pass the info around you can create a package class, all private with your
main class as a friend, the package class containing just one member of
the set-member type.

If you insist on representing states as objects of a class type, then you
have an explosion of member funtions: member function T::a is what S::a
has to call when *it* needs something done depending on the case, T::b1 and
T::b2 are what S::b have to call, etc.

Worse, you will either need to make the members of T friends of the appropriate
members of S, or create yet a new type for transactions between S and T .
This is a path to infinite regress; at *some* point you have to talk about
basic types, at least in C++, and enumerations on states of ONE other type
(perhaps a even a base type for other classes) seems like a clear and
economical place to take that step.

In teaching myself C++, I've carried ``classiness'' a little too far on
occasion, and this was one of the paths I wandered down.

> >This too makes some presumptions: that OOP is necessarily more elegant than
> >non-OOP and that more-elegant is better than not-more-elegant.  The latter
> >is dangerous; ``elegant'' and ``tricksy'' are too easily confused.
> 
> I disagree that elegant and tricksy are at all similar.  In my book,
> elegant means "simple and easy to understand".  It does not mean fast,
> small, or anything else.  ...

In the practical world, you do have to shoot for all those goals, no?  And
one path to elegance is to take advantage of some particular data or
knowledge representation which can be interpreted in different ways depending
upon your viewpoint, producing (by coincidence) consistant results as you
move from viewpoint to viewpoint.  Such things aren't bad when they are
encapsulated and described as conversion mechanisms; they are disasterous
when they are scattered throughout code which treats the coincidence of
the viewpoints in cases A through N a fundamental property of the problem,
breaking the code for cases N through Z.

This confusion does take place in the real world and it is sometimes
exploited to very good end.  And it has come to be one definition of
``elegant.''

I'm pretty well convinced that there is a class of problems which the OOP
paradigm does not handle well, at least as it is implemented in C++, and
that C++ allows these problems to be expressed without preventing the OOP
paradigm from being used on other problems within the same program, even
when the two must communicate and cooperate.

> If it is possible to write a bug-free operating system, elegance is
> a prerequisite.

Which definition of elegance?  I would say that clear thinking is essential.
Clear, comprehensive, and economical expression are essential.  The OOP
paradigm is certainly a powerful tool.  So is a table saw.  But not every
job is best done using a table saw.

Can you demonstrate convincingly (not necessarily to the strict standards
of proof) that in C++ the OOP paradigm is always the ``best'' one to use?

> 			       ...  In my book, there should be an abstract
> class that defines 11 virtual functions and 7 concrete subclasses.  If you
> want to add a new case in the C-like version then you will have to go find
> all 11 places you had a switch statement and fix them up.  ...

You might.  Determining which will be easier and safer to modify is often
an engineering judgement.  (At least if you use enumerated constants and
you avoid default cases in these sets, the language processor can help you
identify the places that need change.)  Engineering judgements often depend
for their soundness upon the soundness of your reading of how the program,
project, or product will be expected to grow in the future.  For the time
being, experience properly used is more valuable than any textbook education
that I know of.

> I love a good fight.  Thanks, Mark!

I'm pleased that I meet your standards for an adversary, but I'm not sure that
I see this discussion in quite that light.
-- 

(This man's opinions are his own.)
From mole-end				Mark Terribile