[comp.lang.c] Implementation-DEPENDENT code

nevin1@ihlpf.ATT.COM (00704a-Liber) (04/08/88)

The reason that I do not like the proposed change (on the net, that is) to
strcpy() is because it ENCOURAGES implementation-dependent code, which, to
put it simply, is a *bad* programming paradigm.  We should be abstracting
on at least a procedural level, if not higher.  If we can do that, then
code maintenence becomes easier, integrating new code becomes easier and
programming in general becomes easier.  Programming projects these days are
simply getting too big to be able to just 'hack' together something that
works.

One of the arguments for the change is "I have some code now which relies
on the fact that strcpy() is left-to-right."  Since this code does not
conform to C as it is defined now, I don't think ANSI should say "that's
ok.  We will change the language so that your BAD code becomes GOOD code."
(BTW, I do not want to infer that the committee IS saying this; I am just
trying to make a point.)

And if this gets 'fixed' (like a cat gets fixed--no smiley), then shouldn't
all the bad programming practices get 'fixed', too.  Take the following
code segment for example:

	#include	<stdio.h>

	int main()

	{
		int i=1;
		while (i < 10)
		{
    	    		int j;
    			if (1 == i)
        			j = 2;
    			printf("%d ", j);
    			j += 2;
    		i++;
		}
	}

On every implementation of C that I tried this on, it printed out

	2 4 6 8 10 12 14 16 18

Now, a few questions about this program.  Assuming that this program is
designed to print out the even numbers from 2 to 18 inclusive, is this a
*correct* implementation of that algorithm, given the definition of C?
NO!!  Why not??  (Think about it before peeking ahead)

The reason is that j is declared inside a block, and this block is entered
and exited through each iteration of the while loop.  [Note:  if this
block was not entered and exited through each iteration, then changing
'int j;' to 'int j=2;' should not change my output.  But it does change it
to "2 2 2 2 2 2 2 2 2 ".  If you still don't believe it's a block, then put
an extra set of braces around everything from 'int j=2' until just before
but not including 'i++;'.  You still get the same results, and the points I
am about to make still hold.]  As K&R 1 says on page 198:  "... automatic
and register variables which are not initialized are guaranteed to start
off as garbage."  This means that every time through the loop, j should
start off as garbage (and I can construct a slightly different case that
would make j garbage on some implementations) and NOT (necessarily) contain
it's previous value.

Now, because I also know a little about how C is implemented, I know why I
got even numbers instead of junk.  In a nutshell, it is because the actual
space for j is never reallocated (there is a lot more to this, but this is
not the issue).  However, there is nothing within C which guarantees that
this space is not reallocated.

SO, I repeat:  is this an example of acceptable code??  NO!!
Is this type of construct in use today?  (Unfortunately) YES!!  (and not
just in old code which nobody ever looks at, either.)  Is this bad code?
YES!  Should C be 'fixed' so that this bad code becomes *good* code?  NO!!!
Fix the bad programs; don't change the language so that these bad programs
become acceptable!


Our programming languages should encourage us to use good programming
methods.  The proposed change to strcpy() discourages abstraction on a
procedural level, and having 'tightly coupled' code where it is not needed
is simply bad programming.  We should be trying to stay away from locking
into one specific way of implementation, since no one knows what
discoveries the future might hold.  Now is the time to fix up all the *bad*
C programs; after ANSI C is passed, we probably won't get another chance.
The proposed change to strcpy() will have far reaching consequences which
many people simply aren't thinking through.  [Sorry to have been so
philosophical about it, but isn't that what the argument is all about?]

-- 
 _ __			NEVIN J. LIBER	..!ihnp4!ihlpf!nevin1	(312) 510-6194
' )  )				"The secret compartment of my ring I fill
 /  / _ , __o  ____		 with an Underdog super-energy pill."
/  (_</_\/ <__/ / <_	These are solely MY opinions, not AT&T's, blah blah blah

chris@mimsy.UUCP (Chris Torek) (04/08/88)

In article <4331@ihlpf.ATT.COM> nevin1@ihlpf.ATT.COM (00704a-Liber) writes:
>The reason that I do not like the proposed change
[to the proposed dpANS to make strcpy(s,s+n) defined]
>is because it ENCOURAGES implementation-dependent code, which, to
>put it simply, is a *bad* programming paradigm.

Once again, IF THE CHANGE WERE MADE, IT WOULD NOT BE IMPLEMENTATION
DEPENDENT.

You believe that strcpy(s,s+n) is somehow inherently `bad'.  We
disagree.  But you cannot claim that, if this change were made,
such usage would be implementation defined; you can only claim that
in your opinion it would still be bad.  You can (and do) claim that
if the change is NOT made, such calls will be implementation
dependent.  You cannot make any claim beyond opinion about such
calls NOW without defining C-as-it-is.

>One of the arguments for the change is "I have some code now which relies
>on the fact that strcpy() is left-to-right."  Since this code does not
>conform to C as it is defined now, [material deleted]

Where is C defined now?  K&R?  No: their strcpy does not even have a
return value.  The implementation?  No: there are many implementations,
and some of them disagree.  Is it defined by the applications that use
it?  No, for some of them are buggy.  Perhaps the only definition is
that `C is what Dennis Ritchie says it is.'  But I doubt you will
accept that one either.  Well, we do not accept yours any more than you
accept ours.  And therein lies the war.

I agree to let you believe that strcpy(s,s+n) `inherently bad'.  Will
you agree to let me believe that it is not so, that it is only bad if
we arbitrarily *define* it as bad?
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

barmar@think.COM (Barry Margolin) (04/08/88)

In article <4331@ihlpf.ATT.COM> nevin1@ihlpf.UUCP (00704a-Liber,N.J.) writes:
[Example of code that depends on old value of a variable from the
previous time a block was exited.]
>The reason is that j is declared inside a block, and this block is entered
>and exited through each iteration of the while loop.

I know of one implementation where that example probably won't work,
although I haven't tried it yet.  The Symbolics C compiler initializes
automatic variables that do not have an explicit initializer) to a
special "undefined" value, and traps attempts to take the value.

The compiler also does a little flow analysis and warns you if it
looks like your code references a variable before setting it
(unfortunately, it is a bit overeager in this regard, and warns about
some cases that are obviously valid to a human reader (note that this
comment is based on experience with the beta test version of the
compiler, and it may have been improved in the final version)).


Barry Margolin
Thinking Machines Corp.

barmar@think.com
uunet!think!barmar

sjs@spectral.ctt.bellcore.com (Stan Switzer) (04/11/88)

This point bears repeating:

> In article <4331@ihlpf.ATT.COM> nevin1@ihlpf.ATT.COM (00704a-Liber) writes:
> >The reason that I do not like the proposed change
> [to the proposed dpANS to make strcpy(s,s+n) defined]
> >is because it ENCOURAGES implementation-dependent code, which, to
> >put it simply, is a *bad* programming paradigm.
  
In article <10969@mimsy.UUCP> chris@mimsy.UUCP writes:
> Once again, IF THE CHANGE WERE MADE, IT WOULD NOT BE IMPLEMENTATION
> DEPENDENT.

Thanks, Chris, for a few sane words.

I think a few observations can be made about this "tempest-in-a-teapot"

  1) Some (most, possibly all) implementations DO implement strcpy
     left-to-right.   (prior-art)
  2) It is (occasionally) a useful thing to guarantee.  (useful feature)
  3) There is a body of code which uses this feature.  (do not break
     existing code)
  4) There has been NO example presented where any other order of copying
     would yield faster code.  (efficiency, spirit-of-C)

Perhaps the Committee has forgotton its charter?

Stan Switzer

scjones@sdrc.UUCP (Larry Jones) (04/14/88)

In article <6670@bellcore.bellcore.com>, sjs@spectral.ctt.bellcore.com (Stan Switzer) writes:
> I think a few observations can be made about this "tempest-in-a-teapot"
> 
>   1) Some (most, possibly all) implementations DO implement strcpy
>      left-to-right.   (prior-art)

Nope.  Practically all VAX implementations use the VAX MOVC3 instruction which
moves EITHER left-to-right OR right-to-left, whichever is needed to get the
"right" answer (an intact copy).

> Perhaps the Committee has forgotton its charter?

No, not so far as I know - it seems to be brought up in committee meetings
almost as often an it's brought up on the net.

----
Larry Jones                         UUCP: uunet!sdrc!scjones
SDRC                                MAIL: 2000 Eastman Dr., Milford, OH  45150
                                    AT&T: (513) 576-2070
"When all else fails, read the directions."

sjs@spectral.ctt.bellcore.com (Stan Switzer) (04/15/88)

I wrote:
> > I think a few observations can be made about this "tempest-in-a-teapot"
> > 
> >   1) Some (most, possibly all) implementations DO implement strcpy
> >      left-to-right.   (prior-art)

In article <258@sdrc.UUCP> scjones@sdrc.UUCP (Larry Jones) writes:
> Nope.  Practically all VAX implementations use the VAX MOVC3 instruction which
> moves EITHER left-to-right OR right-to-left, whichever is needed to get the
> "right" answer (an intact copy).

You are, of course, correct.  I noticed this after I sent it.  What I meant
is that all known implementations implement strcpy(s,d) s.t. when s and d
overlap and s < d, the copy is non-destructive.  This is the essential point.

My point still stands that
  1) in all known implementations it "does the right thing"
  2) changing it can break existing working code (granted that it is based
     on a tacit though universally true assumption).
  3) there is no known reason why on any implementation it could be
     inefficient to require this semantics (and considerable reason to
     doubt that it ever could be more inefficient, given that you would have
     to count 'd' first anyway).

So, where is the justification for not requiring this overlapped copy
to work?

Stan Switzer

jimp@cognos.uucp (Jim Patterson) (05/05/88)

In article <6743@bellcore.bellcore.com> sjs@spectral.UUCP (Stan Switzer) writes: 
>... all known implementations implement strcpy(s,d) s.t. when s and d 
>overlap and s < d, the copy is non-destructive. This is the essential point.  
>My point still stands that 
> 1) in all known implementations it "does the right thing"

The SUN 3 implemention doesn't, or at least not by your definition. I
suspect other implementations on architectures that don't have complex
instructions like MOVC3 won't implement strcpy non-destructively
either. Generally it will be implemented "left-to-right" which of
course is destructive in some instances.

>3) there is no known reason why on any implementation it could be 
>inefficient to require this semantics (and considerable reason to 
>doubt that it ever could be more inefficient, given that you would have 
> to count 'd' first anyway).

You DON'T have to count 'd'.  strcpy can just copy until it encounters
the NUL.  Following is a portable (but potentially destructive) strcpy
implementation.  It doesn't know (and doesn't care) what the length of
the string is. Sun's implementation is essentially this and does the
copy via a three instruction loop (it appears that two instructions
would do on the 68000).

strcpy(a,b) char *a, *b; {
   while (*a++ = *b++)
     ;
 }

Implementing strcpy as you suggest would burden all of its users with
the overhead of passing over the source string twice. Except for those
few applications which depend on a non-destructive implementation, one
of these passes is wasted cycles.

The ANSI committee encountered an analagous situation with memcpy
which also is non-destructive on typical VAX implementations but
destructive on many other implementations. It was resolved by adding
another function, memmove, which is guaranteed non-destructive but
will likely be less efficient. memcpy can often be effectively
implemented as inline code (which DG has done in their compiler), but
the VAX is the only architecture I'm aware of where it would be
reasonable to implement memmove inline (DEC didn't bother to however).
Even systems with microcoded block move instructions don't usually
implement them non-destructively as VAX has (look at IBM 370 or DG MVs
as examples), so you still need to at least compare the source and
target addresses when implementing memmove. The necessarily less
efficient implementation possibilities likely had a bearing on the
ANSI decision in this case.

Perhaps your problem could be resolved by also adding strmove (and of
course strnmove). Note that the additional cost of memmove over memcpy
is quite a bit less than that of the hypothetical strmove over strcpy,
since strmove must in typically half of the cases compute the length
of its source first.
-- 
Jim Patterson                              Cognos Incorporated
UUCP:decvax!utzoo!dciem!nrcaer!cognos!jimp P.O. BOX 9707    
PHONE:(613)738-1440                        3755 Riverside Drive
                                           Ottawa, Ont  K1G 3Z4