[comp.sys.amiga.programmer] ANOTHER SAS C BUG

dave@csis.dit.csiro.au (David Campbell) (03/31/91)

/*  Here is some source which demonstrates a bug in the SAS C
    compiler */



#include <stdio.h>

char *str="hello\n";

main()
{
	char *ptr;

	ptr = str;
	*ptr = '\001' + *ptr++;
	printf(str);
}


/* Should print "iello" through the incrementing of str[0]

   Instead prints "hillo"

EXTERNAL DEFINITIONS

_main 0000-00    _str 0008-01

SECTION 00 "latbug.c" 00000028 BYTES
       | 0000  4E55 FFF8                      LINK      A5,#FFF8
       | 0004  BFEC  0000-XX.2                CMPA.L    __base(A4),A7
       | 0008  6500  0000-XX.1                BCS.W     __xcovf
       | 000C  2F0B                           MOVE.L    A3,-(A7)
       | 000E  266C  0008-01.2                MOVEA.L   01.00000008(A4),A3
       | 0012  101B                           MOVE.B    (A3)+,D0
       | 0014  5200                           ADDQ.B    #1,D0
       | 0016  1680                           MOVE.B    D0,(A3)
       | 0018  2F2C  0008-01.2                MOVE.L    01.00000008(A4),-(A7)
       | 001C  4EBA  0000-XX.1                JSR       _printf(PC)
       | 0020  266D FFF4                      MOVEA.L   FFF4(A5),A3
       | 0024  4E5D                           UNLK      A5
       | 0026  4E75                           RTS

SECTION 01 "__MERGED" 0000000C BYTES
0000 68 65 6C 6C 6F 0A 00 00 hello...
0008 00000000-01 01.00000000

*/

-- 
dave campbell

mike_s@EBay.Sun.COM (Mike "The Claw" Sullivan) (03/31/91)

In <1991Mar31.035009.13183@csis.dit.csiro.au> dave@csis.dit.csiro.au (David Campbell) writes:

>/*  Here is some source which demonstrates a bug in the SAS C
>    compiler */

>	*ptr = '\001' + *ptr++;

No, the problem is that the evaluation order of the right and left sides
of an assignment expression is unspecified. You are assuming that it
determines the storage location on the left (*ptr), then computes the
right hand side (and advances ptr). That does not happen to be what SAS C
is doing.

You should avoid side-effects like ++ in assignment expressions if you
use the operand on the other side.

	Mike
--
Mike Sullivan                     Internet: msullivan@EBay.Sun.COM
Sun Education                     UUCP:     ..!sun!yavin!msullivan
Software Course Developer         Compuserve: 75365,764
"The old Maxwell Smart silhouette on the window shade trick. That's the

jbickers@templar.actrix.gen.nz (John Bickers) (04/01/91)

Quoted from <1991Mar31.035009.13183@csis.dit.csiro.au> by dave@csis.dit.csiro.au (David Campbell):

> 	*ptr = '\001' + *ptr++;

    Are you sure this is a bug? It looks like one of those compiler
    dependent things to do with sequence points, or whatever. You could
    check with the folks in comp.lang.c, who seem to have copies of
    the ANSI specs on hand and who know about these things.

    Easy enough to change to "(*p++)++;" anyhow, isn't it?

> dave campbell
--
*** John Bickers, TAP, NZAmigaUG.        jbickers@templar.actrix.gen.nz ***
***         "Patterns multiplying, re-direct our view" - Devo.          ***

johnhlee@CS.Cornell.EDU (John H. Lee) (04/01/91)

In article <1991Mar31.035009.13183@csis.dit.csiro.au> dave@csis.dit.csiro.au (David Campbell) writes:
>/*  Here is some source which demonstrates a bug in the SAS C
>    compiler */
>
>#include <stdio.h>
>
>char *str="hello\n";
>
>main()
>{
>	char *ptr;
>
>	ptr = str;
>	*ptr = '\001' + *ptr++;
>	printf(str);
>}
>
>/* Should print "iello" through the incrementing of str[0]
>
>   Instead prints "hillo"
[OMD dump deleted]

Is this really a bug?  I believe this is a classic non-portable example as
far as K&R C goes (ANSI C may say otherwise--I not sure.)  The problem is
that we don't know which gets evaluated first:  the lhs or rhs of the
expression.  K&R C leaves this up to the compiler, and it is evaluating
the rhs first so that *ptr on the lhs ends up pointing to str[1].

By the way, your first statement should read:

	char str[]="hello\n";

so that you don't accidently change program constants (but the amount of
memory allocated is the same.)  A rumored VAX side-effect allowed a program
to change the value of the constant 0 this way.

-------------------------------------------------------------------------------
The DiskDoctor threatens the crew!  Next time on AmigaDos: The Next Generation.
	John Lee		Internet: johnhlee@cs.cornell.edu
The above opinions of those of the user, and not of this machine.

jseymour@medar.com (James Seymour) (04/02/91)

In article <1991Apr1.050544.13878@cs.cornell.edu> johnhlee@cs.cornell.edu (John H. Lee) writes:
>
> [stuff deleted]
>By the way, your first statement should read:
>
>	char str[]="hello\n";
>                                                  ... (but the amount of
>memory allocated is the same.)

John, you are correct in that what the original poster was complaining about
is, indeed, a portability "bug".  You are incorrect regarding your comment
on how the string should be declared however (does this belong in the C
newsgroup?).

char *str = "hello";		/* declare a pointer to a char array,
				   initialize pointer to a string */

char str[] = "hello";		/* declare a character array, initialized
				   in length and contents */

BIG difference.  Especially if a function in another function is told
that str *contains* a pointer to char, when in reality it *is* the start
of the character array itself. [I've bitten myself with that one more times
than I'd care to count :-)].  Lastly, the amount of memory consumed is
not the same for each.  The first declaration takes up <pointer_sized> more
memory than the second (although generally this is an insignificant
point).  Which way it should be declared depends on what the program[mer]
needs.

-- 
Jim Seymour				| Medar, Inc.
...!uunet!medar!jseymour		| 38700 Grand River Ave.
jseymour@medar.com			| Farmington Hills, MI. 48331
CIS: 72730,1166  GEnie: jseymour	| FAX: (313)477-8897

ewing@tortuga.SanDiego.NCR.COM (David Ewing) (04/02/91)

In article <1991Mar31.035009.13183@csis.dit.csiro.au> dave@csis.dit.csiro.au (David Campbell) writes:
>/*  Here is some source which demonstrates a bug in the SAS C compiler */
>
>#include <stdio.h>
>
>char *str="hello\n";
>
>main() {
>	char *ptr;
>	ptr = str;
>	*ptr = '\001' + *ptr++;
>	printf(str);
>}
>/* Should print "iello" through the incrementing of str[0]
>   Instead prints "hillo"

This is not a bug in SAS C, as a careful reading of the ANSI standard
will confirm.  The behavior of your code is clearly undefined.

From the standard :

" 2.1.2.3 Program Execution 
...
Evaluation of an expression may produce side effects.  At certain specified
points in the execution sequence called sequence points, all side effects
of previous evaluations shall be complete and no side effects of subsequent
evaluations shall have taken place...

3.3 Expressions
...
Between the previous and next sequence point an object shall have its stored
value modified at most once by the evaluation of an expression.  Furthermore,
the prior value shall be accessed only to determine the value to be stored.
(footnote 34) ... the order of evaluation of subexpressions and the order in 
which side effects take place are both unspecified....

footnote 34. This paragraph renders undefined statement expressions such as 
   i = ++i + 1; ..."

Also see appendix B on sequence points.

This is nothing new to ANSI C.  Kernighan and Ritchie warn :

   "When side effects (assignment to actual variables) takes place is
    left to the discretion of the compiler, since the best order
    strongly depends on machine architecture."

Furthermore, as another astute reader has already commented,
modifying char *str="hello\n"; is also non-portable and undefined.
This should be char str[] = "hello\n";   -->

"3.5.7 Initialization
...
   char s[] = "abc" ...
The contents of the array are modifiable. On the other hand, the 
declaration
   char *p = "abc";
defines p with type 'pointer to char' that is initialized to point
to an object with type 'array of char' with length 4 whose elements 
are initialized with a character string literal.  If an attempt 
is made to use p to modify the contents of the array, the behavior
is undefined."

Please take the time to THOROUGHLY understand C before you disparage
a compiler.

*********************************************
David A. Ewing
ewing@tortuga.sandiego.ncr.com
*********************************************

p554mve@mpirbn.mpifr-bonn.mpg.de (Michael van Elst) (04/02/91)

In article <1991Mar31.035009.13183@csis.dit.csiro.au> dave@csis.dit.csiro.au (David Campbell) writes:
>/*  Here is some source which demonstrates a bug in the SAS C
>    compiler */
>	*ptr = '\001' + *ptr++;
>
>/* Should print "iello" through the incrementing of str[0]


If you look at the K&R book you can find that this isn't a bug
but that C doesn't define the evaluation sequence of a binary
operator except for &&, ||, ?: and the comma operator.

Regards,
-- 
Michael van Elst
UUCP:     universe!local-cluster!milky-way!sol!earth!uunet!unido!mpirbn!p554mve
Internet: p554mve@mpirbn.mpifr-bonn.mpg.de
                                "A potential Snark may lurk in every tree."

johnhlee@CS.Cornell.EDU (John H. Lee) (04/03/91)

In article <98@hdwr1.medar.com> jseymour@medar.com (James Seymour) writes:
>John, you are correct in that what the original poster was complaining about
>is, indeed, a portability "bug".  You are incorrect regarding your comment
>on how the string should be declared however (does this belong in the C
>newsgroup?).
>
>char *str = "hello";		/* declare a pointer to a char array,
>				   initialize pointer to a string */
>
>char str[] = "hello";		/* declare a character array, initialized
>				   in length and contents */
>
>BIG difference.  Especially if a function in another function is told
>that str *contains* a pointer to char, when in reality it *is* the start
>of the character array itself. [I've bitten myself with that one more times
>than I'd care to count :-)].  Lastly, the amount of memory consumed is
>not the same for each.  The first declaration takes up <pointer_sized> more
>memory than the second (although generally this is an insignificant
>point).  Which way it should be declared depends on what the program[mer]
>needs.

Yikes!  Let me try again...

I meant to say that the declaration should have been something like this:

	char buf[] = "hello";
	char *str = buf;

My point is that the declaration

	char *str = "hello";

declares a pointer to a string and sets its inital value to point to a string
in memory.  The string might be considered a "reusable constant" by the
compiler and a use of the string elsewhere in the module like this:

	printf("hello");

could have it that the same string is used.  Thus if you execute this:

	*str = 'i';
	printf("hello");

you'll get "iello" printed out!  This is by no means a certainty;  page 104
of the second edition of K&R C says that the effect is undefined.  By
declaring a character array instead of a pointer, you are guarenteed to
receive a private copy.

As for my comment about the memory allocation being the same, I was off-base
in the stated context, but should be right in the corrected context.

-------------------------------------------------------------------------------
The DiskDoctor threatens the crew!  Next time on AmigaDos: The Next Generation.
	John Lee		Internet: johnhlee@cs.cornell.edu
The above opinions of those of the user, and not of this machine.

cpca@marlin.jcu.edu.au (Colin Adams) (04/04/91)

In article <1991Apr2.194634.27075@cs.cornell.edu> johnhlee@cs.cornell.edu (John H. Lee) writes:
>In article <98@hdwr1.medar.com> jseymour@medar.com (James Seymour) writes:
>>John, you are correct in that what the original poster was complaining about
>>is, indeed, a portability "bug".  You are incorrect regarding your comment
>>on how the string should be declared however (does this belong in the C
>>newsgroup?).
>>
>>char *str = "hello";		/* declare a pointer to a char array,
>>				   initialize pointer to a string */
>>
>>char str[] = "hello";		/* declare a character array, initialized
>>				   in length and contents */
>>
>>BIG difference.  Especially if a function in another function is told
>>that str *contains* a pointer to char, when in reality it *is* the start
>>of the character array itself. [I've bitten myself with that one more times
>>than I'd care to count :-)].  Lastly, the amount of memory consumed is
>>not the same for each.  The first declaration takes up <pointer_sized> more
>>memory than the second (although generally this is an insignificant
>>point).  Which way it should be declared depends on what the program[mer]
>>needs.
>

Here's a real SAS C bug

void main()
{
	char *p;

	p = "Well lets's create a really big string here, even though"
	"SAS C only supports strings which are 256 bytes long, but"
	" with this neat new string concatenation feature you can put"
	" strings on several lines.  Unfortunately SAS C doesn't check"
	" if you go over the end of the buffer like this string will,"
	" which will cause the compiler to trash memory and "
	"crash somewhere.  Sometimes this stuffs up the lexical "
	"analysis causing 'Guru trapped' requesters! "
	"Well it was just great fun finding this. ";
}

This mightn't crash SAS as I can't try it from this DECStation, but
you'll get the general idea.  Yes, I do intend to mail them about
it...



-- 
Colin Adams                                  
Computer Science Department                     James Cook University 
Internet : cpca@marlin.jcu.edu.au               North Queensland
'And on the eight day, God created Manchester'

dave@unislc.uucp (Dave Martin) (04/05/91)

From article <1991Mar31.035009.13183@csis.dit.csiro.au>, by dave@csis.dit.csiro.au (David Campbell):
> /*  Here is some source which demonstrates a bug in the SAS C
>     compiler */
> 
> #include <stdio.h>
> 
> char *str="hello\n";
> 
> main()
> {
> 	char *ptr;
> 
> 	ptr = str;
> 	*ptr = '\001' + *ptr++;
> 	printf(str);
> }
> 
> 
> /* Should print "iello" through the incrementing of str[0]
> 
>    Instead prints "hillo"

why "should" it?

I don't know what K&R say (probably nothing about this) but
ANSI C Draft Dec 7 1988  (I wish I had a more recent version) says

3.3 Expressions
...
Between the previous and next sequence point an object shall have its
stored value modified at most once by the evaluation of an expression.
Furthermore, the prior value shall be accessed only to determine the value
to be stored. (see footnote 31)

footnote 31 says:  This paragraph renders undefined statement expressions
such as

  i = ++i + 1;

while allowing

  i = i + 1;

> 	*ptr = '\001' + *ptr++;

You have asked the compiler to fetch *ptr which is 'h', increment ptr
(it now points to 'e') add '\001' (or 1) to 'h' and store the value
through ptr (which has since been incremented).

This statement isn't guaranteed to work "right" either:

  *(ptr++) = '\001' + *ptr;

You can change this to

  *ptr = '\001' + *ptr;
  ptr++;

The safest thing to do is increment the pointer in a separate statement.

Also, in the Rationale document section 2.1.2.3 program execution:

Because C expressions can contain side effects, issues of sequencing are
important in expression evaluation. (see 3.3)  Most operators IMPOSE NO
SEQUENCING REQUIREMENTS, [emphasis added by me] but a few operators
impose sequence points upon the evaluation:  comma, logical-AND, logical-OR
and conditional.  For example, in the expression (i = 1, a[i] = 0) the
side effect (alteration to storage) specified by i = 1 must be completed
before the expression a[i] = 0 is evaluated.

Note that = (assignment) is NOT one of the listed operators.  the compiler
can perform the ++ in *str++ before OR after the assignment.  You are 
assuming that it will be completed after the assignment, while the compiler
chooses to complete it after the value 'h' is retrieved.

The compiler (and SAS/C claims to be ANSI compliant) can legally do this
under ANSI-C.  K&R probably said nothing about this and the compiler
can do what it wants.

(sidenote: SAS/C still needs to implement long double and I hope they do it
(at least under the -f8 (use FPU) option) as FPU type X (extended).  This
 is the most efficient mode for the FPU anyway as it does all its math in X
 format.)


If someone can confirm (or "fix") my interpretation of the ANSI standard
here, I would appreciate it.  (this is why we have lawyers... 8-)).
-- 
VAX Headroom	Speaking for myself only... blah blah blahblah blah...
Internet: DMARTIN@CC.WEBER.EDU                 dave@saltlcy-unisys.army.mil
uucp:     dave@unislc.uucp or use the Path: line.
Now was that civilized?  No, clearly not.  Fun, but in no sense civilized.