[comp.lang.c] C Compiler bugs

lvc@tut.cis.ohio-state.edu (Lawrence V. Cipriani) (06/05/88)

Following up from comp.unix.wizards are amusing C Compiler bugs:

Then there was the bug where if you had a structure declaration right
before main and forget to end it with a ; the program would core dump
on exit:

	struct blob {
		int a, b, c;
	} /* missing ; */

	main(argc, argv) ...

This is also on old 3B compilers, fixed on newer ones.

-- 
Larry Cipriani, AT&T Network Systems and Ohio State University
Domain: lvc@tut.cis.ohio-state.edu
Path: ...!cbosgd!osu-cis!tut.cis.ohio-state.edu!lvc (strange but true)

karl@haddock.ISC.COM (Karl Heuer) (06/06/88)

In article <15085@tut.cis.ohio-state.edu> lvc@tut.cis.ohio-state.edu (Lawrence V. Cipriani) writes:
>Following up from comp.unix.wizards are amusing C Compiler bugs:
>Then there was the bug where if you had a structure declaration right before
>main and forget to end it with a ; the program would core dump on exit:
>	struct blob { int a, b, c; } /* missing ; */
>	main(argc, argv) ...

Why should it be considered a "compiler bug" when a syntactically correct
program containing a user bug dumps core?  It seems to me that the appropriate
"fix" is to make sure that lint complains about the mismatched declaration.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
Followups to comp.lang.c.

lvc@tut.cis.ohio-state.edu (Lawrence V. Cipriani) (06/07/88)

In article <4421@haddock.ISC.COM> karl@haddock.ima.isc.com (Karl Heuer) writes:
>In article <15085@tut.cis.ohio-state.edu> lvc@tut.cis.ohio-state.edu (Lawrence V. Cipriani) writes:
	...
**	struct blob { int a, b, c; } /* missing ; */
**	main(argc, argv) ...
*Why should it be considered a "compiler bug" when a syntactically correct
*program containing a user bug dumps core?  It seems to me that the appropriate
*"fix" is to make sure that lint complains about the mismatched declaration.
*Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

It certainly would be appropriate for lint, but don't you think this is
a stupid thing for a compiler to allow?  Maybe not, but I do.  I think
compilers should check semantic correctness when possible as well as for
syntactic correctness.  At least a warning message would be useful, also
not every implementation of C is accompanied by lint.

-- 
Larry Cipriani, AT&T Network Systems and Ohio State University
Domain: lvc@tut.cis.ohio-state.edu
Path: ...!cbosgd!osu-cis!tut.cis.ohio-state.edu!lvc (strange but true)

karl@haddock.ISC.COM (Karl Heuer) (06/07/88)

In article <15202@tut.cis.ohio-state.edu> lvc@tut.cis.ohio-state.edu (Lawrence V. Cipriani) writes:
>In article <4421@haddock.ISC.COM> karl@haddock.ima.isc.com (Karl Heuer) writes:
>>In article <15085@tut.cis.ohio-state.edu> lvc@tut.cis.ohio-state.edu (Lawrence V. Cipriani) writes:
>	...
>>>	struct blob { int a, b, c; } /* missing ; */
>>>	main(argc, argv) ...
>>Why should it be considered a "compiler bug" when a syntactically correct
>>program containing a user bug dumps core?  [Fix it in lint]
>
>It certainly would be appropriate for lint, but don't you think this is
>a stupid thing for a compiler to allow?

Yes, but the general problem of mismatched declarations requires cross-file
checking, which has traditionally been in the jurisdiction of lint.  At least
until type-checking loaders become popular.  Misdeclared main() is just one
instance of the problem.

>... also not every implementation of C is accompanied by lint.

I've said it before (usually in Pascal-vs-C discussions): a C compiler
consists of two parts, traditionally called cc and lint.  A vendor who doesn't
supply a lint equivalent is only selling half a C compiler.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

gwyn@brl-smoke.ARPA (Doug Gwyn ) (06/07/88)

In article <15202@tut.cis.ohio-state.edu> lvc@tut.cis.ohio-state.edu (Lawrence V. Cipriani) writes:
>It certainly would be appropriate for lint, but don't you think this is
>a stupid thing for a compiler to allow?  Maybe not, but I do.

I seem to recall you said that this so-called bug had been "fixed".
Was it fixed by building in knowledge that main() returns an int
(in which case how do you disable that "knowledge" for compiling
freestanding applications?), or was it "fixed" by making it
impossible for a function to return a structure (which C allows),
or was it "fixed" by noticing the mismatch between the defined
return type and the value actually returned by "return" statements?

lvc@tut.cis.ohio-state.edu (Lawrence V. Cipriani) (06/08/88)

In article <8036@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
>A freestanding application has every right to define a main() function
>having any valid interface.
Agreed.

>How does one get the compiler to compiler (sic) this code?
As far as the compiler is concerned main is just another subroutine.
A special one, but just another subroutine.  My view of C compilers is
that the name main is not special to the compiler, but when you want to
take .o files and build an executable image there better be one main
routine somewhere in those file for the startup routine to call.
The compiler will automagically add a call to the startup routine to
your program.  I know, I know, so I've only used UNIX C compilers.

>crt0.o has nothing to do with this.
Why not?  Freestanding applications, eg UNIX, have their own magical
way of starting up.  Imagine the chaos that would result if UNIX were
compiled with /lib/crt0.o, yow!  I think the crt0.o file (I spoke of
previously) was recoded to always expect an int from main even if
if it returned something else (our *crt.o files handle startup
and termination).  If the application really needs to return something
else, then it isn't hosted and shouldn't use the "normal" crt0.o files..

I guess I don't see what point I am missing.
-- 
Larry Cipriani, AT&T Network Systems and Ohio State University
Domain: lvc@tut.cis.ohio-state.edu
Path: ...!cbosgd!osu-cis!tut.cis.ohio-state.edu!lvc (strange but true)

greggy@infmx.UUCP (greg yachuk) (06/09/88)

In article <15275@tut.cis.ohio-state.edu>, lvc@tut.cis.ohio-state.edu (Lawrence V. Cipriani) writes:
> In article <8030@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
[general discussion about main returning something other than an int,
 whether it should be considered an "error" by the compiler...]

> >I seem to recall you said that this so-called bug had been "fixed".
> >Was it fixed by building in knowledge that main() returns an int
> >(in which case how do you disable that "knowledge" for compiling
> >freestanding applications?),
> I think it was done this way, and the way it is disabled for free-
> standing applications is to not use the supplied /lib/*crt0.o files.

How can you disable a part of the compiler by determining which /lib/*crt0.o
file will be used when the whole muck is finally linked together?  Do you
have a code emitting linker, or am I missing something very basic?

Greg Yachuk		Informix Software Inc., Menlo Park, CA	(415) 322-4100
{uunet,pyramid}!infmx!greggy	 !yes, I chose that login myself, wazit tooya?

And they offered us a roof above our heads
And like fools we beleive every last word they said.	-- The Christians

gwyn@brl-smoke.ARPA (Doug Gwyn ) (06/09/88)

In article <15298@tut.cis.ohio-state.edu> lvc@tut.cis.ohio-state.edu (Lawrence V. Cipriani) writes:
>As far as the compiler is concerned main is just another subroutine.

Ok, then it cannot complain if main is (inadvertently) declared as
returning a structure.

>I think the crt0.o file ... was recoded to always expect an int from
>main even if it returned something else ...

Yes, AT&T crt0.o has been this way for many years and it is necessary
for the ANSI C hosted environment.

>I guess I don't see what point I am missing.

I thought you had said that the compiler had been "fixed" so that
it did not allow the (inadvertent) declaration of main as returning
a structure that led to the run-time core dump previously reported.
From what you've been saying, it appears that that user error could
still lead to exactly the same symptom.  I don't think this needs
"fixing", by the way, but the right way to do it would be to have
the compiler know whether the compilation was for a freestanding or
hosted environment (e.g. "scc" vs. "cc" in the AT&T UNIX world) and
in the latter case have it "know" the only two valid interfaces for
the definition of the main() function.  (Too bad X3J11 decided to
allow two incompatible ones; if there were only one the compiler
could simply predeclare main().)

owen@wrs.UUCP (Owen DeLong) (06/09/88)

In article <4421@haddock.ISC.COM> karl@haddock.ima.isc.com (Karl Heuer) writes:
>In article <15085@tut.cis.ohio-state.edu> lvc@tut.cis.ohio-state.edu (Lawrence V. Cipriani) writes:
>>Following up from comp.unix.wizards are amusing C Compiler bugs:
>>Then there was the bug where if you had a structure declaration right before
>>main and forget to end it with a ; the program would core dump on exit:
>>	struct blob { int a, b, c; } /* missing ; */
>>	main(argc, argv) ...
>
>Why should it be considered a "compiler bug" when a syntactically correct
>program containing a user bug dumps core?  It seems to me that the appropriate
>"fix" is to make sure that lint complains about the mismatched declaration.
>
>Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
>Followups to comp.lang.c.

Tell me, Karl, where do you see the bug causing the program to dump core?

I get the impression that the bug is in the compiler, and the compiler which
doesn't need a ; (noted as missing) dumped core upon trying to return from
function main.  I see this as definitely being a compiler bug, particularly
if you consider the code to be correct.  It is conceivable to call the code
incorrect (syntax error due to missing semicolon), but I would say that the
compiler should actually accept the closing brace on a compound statement as
an implied ; afterwards.  If I'm wrong, flame me...I'll learn that way.  If
I'm not, we've all learned something.  I would like to see the rest of the
program which you must have seen to say it was a user bug.

Owen

lvc@tut.cis.ohio-state.edu (Lawrence V. Cipriani) (06/09/88)

In article <8045@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:

>>As far as the compiler is concerned main is just another subroutine.
>Ok, then it cannot complain if main is (inadvertently) declared as
>returning a structure.

Sure it can, as long as its only a *warning* and not an *error*.

>I thought you had said that the compiler had been "fixed" so that
>it did not allow the (inadvertent) declaration of main as returning
>a structure that led to the run-time core dump previously reported.

No, the "fixed" compiler would still compile the code, but the
program wouldn't core dump on exit.  I still don't know how this
was accomplished ...

-- 
Larry Cipriani, AT&T Network Systems and Ohio State University
Domain: lvc@tut.cis.ohio-state.edu
Path: ...!cbosgd!osu-cis!tut.cis.ohio-state.edu!lvc (strange but true)

gwyn@brl-smoke.ARPA (Doug Gwyn ) (06/09/88)

In article <15367@tut.cis.ohio-state.edu> lvc@tut.cis.ohio-state.edu (Lawrence V. Cipriani) writes:
>The *fixed* crt0.o
>must have been recoded to recognize arbitrary return values.

That would be a pretty neat trick!

Some linkers are able to detect interface type mismatches and
issue a warning, which would catch the kind of error under discussion,
but I haven't yet seen an AT&T CCS (aka SGS) linker that does that.

Perhaps the "fix" was simply that the error did not have catastrophic
results when the way struct values are returned was changed, or
something like that.

mat@emcard.UUCP (Mat Waites) (06/10/88)

In article <532@wrs.UUCP> owen@wrs.UUCP (Owen DeLong) writes:
]In article <4421@haddock.ISC.COM> karl@haddock.ima.isc.com (Karl Heuer) writes:
]>In article <15085@tut.cis.ohio-state.edu> lvc@tut.cis.ohio-state.edu (Lawrence V. Cipriani) writes:
]>>Following up from comp.unix.wizards are amusing C Compiler bugs:
]>>Then there was the bug where if you had a structure declaration right before
]>>main and forget to end it with a ; the program would core dump on exit:
]>>	struct blob { int a, b, c; } /* missing ; */
]>>	main(argc, argv) ...
]>
]>Why should it be considered a "compiler bug" when a syntactically correct
]>program containing a user bug dumps core?  It seems to me that the appropriate
]>"fix" is to make sure that lint complains about the mismatched declaration.
]>
]>Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
]>Followups to comp.lang.c.
]
]Tell me, Karl, where do you see the bug causing the program to dump core?
]
]I get the impression that the bug is in the compiler, and the compiler which
]doesn't need a ; (noted as missing) dumped core upon trying to return from
]function main. 

[ I see a clue here! ]

]I see this as definitely being a compiler bug, particularly
]if you consider the code to be correct.  It is conceivable to call the code
]incorrect (syntax error due to missing semicolon), but I would say that the
]compiler should actually accept the closing brace on a compound statement as
]an implied ; afterwards.  If I'm wrong, flame me...I'll learn that way.  If
]I'm not, we've all learned something.  I would like to see the rest of the
]program which you must have seen to say it was a user bug.
]
]Owen


The program is syntactically correct. It declares a function named main
which returns a struct blob. Unfortunately the run-time system is not familiar
with blobs and will probably end up trashing the stack on exit.

If you'd written:

struct blob { int a, b, c; } /* missing ; */
bozo;

you'd have a struct blob called bozo. Why shouldn't main act the same way.
"main" is not a reserved word in general, although some compilers treat
it differently than other function names.

I think "the walking lint" is probably right about lint recognizing 
main's appropriate return value type (although is seems to be
somewhat implementation dependant)


Mat

-- 
  W Mat Waites                     |  PHONE:  (404) 727-7197
  Emory Univ Cardiac Data Bank     |  UUCP:   ...!gatech!emcard!mat
  Atlanta, GA 30322                |

karl@haddock.ISC.COM (Karl Heuer) (06/10/88)

(Since the question was directed toward me, I'll take the liberty of replying
in public and hope that nobody else does so.)

In article <532@wrs.UUCP> owen@wrs.UUCP (Owen DeLong) writes:
>Tell me, Karl, where do you see the bug causing the program to dump core?
>
>I get the impression that the bug is in the compiler, and the compiler which
>doesn't need a ; (noted as missing) dumped core upon trying to return from
>function main.  I see this as definitely being a compiler bug, particularly
>if you consider the code to be correct.  It is conceivable to call the code
>incorrect (syntax error due to missing semicolon), ...

The point is that there is NO SYNTAX ERROR in the program.  The code fragment
  struct foo {...} main() {...}
is a perfectly valid declaration of a function named "main" whose return type
is "struct foo {...}".  It happens not to be what the user intended, but it's
still syntactically correct.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

allbery@ncoast.UUCP (Brandon S. Allbery) (06/15/88)

As quoted from <4421@haddock.ISC.COM> by karl@haddock.ISC.COM (Karl Heuer):
+---------------
| In article <15085@tut.cis.ohio-state.edu> lvc@tut.cis.ohio-state.edu (Lawrence V. Cipriani) writes:
| >Then there was the bug where if you had a structure declaration right before
| >main and forget to end it with a ; the program would core dump on exit:
| >	struct blob { int a, b, c; } /* missing ; */
| >	main(argc, argv) ...
| 
| Why should it be considered a "compiler bug" when a syntactically correct
| program containing a user bug dumps core?  It seems to me that the appropriate
| "fix" is to make sure that lint complains about the mismatched declaration.
| 
| Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint
+---------------

(blink)  Whoops!  Mr. Heuer just earned his title.  That *is* a valid
declaration:  it says that the function main() returns a (struct blob), and
declares the (struct blob) at the same time.  Good point.

...but why does the code dump core?  Admitted, the cleanup code in crt0 will
dosciver a type mismatch, but how many programs exit by return'ing from main?
If it happened during an exit(), it's a legitimate bug somewhere.  (If it
happened in crt0, then it's an artifact of the compiler's method of
returning structs.)
-- 
Brandon S. Allbery			  | "Given its constituency, the only
uunet!marque,sun!mandrill}!ncoast!allbery | thing I expect to be "open" about
Delphi: ALLBERY	       MCI Mail: BALLBERY | [the Open Software Foundation] is
comp.sources.misc: ncoast!sources-misc    | its mouth."  --John Gilmore

gwyn@brl-smoke.ARPA (Doug Gwyn ) (06/16/88)

In article <8001@ncoast.UUCP> allbery@ncoast.UUCP (Brandon S. Allbery) writes:
>...but why does the code dump core?  Admitted, the cleanup code in crt0 will
>dosciver a type mismatch, but how many programs exit by return'ing from main?

Most of mine do.

>If it happened during an exit(), it's a legitimate bug somewhere.  (If it
>happened in crt0, then it's an artifact of the compiler's method of
>returning structs.)

Yes, the compiler is allowed to use a different stack layout in order
to provide room for the struct to be returned.  If the function definition
does not have the same "shape", it could easily pick up its arguments
from the wrong place, or several other similar things can go wrong.

	"Don't bother to analyze a folly -- merely ask youself
	what it accomplishes."	- Ellsworth M. Toohey

aronoff@garfield (Avram Aronoff) (06/17/88)

> Many, many people have written...
>| >main and forget to end it with a ; the program would core dump on exit:
>| >	struct blob { int a, b, c; } /* missing ; */
>| >	main(argc, argv) ...
>...but why does the code dump core?

In the typical implementation of structure returns, a function returning a
structure is passed a hidden pointer as its first argument, and uses that
pointer to store the return value. Clearly, crt0.o assumes that main returns
an int, and so passes no pointer. Main is trying to stuff a structure using
argc as a pointer.
This is a programmer error, not a compiler error. In a hosted environment, the
system (albeit implicitly) assumes that main returns an int. Perhaps one of the
ANSI include files should be made to contain a prototype for main.
							Hymie