[net.lang.c] functions that don't return

kwh@bentley.UUCP (KW Heuer) (03/26/86)

The datatype "void" was introduced to distinguish functions which return
an int from those which return no useful value but were commonly declared
int by default.  ("extern f();" still means "extern int f();".)

There are a few functions which don't return to the caller at all:
exit(), _exit(), and longjmp() come to mind.  Many users write their
own (usage(), error()).  These are normally declared void, which is
better than int but still not technically correct.

"void" should mean that the function returns, but has no useful value.
There should be a new keyword, e.g. "flowsink", to describe a function
that never returns at all.

I can see three advantages in this scheme.  The user is saying what he
means, making the code clearer at minimal cost; the compiler would be
able to produce slightly more efficient code ("else" following "exit()"
could be ignored, like "else" following "goto" or "break"); and lint
could dispense with the /* NOTREACHED */ kludge.

Comments?

Karl W. Z. Heuer (ihnp4!bentley!kwh), The Walking Lint

nather@utastro.UUCP (Ed Nather) (03/27/86)

In article <665@bentley.UUCP>, kwh@bentley.UUCP (KW Heuer) writes:
> There should be a new keyword, e.g. "flowsink", to describe a function
> that never returns at all.
> Comments?

We could call it "bye" ...

-- 
Ed Nather
Astronomy Dept, U of Texas @ Austin
{allegra,ihnp4}!{noao,ut-sally}!utastro!nather
nather@astro.UTEXAS.EDU

lindsay@tl-vaxa.UUCP (Don Lindsay) (09/20/86)

There has been some discussion about functions that never return (e.g exit).
It was suggested that if a compiler could be told about this, then it
could generate better code.

In fact, these functions sometimes DO return (and in that case, the compiler
had better have allowed for that). 

For example, a storage allocator may decide to exit, because his caller has
reached some limit. But, the developer of this storage allocator wishes to
write a test program, and she wants the test program to exercise this
feature. The test program can be more powerful, and more convenient, and easier
to document and use, if it can make the "exit" routine return !

Similarly, a program may detect an unusual condition, and call a handler
for it. It is more general if it is the handler which decides whether or
not to return. (Perhaps one would link in different handlers under different
circumstances.) It is quite common for people to write over-specified
programs, where the mainline "knows" what kind of handler is out there.
The existence of a new function type would mostly encourage such limited
thinking.

Don Lindsay

karl@haddock (09/22/86)

tl-vaxa!lindsay (Don Lindsay) writes:
>There has been some discussion about functions that never return (e.g exit).
>It was suggested that if a compiler could be told about this, then it
>could generate better code.
>
>In fact, these functions sometimes DO return (and in that case, the compiler
>had better have allowed for that).

Hogwash.  "Functions that never return" and "functions that sometimes DO
return" describe disjoint sets.  Functions genuinely in the latter set (e.g.
pre-ANSI abort()) are not under consideration here.

>For example, a storage allocator may decide to exit, because his caller has
>reached some limit.  But, the developer of this storage allocator wishes to
>write a test program, and she wants the test program to exercise this
>feature.  The test program can be more powerful, and more convenient, and
>easier to document and use, if it can make the "exit" routine return!

What if the function that calls exit() doesn't fall into a return?  (E.g.
"if (p == NULL) exit(1); *p = ..." will bomb if exit() doesn't exit.)  And
if you supersede exit() with a function that returns, how do you get out of
the program?  Send yourself a signal?

>Similarly, a program may detect an unusual condition, and call a handler
>for it. It is more general if it is the handler which decides whether or
>not to return. (Perhaps one would link in different handlers under different
>circumstances.) It is quite common for people to write over-specified
>programs, where the mainline "knows" what kind of handler is out there.
>The existence of a new function type would mostly encourage such limited
>thinking.

On the contrary, it would encourage the author to document (via declaration)
what the specification is.  It's the author's right to insist on a handler
that doesn't return, if there's no appropriate default handler.  (A storage
allocator normally does have an appropriate default handler: "return (NULL)")

You might as well be arguing that "void" is a bad idea because it encourages
people to write functions that don't return an error check.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

jsdy@hadron.UUCP (Joseph S. D. Yao) (09/29/86)

In article <86900066@haddock> karl@haddock writes:
>if you supersede exit() with a function that returns, how do you get out of
>the program?  Send yourself a signal?

Most crt0.s's or the like contain code similar to:

	store-args	argc, argv, envp
	call-function	_main
	call-function	_exit
	system-call	exit
	halt

If exit() is re-defined to return, the system call will exit after
a return from main().  The 'halt' usually causes a trap if neither
works.

Using 'return' from main() is the preferred way to exit according
to System V lint (which complains otherwise).  I've preferred it
myself, for years before, because I've viewed exit() as a glitch
in a smooth mental model of the invocation and return of routines.
THIS (latter) IS PURELY PERSONAL PREFERENCE, NO FLAMES DESIRED,
but comments on s5lint's and others' mental models welcome.
-- 

	Joe Yao		hadron!jsdy@seismo.{CSS.GOV,ARPA,UUCP}
			jsdy@hadron.COM (not yet domainised)

karl@haddock (09/30/86)

hadron!jsdy (Joe Yao) writes:
>Using 'return' from main() is the preferred way to exit according
>to System V lint (which complains otherwise).

The reason is simply that lint doesn't know about exit() -- it's just another
function as far as it knows.  The warning is not in any way an endorsement
of return over exit, as lint will be equally happy if you put a goto at the
bottom of main() and call exit() in the middle somewhere.

>I've preferred it myself, for years before, because I've viewed exit() as a
>glitch in a smooth mental model of the invocation and return of routines.

I used to prefer return for the same reasons (actually I preferred exit()
before that, because it works even if your local crt0 ignores the result of
main -- as ours did for a while).  Now I've reverted to exit(), because I
consider it a better model for error handling: "fprintf(stderr, usage); \
exit(1);" is more meaningful to me than "... return (1);" and won't break
if the code segment moves into a subroutine.  At the bottom of main(), I now
tend to use exit() again (throwing in a /*NOTREACHED*/ and cussing under my
breath), for reasons I can't explain well.

There are three possible models for the crt0/main interface:  main() could be
int ("exit(main())"), void ("main(); exit(0)"), or dead ("main(); HALT").  I
kind of like the last idea; it requires the user to explicitly call exit(),
but it makes all main programs equivalent.  (Including those which neither
return nor exit, for which the declaration "int main()" looks silly.)

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

chris@umcp-cs.UUCP (Chris Torek) (10/01/86)

In article <584@hadron.UUCP> jsdy@hadron.UUCP (Joseph S. D. Yao) writes:
>Using 'return' from main() is the preferred way to exit according
>to System V lint (which complains otherwise). ...
>but comments on s5lint's and others' mental models welcome.

Someone at Sun must have had a different model indeed, for alas,
in at least one major release, their startup code amounts to this:

	push argc, argv, envp
	call main
	push 0
	call exit

This means that

    main( ... ) ... return (EX_USAGE); ...

will return not exit code 64, but rather code 0 (success).
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1516)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@mimsy.umd.edu

garys@bunker.UUCP (Gary M. Samuelson) (10/01/86)

In article <86900070@haddock> karl@haddock writes:

Re 'exit()' versus 'return' to leave main():
>Now I've reverted to exit(), because I
>consider it a better model for error handling: "fprintf(stderr, usage); \
>exit(1);" is more meaningful to me than "... return (1);" and won't break
>if the code segment moves into a subroutine.

Interesting.  I started using 'return' instead of 'exit()' so that the
code segment wouldn't break if moved into a subroutine -- to be more
precise, so that I could call 'main()' as a subroutine (recursion
being occasionally useful).  Anyone else do this?

>There are three possible models for the crt0/main interface:  main() could be
>int ("exit(main())"), void ("main(); exit(0)"), or dead ("main(); HALT").

I suppose this is an operating system (environment) issue.  Assuming
UNIX, the first model is the only correct one.  If you're not assuming
UNIX, what is 'crt0'?

>I
>kind of like the last idea; it requires the user to explicitly call exit(),
>but it makes all main programs equivalent.

It only makes all main programs equivalent by requiring that they all
be written the same way.  I don't see the advantage.  Assuming a
multiuser environment, it wouldn't be a very good idea at all.  ("Why
is the system hung?"  "Oh, someone ran the 'hello, world' program again.")

>(Including those which neither
>return nor exit, for which the declaration "int main()" looks silly.)

Routines which do not return or exit should be declared 'void'
(or 'dead' -- if that were to be added to the language, which I
wouldn't recommend, but wouldn't fight either).

Gary Samuelson

pedz@bobkat.UUCP (Pedz Thing) (10/03/86)

In article <1216@bunker.UUCP> garys@bunker.UUCP (Gary M. Samuelson) writes:
>In article <86900070@haddock> karl@haddock writes:
>>There are three possible models for the crt0/main interface:  main() could be
>>int ("exit(main())"), void ("main(); exit(0)"), or dead ("main(); HALT").

I think it should be the second choice, main(); exit(0);.  The reason
is two fold.  First, unless a program specifically bombs off, it
should exit with a happy status.  Falling off the bottom of main I
would not consider to be doing anything specific yet the return (and
exit) value will be random.  Second, with a random exit status, it
makes the program much less usefull.  Make(1) will sometimes work and
sometimes fail, etc.
-- 
Perry Smith
ctvax ---\
megamax --- bobkat!pedz
pollux---/

guy@sun.UUCP (10/05/86)

> >>There are three possible models for the crt0/main interface:  main()
> >>could be int ("exit(main())"), void ("main(); exit(0)"), or dead
> >>("main(); HALT").
> 
> I think it should be the second choice, main(); exit(0);.  The reason
> is two fold.  First, unless a program specifically bombs off, it
> should exit with a happy status.  Falling off the bottom of main I
> would not consider to be doing anything specific yet the return (and
> exit) value will be random.  Second, with a random exit status, it
> makes the program much less usefull.  Make(1) will sometimes work and
> sometimes fail, etc.

This merely argues against the third alternative, which is clearly wrong
(the only program that should exit with a random exit status is the System V
Release 2 game "random", when invoked with the "-e" flag, since under those
circumstances it's *supposed* to exit with a random exit status).  Neither
the first and second alternatives permit a correct program to exit with a
random status; with the first model, "main" is a function returning "int",
so it should always return an "int" value (the S5R2 "lint" will check for
this).

Since the first alternative is specified by the S5R2 documentation and the
ANSI C standard, and is implemented by most other versions of UNIX, it
should be the choice unless you manage to convince the ANSI C committee
otherwise.
-- 
	Guy Harris
	{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
	guy@sun.com (or guy@sun.arpa)

rbutterworth@watmath.UUCP (Ray Butterworth) (10/06/86)

> I think it should be the second choice, main(); exit(0);.  The reason
> is two fold.  First, unless a program specifically bombs off, it
> should exit with a happy status.  Falling off the bottom of main I
> would not consider to be doing anything specific yet the return (and
> exit) value will be random.  Second, with a random exit status, it
> makes the program much less usefull.  Make(1) will sometimes work and
> sometimes fail, etc.

If you change that to "main(); exit(1);" I'll agree with you.
Programs should definitely not exit with a random status,
but I think we should do as much as possible to encourage the programmer
to be aware that it is his responsibility to return a meaningful status.

chris@umcp-cs.UUCP (Chris Torek) (10/07/86)

In article <147@bobkat.UUCP> pedz@bobkat.UUCP (Pedz Thing) writes:
>I think [the support library start-up code] should be the second
>choice, main(); exit(0);.  The reason is two fold.  First, unless
>a program specifically bombs off, it should exit with a happy status.
>Falling off the bottom of main I would not consider to be doing
>anything specific yet the return (and exit) value will be random.

But it is doing something specific: the close-brace, if reachable,
is equivalent to `return;'.  I doubt that anyone would argue that

	double foo() {
		return;
	}

is correct.  Is then

	int main(argc, argv) ... {
		...
		return;
	}

correct?

[The trick, of course, is that I have slipped in the keyword `int',
which cannot be found modifying main() in K&R.  On the other hand,
the use of exit() to enforce clean termination, and (not so
incidentally) make that last close-brace unreachable, is covered.]

>Second, with a random exit status, it makes the program much less
>usefull.  Make(1) will sometimes work and sometimes fail, etc.

Indeed.  So fix the program.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1516)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@mimsy.umd.edu

rbutterworth@watmath.UUCP (Ray Butterworth) (10/08/86)

> But it is doing something specific: the close-brace, if reachable,
> is equivalent to `return;'.  I doubt that anyone would argue that
> 
>     double foo() {
>         return;
>     }
> 
> is correct.

The GCOS8 Lint complains that
Function "foo" has no return value
Function "foo" is defined to return a value
(I guess it doesn't really "argue" about it though.:-)

If you think the above definition of foo() should be considered
valid, then without looking at the source for foo, consider:

{
    extern double foo();
    auto double x;
    x=foo();
    foo();
    ...
}

The extern declaration is correct, and required.
The first call to foo() is obviously correct, while the second
is obviously incorrect (or at least it ignores the returned value).
In fact, the first is wrong and the second is correct.

If the function doesn't return a value, why would you want to
state explicitly what type of value it doesn't return?  Either
the return statement or the declaration is wrong, and one of them
should be changed to match the other.  Otherwise you confuse
people who use the function.  They see the "double foo()" in the
source or in the header file containing the extern reference and
expect the function to return a value.