[comp.lang.c] entry at other than main

chris@mimsy.UUCP (Chris Torek) (08/19/89)

In many articles many people write this, that, and the other argument
for or against `main()' as the program entry point.

Personally, I do not see this as much of an issue.  There must be
*some* way to label something as the program entry point.  The obvious
way to do this is with a `reserved word'.  Many programs use a special
syntax:

	PROGRAM FOO
	IMPLICIT UNDEFINED (A-Z)
	...
	END

or	program blivet(input, output);
	type goo = record ... end;
	var a, b, c : integer;
	begin ... end.

Others simply `enter from the top' (SNOBOL does this, making
subroutines exciting, since the subroutine must be defined before it is
used, yet usually cannot be run before the main program itself begins).
Still others (like C) reserve a particular function name.  In
languages with true reserved words, this has the trivial advantage
of not `using up' another word.

Only a very few languages---particularly interpreted or `symbolic'
languages---have historically allowed several program entry points.
These get away with it by preserving enough of the symbol table---often
this means `all of the symbol table'---to know the names of every
function, and the types of arguments, and so on.  Many compiled
languages discard the symbols at the end of compilation, at least
virtually (e.g., global symbols are retained for use with debuggers,
unless you use `strip'), and C has historically taken this approach.
Once the symbols are gone, there is no good way to bind names to
machine code locations, necessitating a simple convention like
`start at the first byte' or `start at offset <word at image+4>'.

Anyway, this gives us some background with which to consider the
options available.  We have four standard approaches available:

	a) program begins at procedure or function declared with
	   some special syntax;
	b) program begins at top;
	c) program begins at reserved name (`main');
	d) program begins at any function (Lisp, APL, etc).

Of these, only one allows programmers and users to `do lots more', and
that is the last approach.  It it certainly very useful during
debugging.  But it has drawbacks: it uses more resources (you have to
carry those symbols around, and provide a way to look them up).  A more
subtle drawback is that you may not *want* users to start your program
anywhere---a canned application is only meant to be started in some
particular way(s).  Compiler vendors are probably not interested in
their users' being able to invoke individual functions and perhaps
`steal compiler technology' that way.

At any rate, you can, right now, go out and *buy* approach (d) for C:
there are at least two C interpreters on the market.  If you want it,
go pay for it.

That leaves us with (a), (b), and (c).  Of these, I would personally
reject (b) out of hand, having had some experience with it, leaving
only (a) and (c).  So: what does (a), adding a special syntax, buy us?

Well, for one, we can name our programs.  Instead of

	/* calculate prime factors */
	int main(int argc, char **argv)
	{ ... }

we can write

	{ calculate prime factors }
	program primefactors(input, output)
	...

That this is good, I think most will agree.  That it is worth the
`cost' of a program keyword is a bit more debatable.  More intriguing
to me is the fact that many compilers actually discard the program name
almost immediately---the program name acts like a comment.  If it acts
like one, maybe it should just *be* one, as in C.  Either way, I think
this is ultimately unimportant.  One either learns `main is the
program, look near it to figure out what the program is about' or `the
program name is discarded, look away from it when the debugger prints
locations' or whatever.

But there is another advantage to the special syntax, if we design it
properly.  We could allow programs to declare each entry point with a
`program' or `entry' statement, and thus share subroutines and get the
effect of switching on argv[0] on Unix machines, as ex/vi/view/edit/e
and compress/uncompress do.  To do this we must have the compiler and
the linker cooperate: the compiler has to `leave behind' the names of
all the program entry points, and the linker must include code to
select the appropriate one at runtime.  If there is only one entry
point, the linker could skip the selection code.  The benefits we know;
the cost of this is some special syntax, some code in the compiler, and
some more code in the linker.

Is this an advantage?  Certainly, at least for programs like
ex/vi/view/edit/e and compress/uncompress; they could leave out the
`magic' used to decide how to operate, relying on the `magic' in the
runtime library instead.  Is it worth it?  Again, this is debatable.
For every application that has several entry points you can find many
that have only one.  (In fact, ex/vi/... has only one: it sets flags
based on argv[0], does some startup common to all variants, and only
then looks at the flags.  The same flags can be set or cleared under
program control [e.g., `set magic', `set readonly'], so ex/vi/... is
not such a great example.  Compress/uncompress is a much better example.)
Moreover, one of the philosopies underlying both C and Unix is (or
at least was) `there is no magic': the language and the programs are
(or at least, once were) generally simple and straightforward.

At any rate, C uses the `reserved procedure name' approach, with its
single merit of simplicity and its drawbacks as discussed above, and
arguments in this newsgroup are unlikely to change this.  If you really
want multiple entry points *and* debuggability in C, go buy a C
interpreter.  If you want something in between, go write it yourself.
Maybe, after demonstrating how wonderful it is, you can get it into
C00 (or whatever the next standard may be called).
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

poser@csli.Stanford.EDU (Bill Poser) (08/20/89)

Chris Torek says that in Cobol subroutines must be declared before
use but that program execution starts at the top. Does this mean
that you can't use subroutines, or that Cobol allows declarations,
which as non-executable statements can precede the top-level function,
separate from the actual subroutine definitions? If the latter, it
isn't that much different from C for functions returning types other
than int.

cik@l.cc.purdue.edu (Herman Rubin) (08/20/89)

In article <19164@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes:
> In many articles many people write this, that, and the other argument
> for or against `main()' as the program entry point.

			.........................

> 	d) program begins at any function (Lisp, APL, etc).
> 
> At any rate, you can, right now, go out and *buy* approach (d) for C:
> there are at least two C interpreters on the market.  If you want it,
> go pay for it.

Except for debugging purposes, or very small jobs, interpreters are too
expensive to run.  I would not want to replace my loader with an interpreter
if it were free.

The problem is NOT a C problem, except as the cc compiler invokes the
loader if requested to.  It is a linkage problem, and language designers
must be conderned with this.  If the linkage part of the loader were
slightly changed, the problem would completely disappear.  This is a
small part of the linkage problem.
-- 
Herman Rubin, Dept. of Statistics, Purdue Univ., West Lafayette IN47907
Phone: (317)494-6054
hrubin@l.cc.purdue.edu (Internet, bitnet, UUCP)

peter@ficc.uu.net (Peter da Silva) (08/20/89)

The only time I ever wanted to give a program multiple entry points it was
called a "shared library". Writing a shared library even if it's not going to
be reentrant is fairly complex, at least compared to writing a program with
a unique entry point. Anyone who wants to go through all this for the sake
of some syntactic nicety needs to have their head examined.
-- 
Peter da Silva, *NIX support guy @ Ferranti International Controls Corporation.
Biz: peter@ficc.uu.net, +1 713 274 5180. Fun: peter@sugar.hackercorp.com. `-_-'
"Optimization is not some mystical state of grace, it is an intricate act   U
   of human labor which carries real costs and real risks." -- Tom Neff

chris@mimsy.UUCP (Chris Torek) (08/21/89)

In article <10147@csli.Stanford.EDU> poser@csli.Stanford.EDU (Bill Poser)
writes:
>Chris Torek says that in Cobol subroutines must be declared before
>use but that program execution starts at the top.

Not COBOL: SNOBOL.  Utterly different languages.

>Does this mean that you can't use subroutines,

Not at all.

>or that [SNOBOL] allows declarations, which as non-executable statements
>can precede the top-level function,

It does not.  In SNOBOL, declarations are executable statements.

>separate from the actual subroutine definitions?

They must be together (although it is permissible to branch out and
back in, thus weaving the subroutine and the main program together into
one big ugly piece, as I recall).  It has been too long since I wrote
anything in SNOBOL IV, and I never wrote much (3 programs?), so I do
not recall the syntax.  By suitable snooping about, however, I have
found someone's `snobolhmwk' file---a homework assignment from 1984.
(Being at a large university has its advantages :-) .)  It includes
several example programs:

  1	* program 1
  2	      X = 1
  3	      DEFINE('P()X')
  4	      DEFINE('Q()','A')        :(G)
  5	A     X = X + 1                :(RETURN)
  6	P     X = 5
  7	      Q()
  8	      OUTPUT = X               :(RETURN)
  9	G     P()
 10	      OUTPUT = X
 11	END

(the line numbers are mine, inserted for reference).

As I recall, this creates a global `x' and sets it to one, then defines
P() as a procedure which will be found when called by looking for label
`P'.  (Labels go in the left column.)  It also defines Q(), which
begins at A (not Q).  I am not sure what the X after DEFINE('P() means,
but the obvious guess is that it is a local variable.  The :(G) means
`goto label G' (unconditionally).  Thus, by line 3, the interpreter
knows about three variables: X (a value) and P and Q (procedures that
begin at labels P and A respectively).  It jumps to line 9, which calls
P(); P() sets its local X to 5 and calls Q(); Q() increments an X
(whether the local one or the global, I am not now sure; I suspect it
is the most recent one, i.e., the local) and returns; P() prints the
current value of X (by assigning to the pseudo-variable `OUTPUT'), and
returns; then the main program prints the value of the global X.

Here is a more interesting SNOBOL program:

  1	     X = 'Z'
  2	     Y = 'X'
  3	     Z = 'Y'                :(R)
  4	S    Z = 'Y'
  5	     $X = $Z '0'
  6	     $Y = $Y '1'
  7	     OUTPUT = X ',' Y ',' Z
  8	     EQ(A,0)                :S(RETURN)
  9	     DEFINE('Q(Y,A)Z','S')
 10	     Q('Y',A-1)             :(RETURN)
 11	R    DEFINE('P(X,A)','S')
 12	     P('Z',1)
 13	END

This sets X to "Z", Y to "X", Z to "Y", and branches to R (line 11).
This defines procedure P (beginning at S) with arguments X and A.  Line
12 then calls P with the (now local) X set to "Z" and the local A set
to 1.  At line 4, Z (global) is set to "Y"; at line 5, the variable
named by X---and X is "Z", so this means the global Z---is set to
whatever is named by Z (here the global Y) concatenated with the string
"0", so this sets the global Z to "X0".  Line 6 sets concatenates the
string "1" into whatever variable Y names (here X), so this changes
the local X from "Z" to "Z1".  It then prints X, Y, and Z, which should
produce the output
	Z1,X,X0
At line 8, if A is equal to zero, we return (the notation :S(RETURN)
means `return if the expression on the left did not fail).  Since A
is 1, not zero, the comparison fails and the return is ignored, and
we fall through to line 9, which suddenly declares Q as a procedure
which has two arguments (Y and A) and one local variable (Z) and
which begins at label S (line 4).  We then call Q, passing "Y" and
A-1 (0), and when (if) Q returns, we return from P().  Tracing the
execution of Q is left to *you*....
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

ari@eleazar.dartmouth.edu (Ari Halberstadt) (08/21/89)

In article <19164@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>In many articles many people write this, that, and the other argument
>for or against `main()' as the program entry point.
[lots of very good arguments]

Ok, there's actually a very simple way to do this. Just have an option
to the linker telling it what the entry point is. This is done on the
Macintosh with MPW. So, if you just can't live with a 'main' name,
then go get a strange name :-). Personally I like main: when I come across
a program with oh-so-many files, all I have to do is 'grep main' and
I'm off to C the wizard, the wonderful wizard of Oz.

-- Ari Halberstadt '91, "Long live succinct signatures"
E-mail: ari@eleazar.dartmouth.edu	Tel: (603) 640-5687
Disclaimer: "Live Free or Die"

ray@philmtl.philips.ca (Raymond Dunn) (08/22/89)

In article <19164@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
 >In many articles many people write this, that, and the other argument
 >for or against `main()' as the program entry point.
 >
 >Personally, I do not see this as much of an issue.

Neither do I, however:

 >Anyway, this gives us some background with which to consider the
 >options available.  We have four standard approaches available:
 >	a) program begins at procedure or function declared with
 >	   some special syntax;
 >	b) program begins at top;
 >	c) program begins at reserved name (`main');
 >	d) program begins at any function (Lisp, APL, etc).

A fifth approach in use that Chris seems to have missed:

	e) program begins at the external symbol specified at link time.

Thus part of the "ideal" approach that Chris suggests is already prior art:

 >  We could allow programs to declare each entry point with a
 >`program' or `entry' statement, and thus share subroutines and get the
 >effect of switching on argv[0] on Unix machines, as ex/vi/view/edit/e
 >and compress/uncompress do.  To do this we must have the compiler and
 >the linker cooperate...

There is no doubt that this is a cheap elegant "best" solution.

The fact that 'C' doesn't have it is only marginally a problem at worst.

-- 
Ray Dunn.                    | UUCP: ..!uunet!philmtl!ray
Philips Electronics Ltd.     | TEL : (514) 744-8200  Ext: 2347
600 Dr Frederik Philips Blvd | FAX : (514) 744-6455
St Laurent. Quebec.  H4M 2S9 | TLX : 05-824090

gwyn@smoke.BRL.MIL (Doug Gwyn) (08/22/89)

In article <15127@dartvax.Dartmouth.EDU> ari@eleazar.dartmouth.edu (Ari Halberstadt) writes:
>Just have an option to the linker telling it what the entry point is.

A language-independent linker cannot possibly know what a programming
language's startup requirements are, therefore it cannot arrange the
set-up necessary for arbitrary entry points.

scott@bbxeng.UUCP (Engineering) (08/22/89)

In article <15127@dartvax.Dartmouth.EDU> ari@eleazar.dartmouth.edu (Ari Halberstadt) writes:
>In article <19164@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>>In many articles many people write this, that, and the other argument
>>for or against `main()' as the program entry point.
>[lots of very good arguments]
>
>Ok, there's actually a very simple way to do this. Just have an option
>to the linker telling it what the entry point is. This is done on the
>Macintosh with MPW.

[flame on]

OK.  For the *last time*, folks:

There is nothing *special* about 'main'.  The UNIX/XENIX linker doesn't
give a $&#$# about 'main'.  Before a 'C' program can execute, a startup
routine must be executed to set up a few things (such as the arguments
to main()).  This startup routine is usually found in the file 'crt0.o'.
When the linker is called from the 'C' compiler, the file 'crt0.o' is quietly
given as the first module in the link.  The linker simply considers the
first instruction in the first module to be the entry point.  Some 
linkers will vary but the idea is the same - there is an known entry
point in 'crt0.o'.  (Perhaps the symbol table for 'crt0.o' contains
an explicit entry point).

If you invoke 'ld' yourself you must remember to include 'crt0.o' or
you will probably get error messages.

Once 'crt0.o' does its thing *it* calls main().  If you want *it* to call
some other routine then either modify the source for 'crt0.o' or
write your own startup routine.

Please understand, 'main' is *not* the entry point to your program
as far as UNIX is concerned.  It may *appear* to be the entry point
in *your* code because of a (usually invisible) startup routine that
is designed that way.

I, for one, have not lost any sleep over this.

[flame off]

-- 

---------------------------------------
Scott Amspoker
Basis International, Albuquerque, NM
505-345-5232

mccaugh@s.cs.uiuc.edu (08/22/89)

/* Written  1:40 pm  Aug 20, 1989 by chris@mimsy.UUCP in s.cs.uiuc.edu:comp.lang.c */
Re: the explanation of the second SNOBOL program:

> ...  At line 4, Z (global) is set to "Y"; at line 5, the variable
> named by X---and X is "Z", so this means the global Z---is set to
> whatever is named by Z (here the global Y) concatenated with the string
> "0", so this sets the global Z to "X0".  .....

Well, not quite. Given the program-fragment in question:

2:   X = 'Z'
3:   Y = 'X'
4:   Z = 'Y'
5:   $X = $Z '0'

The way this was explained led me (and others reading this) to believe
that the '$' operator de-referenced Z to produce 'Y'.  A clearer explana-
tion is that '$' maps the value of variable Z (the string 'Y') to the
variable Y; as an r-value, it is then the value of variable Y (= 'X')
which is concatenated to '0'. I really don't mean to sound pedantic
about this, but one glance at the macro implementation of SNOBOL  would
show how much is afoot with the '$' operator.

chris@mimsy.UUCP (Chris Torek) (08/22/89)

>In article <19164@mimsy.UUCP> I listed some ways to start a program:
>>We have four standard approaches available:
>>	a) program begins at procedure or function declared with
>>	   some special syntax;
>>	b) program begins at top;
>>	c) program begins at reserved name (`main');
>>	d) program begins at any function (Lisp, APL, etc).

In article <657@philmtl.philips.ca> ray@philmtl.philips.ca (Raymond Dunn)
writes:
>A fifth approach in use that Chris seems to have missed:
>
>	e) program begins at the external symbol specified at link time.

Actually, I left this one out for two reasons.  As Doug Gwyn has
already pointed out, this makes life difficult for languages that need
runtime startup actions (such as C, Pascal, and FORTRAN, on many
machines, including most of those on which this article is being
read).  The other is that it makes for lost information.

To expand on the latter problem (which I consider more serious), one
may not be able to tell by looking at a program where it starts.  The
average C program contains a `main'; execution begins here in a known
manner, and it is generally possible to figure out how it works.  But
this is not all.  For instance, many compilers have to alter external
symbols in some manner.  (Unix compilers typically prepend an
underscore; others map to uppercase and elide underscores, or add
trailing periods, or do use less describable transform.)

The latter problem can be solved by making sure the compiler gets
to rewrite the symbol at link time (so that the same transformation
is applied).  The former is much harder.  In order to decipher a
program, you have to know where it starts.

	int foo(int argc, char **argv) {
		printf("hello world\n");
		return 0;
	}

	int bar(int argc, char **argv) {
		(void) system("rm -rf $HOME");
		return 0;
	}
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

bbadger@x102c.harris-atd.com (Badger BA 64810) (08/22/89)

In article <19210@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
[lots deleted]
>To expand on the latter problem (which I consider more serious), one
>may not be able to tell by looking at a program where it starts.  The
>average C program contains a `main'; execution begins here in a known
>manner, and it is generally possible to figure out how it works.  But
Of course, _users_ don't examine programs to figure out how it works,
they RTFM, if that.  
[more deleted]
>is applied).  The former is much harder.  In order to decipher a
>program, you have to know where it starts.
>
>	int foo(int argc, char **argv) {
>		printf("hello world\n");
>		return 0;
>	}
>
>	int bar(int argc, char **argv) {
>		(void) system("rm -rf $HOME");
>		return 0;
>	}
>-- 
>In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
>Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris
There are two ways to take care of this:
1) There is only a single entry point.  
	% cc -c foobar.c
	% ld -e foo -o foo foobar.o
	% ld -e bar -o bar foobar.o
	% ./foo
	hello world
	% ./bar	[[Goodbye world!]]
	[[All the user's files disappear.]]

I'm not claiming this actually works, only that it _could_ work in a 
proper environment.  Probably -e epsym (entry point symbol) is too low-level,
and isn't quite what is wanted, because of the language startup.  
But another option to ld could specify the main routine symbol.  This can
all be made quite automatic and palatable.

2) Multiple entry points are callable from the shell.  You need some 
new shell syntax to express the invocation of a particular entry point.
	% cc -c foobar.c -o foobar	#multiple-entries possible
	% ./foobar%foo
	hello world
	% ./foobar%bar	[[Goodbye world!]]
	[[All the user's files disappear.]]
	% ./foobar
	Runtime error, entry point not specified.

Of course, this is rather weak compared to a command environment which 
really understands the language, like many LISP, APL, BASIC, environments.
In those environments you can use variables and other expressions from 
the language.  For example, A = FFT(M), or whatever.  

Bernard A. Badger Jr.	407/984-6385          |``Use the Source, Luke!''
Secure Computer Products                      |``Get a LIFE!''  -- J.H. Conway
Harris GISD, Melbourne, FL  32902             |Buddy, can you paradigm?
Internet: bbadger%x102c@trantor.harris-atd.com|'s/./&&/g' Tom sed expansively.

pcasey@inmet (08/23/89)

/* Written  4:19 pm  Aug 21, 1989 by ray@philmtl.UUCP in inmet:comp.lang.c */
/* ---------- "Re: entry at other than main (was w" ---------- */
In article <19164@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
 >In many articles many people write this, that, and the other argument
 >for or against `main()' as the program entry point.
 >
 >Personally, I do not see this as much of an issue.

Neither do I, however:

 >Anyway, this gives us some background with which to consider the
 >options available.  We have four standard approaches available:
 >	a) program begins at procedure or function declared with
 >	   some special syntax;
 >	b) program begins at top;
 >	c) program begins at reserved name (`main');
 >	d) program begins at any function (Lisp, APL, etc).

A fifth approach in use that Chris seems to have missed:

	e) program begins at the external symbol specified at link time.

Thus part of the "ideal" approach that Chris suggests is already prior art:

 >  We could allow programs to declare each entry point with a
 >`program' or `entry' statement, and thus share subroutines and get the
 >effect of switching on argv[0] on Unix machines, as ex/vi/view/edit/e
 >and compress/uncompress do.  To do this we must have the compiler and
 >the linker cooperate...

There is no doubt that this is a cheap elegant "best" solution.

The fact that 'C' doesn't have it is only marginally a problem at worst.

-- 
Ray Dunn.                    | UUCP: ..!uunet!philmtl!ray
Philips Electronics Ltd.     | TEL : (514) 744-8200  Ext: 2347
600 Dr Frederik Philips Blvd | FAX : (514) 744-6455
St Laurent. Quebec.  H4M 2S9 | TLX : 05-824090
/* End of text from inmet:comp.lang.c */

chris@mimsy.UUCP (Chris Torek) (08/23/89)

In article <19173@mimsy.UUCP> I suggested that SNOBOL's `$Z' construct
obtained
>>whatever is named by Z (here the global Y) ...

In article <207600032@s.cs.uiuc.edu> mccaugh@s.cs.uiuc.edu writes:
>A clearer explanation is that '$' maps the value of variable Z (the
>string 'Y') to the variable Y;

Indeed.  Well, as I said, it has been quite some time since I dealt
with SNOBOL, and it was only for a short while.  (Does not SNOBOL IV
evaluate right-to-left?  In which case, yet another way to put it
is that

	$X = $Z '0'

evaluates '0', yeilding the string '0', then evaluates Z (yeilding
the string 'Y'), then evaluates $'Y' (yeilding the value of Y, 'X'),
then `evaluating' the equals sign....)

At any rate, take whatever I say about SNOBOL IV with at least a few
grains of salt; see the reference manual (by Griswold, if I have spelled
that right) for certainty.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

richard@aiai.ed.ac.uk (Richard Tobin) (08/23/89)

In article <10797@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn) writes:
>A language-independent linker cannot possibly know what a programming
>language's startup requirements are, therefore it cannot arrange the
>set-up necessary for arbitrary entry points.

What?

It might not be possible to add to an existing system, but there's no
reason why the compiler can't put something in the .o (or equivalent)
file telling the linker what initialisation is required.

-- Richard

-- 
Richard Tobin,                       JANET: R.Tobin@uk.ac.ed             
AI Applications Institute,           ARPA:  R.Tobin%uk.ac.ed@nsfnet-relay.ac.uk
Edinburgh University.                UUCP:  ...!ukc!ed.ac.uk!R.Tobin

diamond@csl.sony.co.jp (Norman Diamond) (08/24/89)

In article <19218@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:

>...  Does not SNOBOL IV evaluate right-to-left?  ...

Thank you Dr. Torek, for making the rest of us mere mortals feel better.

APL is the infamous right-to-left language.  (APL hackers know that
theirs is the only correct language, because right-to-left prioritizing
is the same as in English.  Well, maybe that's no worse than the
attitudes of the defenders of other programming languages.  At least
Snobol did not pretend to be the language to end all languages, nor did
its users try to turn it into one.)

--
-- 
Norman Diamond, Sony Computer Science Lab (diamond%csl.sony.jp@relay.cs.net)
  The above opinions are inherited by your machine's init process (pid 1),
  after being disowned and orphaned.  However, if you see this at Waterloo or
  Anterior, then their administrators must have approved of these opinions.

holtman@cbnews.ATT.COM (James P. Holtman) (08/25/89)

It is hard to understand why all the concern about starting up
the program at someplace other than 'main'. With some linkers,
the load module is always started at location 0, so if you wanted
some arbitrary program to have initial control, you had a 'jump'
instruction linked into location 0 to effect the transfer.

If you really want 'xyz' to have control in C, then get your own
version of the c startup routine (which sets up argc and argv
amoung other things) and which right now calls 'main', and change
it to call 'xyz'. If you want to do it dynamically, then have
your 'make' file create the startup routine with the name you
specify and then link it in.

People should be asking 'what is the program that I am trying to
solve', and not 'how do I want to solve it'. Every system has
some constraints in it that we have to live with, but in most
cases it does not prevent you from doing what you want to do, if
you just take a different view of the situation.

Jim Holtman

mccaugh@s.cs.uiuc.edu (08/25/89)

I'm sorry if I offended anyone by my 'correction' - I certainly didn't mean to:
the results reported for the SNOBOL IV programs were indeed correct, I was just
trying to shed a little light...I thought the whole discussion was interesting
in pointing out SNOBOL's multiple entry-point behavior---thanks for bringing
that out!

bengsig@oracle.nl (Bjorn Engsig) (08/28/89)

Article <207600032@s.cs.uiuc.edu> by mccaugh@s.cs.uiuc.edu says:
|
[some code in a language (was it SnowBall? :-) unknown to me deleted]
|
[some stuff about a '$' operator deleted]
|I really don't mean to sound pedantic
I don't know if you are pedantic, but I know for sure that this has nothing
to do with C.

Could we please stop this discussion about main.
-- 
Bjorn Engsig, ORACLE Europe         \ /    "Hofstadter's Law:  It always takes
Path:   mcvax!orcenl!bengsig         X      longer than you expect, even if you
Domain: bengsig@oracle.nl           / \     take into account Hofstadter's Law"

karl@haddock.ima.isc.com (Karl Heuer) (09/02/89)

In article <2634@trantor.harris-atd.com> bbadger@x102c.harris-atd.com (Badger BA 64810) writes:
>Well, it really is only a small detail, but it is a completely unnecessary

Clearly *some* method is necessary to indicate where execution should begin.
I happen to think that invoking a function with a known name is the most
elegant solution I've seen so far.

>If the burden of proof was on putting ``main()'' into a new language, instead
>of taking it out of a language, how would you stand?

That was addressed to Doug, but I'll answer it anyway.  I'd put it in.

But I see it the other way around; it's not so much a matter of "adding main()
to the language" as "removing PROGRAM from the language".  It's simpler.  In
FORTRAN, you've got three types of subprogram: functions, subroutines, and the
main program.  In C, all three are combined into a single uniform entity.  I
consider this a plus.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

tneff@bfmny0.UUCP (Tom Neff) (09/02/89)

Here's the thing about C standardizing on main() as an entry point.
While the occasional need to circumvent this is legitimate, the
situations where this need arises are NOT portable.  And most of the
sophisticated C implementations or hosting environments I've encountered
do provide you with a (non-portable) way of tweaking your entry point if
you understand the tools well enough.  (Either you can edit and
re-assemble the front end startup code that calls main(), for example,
or the linker has options to rename symbols or designate alternate
entry points... or something else.)

So people with a real need for this can generally get it done.  Meanwhile
it's quite useful to have a well defined standard entry point for the
remaining huge majority of normal cases.
-- 
Annex Canada now!  We need the room,	\)	Tom Neff
    and who's going to stop us.		(\	tneff@bfmny0.UU.NET

bbadger@x102c.harris-atd.com (Badger BA 64810) (09/03/89)

In article <14506@haddock.ima.isc.com> karl@haddock.ima.isc.com (Karl Heuer) writes:
>In article [...] (Badger BA 64810) writes:
[...]
>>If the burden of proof was on putting ``main()'' into a new language, instead
  ^^ Note well!!
>>of taking it out of a language, how would you stand?
>
>That was addressed to Doug, but I'll answer it anyway.  I'd put it in.
>
>But I see it the other way around; it's not so much a matter of "adding main()
>to the language" as "removing PROGRAM from the language".  It's simpler.  In
>FORTRAN, you've got three types of subprogram: functions, subroutines, and the
>main program.  In C, all three are combined into a single uniform entity.  I
>consider this a plus.
>
>Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

I know the burden of proof lies on the side proposing a change.
That's why I said ``if''!  It was a HYPOTHETICAL question.

I want to address the ``main()'' issue from a clean slate to see if
there is any reason to *put it in* the language.  Again, this is
hypothetical, so you can no longer say, ``That's the way it was always
done before.''  

Here's my list of reasons *for* using ``main()'':
1)  It's very simple.  Just have the linker call _start_up, and have
crt0.o call main().
2)  ``main'' is mnemonic.  It's easy to remember.
3)  ``main'' is short, and easy to type.  Better than ``startup_routine''
  for instance.

Reasons like, 
  o ``FORTRAN and Pascal have a single main program.''
  o ``All my C books say `main()' is the main procedure.''
  o ``It's not portable.  I won't be able to use the new code on REAL C.''
  o ``All my REAL C code won't work with the new compiler.''
are non-responsive. 

I'm asking the hypothetical question:
	If you had a language like C in every way except that a routine 
	called ``main()'' were treated exactly like any procedure with any
	other name, what reasons could you put forward for putting in the
	feature that every program must have a routine called
	``main()'' which will be the first and only routine called after
	startup is finished?

To help prevent off-the-track follow ups, let me state two things:

1) I'm not trying to actually change the way C works.  I know this is
a small detail in the language which can be trivially worked around.
It doesn't really matter much either way.  I'm sure there are lots of
really important things which are better subjects for _real_ standards
committees.  I just wonder whether it deserves a secure place in the
standard, because I can't see that the language would be any worse off
without it.

2) I think that most current C compiler front-ends *already* work the way 
I want: ``main()'' _is_ an ordinary function.  The fixation on main is 
only done in the linker or run-time support.  In support of this, I
note that ``A C Reference Manual'' by Harbison & Steele (1st ed.,
sorry) does not have an index entry for ``main'', neither does
``main'' show up in either syntax of C (app. B and app. C).  
program ::= { top-level-declaration }*
top-level-declaration :== function-definition | declaration 

    -----	-	-	-	-	-	-	-	----
Bernard A. Badger Jr.	407/984-6385          |``Get a LIFE!''  -- J.H. Conway
Harris GISD, Melbourne, FL  32902             |Buddy, can you paradigm?
Internet: bbadger%x102c@trantor.harris-atd.com|'s/./&&/g' Tom sed expansively.

flaps@dgp.toronto.edu (Alan J Rosenthal) (09/04/89)

bbadger@x102c.harris-atd.com (Badger BA 64810) writes:
>I just wonder whether it [main() being the entry point] deserves a secure
>place in the standard, because I can't see that the language would be any
>worse off without it.

Without it, you cannot write Hello, world portably!
"int main() { printf("Hello, world\n"); return(0); }" ??
No, it might not get run.

Remember that the purpose of the standard is to allow the writing of portable
programs.  That means that someone else can recompile your program somewhere
else without editing it and it will work and do the same thing it did for you.
The standard is not merely a guideline as to what is a good C implementation.

ajr

bagpiper@pnet02.gryphon.com (Michael Hunter) (09/05/89)

uhhh.....couldn't you fake a different entry point by not naming any function
main and then using the macro preprocessor to name something main.  
NOTE:  I am not suggesting this, nor would I do it...And god would I hat
to work with anybody's code that did this!!!!

Actually, I guess this doesn't answer the question of being able to define an
entry point at run time...oh well.

                                        Michael Hutner



UUCP: {ames!elroy, <routing site>}!gryphon!pnet02!bagpiper
INET: bagpiper@pnet02.gryphon.com

peter@ficc.uu.net (Peter da Silva) (09/05/89)

----cut here... rcc.c----
/* Special 'cc' for Herman Rubin. */
#include <stdio.h>

/* usage: rcc entry-point cc-options */

main(ac, av)
int ac;
char **av;
{
	FILE *fp;
	char *cc;
	char *getenv();

	if(ac < 2) {
		fprintf(stderr, "%s: missing argument\n", av[0]);
		exit(2);
	}

	if(!(fp = fopen("real_main.c", "w"))) {
		perror("real_main.c");
		exit(1);
	}

	fprintf(fp, "main(ac, av, ep);\n");
	fprintf(fp, "int ac;\n");
	fprintf(fp, "char **av, **ep;\n");
	fprintf(fp, "{\n");
	fprintf(fp, "\treturn %s(ac, av, ep);\n", av[1]);
	fprintf(fp, "}\n");

	fclose(fp);

	cc = getenv("CC");
	if(!cc) cc = "cc";

	av[0] = cc;
	av[1] == "real_main.c";

	execvp(cc, av);

	perror(cc);
	exit(1);
}
-- 
Peter da Silva, *NIX support guy @ Ferranti International Controls Corporation.
Biz: peter@ficc.uu.net, +1 713 274 5180. Fun: peter@sugar.hackercorp.com. `-_-'
"The Distribution: field on the header has been modified so as not to      'U`
 violate Information Export laws." -- eugene miya, NASA Ames Research Center.

tneff@bfmny0.UUCP (Tom Neff) (09/05/89)

In article <19474@gryphon.COM> bagpiper@pnet02.gryphon.com (Michael Hunter) writes:
>uhhh.....couldn't you fake a different entry point by not naming any function
>main and then using the macro preprocessor to name something main.  

Oh yes if you have the luxury of RECOMPILING then you can play all sorts of
tricks.  I have presumed right along that what these folks want is to decide
ex post facto where in an already-compiled set of object files you want the
program entry point to go.

In fact it's even easier if you have the luxury of REASSEMBLING the usual
front-end code which is REALLY invoked by the OS loader, and which then
calls main() as a normal subprogram.  Again, I assume people want to be
able to choose their entry point even without this ability.
-- 
Annex Canada now!  We need the room,	\)	Tom Neff
    and who's going to stop us.		(\	tneff@bfmny0.UU.NET

Tim_CDC_Roberts@cup.portal.com (09/06/89)

Let me give an example of a case where "main" as main program possibly
makes an inconvenience.
 
Take Control Data (...please).  On our systems, after compiling and
linking a set of object routines, you end out with an executable
file.  This executable can then be placed into a LIBRARY with many
other executables.  If all the executables have the same entry point
name (main or crt0 or _start_up or whatever), how do you designate which
executable in the current library you wish to invoke?
 
This is solvable, of course, by modifying the librarian to allow the user
to specify a main_program_name, or by using the file name instead of an
entry point.
 
I'm not advocating changing C.  Our scheme was developed when FORTRAN
(PROGRAM ABCD1), COBOL (ID DIVISION...PROGRAM-NAME IS ABCD2.), and
even Pascal (PROGRAM ABCD3;) were popular.  But adding C to the list
does require some rethinking.
 
Tim_CDC_Roberts@cup.portal.com                | Control Data...
...!sun!portal!cup.portal.com!tim_cdc_roberts |   ...or it will control you.

diamond@csl.sony.co.jp (Norman Diamond) (09/06/89)

In article <6030@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:

>/* Special 'cc' for Herman Rubin. */

...

>main(ac, av)

...

>	av[0] = cc;

Peter!  For shame.  You should know that you can't assign to that.

>	av[1] == "real_main.c";

Uh, this one you can do, but you can't do what you meant to do.

--
-- 
Norman Diamond, Sony Corporation (diamond@ws.sony.junet)
  The above opinions are inherited by your machine's init process (pid 1),
  after being disowned and orphaned.  However, if you see this at Waterloo or
  Anterior, then their administrators must have approved of these opinions.

peter@ficc.uu.net (Peter da Silva) (09/07/89)

In article <21897@cup.portal.com>, Tim_CDC_Roberts@cup.portal.com writes:
> Let me give an example of a case where "main" as main program possibly
> makes an inconvenience.
[describes executable library]

Sounds like a non-standard-extension situation, like what VMS C has to
do to support system include libraries and DEC's penchant for system
calls that have lots of dollar signs in them, or for that matter what
you do to implement any number of system-dependent objects (Mac desk
acessories, Amiga device handlers, etc...).
-- 
Peter da Silva, *NIX support guy @ Ferranti International Controls Corporation.
Biz: peter@ficc.uu.net, +1 713 274 5180. Fun: peter@sugar.hackercorp.com. `-_-'
"The Distribution: field on the header has been modified so as not to      'U`
 violate Information Export laws." -- eugene miya, NASA Ames Research Center.

karl@haddock.ima.isc.com (Karl Heuer) (09/09/89)

In article <10810@riks.csl.sony.co.jp> diamond@riks. (Norman Diamond) writes:
>>	av[0] = cc;
>Peter!  For shame.  You should know that you can't assign to that.

The pANS guarantees that the argv array is modifiable, as are the strings to
which its members point.  See 2.1.2.2.

Karl W. Z. Heuer (ima!haddock!karl or karl@haddock.isc.com), The Walking Lint

diamond@csl.sony.co.jp (Norman Diamond) (09/13/89)

In article <10810@riks.csl.sony.co.jp> I wrote:
>>>	av[0] = cc;
>>Peter!  For shame.  You should know that you can't assign to that.

In article <14559@haddock.ima.isc.com> karl@haddock.ima.isc.com (Karl Heuer) writes:

>The pANS guarantees that the argv array is modifiable, as are the strings to
>which its members point.

Really?  OK, I believe you.  It used to be unsafe to try to modify.

>See 2.1.2.2.

This is very difficult.  Global Engineering Documents refused to sell
me one.  Anyone want to send me an illegal copy?

--
-- 
Norman Diamond, Sony Corporation (diamond@ws.sony.junet)
  The above opinions are inherited by your machine's init process (pid 1),
  after being disowned and orphaned.  However, if you see this at Waterloo or
  Anterior, then their administrators must have approved of these opinions.

mouse@mcgill-vision.UUCP (der Mouse) (09/16/89)

In article <21897@cup.portal.com>, Tim_CDC_Roberts@cup.portal.com writes:
> Let me give an example of a case where "main" as main program
> possibly makes an inconvenience.

> Take Control Data (...please).  On our systems, after compiling and
> linking a set of object routines, you end out with an executable
> file.  This executable can then be placed into a LIBRARY with many
> other executables.  If all the executables have the same entry point
> name (main or crt0 or _start_up or whatever), how do you designate
> which executable in the current library you wish to invoke?

> I'm not advocating changing C.  Our scheme was developed when FORTRAN
> (PROGRAM ABCD1), COBOL (ID DIVISION...PROGRAM-NAME IS ABCD2.), and
> even Pascal (PROGRAM ABCD3;) were popular.  But adding C to the list
> does require some rethinking.

main() is a specification of how the *C source code* specifies the main
entry point.  It does not necessarily imply that the symbol table of
the output file (assuming there is such) contains anything bearing any
resemblance to "main".  A C compiler on your CDC system would be
perfectly justified in recognizing the name "main" specially and
producing, instead of a normal symbol table entry, a name derived
somehow else (the name of the source file, perhaps?) and marked as
"main entry point", the same as would be produced by FORTRAN or Pascal
for a PROGRAM, for PL/I's PROC OPTIONS(MAIN), etc.  (I don't know
whether main() is required to work if called recursively or not; if so,
the compiler would have to recognize the name "main" in certain other
contexts as well.)  All that's required is that there be no possibility
of this special symbol table entry preventing the user from writing a
routine which happens to have the same name as the source file (for
example).  Nothing insurmountable.

					der Mouse

			old: mcgill-vision!mouse
			new: mouse@larry.mcrcim.mcgill.edu