[comp.lang.c] How big is the argv[] array?

hood@osiris.cso.uiuc.edu (10/04/88)

How big is the argv[] array?  Or to ask it another way, how safe is it to
go past the argc-1'th element in the argv[] array?

Isn't it true that the array of pointers (or pointer to pointers, depending
on your point of view) "argv" actually contains argc+1 elements, and not
argc elements?

Consider the following command line, "command this is a test".  Is argv
passed to the program like this:

	argc = 5				argc = 5
	argv[0] = command			argv[0] = command	
	argv[1] = this				argv[1] = this
	argv[2] = is		or is it	argv[2] = is
	argv[3] = a				argv[3] = a
	argv[4] = test				argv[4] = test
	argv[5] = NULL

I had read somewhere that argc was not really required to process command
line arguments because, you could test for the NULL in the array to
signal the end of the arguments.

Consider this code fragment where I "walk off the end" of argv[] if argc
is equal to 1.  Is this guaranteed to work?

main(argc, argv)
int argc;
char *argv[];
{
	int i;
					/* fake an argument */
	if (argc == 1) {
		argv[argc] = "dummy";
		argc++;
	}
					/* process arguments */
	for (i=1; i<argc; i++) {
		.
		.
	}
}

Emmet P. Gray				US Army, HQ III Corps & Fort Hood
...!uunet!uiucuxc!fthood!egray		Attn: AFZF-DE-ENV
					Directorate of Engineering & Housing
					Environmental Management Office
					Fort Hood, TX 76544-5057

ron@ron.rutgers.edu (Ron Natalie) (10/04/88)

Your code will work, albeit a pretty raunchy way of doing things.
Assigning into the argc'th position is safe since it is allocated
(it holds that null, -1 if you're really living in the past).
You just have to be careful that nothing later on is going to check
to see if that null was there (like you pass the whole kit and
kaboodle to execv or something).  Perhaps I can suggest an
alternative:

main(argc, argv)
	int	argc;
	char	**argv;
{
	if(argc == 1)
		process_arg(DEFAULT_ARG);
	else
		while(--argc)
			processs_arg(++argv);
}

Almost anything you can kludge, you can spend a minute to do right.

-Ron

henry@utzoo.uucp (Henry Spencer) (10/05/88)

In article <1239500004@osiris.cso.uiuc.edu> hood@osiris.cso.uiuc.edu writes:
>How big is the argv[] array?  Or to ask it another way, how safe is it to
>go past the argc-1'th element in the argv[] array?
>
>Isn't it true that the array of pointers (or pointer to pointers, depending
>on your point of view) "argv" actually contains argc+1 elements, and not
>argc elements?

That is correct.  argv[argc] is guaranteed to exist and be NULL, and thus
argv has argc+1 elements (argc arguments plus the NULL).  Trying to access
argv[argc+1] or higher, however, is not safe.
-- 
The meek can have the Earth;    |    Henry Spencer at U of Toronto Zoology
the rest of us have other plans.|uunet!attcan!utzoo!henry henry@zoo.toronto.edu

gwyn@smoke.ARPA (Doug Gwyn ) (10/05/88)

In article <1239500004@osiris.cso.uiuc.edu> hood@osiris.cso.uiuc.edu writes:
>How big is the argv[] array?  Or to ask it another way, how safe is it to
>go past the argc-1'th element in the argv[] array?

What puzzles me is the line of reasoning you would use to conclude
that accessing nonexistent data could ever be safe.

argv[argc] is supposed to be a null pointer, and as you note you
can test for that instead of using a counter.  (However, getopt()
works best with the counter approach.)  Some older systems did
not have a null pointer in argv[argc], however.  I don't think
you're likely to encounter any such systems today.

argv[argc+1] is simply undefined.  Any use of it would be at your
own risk; in other words: don't do it.

gwyn@smoke.ARPA (Doug Gwyn ) (10/05/88)

In article <Oct.4.11.12.37.1988.6064@ron.rutgers.edu> ron@ron.rutgers.edu (Ron Natalie) writes:
>Almost anything you can kludge, you can spend a minute to do right.

Which beats spending a hour later finding and fixing a problem caused
by the kludge.

dhesi@bsu-cs.UUCP (Rahul Dhesi) (10/05/88)

In article <8631@smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>argv[argc] is supposed to be a null pointer...
>Some older systems did
>not have a null pointer in argv[argc], however.  I don't think
>you're likely to encounter any such systems today.

Since this is comp.lang.c and not a UNIX newsgroup, the following
footnote from "Advanced UNIX Programming" by Marc J. Rochkind (p 105)
is relevant:

     "The C Programming Language" does not state anywhere that argv[argc]
     is NULL.  Neither do most UNIX manuals.  So it's unwise to assume that
     argv is terminated with NULL, because in non-UNIX systems or in
     UNIX clones it may not be true.

(Does ANSI C make argv[argc] valid?  Even so, ANSI-conformant compilers
will not be universally available for some years.)  There are so many
different non-UNIX C implementations around that a defensive programmer
will avoid assuming too much that is not in K&R.
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee}!bsu-cs!dhesi

shankar@hpclscu.HP.COM (Shankar Unni) (10/06/88)

Re: Expanding the argv array:

Try:

  argv = (char **) realloc (argv, newsize * sizeof (char *))
     /* newsize is the size of your expanded argv array */
  for (ctr = OLDargc; ctr < newsize; ctr++)
	argv[ctr] = (char *) 0;
  /* now append arguments to your heart's content.. */

	/* NO COMMENTS ABOUT PASCAL'ish CODING!!! :-) */

--
Shankar.

ok@quintus.UUCP (Richard A. O'Keefe) (10/06/88)

In article <4206@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
>In article <8631@smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>>argv[argc] is supposed to be a null pointer...
>Since this is comp.lang.c and not a UNIX newsgroup, the following
>footnote from "Advanced UNIX Programming" by Marc J. Rochkind (p 105)
>is relevant:
>
>     "The C Programming Language" does not state anywhere that argv[argc]
>     is NULL.  Neither do most UNIX manuals.

The System V Interface Definition says (Vol 1, Page 71, 1st paragraph)
	"argv is terminated by a null pointer", and
	"envp is terminated by a null-pointer".
The 4.2BSD manual page says
	"the array of pointers is terminated by a null pointer".
The VAX-11 C manual (for VMS) says on page 24-5
	"The last element of [argv] is always the null pointer (0).", and
	"The last element in envp must be the null pointer (0)."
On IBM mainframes, SAS Lattice C (C-101, p 4.6) says that
	"argv[argc] is 0"
and the IBM C Compiler reference manual says on page 8.3 that
there is a null pointer after the arguments in argv[].  (It should be
noted that the SAS manual gives an example of calling C from Fortran
which violates this convention, but that example also violates the convention
that the elements of argv[] are pointers to strings.)

So, assuming that !argv[argc] is portable between System V, 4.2,
VMS, and IBM mainframes.

Which are the implementations of C where argv[] is present but argv[argc]
is not defined?

bill@proxftl.UUCP (T. William Wells) (10/06/88)

In article <4206@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
: (Does ANSI C make argv[argc] valid?

Yes. And argv[argc] == 0.

---
Bill

You can still reach me at proxftl!bill
But I'd rather you send to proxftl!twwells!bill

ron@ron.rutgers.edu (Ron Natalie) (10/07/88)

>  argv = (char **) realloc (argv, newsize * sizeof (char *))

NO, NO, NO!  THIS WILL NEVER WORK!

You can't realloc something that wasn't malloc'd.  You can't make
the assumption that argv was allocated by malloc, because it isn't
true.  ARGV is usually found on the stack, but even that is an left
up to the implementation.

-Ron

merlyn@intelob.intel.com (Randal L. Schwartz @ Stonehenge) (10/07/88)

In article <507@quintus.UUCP>, ok@quintus (Richard A. O'Keefe) writes:
| Which are the implementations of C where argv[] is present but argv[argc]
| is not defined?

Not so much undefined, as is different.  Dig out an old V6 or PWB UNIX
manual, if you got'em.  In those beasties (back in the old days),

argv[argc] == -1

In fact, the sentence in the V7 execve(2) manpage that reads something
like "the array passed as argv can be used directly in another execve
call because argv[argc] is 0" was changed only slightly from the same
page in V6 which read "the array... *cannot* be used... argv[argc] is
-1".  It never made a lot of grammatical sense after the change.

BSD (or is it Ultrix?  Don't have access to a BSD manpage anymore) got
rid of that sentence.

Just a bit of history.  If you are writing a program to port between
4.3-tahoe and a V6 system, beware! :-)
-- 
Randal L. Schwartz, Stonehenge Consulting Services (503)777-0095
on contract to BiiN Technical Information Services (for now :-),
in a former Intel building in Hillsboro, Oregon, USA
<merlyn@intelob.intel.com> or ...!tektronix!inteloa[!intelob]!merlyn
Standard disclaimer: I *am* my employer!

swillden@wsccs.UUCP (Shawn Willden) (10/07/88)

In article <1239500004@osiris.cso.uiuc.edu>, hood@osiris.cso.uiuc.edu writes:
> 
> How big is the argv[] array?  Or to ask it another way, how safe is it to
> go past the argc-1'th element in the argv[] array?
> 
> Isn't it true that the array of pointers (or pointer to pointers, depending
> on your point of view) "argv" actually contains argc+1 elements, and not
> argc elements?

No.  At least not in any C I know of.  Argv has argc elements.
It is possible that some C compilers may give argv argc+1 elements but I
know that Borland's Turbo C for PC's and VAX C do not.

swillden@wsccs

gwyn@smoke.ARPA (Doug Gwyn ) (10/07/88)

In article <660020@hpclscu.HP.COM> shankar@hpclscu.HP.COM (Shankar Unni) writes:
>  argv = (char **) realloc (argv, newsize * sizeof (char *))

NO! NO! NO!  Realloc only areas obtained from malloc (or calloc).
You don't know where the argv array came from but the odds are good
that it was NOT malloced by the startup module.

meissner@xyzzy.UUCP (Michael Meissner) (10/09/88)

In article <507@quintus.UUCP>, ok@quintus (Richard A. O'Keefe) writes:
| Which are the implementations of C where argv[] is present but argv[argc]
| is not defined?

In article <2989@mipos3.intel.com> merlyn@intelob.intel.com (Randal L.
Schwartz @ Stonehenge) writes:

| Not so much undefined, as is different.  Dig out an old V6 or PWB UNIX
| manual, if you got'em.  In those beasties (back in the old days),
| 
| argv[argc] == -1
	...
| Just a bit of history.  If you are writing a program to port between
| 4.3-tahoe and a V6 system, beware! :-)

I agree that V6 had this misfeature (and AT&T corrected it in V7, which
is the basis for all modern UNIX systems).  However, if you did find
a V6 PDP-11 to port to, I would wager that this might be the least of
your troubles.  Both C and UNIX have changed a lot from those days.  I
wonder if there is any V6 systems still used for development these
days (I can believe that some are still chugging away on applications
written back then....).
-- 
Michael Meissner, Data General.

Uucp:	...!mcnc!rti!xyzzy!meissner
Arpa:	meissner@dg-rtp.DG.COM   (or) meissner%dg-rtp.DG.COM@relay.cs.net

diamond@csl.sony.JUNET (Norman Diamond) (10/11/88)

In article <660020@hpclscu.HP.COM>, shankar@hpclscu.HP.COM (Shankar Unni) writes:
>   argv = (char **) realloc (argv, newsize * sizeof (char *))
>      /* newsize is the size of your expanded argv array */

What does realloc do when its first argument is a pointer to storage
that was not obtained by malloc?  Maybe you won't crash until your
program exits and the shell reads the next command line?
-- 
-------------------------------------------------------------------------------
  The above opinions are my own.   |   Norman Diamond
  If they're also your opinions,   |   Sony Computer Science Laboratory, Inc.
  you're infringing my copyright.  |   diamond%csl.sony.jp@relay.cs.net

jim.nutt@p11.f15.n114.z1.fidonet.org (jim nutt) (10/29/88)

 ND> In article <660020@hpclscu.HP.COM>, shankar@hpclscu.HP.COM (Shankar 
 ND> Unni) writes:
 ND> >   argv = (char **) realloc (argv, newsize * sizeof (char *))
 ND> >      /* newsize is the size of your expanded argv array */
 ND> 
 ND> What does realloc do when its first argument is a pointer to storage
 ND> that was not obtained by malloc?  Maybe you won't crash until your
 ND> program exits and the shell reads the next command line?

it varies seriously on microcomputer compilers.  borland's turbo c handles it 
least gracefully, with msc a little more friendly...  zortech c waits until 
the program ends to die....  it also depends on the memory model...

jim nutt
'the computer handyman'



--  
St. Joseph's Hospital/Medical Center - Usenet <=> FidoNet Gateway
Uucp: ...{gatech,ames,rutgers}!ncar!noao!asuvax!stjhmc!15.11!jim.nutt
Internet: jim.nutt@p11.f15.n114.z1.fidonet.org

crossgl@ingr.UUCP (Gordon Cross) (11/01/88)

In article <10033@socslgw.csl.sony.JUNET>, diamond@csl.sony.JUNET (Norman Diamond) writes:
> 
> What does realloc do when its first argument is a pointer to storage
> that was not obtained by malloc?  Maybe you won't crash until your
> program exits and the shell reads the next command line?

My experience with these types of problems is that (assumming the memory is
writable) realloc simply assumes that the pointer it has been handed 
(excepting the NULL case which many implementations treat as if malloc had
been called) was previously allocated via malloc.  If the realloc call does
not cause some sort of memory fault (and in many cases it may not) it winds
up corrupting the data structures that malloc uses.  Eventually the program
will likely abort (I don't like the word "crash" - it implies a system crash)
but during a subsequent call to malloc, realloc, calloc, or free....


Gordon Cross
Intergraph Corp.  Huntsville, AL