[comp.lang.c] Is this kosher?

art@dinorah.wustl.edu (Arthur B. Smith) (12/06/89)

Forgive me if this has already been discussed....  I don't remember
seeing it recently, anyway.  Here is a much shortened version of what
I am doing, and I want to know if it's ok.  The particularly
questionable thing is marked.

typedef struct node
  {
  char        * str;
  struct node * next;
  } node;

void f ( )

  {
  node * listhd;

  if ( (listhd = (node *)malloc(sizeof(node))) == NULL )
    {
    cough_up_and_die();	/* Details not important */
    }
  else
    {
    listhd->str = "string literal";   /* This is the questionable line... */
    listhd->next = 0;
    recursive_function(listhd);	      /* ...given this usage */
    }
  return;
  }

    In particular, is it safe to assign the address of the string
literal in the list and assume that the contents of that address do
not change when in another function?

    As I RTFM, string literals have a static storage class, which
means that they are guaranteed to have the same contents inside the
block, but I am not clear on their linkage (which I think determines
whether they have the same contents outside the block as well).  K&R2
(since I don't have an ANSI draft) says:

"A string [literal] has type "array of characters" and storage class
static and is initialized with the given characters." 

and

"Static objects may be local to a block or external to all blocks, but
in either case retain their values accross exit from and reentry to
functions and blocks."  This is followed by how to delcare the linkage
for objects declared outside blocks using static and extern keywords, etc.

    I cannot find anywhere that indicates the linkage for the string
constant.  If my recursive_function() in the example above uses the
value in listhd->str (say for a strcmp()), can it be sure that the
string contains "string literal"?  If not, how do I assure that?
Do I have to do something like:

typedef struct node
  {
  char        * str;
  struct node * next;
  } node;

char * stick_around = "string literal";	    /* Yucko! */

int f ( )

  {
  node * listhd;

  if ( (listhd = (node *)malloc(sizeof(node))) == NULL )
    {
    cough_up_and_die();	/* Details not important */
    }
  else
    {
    listhd->str = stick_around;	/* What's in stick_around? */
    listhd->next = 0;
    recursive_function(listhd);
    }
  return;
  }

    This is certainly less readable.  I _am_ interested in portability
considerations, both theoretical and real (you can have your way with
the "weird" machines, Chris 8^).  I will try to follow this group, but
personal replies are also welcome.  Thanks in advance!

    art smith  (art@dinorah.wustl.edu  or ...!uunet!wucs1!dinorah!art)

Usual disclaimers apply.

chris@mimsy.umd.edu (Chris Torek) (12/06/89)

In article <1047@dinorah.wustl.edu> art@dinorah.wustl.edu (Arthur B. Smith)
writes:
>    listhd->str = "string literal";   /* This is the questionable line... */
 ...
>    As I RTFM, string literals have a static storage class, which
>means that they are guaranteed to have the same contents inside the
>block, but I am not clear on their linkage (which I think determines
>whether they have the same contents outside the block as well).

The following definitions may help:

  Scope - determines where the identifier's name may be used.  This is
  a matter for the compiler.  (Possibilities: block, file, function
  [labels only], prototype [prototypes only].)

  Linkage - basically the same as scope, but applies to the linker.
  (Possibilities: external [global], internal [file-wide but no more],
  none.)

  Duration (aka Lifetime) - determines when the contents of a variable
  are valid.  (Possibilities: static, automatic.)

For instance, a local array such as the one created by

	f() { char array[12]; ... }

has block scope (the name `array' vanishes at the close brace), has no
linkage (is effectively invisible to the linker), and has automatic
duration (lasts while the particular activation of f() is running).  To
change its duration to static, one declares it static:

	f() { static char array[12]; ... }

A string literal is an unnamed static array of char, and hence has:

  scope: none (there is no identifier, since it is unnamed)
  linkage: none
  duration: static

All static variables have static duration, i.e., are valid at all times
at which the program is running.  Some entities---global variables and
all functions---*always* have static duration.  The confusion comes in
here, as the `static' keyword can be applied to them, this time
changing not their duration (which is already static) but rather their
linkage.  Static globals get internal linkage, which means that their
names do not appear to exist outside the one file:

	int func() { ... }
	static int f1() { ... }
	/* func() can be seen outside, but f1() cannot */
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

gwyn@smoke.BRL.MIL (Doug Gwyn) (12/06/89)

In article <1047@dinorah.wustl.edu> art@dinorah.wustl.edu (Arthur B. Smith) writes:
>    In particular, is it safe to assign the address of the string
>literal in the list and assume that the contents of that address do
>not change when in another function?

Yes, string literals have static storage duration.  Only register
and auto objects end their lifetimes when leaving the block.

Be careful you don't feed a string literal to free().

>I cannot find anywhere that indicates the linkage for the string constant.

Linkage is not relevant here.  (Linkage basically has to do with
whether or not the identifier is visible to other translation units.)

6600pete@hub.UUCP (12/07/89)

From article <21122@mimsy.umd.edu>, by chris@mimsy.umd.edu (Chris Torek):
> 	f() { char array[12]; ... }

What's the use of declaring such a thing beyond passing a pointer to it to
another module? (BTW, IMHO, this one use of the declaration runs counter to
intuitive programming practices because it peppers the .c file with things
that go in the binary executable image file instead of on the stack.)

> Some entities---global variables and
> all functions---*always* have static duration.  The confusion comes in
> here, as the `static' keyword can be applied to them, this time
> changing not their duration (which is already static) but rather their
> linkage.  Static globals get internal linkage, which means that their
> names do not appear to exist outside the one file...

IMHO, this is a weakness in the standard. Has it been bashed out before?
-------------------------------------------------------------------------------
Pete Gontier   : InterNet: 6600pete@ucsbuxa.ucsb.edu, BitNet: 6600pete@ucsbuxa
Editor, Macker : Online Macintosh Programming Journal; mail for subscription
Hire this kid  : Mac, DOS, C, Pascal, asm, excellent communication skills

chris@mimsy.umd.edu (Chris Torek) (12/07/89)

>In article <21122@mimsy.umd.edu> I wrote:
>> 	f() { char array[12]; ... }

In article <3248@hub.UUCP> 6600pete@hub.UUCP writes:
>What's the use of declaring such a thing beyond passing a pointer to it to
>another module? (BTW, IMHO, this one use of the declaration runs counter to
>intuitive programming practices because it peppers the .c file with things
>that go in the binary executable image file instead of on the stack.)

I do not understand this question.

A local array is exactly as useful as N local variables that can be
addressed randomly (since that is what it is).  It also typically does
go on a stack (C does not say what a `stack' might be; some
implementations might, e.g., put small arrays into registers, if the
machine architecture permits it).

If you mean instead `what is the use of declaring a static array beyond'
etc., you need some sort of stable object if it is intended to last
beyond the activation of the function that contains it.  Some people
disapprove of this practise on principle, since it leads to strange
results in cases like

	printf("time1: %stime2: %s", ctime(&t1), ctime(&t2));

since ctime() tends to use static storage.  (Remember that ctime's
return points to a NUL-terminated string whose last printing character
is a newline, hence no \n in the printf above.)  The alternatives,
however, are sometimes just as bad:

	char *ct1, *ct2;
	ct1 = new_ctime(&t1);
	ct2 = new_ctime(&t2);
	printf("time1: %stime2: %s", ct1, ct2);
	free(ct1);
	free(ct2);

or

	char ct1[CTIME_SIZE], ct2[CTIME_SIZE];
	printf("time1: %stime2: %s", ctime(&t1, ct1), ctime(&c2, ct2));

The former requires heap allocation at runtime, while the latter
requires deciding on a CTIME_SIZE in advance and sticking to it (even
if it later proves to be too small).  (Ctime is a bad example here
since it already promises a fixed format.)

>>Some entities---global variables and
>>all functions---*always* have static duration.  The confusion comes in
>>here, as the `static' keyword can be applied to them, this time
>>changing not their duration (which is already static) but rather their
>>linkage.  Static globals get internal linkage, which means that their
>>names do not appear to exist outside the one file...

>IMHO, this is a weakness in the standard. Has it been bashed out before?

How can it be a weakness in the standard when it was required by
K&R-1?  It *could* be considered a weakness in C, that this keyword
(`static') was overloaded to also mean `private' in various
circumstances.  However, it is worth pointing out that not all possible
combinations can be had.  In particular, if an object's duration is to
be automatic, then object must declared in a block, and cannot have
linkage.  This leaves only static-duration objects to consider.  These
can have scope and linkage.  List all combinations, and you will find
that some are nonsensical:

	scope: (choices are block and file)
	linkage: (choices are none, internal, external)

 if scope = block:

	linkage = none is sensible
	linkage = internal is not (you have to name the object to
		use it, and block scope prevents naming it)
	linkage = external is not (same problem)

 if scope = file:

	linkage = none is not sensible
	linkage = internal is sensible (a private global name)
	linkage = external is sensible (a public global name)

By assigning a `sensible' default action (with no modifier), one can
see that only one modifier is needed for file scope names (to change
private to public or vice versa), and no modifiers are needed for block
scope names.  Thus, one can reuse an existing keyword that is not
applicable to file scope names to be the private/public modifier.
The keyword need not be `static'---e.g., `auto' would have worked,
or even `for' or `goto'.

This is not to say that avoiding keywords for the sake of avoiding
keywords is necessarily a good thing, or that I agree with the choice
of `static' as a modifier to imply `private'.  (Note that I have NOT
told you what choices I would make!  I say only that one can make
the above argument.  Whether I believe in it is not really a matter
for comp.lang.c.)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

6600pete@hub.UUCP (12/07/89)

From article <21144@mimsy.umd.edu-, by chris@mimsy.umd.edu (Chris Torek):
--- f() { char array[12-; ... }
 
-- What's the use of declaring such a thing beyond passing a pointer to it to
-- another module?
 
- ...you need some sort of stable object if it is intended to last
- beyond the activation of the function that contains it.
 
Right. I wasn't thinking that one would do anything but make the declaration
global. I suppose in a large file requiring lots of "stable objects" this
would be necessary, but it didn't occur to me.
 
-----------------
 
-- (BTW, IMHO, this one use of the declaration runs counter to
-- intuitive programming practices because it peppers the .c file with things
-- that go in the binary executable image file instead of on the stack.)
 
- Some people disapprove of this practise on principle, since it leads to
- strange results in cases like
-
-       printf("time1: %stime2: %s", ctime(&t1), ctime(&t2));
-
- since ctime() tends to use static storage.
 
- [ placed a little bit out of context ]
- I do not understand this question.
 
Oh yes you did! :-) That was my idea exactly.
 
------------------
 
--- Some entities---global variables and
--- all functions---*always* have static duration.  The confusion comes in
--- here, as the `static' keyword can be applied to them, this time
--- changing not their duration (which is already static) but rather their
--- linkage.  Static globals get internal linkage, which means that their
--- names do not appear to exist outside the one file...
 
-- IMHO, this is a weakness in the standard. Has it been bashed out before?
 
- How can it be a weakness in the standard when it was required by K&R-1?
 
Well, I don't know all that much about the way the committee operated;
I take it from this sentence that they took K&R as gospel and didn't
mess with it.
 
- It *could* be considered a weakness in C, that this keyword
- (`static') was overloaded to also mean `private' in various
- circumstances.
 
- This is not to say that avoiding keywords for the sake of avoiding
- keywords is necessarily a good thing, or that I agree with the choice
- of `static' as a modifier to imply `private'.
 
That is exactly my point. I am marking this Followup-to: comp.std.c,
which I don't read, because obviously we are in agreement on this issue.
I'm not interested in trying to change the ANSI committee's mind, because
this is the very first thing I have ever found in C that I didn't like.
I _would_ appreciate it if someone would mail me a clue as to why this
keyword was overloaded and no alias for it ("here"? :-) ) was added.
-------------------------------------------------------------------------------
Pete Gontier   : InterNet: 6600pete@ucsbuxa.ucsb.edu, BitNet: 6600pete@ucsbuxa
Editor, Macker : Online Macintosh Programming Journal; mail for subscription
Hire this kid  : Mac, DOS, C, Pascal, asm, excellent communication skills