[comp.lang.c] hardcoded constants

henry@utzoo.uucp (Henry Spencer) (12/14/88)

In article <9134@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>>How much hard-coding is too much?
>
>Almost any is too much.  It is proper to use explicit constants when it
>is clear what they mean and that they can never need to be changed.  For
>example, assigning 0 or 1 to initialize a counter is proper.  Assuming
>that 03 is always the right character code for a keyboard interrupt
>character (i.e. ASCII ctrl-C) is not proper.

The policy we try to follow is that if you must hard-code a constant,
then it must be accompanied by a comment explaining why that particular
number is there.
-- 
SunOSish, adj:  requiring      |     Henry Spencer at U of Toronto Zoology
32-bit bug numbers.            | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

dhesi@bsu-cs.UUCP (Rahul Dhesi) (12/15/88)

In article <1988Dec13.172306.16195@utzoo.uucp> henry@utzoo.uucp (Henry
Spencer) writes:
>The policy we try to follow is that if you must hard-code a constant,
>then it must be accompanied by a comment explaining why that particular
>number is there.

A suggestion:  If you want to hard-code a constant, use a #define
anyway:

     check_break() {
     #define   CTRL_C    3        /* ASCII control C */
        if (keyscan() == CTRL_C)
           ...
     }

A well-chosen name will make the code understandable.  Any additional
information about the hard-coded value can be in a comment to the
#define itself.  Scanning the source for all #defines will let you
locate all hard-coded constants with some confidence.

So I find the above code fragment preferable to the following:

     check_break() {
        if (keyscan() == 3)       /* check for ASCII control C */
           ...
     }
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee}!bsu-cs!dhesi

barmar@think.COM (Barry Margolin) (12/15/88)

In article <1988Dec13.172306.16195@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>The policy we try to follow is that if you must hard-code a constant,
>then it must be accompanied by a comment explaining why that particular
>number is there.

My rule of thumb is that if it needs a comment, it should be a
manifest constant, not a hardcoded literal.  The name of the constant
then serves as self-documentation.

Barry Margolin
Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

djones@megatest.UUCP (Dave Jones) (12/15/88)

From article <33459@think.UUCP>, by barmar@think.COM (Barry Margolin):
...
> 
> My rule of thumb is that if it needs a comment, it should be a
> manifest constant, not a hardcoded literal.  The name of the constant
> then serves as self-documentation.
> 

Your thumb is okay with me, so long as you add the proviso that
if the constant is only used in one place, its scope of definition
is restricted to that place.  I'll try to say that in English.
I don't want the #define to be exiled to some #include file unless
it really is used all over the place.  If it is local, keep it local.

henry@utzoo.uucp (Henry Spencer) (12/16/88)

In article <5146@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
>>The policy we try to follow is that if you must hard-code a constant,
>>then it must be accompanied by a comment explaining why that particular
>>number is there.
>
>A suggestion:  If you want to hard-code a constant, use a #define
>anyway...

Trouble is, often it's almost impossible to devise a meaningful name.
I'm not talking about hard-coding things like choice of control characters,
but about things like (in a function to concatenate two strings with a
'/' in between):

	foo = malloc(strlen(a)+strlen(b)+2);	/* 2 for '/' '\0' */

Now, what's a good name for that "2", and how does naming it improve
readability?
-- 
"God willing, we will return." |     Henry Spencer at U of Toronto Zoology
-Eugene Cernan, the Moon, 1972 | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

bobmon@iuvax.cs.indiana.edu (RAMontante) (12/16/88)

henry@utzoo.uucp (Henry Spencer) writes:
-
-Trouble is, often it's almost impossible to devise a meaningful name.
-I'm not talking about hard-coding things like choice of control characters,
-but about things like (in a function to concatenate two strings with a
-'/' in between):
-
-	foo = malloc(strlen(a)+strlen(b)+2);	/* 2 for '/' '\0' */
-
-Now, what's a good name for that "2", and how does naming it improve
-readability?

#define STRING_TERMINATOR_LENGTH  2

is descriptive, but I think only a fanatic Pascalist could love it.  And
only somebody with a more-than-80-column display would want to use it.

more-fun-with-Henry's-signatures follows:

--
-"God willing, we will return." -Eugene Cernan, the Moon, 1972

I like it, but it's not obvious whether God understands "we" to mean
"Americans".  Neil Armstrong had to go and get ecumenical with his
"...giant leap for mankind".  And now it's those godless communists who
have a space station...

barmar@think.COM (Barry Margolin) (12/17/88)

In article <1988Dec15.190331.2986@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>Trouble is, often it's almost impossible to devise a meaningful name.
>I'm not talking about hard-coding things like choice of control characters,
>but about things like (in a function to concatenate two strings with a
>'/' in between):
>
>	foo = malloc(strlen(a)+strlen(b)+2);	/* 2 for '/' '\0' */
>
>Now, what's a good name for that "2", and how does naming it improve
>readability?

I have a few ideas:

1) When I've done this in other languages, I've used something like
strlen("/") instead of the 2.  Unfortunately, in C I'd still have to
say "+1", which I'd then want to comment with /* leave room for the
trailing null */, since I don't think there's an expression that will
return the total space taken up by a string.  I like this because I
would expect a reasonable compiler to constant-fold the expression,
and it says exactly what the extra space is there for.

2) Define constants: #define PATH_DELIMITER "/"
		     #define ROOM_FOR_PATH_DELIMITER (strlen(PATH_DELIMITER))
		     #define ROOM_FOR_TRAILING_NULL 1

then use ROOM_FOR_PATH_DELIMITER+ROOM_FOR_TRAILING_NULL.
ROOM_FOR_TRAILING_NULL would probably be useful in other places if the
program does lots of concatenation like this.  Those names are pretty
meaningful.  If you're worried that strlen("/") won't be
constant-folded, put the 1 in the #define, with the expression in a
comment.

3) Word the comment differently: /* allocate room for the two strings,
a separator and trailing null */.  This way, it doesn't sound as if
you're defining the 2, just explaining what the statement as a whole
is doing.  The 2 is implicit, and doesn't really stand for anything.


Barry Margolin
Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

dhesi@bsu-cs.UUCP (Rahul Dhesi) (12/17/88)

In article <1988Dec15.190331.2986@utzoo.uucp> henry@utzoo.uucp (Henry Spencer)
writes:
     [But how about] about things like (in a function to concatenate
     two strings with a '/' in between):

	   foo = malloc(strlen(a)+strlen(b)+2);	/* 2 for '/' '\0' */

You are right.  This is a valid exception to my suggestion, and it had
not occurred to me.
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee}!bsu-cs!dhesi

ok@quintus.uucp (Richard A. O'Keefe) (12/17/88)

henry@utzoo.uucp (Henry Spencer) writes:
>-	foo = malloc(strlen(a)+strlen(b)+2);	/* 2 for '/' '\0' */
>-Now, what's a good name for that "2", and how does naming it improve
>-readability?

I recently had a very similar problem.  A *superb* "name" for that 2 is
		sizeof "/"

peter@ficc.uu.net (Peter da Silva) (12/17/88)

In article <33604@think.UUCP>, barmar@think.COM (Barry Margolin) writes:
> 1) When I've done this in other languages, I've used something like
> strlen("/") instead of the 2.  Unfortunately, in C I'd still have to

How about sizeof "/"? Or does that return sizeof(char *)?
-- 
Peter da Silva, Xenix Support, Ferranti International Controls Corporation.
Work: uunet.uu.net!ficc!peter, peter@ficc.uu.net, +1 713 274 5180.
Home: bigtex!texbell!sugar!peter, peter@sugar.uu.net.

guy@auspex.UUCP (Guy Harris) (12/17/88)

>How about sizeof "/"? Or does that return sizeof(char *)?

No, it returns the number of bytes in the anonymous character array that
"/" is.  K&R First Edition says "When applied to an array, the result is
the number of bytes in the array", and also 

	A string is a sequence of character surrounded by double quotes,
	as in "...".  A string has type "array of characters"...

so "/" is an array, and "sizeof", when applied to it, returns the number
of characters in it.  The dpANS says much the same thing.

Unfortunately or fortunately, depending on how you look at it, 'sizeof
"/"' is 2, since the array in question has *two* characters - the '/'
and the '\0' at the end.  This means that

	strlen(a) + sizeof "/" + strlen(b)

happens to be the minimum number of characters that must be in the array
"buf" to make

	strcpy(buf, a);
	strcat(buf, "/");
	strcat(buf, b);

work, since it counts both the "/" added by the first "strcat" and the
null left at the end; however

	strlen(a) + sizeof "/" + strlen(b) + sizeof "/" + strlen(c)

is one more than the minimum number of characters that must be in the
array "buf" to make

	strcpy(buf, a);
	strcat(buf, "/");
	strcat(buf, b);
	strcat(buf, "/");
	strcat(buf, c);

work.  In this particular case it may not be worth worrying about since
it's only one character....

dhesi@bsu-cs.UUCP (Rahul Dhesi) (12/17/88)

In article <2478@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>How about sizeof "/"? Or does that return sizeof(char *)?

I considered this and rejected it because it is misleading.  We need 2
because one is for a '/' in the middle and one is for a trailing null,
not because we need two for just '/'.  That "/" is a string and needs
two is not really the point, even though it gives the right answer.

The format I prefer most is actually:

     strlen(a) + 1 + strlen(b) + 1

because it arranges the components of the expression in the right
order.  (It also avoids the magic number 2.  However, 1 is a magic
number here even though it isn't usually considered to be one.)  It
would be nice to be able to say

     strlen(a) + sizeof('/') + strlen(b) + 1

but unfortunately sizeof('/') is the same as sizeof(int).

One could also do

     #define   ONE_CHAR   1
     #define   TWO_CHARS  2

and then say things like:

     strlen(a) + ONE_CHR + strlen(b) + ONE_CHR
     strlen(a) + strlen(b) + TWO_CHARS

This sounds like the type of error undergraduates make when they are
first asked to use symbolic constants, but upon thinking I realize that
it could actually clarify the code quite a bit.
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee}!bsu-cs!dhesi

scs@adam.pika.mit.edu (Steve Summit) (12/18/88)

In article <1988Dec15.190331.2986@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>...things like (in a function to concatenate two strings with a
>'/' in between):
>
>	foo = malloc(strlen(a)+strlen(b)+2);	/* 2 for '/' '\0' */

You're all going to think I'm crazy, but I always write this as

	... malloc(strlen(a)+1+strlen(b)+1) ...

to make it that much more obvious how the length is being
calculated.  (A comment still helps.)  This is merely a strict
application of Kernighan and Plaugher's adage, "Let the computer
do the dirty work."  Why make the person reading the code
"decompile" the 2 into its constituents?  (This example is
admittedly trivial, but in more complicated cases it can make a
big difference.)

C compilers can (and do) employ associativity and commutativity
to "fold" the two separate 1's into a single 2, at compile time,
so code like this is in no way less efficient.  (I'm not sure how
much of the compiler's freedom to rearrange expressions is being
abrogated under ANSI C.)

                                            Steve Summit
                                            scs@adam.pika.mit.edu

peter@ficc.uu.net (Peter da Silva) (12/18/88)

In article <735@auspex.UUCP>, guy@auspex.UUCP (Guy Harris) writes:
> >How about sizeof "/"? Or does that return sizeof(char *)?
> 
> No, it returns the number of bytes in the anonymous character array that
> "/" is.... [Which is 1+#chars, since the null is included]

#define STRLEN(s) (sizeof s - 1)
#define NULLEN 1

> 	strlen(a) + sizeof "/" + strlen(b) + sizeof "/" + strlen(c)

	strlen(a) + STRLEN("/") + strlen(b) + STRLEN("/") +strlen(c) + NULLEN
-- 
Peter da Silva, Xenix Support, Ferranti International Controls Corporation.
Work: uunet.uu.net!ficc!peter, peter@ficc.uu.net, +1 713 274 5180.
Home: bigtex!texbell!sugar!peter, peter@sugar.uu.net.

scs@adam.pika.mit.edu (Steve Summit) (12/18/88)

In article <883@quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
>henry@utzoo.uucp (Henry Spencer) writes:
>>-	foo = malloc(strlen(a)+strlen(b)+2);	/* 2 for '/' '\0' */
>>-Now, what's a good name for that "2", and how does naming it improve
>>-readability?
>
>I recently had a very similar problem.  A *superb* "name" for that 2 is
>		sizeof "/"

Was this suggestion supposed to include a :-) ?  sizeof("/") is a
very poor substitute, in this case: it gets the right answer for
the wrong reason.  (The '\0' the compiler counts in the string
constant "/" has little to do with the one which will be added
to the final, concatenated string.)

I once scratched my head over

	kill(HUP, 1);

in a program which had modified /etc/ttys and wanted to tell
/etc/init (process 1) to re-read it.  (This is more of a
unix-wizards than an info-c topic.)  Funny thing: man 2 kill
says it's kill(pid, signal), and yet the code worked just fine.
(Spoiler: HUP just happens to be 1).

                                            Steve Summit
                                            scs@adam.pika.mit.edu

bill@twwells.uucp (T. William Wells) (12/18/88)

In article <1988Dec13.172306.16195@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
: In article <9134@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
: >>How much hard-coding is too much?
: >
: >Almost any is too much.  It is proper to use explicit constants when it
: >is clear what they mean and that they can never need to be changed.  For
: >example, assigning 0 or 1 to initialize a counter is proper.  Assuming
: >that 03 is always the right character code for a keyboard interrupt
: >character (i.e. ASCII ctrl-C) is not proper.
:
: The policy we try to follow is that if you must hard-code a constant,
: then it must be accompanied by a comment explaining why that particular
: number is there.

My rule is this: any constant which is used, explicitly or implicitly,
more than once gets a define. Any tunable parameter gets a define.
Anything else stays a constant.

The first covers using a constant explicitly, as in defining a[256],
and implicitly as in a[255], when the 255 is there because 255 ==
sizeof(a) - 1. These should be something like a[SIZEA] and a[SIZEA-1],
respectively.  However, it does not cover most uses of zero; zero
usually refers to either the first element of an array, a thing to be
converted to a null pointer, or a null character; all of which are
defined as part of the language and so don't require any definition
from me!

Some examples:

	cnt = 0;
	while (n) {
		cnt += n & 1;
		n >>= 1;
	}

does not get any defines; the 1's are part of the problem definition.
On the other hand,

#define DREG_RDY 0x01

	volatile char *dreg;

	while (*dreg & DREG_RDY)
		;

gets a define, the bit is defined in more than one place: the
hardware and the program.

---
Bill
{uunet|novavax}!proxftl!twwells!bill

mat@mole-end.UUCP (Mark A Terribile) (12/19/88)

>      [But how about] about things like (in a function to concatenate
>      two strings with a '/' in between):
> 
> 	   foo = malloc(strlen(a)+strlen(b)+2);	/* 2 for '/' '\0' */
> 
> You are right.  This is a valid exception to my suggestion, and it had
> not occurred to me.

Well, there is a way, although it's a little wordy and I don't know if I'd
do it myself.

	foo = malloc( strlen( a ) + strlen( b ) + 2 * sizeof( char ) );

You may still have to note that the chars are '/' and null.
-- 

(This man's opinions are his own.)
From mole-end				Mark Terribile

ok@quintus.uucp (Richard A. O'Keefe) (12/19/88)

In article <8512@bloom-beacon.MIT.EDU> scs@adam.pika.mit.edu (Steve Summit) writes:
>In article <883@quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
>>henry@utzoo.uucp (Henry Spencer) writes:
>>>-	foo = malloc(strlen(a)+strlen(b)+2);	/* 2 for '/' '\0' */
>>>-Now, what's a good name for that "2", and how does naming it improve
>>>-readability?
>>
>>I recently had a very similar problem.  A *superb* "name" for that 2 is
>>		sizeof "/"
>
>Was this suggestion supposed to include a :-) ?  sizeof("/") is a
>very poor substitute, in this case: it gets the right answer for
>the wrong reason.  (The '\0' the compiler counts in the string
>constant "/" has little to do with the one which will be added
>to the final, concatenated string.)

Nope, it is the right answer for the right reason.  The characters in
the byte array are exactly the characters which will be added.  If you
were going to do sprintf(dest, "%s%s%s%s%s", a, "/", b, "/", c),
the size of dest would be strlen(a)+strlen(b)+strlen(c) + sizeof "//".
What I really prefer to do in a case like this is
	#define arglen(x) (strlen(x)-2)	/* strlen(x) - strlen("%s") */

	    static char fmt[] = "%s/%s/%s";
	    ...
	    foo = malloc(sizeof fmt + arglen(a) + arglen(b) + arglen(c));
	    ...
	    sprintf(foo, fmt, a, b, c);

Now if someone changes that to
	    static char fmt[] = "[%s.%s]%s;";
it still works.

remmers@m-net.UUCP (John H. Remmers) (12/19/88)

In article <1988Dec15.190331.2986@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) 
writes:
>
>Trouble is, often it's almost impossible to devise a meaningful name.
>I'm not talking about hard-coding things like choice of control characters,
>but about things like (in a function to concatenate two strings with a
>'/' in between):
>
>	foo = malloc(strlen(a)+strlen(b)+2);	/* 2 for '/' '\0' */
>
>Now, what's a good name for that "2", and how does naming it improve
>readability?
>
The "2" can be viewed as the incidental result of a bookkeeping operation:
a "+1" for the null terminator in a string concatenation, and a "+1" for
the extra '/' character.  So naturally it's hard to name in an illumi-
nating way.  

I always tell my students that bookkeeping is best left to the computer; 
to improve readability, name the *concepts* and let the compiler sort out 
the bookkeeping details.  Hence the "2" probably shouldn't be in the 
source code at all, named or otherwise.

The space requirement for a string concatenation is a frequently-needed
value; it's worth having a macro to represent it:

	#define  Catspace(s,t)  (strlen(s) + strlen(t) + 1)

In terms of this, the malloc() call can be written:

	foo = malloc(Catspace(s,t) + sizeof('/'));

thus making it explicit that you're concatenating two strings and
allocating one more byte for an extra character.  Readability is
improved, and the question of naming the "2" never comes up.

djones@megatest.UUCP (Dave Jones) (12/21/88)

From article <2636@m2-net.UUCP>, by remmers@m-net.UUCP (John H. Remmers):
...
> The space requirement for a string concatenation is a frequently-needed
> value; it's worth having a macro to represent it:
> 
> 	#define  Catspace(s,t)  (strlen(s) + strlen(t) + 1)
> 

You are a candidate for SLMA (Silly Little Macros Anonymous).

There are no dues.  Just get a sponsor and come to the meetings.
Take it One Day At A Time, and soon you will no longer be
sending innocent programmers poking through .h files in order
to figure out what "Catspace" means.

:-), in case you didn't guess.

But if you don't like the naked "1", how about this?

foo =  malloc( strlen(s) + strlen(t) + sizeof('\0'));

Tells it all.  And keeps the reader on track, not grepping around
for macros.

chip@ateng.ateng.com (Chip Salzenberg) (12/22/88)

According to henry@utzoo.uucp (Henry Spencer):
>Trouble is, often it's almost impossible to devise a meaningful name.
>I'm not talking about hard-coding things like choice of control characters,
>but about things like (in a function to concatenate two strings with a
>'/' in between):
>
>	foo = malloc(strlen(a)+strlen(b)+2);	/* 2 for '/' '\0' */

Not hard:

	foo = malloc(strlen(a)+sizeof("/")+strlen(b));

-- 
Chip Salzenberg             <chip@ateng.com> or <uunet!ateng!chip>
A T Engineering             Me?  Speak for my company?  Surely you jest!
	  "It's no good.  They're tapping the lines."

bright@Data-IO.COM (Walter Bright) (12/22/88)

In article <1104@goofy.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes:
<foo =  malloc( strlen(s) + strlen(t) + sizeof('\0'));
<Tells it all.  And keeps the reader on track, not grepping around
<for macros.

But doesn't sizeof('\0') return sizeof(int), instead of sizeof(char)?
Remember, the integral promotions are being done.

jeff@amsdsg.UUCP (Jeff Barr) (12/22/88)

In article <883@quintus.UUCP>, ok@quintus.uucp (Richard A. O'Keefe) writes:
> henry@utzoo.uucp (Henry Spencer) writes:
> >-	foo = malloc(strlen(a)+strlen(b)+2);	/* 2 for '/' '\0' */
> >-Now, what's a good name for that "2", and how does naming it improve
> >-readability?
> 
> I recently had a very similar problem.  A *superb* "name" for that 2 is
> 		sizeof "/"

(*superb*)?  Personally, I don't see any connection.  I'd rather use 

			(sizeof ("")) 
			
to indicate the overhead (if you will) to store a string.

							Jeff
-- 
	 /-------------------------------------------------------\
	/  Jeff Barr   AMS-DSG   uunet!amsdsg!jeff   800-832-8668 \
	\  American Express: "Don't leave $HOME without it".	  /
	 \-------------------------------------------------------/

tps@chem.ucsd.edu (Tom Stockfisch) (12/22/88)

In article <1988Dec15.190331.2986@utzoo.uucp> henry@utzoo.uucp (Henry Spencer) writes:
>	foo = malloc(strlen(a)+strlen(b)+2);	/* 2 for '/' '\0' */
>Now, what's a good name for that "2", and how does naming it improve
>readability?

In this case I usually write

	foo =	malloc( strlen(a) + strlen(b) + sizeof("/") );

What do you think of that?

-- 

|| Tom Stockfisch, UCSD Chemistry	tps@chem.ucsd.edu

nevin1@ihlpb.ATT.COM (Liber) (12/22/88)

In article <1104@goofy.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes:
>From article <2636@m2-net.UUCP>, by remmers@m-net.UUCP (John H. Remmers):
 
>> 	#define  Catspace(s,t)  (strlen(s) + strlen(t) + 1)

>You are a candidate for SLMA (Silly Little Macros Anonymous). [:-)]

>But if you don't like the naked "1", how about this?

>foo =  malloc( strlen(s) + strlen(t) + sizeof('\0'));

>Tells it all.  And keeps the reader on track, not grepping around
>for macros.

It also happens to be WRONG!  (Actually, if you are only using it for
malloc(), it will work; you just end up allocating more space than is
actually needed.)  Quoting from dpANS C 10/88 draft, section 3.1.3.4
(character constants), subsection on semantics (p 30):

	"An integer character constant has type int."

This means, among other things, that sizeof('\0') == sizeof(int), and
not sizeof(char).
-- 
NEVIN ":-)" LIBER  AT&T Bell Laboratories  nevin1@ihlpb.ATT.COM  (312) 979-4751

bill@twwells.uucp (T. William Wells) (12/22/88)

In article <1104@goofy.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes:
: But if you don't like the naked "1", how about this?
:
: foo =  malloc( strlen(s) + strlen(t) + sizeof('\0'));
:
: Tells it all.  And keeps the reader on track, not grepping around
: for macros.

How about sizeof('\0') is equal to the size of an integer, which is
unlikely to be 1?

Remember, in C, character constants are integer constants.

---
Bill
{uunet|novavax}!proxftl!twwells!bill

stuart@bms-at.UUCP (Stuart Gathman) (12/22/88)

In article <2636@m2-net.UUCP>, remmers@m-net.UUCP (John H. Remmers) writes:

> The space requirement for a string concatenation is a frequently-needed
> value; it's worth having a macro to represent it:

> 	#define  Catspace(s,t)  (strlen(s) + strlen(t) + 1)

> In terms of this, the malloc() call can be written:

> 	foo = malloc(Catspace(s,t) + sizeof('/'));

> thus making it explicit that you're concatenating two strings and
> allocating one more byte for an extra character.  Readability is
> improved, and the question of naming the "2" never comes up.

Of course, this is incorrect since sizeof('/') is 2 or 4 depending on
your machine.  But the concept is good . . .
-- 
Stuart D. Gathman	<stuart@bms-at.uucp>
			<..!{vrdxhq|daitc}!bms-at!stuart>

dc@gcm (Dave Caswell) (12/22/88)

John H. Remmers writes
>  The space requirement for a string concatenation is a frequently-needed
>  value; it's worth having a macro to represent it:
>  
>  	#define  Catspace(s,t)  (strlen(s) + strlen(t) + 1)


and Dave Jones rambles
.
.foo =  malloc( strlen(s) + strlen(t) + sizeof('\0'));
                                        |
					|--this is four on 32-bit machines

.Tells it all.  And keeps the reader on track, not grepping around
.for macros.

An advantage of functions or macros is what you write it once and debug it;
you cut down the chance of errors.  I wouldn't mind searching for code what
worked.

-- 
Dave Caswell (former EMU student)
Greenwich Capital Markets                             uunet!philabs!gcm!dc

arrom@aplcen.apl.jhu.edu (Ken Arromdee ) (12/23/88)

>How about sizeof('\0') is equal to the size of an integer, which is
>unlikely to be 1?
>Remember, in C, character constants are integer constants.

I was under the impression that "sizeof" is done at compile time and isn't
really a function, so this would correctly return 1.
--
"Thinking small is seeing your bus on the other side of the street, and
           wishing you could teleport across to catch it."

--Kenneth Arromdee (ins_akaa@jhunix.UUCP, arromdee@crabcake.cs.jhu.edu,
	g49i0188@jhuvm.BITNET) (not arrom@aplcen, which is my class account)

vch@attibr.UUCP (Vincent C. Hatem) (12/23/88)

In article <1988Dec21.133910.23182@ateng.ateng.com>, chip@ateng.ateng.com (Chip Salzenberg) writes:
] According to henry@utzoo.uucp (Henry Spencer):
] >Trouble is, often it's almost impossible to devise a meaningful name.
] >I'm not talking about hard-coding things like choice of control characters,
] >but about things like (in a function to concatenate two strings with a
] >'/' in between):
] >
] >	foo = malloc(strlen(a)+strlen(b)+2);	/* 2 for '/' '\0' */
] 
] Not hard:
] 
] 	foo = malloc(strlen(a)+sizeof("/")+strlen(b));
] 
] -- 
] Chip Salzenberg             <chip@ateng.com> or <uunet!ateng!chip>
] A T Engineering             Me?  Speak for my company?  Surely you jest!
] 	  "It's no good.  They're tapping the lines."

Right, chip. 

Last I heard sizeof("/") == sizeof(char *) - which is almost never 2, and
NEVER portable.

How about the more accurate:
	foo = malloc(strlen(a)+strlen(b)+(2*sizeof(char)));



-- 
Vincent C. Hatem                            | att ---->\ (available from any
AT&T International                          | ulysses ->\ Action Central site)
International Operations Technical Support  | bellcore ->\___ !attibr!vch
1200 Mt Kemble Ave, Basking Ridge, NJ 07920 | (201) 953-8030

chris@mimsy.UUCP (Chris Torek) (12/23/88)

>>sizeof('\0') is equal to the size of an integer ....
>>Remember, in C, character constants are integer constants.

In article <410@aplcen.apl.jhu.edu> arrom@aplcen.apl.jhu.edu
(Ken Arromdee) writes:
>I was under the impression that "sizeof" is done at compile time and isn't
>really a function, so this would correctly return 1.

sizeof is indeed not a function, yet sizeof('\0') is the same as
sizeof(0) and sizeof(int), which are usually not the same as sizeof(char).
The second `>>' line above is the key: the type of '\0' is int, not char.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

djones@megatest.UUCP (Dave Jones) (12/23/88)

From article <1797@dataio.Data-IO.COM>, by bright@Data-IO.COM (Walter Bright):
> In article <1104@goofy.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes:
> <foo =  malloc( strlen(s) + strlen(t) + sizeof('\0'));
> <Tells it all.  And keeps the reader on track, not grepping around
> <for macros.
> 
> But doesn't sizeof('\0') return sizeof(int), instead of sizeof(char)?
> Remember, the integral promotions are being done.


Gack! You are correct. Due to certain intermittent reduced neurological
function (sometimes I can be real dumb), it is hard for me to remember
that in C, "character" constants are of type int, not char. Seems goofy
to me. Can anyone suggest a rationale, so that I can remember this arcane
fact?

Okay. You get bucked off, you climb right back in the saddle, right?


How about,


  foo =  malloc( strlen(s) + strlen(t) + sizeof((char)'\0'));

djones@megatest.UUCP (Dave Jones) (12/23/88)

From article <77@attibr.UUCP>, by vch@attibr.UUCP (Vincent C. Hatem):

> 
> Last I heard sizeof("/") == sizeof(char *) - which is almost never 2, and
> NEVER portable.
> 

First I heard sizeof("/") == 2 which is sometimes sizeof(char*), and
ALWAYS portable. 

On my machine, a 68020 based Sun3/60, sizeof(char*) seems to be either
four or seven, depending.

Don't feel bad.  I blew sizeof('\0') bigtime.


% cat foo.c
main()
{
  printf("%d %d\n", sizeof(char*), sizeof ("foobar"));
  exit(0);
}
% cc foo.c
% a.out
4 7 

guy@auspex.UUCP (Guy Harris) (12/23/88)

>Last I heard sizeof("/") == sizeof(char *)

I don't know who you heard that from, but I suggest you take anything
else they have to say about C with a grain of salt; it is not true.  "/"
is an array of "char" with two elements, not a pointer to "char".  In
some contexts, that "array of 'char'" expression gets converted to a
"pointer to 'char'" that points to the array's first member; however,
"argument of the 'sizeof' operator" is not one of those contexts.

bill@twwells.uucp (T. William Wells) (12/23/88)

In article <410@aplcen.apl.jhu.edu> arrom@aplcen.UUCP (Ken Arromdee (600.429)) writes:
: >How about sizeof('\0') is equal to the size of an integer, which is
: >unlikely to be 1?
: >Remember, in C, character constants are integer constants.
:
: I was under the impression that "sizeof" is done at compile time and isn't
: really a function, so this would correctly return 1.

Well, you are right that sizeof is done at compile time, and it isn't
a function. But it won't return 1.

Sizeof is an operator. The thing it is operating on is a character
constant. In C, a character constant *is* an integer. Thus sizeof a
character constant must the the same as the sizeof an integer. (Now,
for the gurus: am I right in saying integer, or should I say `of some
integral type'?)

---
Bill
{uunet|novavax}!proxftl!twwells!bill

remmers@m-net.UUCP (John H. Remmers) (12/24/88)

In article <1104@goofy.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes:
>From article <2636@m2-net.UUCP>, by remmers@m-net.UUCP (John H. Remmers):
>...
>> The space requirement for a string concatenation is a frequently-needed
>> value; it's worth having a macro to represent it:
>> 
>> 	#define  Catspace(s,t)  (strlen(s) + strlen(t) + 1)
>> 
>
>You are a candidate for SLMA (Silly Little Macros Anonymous).
>
The merits are debatable, I suppose, but if an application requires a 
lot of dynamic allocation of space for strings, hiding the 1 for the
null byte in a macro definition is a way to defend against forgetting
it (a common error, in my experience).  The wisdom of this approach
ultimately is determined by frequency of use.  Agreed, inventing
macros for isolated situations is not good practice.  What I was trying
to say, perhaps not too clearly, is that a macro for string concatena-
tion space might find sufficient use to be worth defining, and that
*if* that is the case, this is a natural place to use it.
>
>But if you don't like the naked "1", how about this?
>
>foo =  malloc( strlen(s) + strlen(t) + sizeof('\0'));
>
As a couple of people pointed out to me in mail, '\0' and '/' are
ints, so sizeof('\0') = sizeof(int), usually 2 or 4, and more space
is allocated than needed.

It was not the naked "1" I was questioning so much as the naked "2".

remmers@m-net.UUCP (John H. Remmers) (12/24/88)

In article <1104@goofy.megatest.UUCP> djones@megatest.UUCP (Dave Jones) writes:
>From article <2636@m2-net.UUCP>, by remmers@m-net.UUCP (John H. Remmers):
>...
>> The space requirement for a string concatenation is a frequently-needed
>> value; it's worth having a macro to represent it:
>> 
>> 	#define  Catspace(s,t)  (strlen(s) + strlen(t) + 1)
>> 
>
>You are a candidate for SLMA (Silly Little Macros Anonymous).

Well, maybe, maybe not.  Hiding the 1 for the null byte in a macro
definition is a defense against forgetting it.  Promiscuous ad hoc
invention of macro names is of course Bad Programming Practice;
what I was trying to say, perhaps not too clearly, is that *if* dynamic
allocation of space for strings is a frequently-needed operation in
an application, *then* a macro (or maybe a set of macros) might be
worthwhile, and that the example at hand is a natural place to use it.

>But if you don't like the naked "1", how about this?
>
>foo =  malloc( strlen(s) + strlen(t) + sizeof('\0'));

A couple of people have pointed out to me in mail that 
sizeof(<character-constant>) is wrong, since a character-constant is 
an int.  Hence sizeof('\0') = sizeof(int), usually 2 or 4, and more 
space is allocated than needed.  Actually, I don't mind the naked
"1"; it was the naked "2" I was questioning.


-- 
John H. Remmers               | ...umix!m-net!remmers
Dept. of Computer Science     |---------------------------------------------
Eastern Michigan University   | My opinions and those of my employer are the
Ypsilanti, MI 48197           | same, but my employer doesn't know that yet.

cks@ziebmef.uucp (Chris Siebenmann) (12/26/88)

In article <2636@m2-net.UUCP> remmers@m-net.UUCP (John H. Remmers) writes:
...
>The space requirement for a string concatenation is a frequently-needed
>value; it's worth having a macro to represent it:
>	#define  Catspace(s,t)  (strlen(s) + strlen(t) + 1)
>In terms of this, the malloc() call can be written:
>	foo = malloc(Catspace(s,t) + sizeof('/'));

 For true safety, I'd write this as
 
 	foo = malloc(Catspace(s,t)*sizeof(char) + sizeof('/'));

(or redefine Catspace() to do this itself, depending on whether it
returns a character or byte count), just in case someone produces a
compiler where characters aren't single bytes. 

[Corrections gratefully accepted; I don't think ANSI mandated
sizeof(char) being one, although a lot of code will probably break if
it isn't.]
[Route uucp mail manually until the January uucp maps come out; a
machine on the favorite route to the Ziebmef just went away.]
-- 
	"...in all the history of Earth, there's never been a heaven, never
	    been a house of gods that was not built on human bones."
Chris Siebenmann		uunet!{utgpu!moore,attcan!telly}!ziebmef!cks
cks@ziebmef.UUCP	     or	.....!utgpu!{,ontmoh!,ncrcan!brambo!}cks

gwyn@smoke.BRL.MIL (Doug Gwyn ) (12/27/88)

In article <1797@dataio.Data-IO.COM> bright@dataio.Data-IO.COM (Walter Bright) writes:
>But doesn't sizeof('\0') return sizeof(int), instead of sizeof(char)?

Yes ...

>Remember, the integral promotions are being done.

... but integral promotion has nothing to do with it.  '\0' IS an integer
constant of value 0 and type int, not char.

gwyn@smoke.BRL.MIL (Doug Gwyn ) (12/27/88)

In article <15145@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>sizeof is indeed not a function, yet sizeof('\0') is the same as
>sizeof(0) and sizeof(int), which are usually not the same as sizeof(char).

We should probably explain also that it's the same as sizeof 0 and
sizeof '\0', but not sizeof int (which is illegal).  Since any C
text should explain this, look it up if you don't understand.

gwyn@smoke.BRL.MIL (Doug Gwyn ) (12/27/88)

In article <789@auspex.UUCP> guy@auspex.UUCP (Guy Harris) writes:
[quoting somebody:]
>>Last I heard sizeof("/") == sizeof(char *)
>I don't know who you heard that from, but I suggest you take anything
>else they have to say about C with a grain of salt; it is not true.

Actually, on at least one release of Gould's UTX-32, apparently the C
compiler maintainer decided to "fix" this and their compiler indeed
interpreted all sizeof"..." constructs as sizeof(char*).  Of course we
screamed about this and it was fixed in a later release.

ado@elsie.UUCP (Arthur David Olson) (12/29/88)

> > . . .the malloc() call can be written:
> >	foo = malloc(Catspace(s,t) + sizeof('/'));
> 
>  For true safety, I'd write this as
>  	foo = malloc(Catspace(s,t)*sizeof(char) + sizeof('/'));

What we write in this neck of the woods is
	foo = ecpyalloc(s);
	foo = ecatalloc(foo, t);
where "ecpyalloc" sets foo pointing to an allocated copy of s
(a la "AllocCpy" in the 2.11.14 news software)
and "ecatalloc" sets foo pointing to a reallocation of foo with t catenated.
This avoids the need for a hardcoded constant entirely;
since you don't use one, you can't get its value wrong.
-- 
	Arthur David Olson    ado@ncifcrf.gov    ADO is a trademark of Ampex.

bill@twwells.uucp (T. William Wells) (12/29/88)

In article <1988Dec26.021757.15813@ziebmef.uucp> cks@ziebmef.UUCP (Chris Siebenmann) writes:
:  For true safety, I'd write this as
:
:       foo = malloc(Catspace(s,t)*sizeof(char) + sizeof('/'));
:
: (or redefine Catspace() to do this itself, depending on whether it
: returns a character or byte count), just in case someone produces a
: compiler where characters aren't single bytes.

Characters are required to be single bytes.  Note that this doesn't
prevent the implementer from using 16 bit bytes....

: [Corrections gratefully accepted; I don't think ANSI mandated
: sizeof(char) being one, although a lot of code will probably break if
: it isn't.]

Be grateful.

From the May 13 draft, section 3.3.3.4:

"When applied to an operand that has type char, unsigned char, or
signed char, (or a qualified version thereof) the result is 1."

---
Bill
{uunet|novavax}!proxftl!twwells!bill

sar@datcon.UUCP (Simon A Reap) (01/03/89)

In article <1988Dec21.133910.23182@ateng.ateng.com> chip@ateng.ateng.com
						    (Chip Salzenberg) writes:
>According to henry@utzoo.uucp (Henry Spencer):
>>Trouble is, often it's almost impossible to devise a meaningful name.
>>I'm not talking about hard-coding things like choice of control characters,
>>but about things like (in a function to concatenate two strings with a
>>'/' in between):
>>	foo = malloc(strlen(a)+strlen(b)+2);	/* 2 for '/' '\0' */
>Not hard:
>	foo = malloc(strlen(a)+sizeof("/")+strlen(b));

Ah, but if we want to concatenate more strings, don't we need something like...

#define TO_CAT_3_STRINGS (-1)
#define TO_CAT_4_STRINGS (-2)
	foo = malloc(strlen(a)+sizeof("/")+strlen(b)+sizeof("/")+
			strlen(c)+TO_CAT_3_STRINGS);
	foo = malloc(strlen(a)+sizeof("/")+strlen(b)+sizeof("/")+
			strlen(b)+sizeof("/")+strlen(c)+TO_CAT_4_STRINGS);

with, of course, negative constants to *really* confuse the beginner :-)
(Yes, I know you could have a single 'sizeof("//")' or sizeof("///") as
required, but that's not really the point, is it?)
-- 
Enjoy,
yerluvinunclesimon             Opinions are mine - my cat has her own ideas
Reach me at sar@datcon.co.uk, or ...!mcvax!ukc!pyrltd!datcon!sar