[comp.lang.c] sprintf

charlie@mica.stat.washington.edu (Charlie Geyer) (12/08/88)

It seems that in 4.3 BSD (and before?) sprintf doesn't lint right
except on a VAX.  Consider the following little program.

  #include <stdio.h>

  main()
  {
  char s[20];

  sprintf(s,"foo");
  fprintf(stdout," %s\n",s);
  sprintf(s,"foo");
  fprintf(stdout," %s\n",s);
  }

which compiles and works fine.  It also lints fine on a VAX running
4.3 BSD from mt Xinu.  On a Sun-3 running Sun UNIX 4.2 Release 3.2
lint gives the following rather odd error message

  sprintf value declared inconsistently   llib-lc(512)  ::  goo.c(9)

What's going on?  And why just the error in line 9.  Why not in line 7
as well?  Looking /usr/lib/lint/llib-lc we see that lint thinks that
sprintf returns a char * which agrees with what the man page says.
But /usr/include/stdio.h has the following interesting item

  #ifdef vax
  char    *sprintf();             /* too painful to do right */
  #endif

so it doesn't define what sprintf returns (and by default C assumes
that sprintf returns an int).

So two questions:

  (1) What is "too painful to do right" and why?

  (2) Why doesn't lint give an error for line 7 as well?  I'm not
      asking for an explanation of why lint does this.  I want a
      justification of why it *should* do this.

This error is not Sun-specific.  On an IBM RT running IBM Academic
Information Systems 4.3 (which somebody told me is approximately 4.3
BSD and looks like it).  /usr/include/stdio.h has the same ifdef and
comment about "too painful to do right" and so does not define what
sprintf returns.  But here the man page doesn't specify the type
sprintf returns either (so int is default).  But /usr/lib/lint/llib-lc
says that sprintf returns a char *.  So the RT gives the same error
message

  sprintf value declared inconsistently   llib-lc(438)  ::  goo.c(9)

The RT has source and /usr/src/lib/libc/stdio/sprintf.c says that
sprintf does indeed return a char *, it's first argument.  So (if the
source is to be believed) the man page and stdio.h are broken.

chris@mimsy.UUCP (Chris Torek) (12/08/88)

In article <1102@entropy.ms.washington.edu> charlie@mica.stat.washington.edu
(Charlie Geyer) writes:
>But [4.xBSD, x < 3tahoe] /usr/include/stdio.h has the following
>interesting item
>
>  #ifdef vax
>  char    *sprintf();             /* too painful to do right */
>  #endif
>
>so it doesn't define what sprintf returns (and by default C assumes
>that sprintf returns an int).

>  (1) What is "too painful to do right" and why?

sprintf() is supposed to return an int, namely the count of characters
printed.  There were numerous programs that depended on it returning
a `char *'.  These have been fixed; it was only mildly painful.

>  (2) Why doesn't lint give an error for [a call to fprintf] as well?

Your lint library (for Suns and IBM ACIS) does not incorrectly declare
fprintf() as returning `char *'.  Whether your sprintf() actually
returns `char *' (as not-quite-defined in stdio.h) is another question
entirely.  Poking around in the SunOS 3.2 sources, I find both versions.
Which one gets used I have no idea.

This is fixed in 4.3BSD-tahoe, where sprintf() returns int---no options.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

gwyn@smoke.BRL.MIL (Doug Gwyn ) (12/09/88)

In article <1102@entropy.ms.washington.edu> charlie@mica.stat.washington.edu (Charlie Geyer) writes:
>It seems that in 4.3 BSD (and before?) sprintf doesn't lint right
>except on a VAX.

Actually the VAX version is wrong.  sprintf() is supposed to return int.
This wasn't clearly specified in the "good old days", and since one
implementation accidentally returned the buffer address, somebody thought
that was supposed to be the definition.

>  (1) What is "too painful to do right" and why?

Presumably somebody realized that it was wrong but thought there was
too much code that relied on the BSD behavior.  (E.g. "rogue" did this.)

guy@auspex.UUCP (Guy Harris) (12/10/88)

>It seems that in 4.3 BSD (and before?) sprintf doesn't lint right
>except on a VAX.

More correctly, you have to declare "sprintf" yourself, even if you've
included <stdio.h>, except on a VAX.

>What's going on?

Since "sprintf" wasn't declared, when it was first used the compiler
implicitly declared it as returning "int", which disagrees with the
"lint" library.

>And why just the error in line 9.

Beats me.  I tried it on this Sun running 4.0, and it complained about
line 7...

>Why not in line 7 as well?

...and not about line 9.  The 4.0 behavior is argably correct; it
complains about line 7 because that's the first line on which "sprintf"
is used, and therefore is the line on which it's declared.  Since it's
already been declared by line 9, no complaint is issued for that line.

>  (1) What is "too painful to do right" and why?

It was too painful to change "sprintf" to return the number of
characters it generated, rather than returning its first argument; too
many programs relied on that behavior.

The original V7 "sprintf" returned its first argument.  In System III,
or perhaps earlier, all the "*printf" routines were changed to return
the number of characters generated.  4.xBSD was derived from UNIX 32V,
which was derived from V7, so it had the old-style "sprintf".

>  (2) Why doesn't lint give an error for line 7 as well?  I'm not
>      asking for an explanation of why lint does this.  I want a
>      justification of why it *should* do this.

It only gives an error for line 9 because there's a bug, presumably.  It
*should* only give an error for line 7, for the reasons listed above.

The dpANS specifies that they all return the number of characters
generated, and POSIX inherits this from the dpANS (I think); this will
probably prompt Berkeley to say "change your code or die" and adopt the
new-style behavior (the pain involved nonwithstanding).

guy@auspex.UUCP (Guy Harris) (12/10/88)

 >Your lint library (for Suns and IBM ACIS) does not incorrectly declare
 >fprintf() as returning `char *'.  Whether your sprintf() actually
 >returns `char *' (as not-quite-defined in stdio.h) is another question
 >entirely.  Poking around in the SunOS 3.2 sources, I find both versions.
 >Which one gets used I have no idea.

One gets used when you compile in the 4BSD environment (using
"(/usr)/bin/cc"); one gets used when you compile in the S5 environment
(using "/usr/5bin/cc").

guy@auspex.UUCP (Guy Harris) (12/10/88)

>Actually the VAX version is wrong.  sprintf() is supposed to return int.
>This wasn't clearly specified in the "good old days", and since one
>implementation accidentally returned the buffer address, somebody thought
>that was supposed to be the definition.

Err, umm, accidentally or deliberately?  It wasn't specified *at all* in
the V7 documentation, as I remember, but at least one piece of *System
III* thought it should return the buffer address (one of the SCCS
commands) - which is kind of amusing, considering the behavior had been
changed by then.

knudsen@ihlpl.ATT.COM (Knudsen) (12/14/88)

Funny you should mention Sun and sprintf().  I have just
learned the hard way that while most (???) C systems
have sprintf return the number of characters written,
the Sun version returns the buffer address!
Let's hear it for STANDARD I/O Libraries!

Both return values are useful, but I'd rather have the integer
number of characters, since otherwise a strlen() call is needed
to get this.

Funny, most programmers don't even know what the printf family's
return values are supposed to be (thus permitting Sun to
define them as they please?).  You can always declare
	(void) printf(), sprintf(), fprintf();
to shut up lint, but sometimes I want those return values.

Have you tested printf() and fprintf() on Sun to see what THEY return?
On AT&T 3B2 they all return number of characters.
-- 
Mike Knudsen  Bell Labs(AT&T)   att!ihlpl!knudsen
"Lawyers are like nuclear bombs and PClones.  Nobody likes them,
but the other guy's got one, so I better get one too."

ok@quintus.uucp (Richard A. O'Keefe) (12/14/88)

In article <8131@ihlpl.ATT.COM> knudsen@ihlpl.ATT.COM (Knudsen) writes:
>Funny you should mention Sun and sprintf().  I have just
>learned the hard way that while most (???) C systems
>have sprintf return the number of characters written,
>the Sun version returns the buffer address!
>Let's hear it for STANDARD I/O Libraries!

yes, let's.  SunOS was originally based on 4.2BSD, and 4.2BSD sprintf()
returns the buffer address.  Why is that?  Because Berkeley ***DIDN'T***
change it!  AT&T changed the "standard" library!

And while we're bashing Sun, let's note that they are so unprincipled
that they provide the System V behaviour if you want it.  Just compile
your programs with /usr/5bin/cc.  What villains!  (Heavy irony.)

guy@auspex.UUCP (Guy Harris) (12/15/88)

>Funny you should mention Sun and sprintf().  I have just
>learned the hard way that while most (???) C systems
>have sprintf return the number of characters written,
>the Sun version returns the buffer address!
>Let's hear it for STANDARD I/O Libraries!

No, not most.  The Version 7 "sprintf" returned the buffer address,
although this wasn't documented.  It was changed somewhere around System
III to return the number of characters printed (although, amusingly
enough, one piece of the System III SCCS code depended on it returning
the buffer address!).

SunOS started with 4.xBSD, which was derived from UNIX/32V, which was
based (more or less) on V7, so its "sprintf" returns the buffer address
- at least in the 4BSD environment.  The "sprintf" in the System V
environment (compile with "/usr/5bin/cc" rather than "(/usr)/bin/cc")
returns the number of characters printed.

Other systems that started with 4.xBSD probably also return the buffer
address, at least in their 4.xBSD environments.

>Funny, most programmers don't even know what the printf family's
>return values are supposed to be (thus permitting Sun to
>define them as they please?).

No, Sun didn't define it that way - in effect, AT&T did, when they
shipped V7 with "sprintf" returning the buffer pointer.  Sun just
followed Berkeley who followed V7....

>Have you tested printf() and fprintf() on Sun to see what THEY return?
>On AT&T 3B2 they all return number of characters.

In fact, they return the number of characters on the Sun - in both
environments.  In 4.xBSD, and probably in V7, they return, I think, 0 on
success and EOF on error.  The intent in SunOS 2.0 was to change them
all to return the number of characters printed; unfortunately, while
this broke few, if any, programs for "printf" and "fprintf", it broke a
bunch for "sprintf", so that was backed out.

gwyn@smoke.BRL.MIL (Doug Gwyn ) (12/15/88)

In article <865@quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
>yes, let's.  SunOS was originally based on 4.2BSD, and 4.2BSD sprintf()
>returns the buffer address.  Why is that?  Because Berkeley ***DIDN'T***
>change it!  AT&T changed the "standard" library!

You've oversimplified to the point of unacceptable distortion.
Originally, sprintf()'s return value was not specified (see e.g.
UPM 7th Ed.).  In one of several cases of parallel uncoordinated
invention, Berkeley decided it should be the buffer address
(perhaps because that happened to accidentally be in the return
register already in the VAX assembly-language implementation),
while AT&T decided that all the *printf() functions should return
the (much more useful) count of characters transferred.

To make matters even worse, somewhere along the way (perhaps as a
result of the /usr/group 1984 Standard) Berkeley realized that
there was this discrepancy between the two major UNIX variants.
So they altered the declaration in their <stdio.h> to:

#ifdef vax
char	*sprintf();		/* too painful to do right */
#endif

Note that Sun MUST have changed this in order for it to apply to
their (definitely non-VAX) machines.  In fact a lot of vendors
supplying 4BSD-based systems (maybe even all of them) did the
same thing, since there were applications developed on VAX 4BSD
that relied on it.  (Rogue, for example.)

I hope by now all the BSD code that depends on the value of
sprintf() being anything in particular has been changed to use
some other method.  Usually, something like
	buf[0] = '\0';
	(void)sprintf(buf, fmt, args);
	if (buf[0] == '\0')
		/* error */
is sufficient (it works with both definitions of sprintf()).

ok@quintus.uucp (Richard A. O'Keefe) (12/16/88)

In article <9181@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:

>#ifdef vax
>char	*sprintf();		/* too painful to do right */
>#endif
>
>Note that Sun MUST have changed this in order for it to apply to
>their (definitely non-VAX) machines.

In point of fact, at least as of SunOS 3.2, they DIDN'T.
So much for "MUST".  The effect is that in SunOS, if you wanted the
buffer pointer, you had to explicitly declare sprintf() yourself.

>I hope by now all the BSD code that depends on the value of
>sprintf() being anything in particular has been changed to use
>some other method.  Usually, something like
>	buf[0] = '\0';
>	(void)sprintf(buf, fmt, args);
>	if (buf[0] == '\0')
>		/* error */
>is sufficient (it works with both definitions of sprintf()).

Except that it doesn't work with either.  Consider
	sprintf("%.*s", len, ptr);
when len happens to be 0.  By the way, what _are_ the output error
conditions that *printf() can detect other than by crashing?

gwyn@smoke.BRL.MIL (Doug Gwyn ) (12/17/88)

In article <878@quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
-In article <9181@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
->#ifdef vax
->char	*sprintf();		/* too painful to do right */
->#endif
->Note that Sun MUST have changed this in order for it to apply to
->their (definitely non-VAX) machines.
-In point of fact, at least as of SunOS 3.2, they DIDN'T.
-So much for "MUST".  The effect is that in SunOS, if you wanted the
-buffer pointer, you had to explicitly declare sprintf() yourself.

I thought the poster who originated this discussion said that
his version of SunOS had done this.  I'm still covered by the
"in order for" clause in what I said.

->I hope by now all the BSD code that depends on the value of
->sprintf() being anything in particular has been changed to use
->some other method.  Usually, something like
->	buf[0] = '\0';
->	(void)sprintf(buf, fmt, args);
->	if (buf[0] == '\0')
->		/* error */
->is sufficient (it works with both definitions of sprintf()).
-Except that it doesn't work with either.  Consider
-	sprintf("%.*s", len, ptr);
-when len happens to be 0.  By the way, what _are_ the output error
-conditions that *printf() can detect other than by crashing?

Again, I said "Usually, ..." which implies "not always".  The vast
majority of sprintf() usage can correctly be changed to follow my
suggestion.  I note that you offered no alternative.

There are several possible errors other than I/O errors.  Think
hard enough and you should be able to imagine at least one.

ok@quintus.uucp (Richard A. O'Keefe) (12/19/88)

In article <9205@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>In article <878@quintus.UUCP> ok@quintus.UUCP (Richard A. O'Keefe) writes:
>-In article <9181@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>->I hope by now all the BSD code that depends on the value of
>->sprintf() being anything in particular has been changed to use
>->some other method.  Usually, something like
>->	buf[0] = '\0';
>->	(void)sprintf(buf, fmt, args);
>->	if (buf[0] == '\0')
>->		/* error */
>->is sufficient (it works with both definitions of sprintf()).
>-Except that it doesn't work with either.  Consider
>-	sprintf("%.*s", len, ptr);
>-when len happens to be 0.

>Again, I said "Usually, ..." which implies "not always".  The vast
>majority of sprintf() usage can correctly be changed to follow my
>suggestion.  I note that you offered no alternative.

I offered no alternative because I wanted to keep the message short.
According to the ULTRIX documentation, sprintf(3s) returns EOF to
indicate an error.  The alternative is thus
	#ifdef	BSD
	#define	BadSprintf (char*)(EOF) == sprintf
	#else /*SYS5*/
	#define	BadSprintf         EOF  == sprintf
	#endif
and then do
	if (BadSprintf(buf, fmt, args)) /* error */;

>-By the way, what _are_ the output error
>-conditions that *printf() can detect other than by crashing?

>There are several possible errors other than I/O errors.  Think
>hard enough and you should be able to imagine at least one.

I can _imagine_ a lot of possible errors, no problem.  The trouble is
that all the ones I've tested either dump core or are quietly accepted.
For example, "?" is not a defined conversion code, but sprintf accepts
"%.*?" as a format conversion, and writes '?'.  Then there are negative
field widths.  One might expect sprintf(buf, "%.*d", -9999, 44) to
either report an error or else act like "%-9999d", but it does neither.
It would be reasonable for sprintf(buf, "(%s)", (char*)NULL) to report
an error, but on my system it writes "((null))".  Another obvious error
is sprintf(buffer, "%4294967299d", 1), but it is not reported.  And so
it goes.

I think the following three points are uncontroversial:
(1) The (4.2BSD, S5R2, S5R3, Ultrix, SunOS) manual pages for printf(3s)
    do not say what any of the conditions which can cause an error return
    are.  (I haven't seen the 4.3BSD manual page.)
(2) There are a lot of errors which sprintf() either converts quietly to
    something plausible or dumps core with, not least buffer overflow.
(3) A portable program cannot rely on sprintf() catching any error at
    all, and thus had better do its own argument validation in any case.

It would be interesting to hear from someone who has UNIX sources what
the reported errors are.  

guy@auspex.UUCP (Guy Harris) (12/20/88)

>I think the following three points are uncontroversial:
>(1) The (4.2BSD, S5R2, S5R3, Ultrix, SunOS) manual pages for printf(3s)
>    do not say what any of the conditions which can cause an error return
>    are.  (I haven't seen the 4.3BSD manual page.)

I checked the 4.3-tahoe manual, and a quick look reveals no indication
of what the conditions that cause an error return are; I don't have the
4.3BSD page handy, but I don't remember it containing any such
indication.

>(2) There are a lot of errors which sprintf() either converts quietly to
>    something plausible or dumps core with, not least buffer overflow.

This doesn't have to be true of "sprintf", although for most existing
implementations this is true (buffer overflow is harder to catch in most
C implementations than, say, some format-string syntax errors are).

>(3) A portable program cannot rely on sprintf() catching any error at
>    all, and thus had better do its own argument validation in any
>    case.

This is true.

>It would be interesting to hear from someone who has UNIX sources what
>the reported errors are.  

My quick check of the S5R3 source indicates that "printf" and "fprintf". 
return EOF only if "ferror" on the stream being written to indicates an
error, or if "_doprnt" returns EOF, and "sprintf" does so only if
"_doprnt" returns EOF.  "_doprnt", and other I/O functions, set the I/O
error flag on a stream only if a "write" call returns an error
indication.  Since "sprintf" isn't supposed to write anything, this
means it should never return EOF in that implementation.