[comp.std.c] printf zero-pads strings?

williams@beowulf.ucsd.edu (Paul Williamson) (10/22/89)

Has the definition of printf changed since early drafts of the ANSI spec?
In particular, I am interested in the interpretation of 
  printf("%05s", "x");
According to my old draft spec, and several compilers, this should print
"0000x".  That is, it should pad the string on the left with zeroes.  But
K&R2 and several other compilers give "    x", claiming that zero-padding
applies only to numeric values.

I know from my experiments with various compilers that it isn't safe to
use this construct in the real world.  But I can't help but wonder which
compilers are right according to the current pANS.  Language lawyers?

Paul Williamson           williams%cs@ucsd.edu

chris@mimsy.umd.edu (Chris Torek) (10/22/89)

In article <7279@sdcsvax.UCSD.Edu> williams@beowulf.ucsd.edu
(Paul Williamson) writes:
>  printf("%05s", "x");
>According to my old draft spec, and several compilers, this should print
>"0000x".  That is, it should pad the string on the left with zeroes.  But
>K&R2 and several other compilers give "    x", claiming that zero-padding
>applies only to numeric values.

(K&R2 is not a compiler.)

The 4.4BSD doprnt.c (essentially the one I posted to comp.lang.c) prints
"0000x".  This appears to conform to the letter of the standard (we wrote
the thing based on the letter of the standard!).
-- 
`They were supposed to be green.'
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

gwyn@smoke.BRL.MIL (Doug Gwyn) (10/23/89)

In article <7279@sdcsvax.UCSD.Edu> williams@beowulf.UCSD.EDU (Paul Williamson) writes:
>Has the definition of printf changed since early drafts of the ANSI spec?

Yes, at one point the specs that had been derived from AT&T UNIX
System V specs were replaced with wording supplied by Alan Beale.
When we subsequently found unintended changes in behavior implied
in that version, additional revision occurred.  So far as I can
tell, the final fprintf() spec is exactly what we do intend.

>K&R2 and several other compilers give "    x", claiming that zero-padding
>applies only to numeric values.

And that is what we intend.

henry@utzoo.uucp (Henry Spencer) (10/23/89)

In article <7279@sdcsvax.UCSD.Edu> williams@beowulf.UCSD.EDU (Paul Williamson) writes:
>Has the definition of printf changed since early drafts of the ANSI spec?
>In particular, I am interested in the interpretation of 
>  printf("%05s", "x");
>According to my old draft spec, and several compilers, this should print
>"0000x".  That is, it should pad the string on the left with zeroes.  But
>K&R2 and several other compilers give "    x", claiming that zero-padding
>applies only to numeric values.

The Oct 88 draft (essentially final except for wording changes) says that
the `0' flag in formatting specifications applies only to the numeric
conversions.  Nothing is said about what happens otherwise, i.e. it is
undefined.
-- 
A bit of tolerance is worth a  |     Henry Spencer at U of Toronto Zoology
megabyte of flaming.           | uunet!attcan!utzoo!henry henry@zoo.toronto.edu

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (10/24/89)

In article <11390@smoke.BRL.MIL>, gwyn@smoke.BRL.MIL (Doug Gwyn) writes:

|  >K&R2 and several other compilers give "    x", claiming that zero-padding
|  >applies only to numeric values.
|  
|  And that is what we intend.

  And broke every program which used zero fill for strings! I think you
also broke zero fill for left justified data, both strings and numerics.

  Before you write a hot reply, I'm joking. One of the hardest problems
I had to find (someone else's code) was a case where a number was
zerofill, left justified. Naturally we were looking for something in the
code which multiplied by ten every time the value dropped below 100! 

  I actually do have a program which uses zero filled strings and I
will not complain if it breaks, it was a hack and is documented as
such. The program was compiled on VMS2 (yes ten years ago) and has
probably never been recompiled.

-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
"The world is filled with fools. They blindly follow their so-called
'reason' in the face of the church and common sense. Any fool can see
that the world is flat!" - anon

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (10/25/89)

In article <20327@mimsy.umd.edu>, chris@mimsy.umd.edu (Chris Torek) writes:

|  The 4.4BSD doprnt.c (essentially the one I posted to comp.lang.c) prints
|  "0000x".  This appears to conform to the letter of the standard (we wrote
|  the thing based on the letter of the standard!).

  I would be delighted if it workds that way, having a program which
uses it (input strings are digit sequences with leading zeros stripped),
now I ask you, what does "%-05x" give. I won't be on the machine with
your doprint until Friday, thought you might remember.
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
"The world is filled with fools. They blindly follow their so-called
'reason' in the face of the church and common sense. Any fool can see
that the world is flat!" - anon

chris@mimsy.umd.edu (Chris Torek) (10/25/89)

In article <1430@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.COM
>(Wm E Davidsen Jr) writes:
>I would be delighted if [printf] workds that way ["%05s","x"=>"0000x"],
>having a program which uses it (input strings are digit sequences with
>leading zeros stripped), now I ask you, what does "%-05x" give.

May 13, 1988 draft standard says this about `0' (|xxx| indicates courier
font, for literal C text):

	For |d|, |i|, |o|, |u|, |x|, |X|, |e|, |E|, |f|, |g|, and |G|
	conversions, leading zeros (following any indication of sign
	or base) are used to pad to the field width; no space padding
	is performed.  If the |0| and |-| flags both appear, the |0|
	flag will be ignored.  For |d|, |i|, |o|, |u|, |x|, and |X|
	conversions, if a precision is specified, the |0| flag will
	be ignored.  For other conversions, the behavior is undefined.

We happened to base the %05s zero-fill rule on an earlier draft, which
simply said `zero pad instead of blank pad, but |0| and |-| together
is like |-| alone'.  Hence %-05<any> functions as %-5<any>.
-- 
`They were supposed to be green.'
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris

davidsen@crdos1.crd.ge.COM (Wm E Davidsen Jr) (10/26/89)

In article <20371@mimsy.umd.edu>, chris@mimsy.umd.edu (Chris Torek) writes:
	if a precision is specified, the |0| flag will be ignored.
	(quoted from the standard)

  This certainly doesn't grab me as being 'least astonishment.' I
interpret this to mean that if I say %010.2f the |0| is ignored. Yes? My
copy of the standard is on loan, I can't check that you quoted it
correctly, but the context (from which I extracted it) doesn't seem to
refer to this.

  Could someone explain why this works this way? I certainly find it
easier to explain when leading zero means pad with zeros. Period. What
was the thinking that specifying precision in some way made zero fill
undesirable?

  Adding a leading zero is not the type of thing one does by accident.
It would be nice if it had been defined to either cause leading zeros or
be a runtime error. Ignoring a user request for action is a good way to
create hard to find errors (my opinion).
-- 
bill davidsen	(davidsen@crdos1.crd.GE.COM -or- uunet!crdgw1!crdos1!davidsen)
"The world is filled with fools. They blindly follow their so-called
'reason' in the face of the church and common sense. Any fool can see
that the world is flat!" - anon

chris@mimsy.umd.edu (Chris Torek) (10/27/89)

>In article <20371@mimsy.umd.edu>, chris@mimsy.umd.edu (Chris Torek) writes:
>	if a precision is specified, the |0| flag will be ignored.
>	(quoted from the standard)

In article <1470@crdos1.crd.ge.COM> davidsen@crdos1.crd.ge.COM
(Wm E Davidsen Jr) writes:
>  This certainly doesn't grab me as being 'least astonishment.' I
>interpret this to mean that if I say %010.2f the |0| is ignored. Yes?

No; that clause applies only to [diouxX] formats.  Thus, the 0 is not
ignored for `%010.2f', but it is ignored for `%010.2d'.

Apparently the logic is that the precision here says exactly how many
numeric digits are to appear, hence the precision says how many leading
zeros there can be, and so the 0 flag should be ignored.

>... I certainly find it easier to explain when leading zero means pa
> with zeros. Period.

I happen to agree, but my copy of the draft does not.

Chris
-- 
`They were supposed to be green.'
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris