[comp.lang.c] Strings in C

peter@ficc.uu.net (Peter da Silva) (10/25/89)

In article <2522@munnari.oz.au> ok@cs.mu.oz.au (Richard O'Keefe) writes:
> C provides direct
> syntax for literals of only one type, but as I showed above, it isn't
> hard to come up with macros to declare named constants of the other
> types.

In most Macintosh C compilers that I've seen, the syntax !"%p...."! is
used to indicate byte-counted strings. P stands for "pascal".

Was X3J11 aware of this? Or are the Mac compiler vendors going to change?
-- 
Peter da Silva, *NIX support guy @ Ferranti International Controls Corporation.
Biz: peter@ficc.uu.net, +1 713 274 5180. Fun: peter@sugar.hackercorp.com. `-_-'
"That particular mistake will not be repeated.  There are plenty of        'U`
 mistakes left that have not yet been used." -- Andy Tanenbaum (ast@cs.vu.nl)

grogers@sushi.uucp (Geoffrey Rogers) (10/26/89)

In article <6676@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>In most Macintosh C compilers that I've seen, the syntax !"%p...."! is
>used to indicate byte-counted strings. P stands for "pascal".
>
>Was X3J11 aware of this? Or are the Mac compiler vendors going to change?

Why should Mac compiler vendors have to change this feature. It is an
extension of the language, which they deem important for there market
place.

The only real question is, do they document it as an extension?

Geoffrey C. Rogers				"Whose brain did you get?"
{uunet,sun}!convex!grogers			"Abie Normal!"
grogers@convex.com

6600pete@ucsbuxa.ucsb.edu (10/27/89)

In article <2421@convex.UUCP> grogers@sushi.uucp (Geoffrey Rogers) writes:
> In article <6676@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>>In most Macintosh C compilers that I've seen, the syntax !"%p...."! is

Actually, the syntax is "\p...", but that's only a nit I had to pick.

>>used to indicate byte-counted strings. P stands for "pascal".
>>
>>Was X3J11 aware of this? Or are the Mac compiler vendors going to change?

No, they're not. It's for ROM compatibility and there isn't a work-around.

> Why should Mac compiler vendors have to change this feature. It is an
> extension of the language, which they deem important for there market
> place.

Yes. But that seems so obvious as to not be an issue. (Although, in this
group, I should know better.)

> The only real question is, do they document it as an extension?

Yes, they do.
--
  | GurgleKat (Pete Gontier), pete@cavevax.ucsb.edu
  | .UUCP reply addresses bounce; try another path.
  | ...if you'd gone to Dartmouth, you'd not have had to take the math.

peter@ficc.uu.net (Peter da Silva) (10/28/89)

[ !"%p...."! is used to indicate byte-counted strings. P stands for "pascal".
  I asked: "Was X3J11 aware of this? Or are the Mac compiler vendors going
	    to change?" ]

In article <2421@convex.UUCP> grogers@convex.COM (Geoffrey Rogers) writes:
> Why should Mac compiler vendors have to change this feature. It is an
> extension of the language, which they deem important for there market
> place.

Well, yes, but for one thing. ANSI seems to have defined !%p! as pointer
format. (so, what does !printf("%04p", ptr);! display on an 8086?)

(or was I asleep when I read that?)
-- 
`-_-' Peter da Silva <peter@ficc.uu.net> <peter@sugar.hackercorp.com>.
 'U`  --------------  +1 713 274 5180.
"That particular mistake will not be repeated.  There are plenty of mistakes
 left that have not yet been used." -- Andy Tanenbaum (ast@cs.vu.nl)

gwyn@smoke.BRL.MIL (Doug Gwyn) (10/28/89)

In article <6676@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>In most Macintosh C compilers that I've seen, the syntax !"%p...."! is
>used to indicate byte-counted strings. P stands for "pascal".
>Was X3J11 aware of this? Or are the Mac compiler vendors going to change?

Yes, at least some of us were quite aware of this.  There are also other
nonstandard vendor extensions to fprintf() format specs in existence.
I've argued numerous times with C implementors that using %p for "Pascal"
(counted) strings encouraged non-portable programming.  Unfortunately
Apple's ToolBox interfaces are designed for Pascal, not C; the same malady
later showed up on the Apple IIGS.  There are better solutions, though,
such as providing Pascal-to-C string translator functions in the C library.
Note that ByteWorks' ORCA/C for the Apple IIGS will remain non-compliant
with the C Standard unless Mike Westerfield reverses his decision about
this.  (ORCA/C does provide conversion functions, so %p=>Pascal is not
really necessary.)

I have no idea whether or not the Mac compilers will ever become Standard
conforming.  From what I hear, most of them come nowhere close to
providing a full hosted implementation.

gwyn@smoke.BRL.MIL (Doug Gwyn) (10/28/89)

In article <2742@hub.UUCP> pete@cavevax.ucsb.edu writes:
>In article <2421@convex.UUCP> grogers@sushi.uucp (Geoffrey Rogers) writes:
>> In article <6676@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>>>In most Macintosh C compilers that I've seen, the syntax !"%p...."! is
>Actually, the syntax is "\p...", but that's only a nit I had to pick.

Oh, if THAT's the issue, then it's not a problem.  Any program containing
a  string literal like "\pXXX" has stepped into the Twilight Zone of
"undefined behavior"; the implementation is free to assign a "counted
string" meaning to this if it happens to be useful to do so.

peter@ficc.uu.net (Peter da Silva) (10/28/89)

I said:
>In most Macintosh C compilers that I've seen, the syntax !"%p...."! is

In article <2742@hub.UUCP> pete@cavevax.ucsb.edu writes:
> Actually, the syntax is "\p...",

Ack. (takes aim at foot)

> but that's only a nit I had to pick.

I'm afraid it's not nitpicking. !\p! is an extension that doesn't conflict
with existing non-Mac code or working ANSI code. It's reasonable. I thought
it was !%p!, which conflict with "pointer" format in ANSI C.

(*BANG*)
-- 
`-_-' Peter da Silva <peter@ficc.uu.net> <peter@sugar.hackercorp.com>.
 'U`  --------------  +1 713 274 5180.
"That particular mistake will not be repeated.  There are plenty of mistakes
 left that have not yet been used." -- Andy Tanenbaum (ast@cs.vu.nl)

6600pete@ucsbuxa.ucsb.edu (10/28/89)

In article <11428@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes:
> In article <2742@hub.UUCP> pete@cavevax.ucsb.edu writes:
>>In article <2421@convex.UUCP> grogers@sushi.uucp (Geoffrey Rogers) writes:
>>> In article <6676@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
>>>>In most Macintosh C compilers that I've seen, the syntax !"%p...."! is
>>Actually, the syntax is "\p...", but that's only a nit I had to pick.
> Oh, if THAT's the issue, then it's not a problem.  Any program containing
> a  string literal like "\pXXX" has stepped into the Twilight Zone of
> "undefined behavior"; the implementation is free to assign a "counted
> string" meaning to this if it happens to be useful to do so.

Ick. What a mess this makes. Oh well.

Actually, from previous articles I've just read, there is apparently a
"%p...", but I'm not quite sure why this is a bad thing as long as it is
documented correctly. I mean, on a non-Mac compiler, it's not going to
be caught anyway.

And odds are that you're not going to want to port a truly Mac-ish app
anywhere. The first page of Inside Macintosh, the Macker's Bible, says
"Everything you know is wrong..." Porting TO the Mac is conceivable;
most compilers have relatively extensive "UNIX compatibility" libraries.
But code FROM UNIX is not going to have a problem with "\p..." or
"%p..." either, is it? The SunOS 4 man pages don't say they use "%p..."
for anything (which of course is not the final word, but we're dealing
in percentage chances of conflicts, aren't we?)
--
  | GurgleKat (Pete Gontier), pete@cavevax.ucsb.edu
  | .UUCP reply addresses bounce; try another path.
  | ...if you'd gone to Dartmouth, you'd not have had to take the math.

6600pete@ucsbuxa.ucsb.edu (10/28/89)

In article <6706@ficc.uu.net> peter@ficc.uu.net (Peter da Silva) writes:
> I said:
>>In most Macintosh C compilers that I've seen, the syntax !"%p...."! is
> In article <2742@hub.UUCP> pete@cavevax.ucsb.edu writes:
>> Actually, the syntax is "\p...",
> Ack. (takes aim at foot)
>> but that's only a nit I had to pick.
> I'm afraid it's not nitpicking. !\p! is an extension that doesn't conflict
> with existing non-Mac code or working ANSI code. It's reasonable. I thought
> it was !%p!, which conflict with "pointer" format in ANSI C.
> (*BANG*)

OK, everybody seems to be shooting themselves in the foot on this one.
Until we get a real Mac C type with the balls to post in this group, I
volunteer to shut up, after the next paragraph.

So far, we've concluded that Mac C compilers generally support BOTH
"\p..." AND  "%p" in different contexts, for different things (and if
you think about it a little, I won't have to explain it and take FURTHER
risks of shooting myself in the foot).

Direct pro-Mac-C flames to mail, please. I hack the Mac (Mack), but in
Pascal and assembly, thank you.
--
  | GurgleKat (Pete Gontier), pete@cavevax.ucsb.edu
  | .UUCP reply addresses bounce; try another path.
  | ...if you'd gone to Dartmouth, you'd not have had to take the math.

gwyn@smoke.BRL.MIL (Doug Gwyn) (10/28/89)

In article <2756@hub.UUCP> pete@cavevax.ucsb.edu writes:
>Actually, from previous articles I've just read, there is apparently a
>"%p...", but I'm not quite sure why this is a bad thing as long as it is
>documented correctly. I mean, on a non-Mac compiler, it's not going to
>be caught anyway.

The problem is, a standard-conforming C compiler is obliged to
treat the %p format spec as meaning "take the void* argument
and print out a representation of the pointer, suitable for
debugging etc.", not to interpret the data at that address as
a counted ("Pascal") string.

joe@gistdev.UUCP (Joe Brownlee) (10/29/89)

In lots of articles, lots of people say lots of things about how Macintosh
C compilers handle counted (or Pascal-style, if you will) strings.

In an article someone says (sorry, lost the attribution):
>Until we get a real Mac C type with the balls to post in this group...

Well, judge for yourself :-).  I just hack a bit as well, but I happen to have
the THINK C 4.0 Manual handy, so here goes.  By the way, THINK C 4.0 is
advertised as being very ANSI conformant, and for the most part, it is.  The
libraries seem to be the most conformant feature.  However, this version does
not support "const", "volatile", or "signed", for example.

First, string literals which begin with a "\p" are considered type "Str255",
the type of string used by Pascal and the Macintosh ROM toolbox.  Type Str255
is defined in the header file "MacTypes.h" like so:

   typedef unsigned char Str255[256];

The count is kept in the first byte.

Routines are provided to convert between the two formats of strings.  However,
in my experience, it is better to write parallel versions of the standard "str"
routines which operate on counted strings, since all toolbox calls operate on
them.  Thus you would say:

Str255 s;
[...]
(void)Pstrcpy( s, "\pThis is how to set a counted string." );

Appendix A of the "Standard Libraries Reference" Manual (which deals with
printf() and scanf()) says the following in its table of format characters:

Character   Argument Type   Output
p           void *          An eight-digit hexadecimal number.
s           char *          A string.  [...] Prints a C-style string (an object
                            of type char *).  It prints the string until one of
                            the following happens:

                            .   It encounters a NULL character, which it will
                                not print.
                            .   It prints the maximum number of characters
                                allowed by the precision directive.

                            With the # flag, this specifier prints a Pascal-
                            style string (an object of type Str255).

In other words:

Str255 s;
[...]
(void)printf( "This is a Pascal string: %#s\n", s );

...would be used to display the contents of a counted string.

As stated earlier, I have never taken a Mac program to UNIX or DOS, but I have
taken UNIX and DOS programs to the Mac, so I do not find the above to be a
problem.

Disclaimer: I am not associated with Symantec/THINK C in any other way than as
a user.  I do necessarily endorse the above as being ANSI conformant or even as
an allowed extension.  Others in this group can speak to this issue (please,
do).  I am simply posting this information so that any futher discussion
can be informed rather than "hear-say".

Joe Brownlee               | The best diplomat I know is a fully activated
Global Information Systems | phaser bank.  -- Montgomery Scott
1800 Woodfield Drive       |
Savoy, Illinois 61874	   | Pay attention to what I say.  Start a trend.
(217) 352-1165	           | UUCP: {uunet,pur-ee,convex}!gistdev!joe