bobl@aeolus.UUCP (Bob Lewis) (12/03/84)
I'd like to bring up the matter of using '$' in C identifiers. Here are several points of view I've come across: Ritchie's "The C Programming Language -- Reference Manual": While he states that '_' counts as a letter, he says nothing about '$'. 4.2bsd "cc": '$' counts as a letter. VAX/VMS "CC": '$' counts as a letter, but the documentation warns: The dollar sign should be used only in identifiers for VAX/VMS global symbols. Identifiers that contain dollar signs may not be portable. What does the standard say about this? '$' is not used anywhere else in C. I think its use as a letter should be officially permitted. (It makes a nice "package" identifier, c.f. VMS.) I'm conducting an informal poll on what other C compilers do with '$'. If you're interested, please send me your findings and I will summarize. (Deadline: 12/10/84). - Bob L. ...!tektronix!teklds!bobl
jss@sftri.UUCP (J.S.Schwarz) (12/05/84)
-- > I'd like to bring up the matter of using '$' in C identifiers. Here are > several points of view I've come across: > > What does the standard say about this? '$' is not used anywhere else in C. > I think its use as a letter should be officially permitted. (It makes a > nice "package" identifier, c.f. VMS.) The latest (Oct 31) ANSI draft that I have does not include '$' as a character in identifiers. It should be pointed out that many preprocessors (such as YACC) use '$' in some way to distinguish "special" identifiers. It would cause confusion if '$' was now made legal in ordinary C identifiers. Looking at my keyboard, the characters not currently used by C (outside of comments and strings) are '@', '$', and backquote(`). If you really need a new nonalphabetic character in identifiers I would suggest backquote. Identifiers like xyz`p look nice to me. (But personally, I see no need for such a modification to the language definition.) Jerry Schwarz ihnp4!btlunix!jss
henry@utzoo.UUCP (Henry Spencer) (12/06/84)
'$' is, alas, commonly used as an escape from C, for example in yacc. -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
steveg@hammer.UUCP (Steve Glaser) (12/07/84)
In article <260@sftri.UUCP> jss@sftri.UUCP (J.S.Schwarz) writes: >-- > >> I'd like to bring up the matter of using '$' in C identifiers. Here are >> several points of view I've come across: >> >> What does the standard say about this? '$' is not used anywhere else in C. >> I think its use as a letter should be officially permitted. (It makes a >> nice "package" identifier, c.f. VMS.) > >The latest (Oct 31) ANSI draft that I have does not include '$' as >a character in identifiers. > >It should be pointed out that many preprocessors (such as YACC) use >'$' in some way to distinguish "special" identifiers. It would >cause confusion if '$' was now made legal in ordinary C identifiers. > >Looking at my keyboard, the characters not currently used by C (outside >of comments and strings) are '@', '$', and backquote(`). If you really >need a new nonalphabetic character in identifiers I would suggest backquote. >Identifiers like xyz`p look nice to me. (But personally, I see no >need for such a modification to the language definition.) > > Jerry Schwarz > ihnp4!btlunix!jss Backquote is already used in the GCOS version of C. Try this on you're favorite C compiler (4.2 in this case, but it's in all versions of PCC I've looked for it). % cat test2.c main() { char c[] = `hi there`; } % cc test2.c "test2.c", line 3: no automatic aggregate initialization "test2.c", line 3: BCD constant exceeds 6 characters "test2.c", line 3: gcos BCD constant illegal "test2.c", line 3: illegal left operand of assignment operator % Seriously though, I can't see adding anymore letters to the C language identifier set just "cause it'd be nice and besides VMS does it". If we have to change that area, let it be in the area of somehow allowing the various national variants of the ISO character code (of which US ASCII is just one). That's the only character set issue that I can see that's important enough to warrant changing the language for. Unfortunately, with the overloading of characters ('|' is a letter in some countries), I don't see an easy solution emerging. Steve Glaser steveg.tektronix@csnet-relay tektronix!steveg
smh@mit-eddie.UUCP (Steven M. Haflich) (12/09/84)
Another very ugly but very practical reason for not allowing additional alphameric characters in identifiers is portability. Regardless what the C Standard eventually says, not all machine/OS combinations support all C-ASCII characters in identifiers (especially externals), and some support non-C-ASCII characters. There is little a standards committee or net.lang.c can do about this [except, I suppose, flame :-)]. My intuition is that languages and OS/linkers most commonly allow exactly *one* legal nonalphameric character in identifiers, and this character is most often overloaded as an informal package flag: `_' as an external prefix char in Unix/C, `$' in lots of Big Blue systems, etc. When porting code either direction, the simple one-to-one mapping of these characters saves a lot of grief. Let's not make it tougher to *ex*port Unix/C to other systems by trying to make it very occasionally easier to *im*port foreign code. It is my opinion, by the way, that the traditional availability of these informal package-flag chars inside identifiers was a portability botch, mostly impeding exporting code, not importing. But it is only recently that vendors and their captive language designers have come to realize that exportability of code can sell machines just as well as importability.
henry@utzoo.UUCP (Henry Spencer) (12/11/84)
For what it's worth, the current ANSI C draft (12 Nov 1984) says that dollar signs aren't in C's vocabulary at all (except for the usual exemption for comments and strings), but adds the following in the "Common extensions" discussion in appendix E: E.4.4.1 Specialized identifiers Characters other than the underscore _, letters, and digits, that are not defined in the required source character set (such as the dollar sign $) may appear in an identifier. Note that this is identified as a "common extension", not as part of the standard proper, and "[is] not portable to all implementations". -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
85488116@sdcc3.UUCP (Oliver Boliver Butt) (12/12/84)
> For what it's worth, the current ANSI C draft (12 Nov 1984) says that > dollar signs aren't in C's vocabulary at all (except for the usual > exemption for comments and strings), but adds the following in the > "Common extensions" discussion in appendix E: > > E.4.4.1 Specialized identifiers > > Characters other than the underscore _, letters, and digits, that > are not defined in the required source character set (such as the > dollar sign $) may appear in an identifier. * Yacc uses $$ and $n to allow for manipulations on its parsing stack. In this case, $$ & $n are used as pseudo identifiers for the stack in the C code actions which get executed when a rule gets reduced. Therefore, there is an ambiguity problem with yacc and the proposed $ operators when you try to compile the code yacc generates. So you either have to change yacc, or forget about the $ ops. Paul van de Graaf U. C. San Diego sdcsvax!sdcc3!85488116 * If you don't know, yacc stands "for yet another compiler compiler" It is a tool which generates a compiler given a LR(1) grammar and some supporting code that the programmer writes.
robert@gitpyr.UUCP (Robert Viduya) (12/14/84)
><
Perhaps what C really needs is a way to define a seperate external name for
an identifier when declaring that identifier. For example, the following:
extern sys_read "SYS$READ" ();
would tell the compiler/linker to call the entrypoint "SYS$READ" whenever
"sys_read" was called in the source. "sys_read" would be the source level
name of the procedure and "SYS$READ" would be the object level name of the
procedure. This should also be able to be applied to int's and other data
structures. The definition of the object level name should also be
optional.
Of course, the syntax is a only a suggestion.
robert
--
Robert Viduya
Office of Computing Services
Georgia Institute of Technology, Atlanta GA 30332
Phone: (404) 894-4669
...!{akgua,allegra,amd,hplabs,ihnp4,masscomp,ut-ngp}!gatech!gitpyr!robert
...!{rlgvax,sb1,uf-cgrl,unmvax,ut-sally}!gatech!gitpyr!robert
geoff@desint.UUCP (Geoff Kuenning) (12/18/84)
In article <422@gitpyr.UUCP> robert@gitpyr.UUCP (Robert Viduya) writes: > extern sys_read "SYS$READ" (); > >would tell the compiler/linker to call the entrypoint "SYS$READ" whenever >"sys_read" was called in the source. Bravo for this idea! The syntax, however, conflicts with the "old-style initializer" syntax. Anybody got ideas for a parseable syntax? -- Geoff Kuenning ...!ihnp4!trwrb!desint!geoff
garys@bunker.UUCP (12/19/84)
> In article <422@gitpyr.UUCP> robert@gitpyr.UUCP (Robert Viduya) writes: > > > extern sys_read "SYS$READ" (); > > > >would tell the compiler/linker to call the entrypoint "SYS$READ" whenever > >"sys_read" was called in the source. > > Bravo for this idea! The syntax, however, conflicts with the "old-style > initializer" syntax. Anybody got ideas for a parseable syntax? > -- > > Geoff Kuenning OK, how about: extern sys_read() = "SYS$READ"; Gary Samuelson
joe@fluke.UUCP (Joe Kelsey) (12/20/84)
>From: henry@utzoo.UUCP (Henry Spencer) >'$' is, alas, commonly used as an escape from C, for example in yacc. I ported yacc to VMS, where identifiers can contain $, and encountered absolutely no problems! yacc only uses identifiers of the form $n, where n is some small number, so as long as you stay away from identifiers like that, there is no problem. I see no real reason to loose sleep over the incompatibility with yacc, as there is no problem in practice. /Joe
tim@cmu-cs-k.ARPA (Tim Maroney) (12/20/84)
Re the "new" idea for resolving lexical mismatches between an OS and C by introducing some construct such as OSid text_limit "text$space$top"; (making the compiler put out the identifier "text$space$top" [illegal in C] in place of all ocurrences of "text_limit"), I am very much in favor of this idea. I would like to point out, though, that I suggested this last Spring on this very newsgroup and was met by thundering silence. It is not a "new" idea at all. -=- Tim Maroney, Carnegie-Mellon University Computation Center ARPA: Tim.Maroney@CMU-CS-K uucp: seismo!cmu-cs-k!tim CompuServe: 74176,1360 audio: shout "Hey, Tim!" "Remember all ye that existence is pure joy; that all the sorrows are but as shadows; they pass & are done; but there is that which remains." Liber AL, II:9.
sde@Mitre-Bedford (12/21/84)
Actually, HP Pascal has had that feature, with slightly different syntax, for years, and for exactly that reason.
mab@druxp.UUCP (BlandMA) (12/22/84)
There have been several suggestions to resolve the $ identifier problem by adding new constructs to the language. Perhaps a better (at least different) place would be to have the linker/loader (whatever you call it) do the name translation. I seem to recall an IBM link editor capable of something like this. For example, to make a call to the sys$whatever function, you would write your code using a syntactically valid C identifier, such as sys_whatever. The load command would include some option to resolve the symbol "sys_whatever" to "sys$whatever". There would probably be a file on the system somewhere that contains the commonly used translations for that particular system. I would prefer this solution over changing every compiler for every language that currently doesn't allow $ in identifiers. -- Alan Bland {ihnp4, allegra}!druxp!mab AT&T Information Systems Labs, Denver
jack@vu44.UUCP (Jack Jansen) (12/26/84)
If the idea to make a construct like extern sys_read "SYS$READ" (); cannot be implemented like this because of the conflict with old-style initialization, why not use the "entry" keyword? It is still a reserved word (well, according to my K&R, at least), and I've never heard of an implementation using it. Besides that, extern sys_read entry "SYS$READ" (); looks even more intellegible (to me). -- Jack Jansen, {seismo|philabs|decvax}!mcvax!vu44!jack or ...!vu44!htsa!jack
david@ukma.UUCP (David Herron, NPR Lover) (12/28/84)
I've got some bad news. Using "entry" to mark external identifiers that have bad names seems (on the surface) to be a bad idea. However entry is not in the standard as a reserved word (this was as of Oct. 17). So it would have to be added. I have never heard of a C compiler that uses it. Are there any? David Herron.
kpmartin@watmath.UUCP (Kevin Martin) (12/28/84)
>How about: > > extern sys_read() = "SYS$READ"; > >Gary Samuelson That only works for functions... If the identifier is an object, this looks like an initializer. Kevin Martin, UofW Software Development Group.
kpmartin@watmath.UUCP (Kevin Martin) (12/28/84)
>For example, to make a call to the sys$whatever function, you would >write your code using a syntactically valid C identifier, such as >sys_whatever. The load command would include some option to resolve >the symbol "sys_whatever" to "sys$whatever". There would probably be >a file on the system somewhere that contains the commonly used translations >for that particular system. This stil leaves the problem of getting at the UNcommonly-used funny-named variables. Perhaps the user has to supply another translation file for these... >I would prefer this solution over changing every compiler for >every language that currently doesn't allow $ in identifiers. >Alan Bland >{ihnp4, allegra}!druxp!mab >AT&T Information Systems Labs, Denver I would prefer being able to tell what is happening just from reading the C code. Having to search in (several) places to find which names map to what would be an ongoing cost, compared to the fixed cost of having the compiler translate the names. Having many symbols automatically mapped could easily cause external name clashes too... A C compiler which allows re-naming like this could also be used to port long-name programs to systems with 6-character linkers with minimal effort. It should be noted that '$' is not the only offending character. '.' is also popular, and there is *no way* of including it in C identifiers. Kevin Martin, UofW Software Development Group.
henry@utzoo.UUCP (Henry Spencer) (12/30/84)
> ... However entry is not in the standard as > a reserved word (this was as of Oct. 17). So it would > have to be added. > > I have never heard of a C compiler that uses it. Are there > any? Actually, "put back" is more accurate than "added". It has been a "reserved for future use" keyword in C for a long time, but the ANSI committee decided (in my opinion, correctly) that it did not seem to have a future and thus could be deleted. As far as I know, nobody has ever done anything with it. -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
bsa@ncoast.UUCP (Brandon Allbery (the tame hacker on the North Coast)) (01/02/85)
How about this construct: extern sys_read() : "SYS$READ"; About the only possible clash is with bitfields, and this wouldn't be useful where bitfield declarations are legal (i.e. inside structures). It also wouldn't eat up the `entry' keyword. --bsa -- Brandon Allbery @ decvax!cwruecmp!ncoast!bsa (..ncoast!tdi1!bsa business) 6504 Chestnut Road, Independence, Ohio 44131 (216) 524-1416 Who said you had to be (a) a poor programmer or (b) a security hazard to be a hacker?
jack@vu44.UUCP (Jack Jansen) (01/03/85)
> How about this construct: > > extern sys_read() : "SYS$READ"; > I think that it should be closer to the identifier. It is the identifier that is modified, after all. If you don't want to use 'entry', the use ':', but keep it next to the identifier, something like extern sys_read:"SYS$READ"(); This is also much more readable in the case of *defining* funny names, which hasn't been discussed yet, but which can be just as useful, for instance when you're writing a library and you want to hide your internal routines. By the way, I'm still in favor of using 'entry' in stead of Yet Another Funny Char. This usage won't even make the entry symbol unusable, since the use of entry for defining entrypoints is presumably somewhere inside the *code*, not the declarations. -- Jack Jansen, {seismo|philabs|decvax}!mcvax!vu44!jack or ...!vu44!htsa!jack If *this* is my opinion, I wasn't sober at the time.
elbaum@reed.UUCP (Daniel Elbaum) (01/06/85)
. Actually, OASIS systems use this keyword in their C compilers. It's kind of useful, since you can use it to enter values at compile time, thereby aiding debugging. -Daniel Elbaum {decvax, ucbvax, pur-ee, uw-beaver, masscomp, cbosg, mit-ems, psu-cs, uoregon, orstcs, ihnp4, uf-cgrl}!tektronix teneron----\ ogcvax------+-!reed!elbaum muddcs-----/ cadic-----/ oresoft--/ grpwre--+