[comp.lang.c] MAJOR ANSI C FLAW

mc68020@gilsys.UUCP (Thomas J Keller) (10/06/87)

   Alright, so here I am reading an article in Dr. Dobbs on the new ANSI C
standard.  Things are looking very nice.  Function prototyping looks WONDERFUL,
all identifiers with a leading '_' are reserved, 31 character double case
identifier significance for INTERNAL names...and:

	   ******  S  I  X  ******  character double case external names!!!!!!


    OK, I'll admit, these are the MINIMUM requirements.  Even so, I note that
the minimums for almost everything involved are quite large (31 char internals,
257 cases in switch, 509 chars/logical source line, etc., etc.), while the
external name minimum is the same old brain-damaged, hair-pulling, mind-
wrenching 6.
    
    I want to know which compiler vendor had enough power to force this pile
of BULL**T down the committee's throats??????

    So quite a few vendors opt for the minimums, while others opt for more
reasonable numbers.  There goes your portability!

    Argument:  but if you stick to the minimums, your code will ALWAYS be
portable across ANSI conformant compilers.

    Response:  true enough.  If I stick to a 6 char external minimum, my coding
style and readability will be severely compromised.  Ever try to provide large
numbers of MEANINGFUL identifier names with a 6 char limit????

    I defy ANYONE to offer a rational, realistic explanation for this 
abomination.  This decision was obviously political, as there *IS* no rational
technical reason for such a crock.

    I wish I knew how to contact the committee and propose that they fix their
screwup.  Maybe someone who reads this will know, and pass it on.
-- 
Tom Keller 
VOICE  : + 1 707 575 9493
UUCP   : {ihnp4,ames,sun,amdahl,lll-crg,pyramid}!ptsfa!gilsys!mc68020

mc68020@gilsys.UUCP (Thomas J Keller) (10/08/87)

In article <1132@gilsys.UUCP>, mc68020@gilsys.UUCP (Thomas J Keller) writes:
> 
> 	   ******  S  I  X  ******  character double case external names!!!!!!
> 

   Sorry about that!  Make that ******  S  I  X  ****** character SINGLE case
external names!  (even *MORE* of a botch!)

-- 
Tom Keller 
VOICE  : + 1 707 575 9493
UUCP   : {ihnp4,ames,sun,amdahl,lll-crg,pyramid}!ptsfa!gilsys!mc68020

dhesi@bsu-cs.UUCP (Rahul Dhesi) (10/09/87)

In article <1132@gilsys.UUCP> mc68020@gilsys.UUCP (Thomas J Keller) writes:
>	   ******  S  I  X  ******  character [single] case external names!!!!!!
...
>    I defy ANYONE to offer a rational, realistic explanation for this 
>abomination.  This decision was obviously political, as there *IS* no rational
>technical reason for such a crock.

No modern system requires such short external identifiers.  You can
safely ignore this limit.  Consider 8 significant characters to be a
reasonable limit.  Even that's taking things a bit too far, considering
that for $61 I can get a largely-ANSI-conformant C compiler on my
MS-DOS machine, complete with linker, that supports far longer external
identifiers.
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi

swh@hpsmtc1.HP.COM (Steve Harrold) (10/10/87)

Re: 6-character external names

If using names of externals with only the first 6 characters being
recognized, is too restrictive in your choice of "meaningful" names, 
why not use the #define technique.  For example:

#define    longandverbose  simple

Then everyone can have their cakes and eat them.

---------------------
Steve Harrold			...hplabs!hpsmtc1!swh
				HPG200/13
				(408) 447-5580
---------------------

gwyn@brl-smoke.ARPA (Doug Gwyn ) (10/10/87)

In article <1246@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
>No modern system requires such short external identifiers.

Excuse me, but if that were true you can bet that X3J11 would not
have imposed the restriction!  We don't like it either, but it IS
necessary for some environments.

>Consider 8 significant characters to be a reasonable limit.

That's one too many for PDP-11 UNIX, or for most pre-flexnames UNIXes.

There is nothing in X3.159-198x that PROHIBITS a programmer from
exploiting support for longer, case-sensitive external names; it
just doesn't guarantee that such code will port painlessly to
all C implementations.  That's simply a fact of life..

gwyn@brl-smoke.ARPA (Doug Gwyn ) (10/10/87)

In article <1132@gilsys.UUCP> mc68020@gilsys.UUCP (Thomas J Keller) writes:
>    I want to know which compiler vendor had enough power to force this pile
>of BULL**T down the committee's throats??????

What makes you think that's what happened?  When X3J11 realized that
this was a real issue for several systems, they voluntarily, however
reluctantly, acknowledged this very real constraint when setting
implementation minimums.

>    So quite a few vendors opt for the minimums, while others opt for more
>reasonable numbers.  There goes your portability!

Only those vendors who have such an restriction are likely to impose
it.  This includes those who produce C implementations but have no
control over the linker that will be used.

>    Response:  true enough.  If I stick to a 6 char external minimum, my coding
>style and readability will be severely compromised.  Ever try to provide large
>numbers of MEANINGFUL identifier names with a 6 char limit????

First of all, I routinely program within this constraint; it's slightly
annoying but workable.  UNIX programmers had a 7-character external
identifier significance constraint until recently, when Berkeley added
support for "flexnames" that AT&T eventually picked up for System V.

Secondly, there are tricks for mapping longer names to unique shorter
ones when necessary (which will be the case in only a few environments).
Although such mapping generally hampers debugging, in such environments
you probably won't find very good debuggers anyway.

>    I defy ANYONE to offer a rational, realistic explanation for this 
>abomination.  This decision was obviously political, as there *IS* no rational
>technical reason for such a crock.

I don't know what you consider rational.  X3J11 thought it was rational
to acknowledge facts of reality and accommodate them in the standard.

Note that longer external identifier significance is mentioned as a
"common extension".  Also, under "Future Language Directions", the
draft Standard states:

	Restriction of the significance of an external name to fewer
	than 31 characters or to only one case is an obsolescent
	feature that is a concession to existing implementations.

You may not realize it, but that is an extremely strong hint that
the committee deplores the existing situation and are warning that
it had better improve before the next revision of the Standard.
"Obsolescent" has definite meaning according to "standards language";
this allows removal of this constraint in the next revision.

I don't know what more could have realistically been done.

>    I wish I knew how to contact the committee and propose that they fix their
>screwup.  Maybe someone who reads this will know, and pass it on.

This has been discussed by the committee several times and reaffirmed
each time, although the wording has been "toughened up" about this.

P.S.  As usual, the above is all my opinion, not necessarily X3J11's.

dhesi@bsu-cs.UUCP (Rahul Dhesi) (10/10/87)

I wrote, referring to ANSI's 6-character limit on external identifiers
in portable C,

     No modern system requires such short external identifiers

and suggested that programmers use 8 characters as a working maximum.

In article <6543@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) 
writes:
>That's one too many for PDP-11 UNIX, or for most pre-flexnames UNIXes.

Doug Gwyn is factually correct, but my point is important too.  I don't
consider the quoted systems to be modern systems.  As technology moves
on, those with older architectures will be forced to balance the cost
(and benefits) of sticking with what they have against the cost (and
benefits) of investing in an upgrade.  The length of external
identifiers is just one small factor in that.  Programmers must ask
themselves how much programming convenience they want to give up (e.g.
convenience of long identifiers) in order to support users who have
consciously chosen not to move with the times.

It's great to have choices;  and you and I are free to do our own
cost/benefit analysis, and our conclusions may not be the same
as ANSI's.  Why, you could (horrors!) choose to use arrays larger
than 64 kilobytes, and leave us 80286 users in the lurch!  That's
what choice is all about.
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi

edw@ius1.cs.cmu.edu (Eddie Wyatt) (10/10/87)

In article <1252@bsu-cs.UUCP>, dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
> I wrote, referring to ANSI's 6-character limit on external identifiers
> in portable C,
> 
>      No modern system requires such short external identifiers
> 
> and suggested that programmers use 8 characters as a working maximum.
> 
> In article <6543@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) 
> writes:
> >That's one too many for PDP-11 UNIX, or for most pre-flexnames UNIXes.
> 
> Doug Gwyn is factually correct, but my point is important too.  I don't
> consider the quoted systems to be modern systems.  As technology moves
> on, those with older architectures will be forced to balance the cost
> (and benefits) of sticking with what they have against the cost (and
> benefits) of investing in an upgrade.  The length of external
> identifiers is just one small factor in that.  Programmers must ask
> themselves how much programming convenience they want to give up (e.g.
> convenience of long identifiers) in order to support users who have
> consciously chosen not to move with the times.
> 
> It's great to have choices;  and you and I are free to do our own
> cost/benefit analysis, and our conclusions may not be the same
> as ANSI's.  Why, you could (horrors!) choose to use arrays larger
> than 64 kilobytes, and leave us 80286 users in the lurch!  That's
> what choice is all about.
> -- 
> Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi


  I'm getting tired of your ranting.  Do you understand what PORTABLE
means.  Every C program written in pure ANSI C standards should
be able to run on any machine with an implementation of ANSI C.
The ANSI committe has accomendated the standard so that the
language definition does not preclude it from being implemented
on a particular machine. Hence, 6 letter ids have been
adopted so that the language can be implemented on some architectures,
in particular PDP-11s, where there is a limitation on the id length PERIOD.
BTW those archaic beasties, the PDP-11s, still abound.

 You are not restricted to just using 6-characters id, but you
are only guarenteed that the first 6-characters will be
used to uniquely defined the id.  

Doug's point still stands, if you want to use longer ids go ahead,
most C implementations will/do have max id lenght > 6, but you
are warned that not all implementations do.

Geez, some people are just dense!!!

-- 

					Eddie Wyatt

e-mail: edw@ius1.cs.cmu.edu

rickf@ihlpa.ATT.COM (Fritz) (10/11/87)

In article <1050@ius1.cs.cmu.edu> edw@ius1.cs.cmu.edu (Eddie Wyatt) writes:
>
> You are not restricted to just using 6-characters id, but you
>are only guarenteed that the first 6-characters will be
>used to uniquely defined the id.  

This brings out why I think the issue is not as important as it is being
made out.  Identifiers can be as long as needed for clarity. But must be
significant within the first six characters.

I teach a lot of intro C courses, and alway tell my student to be unique
within the first three characters if possible. Not because of portability,
but simply for readability.  If you have two names in the same prog. that are
not unique within the first few characters there is a much stronger chance
that they will be confused by anyone reading the program.

Long names are fine, but uniqueness in the first N characters is not only
easier on the compiler (writer? ;-|) but on humans reading the program as
well. (Personally I am much more disenchanted with the lack of case
significance)

Rick Frankel
ihnp4!ihlpa!rickf

henry@utzoo.UUCP (Henry Spencer) (10/12/87)

> ... Consider 8 significant characters to be a
> reasonable limit.  Even that's taking things a bit too far, considering
> that for $61 I can get a largely-ANSI-conformant C compiler on my
> MS-DOS machine, complete with linker...

Oh goodie.  Will it also link object modules from XYZ Inc.'s Fortran, not
to mention all those binary-only libraries from ABC Co.?  Bear in mind
that a whole lot of C users are not basement hackers, who can afford to
throw out their existing software to upgrade their programming environment.

Also note that MS/DOS is not a fair example in such cases:  almost anything
is available for MS/DOS simply because of the size of the market.  This does
NOT generalize to other systems.

Please, let us not have another re-run of the six-character debate; the issue
is not new, I assure you.  Almost nobody likes the six-character nonsense,
but it is a political necessity.  Yes, "political":  standards are worthless
if they are not widely accepted, and acceptance is often a political issue.
-- 
"Mir" means "peace", as in           |  Henry Spencer @ U of Toronto Zoology
"the war is over; we've won".        | {allegra,ihnp4,decvax,utai}!utzoo!henry

henry@utzoo.UUCP (Henry Spencer) (10/12/87)

> ... If I stick to a 6 char external minimum, my coding
> style and readability will be severely compromised.  Ever try to provide large
> numbers of MEANINGFUL identifier names with a 6 char limit????

Oh, nonsense.  It's a pain, yes, but it's not that hard, unless you belong
to the school that thinks that "NumberOfCharactersInString" is somehow more
meaningful than "nchars".  Everything I code has external names unique in
the first six characters (note that you *can* use more, you just have to
ensure uniqueness).  Once in a long while it makes me grit my teeth and
use a less-than-ideal name.  Big deal.
-- 
"Mir" means "peace", as in           |  Henry Spencer @ U of Toronto Zoology
"the war is over; we've won".        | {allegra,ihnp4,decvax,utai}!utzoo!henry

meissner@dg-rtp.UUCP (Michael Meissner) (10/12/87)

In article <11480008@hpsmtc1.HP.COM> swh@hpsmtc1.HP.COM (Steve Harrold) writes:
| Re: 6-character external names
| 
| If using names of externals with only the first 6 characters being
| recognized, is too restrictive in your choice of "meaningful" names, 
| why not use the #define technique.  For example:
| 
| #define    longandverbose  simple
| 
| Then everyone can have their cakes and eat them.

And then you run the risk of having the preprocessor run out of memory.  One
way that people haven't mentioned is to group all of the external variables
(and maybe pointers to functions) within a structure.  That way, you group
things together, and you have less clutter in the external name space.
-- 
Michael Meissner, Data General.		Uucp: ...!mcnc!rti!xyzzy!meissner
					Arpa/Csnet:  meissner@dg-rtp.DG.COM

chip@ateng.UUCP (Chip Salzenberg) (10/12/87)

In article <11480008@hpsmtc1.HP.COM> swh@hpsmtc1.HP.COM (Steve Harrold) writes:
>
>#define    longandverbose  simple
>
>Steve Harrold			...hplabs!hpsmtc1!swh

This is a good technique, but it's even better if the #define is conditional:

#ifdef BRAINDAM /* AGED :-) */
#define longandverbose  terse
#endif

Note that this method is still a royal pain when debugging.

-- 
Chip Salzenberg         "chip@ateng.UUCP"  or  "{uunet,usfvax2}!ateng!chip"
A.T. Engineering        My employer's opinions are not mine, but these are.
   "Gentlemen, your work today has been outstanding.  I intend to recommend
   you all for promotion -- in whatever fleet we end up serving."   - JTK

breck@aimt.UUCP (Robert Breckinridge Beatie) (10/13/87)

In article <6543@brl-smoke.ARPA>, gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:
> ... We don't like it either, but it IS
> necessary for some environments.

Hmmm...  why is a 6 character limit necessary in any environment?  I'm not
necessarily disagreeing with you.  But I am curious.

> There is nothing in X3.159-198x that PROHIBITS a programmer from
> exploiting support for longer, case-sensitive external names; it
> just doesn't guarantee that such code will port painlessly to
> all C implementations.  That's simply a fact of life..

Well, isn't portability the reason we need a standard in the first place?
A standard that programmers are *strongly* tempted to disregard seems to be
of little, if any, use.

In addition, this standard actually seems to be a step backward at least in
this regard.  As you pointed out (sorry for deleting the lines) even PDP unix
C compilers supported a 7 character external name.  Why are we now being limited
even more severely?

I have to agree with the original poster on this one.  It's really hard to
pack much meaning into 6 characters of variable name.  Or is this just a way
of getting programmers to restrict their use of global variables?  :-)
-- 
Breck Beatie
uunet!aimt!breck

chris@mimsy.UUCP (Chris Torek) (10/13/87)

In article <104@aimt.UUCP> breck@aimt.UUCP (Robert Breckinridge Beatie) writes:
>Hmmm...  why is a 6 character limit necessary in any environment?

Because some vendors like FORTRAN---or should I say,

	PROGRAM BECAUS
	REAL SOME
	COMMON /VENDOR/ SOME
	...
	INTEGER FUNCTION LIKE(FORTRA)
	INTEGER FORTRA
	...

These vendors refuse to go through (or put their users through) the
trauma of converting compilers, linkers, debuggers, and such.

`If it was good enough for the 1950s, it is good enough for you'

(bleah)
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

rbutterworth@orchid.UUCP (10/13/87)

Yes, the 6 case-insignificant character limit on external names
is a real pain.

However, it is not any compiler-writer that insists on this limit,
it is existing operating systems.  Many systems have been around
for a long time with existing standards for libraries with this
limit built into it.

And while it may seem unreasonable to tell someone that is going
to write a C compiler, C library, or C program that they must
be aware of this limit if they want to run on a particular
operating system, it would be even more unreasonable to tell
the vendor and users of an operating system that they are
going to have to change all their existing compilers, loaders,
libraries, and programs to take a different format of library
simply to accommodate an "improved" version of a C compiler.

I'm sure the typical reaction would be: "Forget it.  We'll
stick with the compilers we have now.  Who needs that crap?"

Imagine you've written a new improved BASIC interpreter and
you want to sell it to IBM.  You go up to them and say:  "Here's
our wonderful new BASIC.  All you have to do if you want to use
it is change your definition of card-image file from 80-columns
to be 96-columns instead, change all your existing programs
and libraries accordingly, and let all your customers know that
they have to change too if they want their programs to continue
working."  Good luck.

reggie@pdn.UUCP (George W. Leach) (10/13/87)

In article <8754@utzoo.UUCP>, henry@utzoo.UUCP (Henry Spencer) writes:
> > ... If I stick to a 6 char external minimum, my coding
> > style and readability will be severely compromised.  Ever try to provide large
> > numbers of MEANINGFUL identifier names with a 6 char limit????
> 
> Oh, nonsense.  It's a pain, yes, but it's not that hard, unless you belong
> to the school that thinks that "NumberOfCharactersInString" is somehow more
> meaningful than "nchars".  Everything I code has external names unique in
> the first six characters (note that you *can* use more, you just have to
> ensure uniqueness).  Once in a long while it makes me grit my teeth and
> use a less-than-ideal name.  Big deal.

       These types of issues tend to polarize many people into the extremist
camps.  On one hand as Henry pointed out we have those who want unrestricted
length in naming.  Then we have those who take the standard to a rediculous
limit, by keeping everything to a six character limit!!!  I have seen it!!!
Basically, people fall back on their old Fortran programming styles for 
variable names.  A little common sense will tell you that a happy median is
what is called for.  People often react the same way on the issues of style
and comments as well.  Remember too much is often just as bad (and usually
worse) than too little.


       Don't complain about externals differing within the first six characters
I have a case in which the limit is eight and I have some code from the net that
failed to even ensure externals were unique within the first eight!!!!  I'm now
in the process of renaming some variables to get around this.  Thank God this is
not a huge piece of software.  (Don't get me wrong, this is still better than
sitting down and writing the code from scratch and it is only used as a case
study).


-- 
George W. Leach					Paradyne Corporation
{gatech,codas,ucf-cs}!usfvax2!pdn!reggie	Mail stop LF-207
Phone: (813) 530-2376				P.O. Box 2826
						Largo, FL  34649-2826

gwyn@brl-smoke.ARPA (Doug Gwyn ) (10/13/87)

In article <104@aimt.UUCP> breck@aimt.UUCP (Robert Breckinridge Beatie) writes:
>Hmmm...  why is a 6 character limit necessary in any environment?

Linker restriction.

daveh@cbmvax.UUCP (Dave Haynie) (10/13/87)

in article <8754@utzoo.UUCP>, henry@utzoo.UUCP (Henry Spencer) says:
> Keywords: external names, length, bogosity to the max
> 
>> ... If I stick to a 6 char external minimum, my coding
>> style and readability will be severely compromised.  Ever try to provide large
>> numbers of MEANINGFUL identifier names with a 6 char limit????
> 
> Oh, nonsense.  It's a pain, yes, but it's not that hard, unless you belong
> to the school that thinks that "NumberOfCharactersInString" is somehow more
> meaningful than "nchars".  Everything I code has external names unique in
> the first six characters (note that you *can* use more, you just have to
> ensure uniqueness).  Once in a long while it makes me grit my teeth and
> use a less-than-ideal name.  Big deal.
> -- 
It's more than just a pain, it's downright undoable in many cases.  Like, for
instance, when your operating system has 500-2000 meaningful names of it's
own, and you're expecting all these names to be meaningful AND unique to 6
characters, while at the same time you on your own must come up with new 6
character global names that don't conflict with other names used in the 
system.  Can you say 1975?  No modern computer system imposes such limitations,
and no modern language should impose them.  Especially C, which has problems
in this area that more modern languages don't.  For example, "nchars" may be
just as meaningful as "NumberOfCharactersInString", providing that strings are
the only thing in which the number of characters has any meaning.  But as soon
as you add other data types that also have a number of characters rating, you're
screwed.  C++ could use "nchars" for every data type, but in C you'd have to
do something like snchars, bnchars, lnchars, qnchars, etc. instead of
something more reaable like "LengthString", "LengthBuffer", "LengthLine",
"LengthQueue", etc.  And really, if the spec states that 6 chars minimum are
significant, everyone will support whatever they would anyway, and system that
have a maximum of 6 characters are out of luck.  If they state that 6 
significant characters is the standard, most compilers will default to the
normal maximum of that OS/machine, and offer 6 as a compiler options, just as
many C compilers today offer an 8 significant character mode.  So systems 
that only work with 6 are again out of luck.  Regardless of what politics
says, 6 character just aren't enough for most systems, and no one's going to
use them unless they have to.  So those political folks are only fooling 
themselves.  A more realistic approach would be to say to these guys, 
"the standard is A_LIVABLE_MINIMUM_LENGTH"; if you don't like it, wake up
and smell the tost burning, pal.  This isn't 1975!

Hell, even my C64 C-linker handler more than 6 significant characters!

> "Mir" means "peace", as in           |  Henry Spencer @ U of Toronto Zoology
> "the war is over; we've won".        | {allegra,ihnp4,decvax,utai}!utzoo!henry
-- 
Dave Haynie     Commodore-Amiga    Usenet: {ihnp4|caip|rutgers}!cbmvax!daveh
   "The B2000 Guy"              PLINK : D-DAVE H             BIX   : hazy
    "Computers are what happen when you give up sleeping" - Iggy the Cat

rjg@ruby.TEK.COM (Rich Greco) (10/13/87)

The real shame here is not that you will restrict your own portability
if you violate the six character limit, but rather what vendors of
software and authors of other standards must do.

Look at ANSI GKS's Fortran binding.  It has a binding to the Fortran
subset, as well as one to the full language.

This heritage is continued with the C binding.  All functions in the C
binding of GKS will have six character names.

This will be continued for CGI, PHIGS, and the Window Management
function, because in authoring a standard you cannot require things
not required by a peer standard (for example 7 charcater variable
names).

This will also be compunded by software vendors who must determine how
portable they want software libraries sold by their company to be.  A
great number of the people will elect to sell libraries with short
function names.

This is indeed a real shame since descriptive rather than cryptic
variable names is something I have always felt was a requirement for a
high level language.

lwv@n8emr.UUCP (Larry W. Virden) (10/13/87)

Please note that there are at least TWO types of programs which will be
able to be written once ANSI C is approved and implemented; those which are
ANSI compatible and those which are not.  For companies which REQUIRE all
software to be ANSI compatible (note that the Goverment of the USA will most
likely be one of these types of 'companies') one will be restricted to using
6 character names, even if all machines in the environment to be implemented
can support flexnames.  This is why I personally would have rather seen the
6 character name issue placed somewhere other than in the standard itself.  But
then, perhaps that would have weakened the entire issue to much.  It is a very
real issue for people who intend on writing ANSI C compatible programs.

Note: is there going to be folks developing software which will analyze a
piece of C code to indicate whether or not it conforms to ANSI C?  Is this
a required feature of the ANSI C standard (that the compiler indicate all
deviations from the standard)?  Just curious.
-- 
Larry W. Virden	 75046,606 (CIS)
674 Falls Place, Reynoldsburg, OH 43068 (614) 864-8817
cbosgd!n8emr!lwv HAM/SWL BBS (HBBS) 614-457-4227.. 300/1200 bps
We haven't inherited the world from our parents, but borrowed it from our children.

mason@Pescadero.stanford.edu (Tony Mason) (10/14/87)

In article <8992@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>In article <104@aimt.UUCP> breck@aimt.UUCP (Robert Breckinridge Beatie) writes:
>>Hmmm...  why is a 6 character limit necessary in any environment?
>
>`If it was good enough for the 1950s, it is good enough for you'

It wasn't so long ago that that limitation was very real.  In Version 7 (not
really the 50's) the limitation exists in the linker that a symbol name be no
more than eight ascii characters.  With one for the '\0' at the end, and one
for the '_' at the beginning (remember this was a fix so variable names
wouldn't clash with register names in the UNIX assembler.)  All on the PDP.
A machine wherein we have *space* optimizing compilers and programs still
slowing down performance (remember the LEX presentation at USENIX last year?)
The folks at Murray Hill decided it wasn't worth the effort to build a
variable-length string handling system (and its associated complications.)

Yet we now rail against them as we work on environments where we have demand-
paged virtual memory machines, with 4MB of memory (or more, like my uVAX
workstation with 16MB.)  Why did the ANSI committee leave this temporary
limitation in?   My guess would be it had little to do with FORTRAN compilers
but rather with machines that are still in use, still running UNIX programs,
and still doing so in a limited address space.  PDP's, for all their
limitations, are still workhorses.  If you don't want to write programs for
those people still using PDP's, then you have a great chance of getting off 
with ignoring this restriction.  Of the machines I work on, I cannot think 
of any besides a PDP that have this limitation (Sun doesn't, my VAX doesn't, 
my PC doesn't, MAC's don't, and Amiga's don't.)

As for capitalization being meaningless, there are (gasp) systems which
ignore case - another hold-over from bygone limitations imposed on
programmers.  In the MS-DOS world, the old linkers were case insensitive.
Many older languages were also case insensitive (Pascal, for example.)  How
many people would have Pascal programs die if case sensitivity were suddenly
forced upon them.  For a vendor with a multi-language product selection, such
a choice would be impossible to balance.  Microsoft has taken a reasonable
approach (use of a switch to turn on case sensitivity on MS-DOS machines.)

To complain to the ANSI standards committee because they adopted (and in a
rather limited fashion as was pointed out earlier) a standard allowing
compatiblity with the *original* system on which C was developed seems a bit
pedantic.  I salute them for being able to balance the interests of so many
people at all.

Tony Mason
Distributed Systems Group
Stanford University
mason@pescadero.stanford.edu

dag@chinet.UUCP (Daniel A. Glasser) (10/14/87)

I've run into the oposite problem -- I've taken code from the net and
ported it to systems that used long names (> 8 chars) and discovered
that the code would not work because of variant spellings in the latter
chars.

The moral?  No matter how many characters your system allows, make sure
the names match on any extra chars, since the restriction may be removed.

					Daniel Glasser
		One of those things that goes "BUMP!!!(ouch!)" in the night.

wong@llama.rtech.UUCP (J. Wong) (10/14/87)

In article <104@aimt.UUCP> breck@aimt.UUCP (Robert Breckinridge Beatie) writes:
>
>Hmmm...  why is a 6 character limit necessary in any environment?  I'm not
>necessarily disagreeing with you.  But I am curious.
>

IBM linker/loaders recognize only 6 characters of significance.

>
>Well, isn't portability the reason we need a standard in the first place?
>A standard that programmers are *strongly* tempted to disregard seems to be
>of little, if any, use.
>
>In addition, this standard actually seems to be a step backward at least in
>this regard.  As you pointed out (sorry for deleting the lines) even PDP unix
>C compilers supported a 7 character external name.  Why are we now being limited
>even more severely?
>
The point is that if you expect to port a program to an IBM mainframe,
you'd better have all your external names unique within 6 characters.

				J. Wong		ucbvax!mtxinu!rtech!wong

****************************************************************
You start a conversation, you can't even finish it.
You're talking alot, but you're not saying anything.
When I have nothing to say, my lips are sealed.
Say something once, why say it again.		- David Byrne

peter@sugar.UUCP (Peter da Silva) (10/14/87)

How do you intend to implement longer external names when the system linker
only supports six character single case? You can write your own linker,
I suppose, and watch all your potential customers stay away in droves
because they can't link 'C' to Fortran...
-- 
-- Peter da Silva  `-_-'  ...!hoptoad!academ!uhnix1!sugar!peter
-- Disclaimer: These U aren't mere opinions... these are *values*.

gwyn@brl-smoke.ARPA (Doug Gwyn ) (10/14/87)

In article <407@ruby.TEK.COM> rjg@ruby.UUCP (Rich Greco) writes:
>This heritage is continued with the C binding.  All functions in the C
>binding of GKS will have six character names.

This is either
	(a) necessary, because you expect there to be a GKS binding in
	    an environment that really cannot support longer external
	    identifiers;
or
	(b) unnecessary, because you don't want to worry about such
	    environments for GKS.

In the former case, all the wishing in the world is not going to help.

In the latter case, the GKS binding standard could require 31-character
case-sensitive external name uniqueness, if the people drawing up the
GKS binding standard felt like doing so.  This would be a requirement
above and beyond those specified for ANSI C, and it would not conflict
with the requirements for ANSI C.  This sort of thing has been done for
the POSIX (IEEE 1003.1) standard, which adds semantics to some of the
ANSI C-defined library functions.  You just have to be careful not to
require something which breaks conformance to the spec; longer extern
significance does not.

dan@rose3.Rosemount.COM (Dan Messinger) (10/14/87)

(no flames intended, I'm just answering his question)

In article <104@aimt.UUCP> breck@aimt.UUCP (Robert Breckinridge Beatie) writes:
>Hmmm...  why is a 6 character limit necessary in any environment?  I'm not
>necessarily disagreeing with you.  But I am curious.

Because the compiler is not in control of the situation.

Consider what has to be done to a C source file to get it to executable
binary.  What does the average Unix C compiler produce?  Assembly!  So
you need a new assembler that can handle long external names, too.

And what does the assembler produce?  An object file.  You need to redefine
the format of the object files to have space to longer identifiers.

The object files are feed to a linker.  You need a new linker that uses
the new object file format.   The linker also searches a bunch of libraries.
You need all new libraries that are in the object file format.

Now this linker was used with compilers other than the C compiler.  What
happens to those Pascal and Fortran programs?  Better keep the old linker
around for them.  And then there are the little assembly routines that you
wrote that are linked into those Pascal and Fortran programs.  Can't
use the new assembler on those, since the old linker can't handle the new
object file format.  Better keep the old assembler around too.  My, this
is getting messy!

And now its time to debug your new C program.  Oops!  That fancy symbolic
debugger that you had can't handle these new long names.  And you want
to maintain your own library of routines?  Ar knows a little about the
object format so that it can make symbol tables for the linker (depends
on your Unix version).  The list goes on...

(Question:  The symbol section of the executable will need to be changed,
too.  Is there anything in the kernal that would choke if the format
of the symbol table changed?)

In summary, it takes far more than a new version of cc to get longer
external identifiers.  It a matter of momentum (and profits).  There is
a LOT of software that needs to be changed on the ol' PDP-11 to increase
the external id size.  Do you know of any software houses that want to
develope all that software, knowing that there are a fixed number of PDP-11s
in the world, and that number is getting smaller?

>Well, isn't portability the reason we need a standard in the first place?
>A standard that programmers are *strongly* tempted to disregard seems to be
>of little, if any, use.

A standard is useless if it can't be attained at all.

Dan Messinger
dan@rose3.rosemount.com

daveb@geac.UUCP (Dave Collier-Brown) (10/14/87)

In article <104@aimt.UUCP> breck@aimt.UUCP (Robert Breckinridge Beatie) writes:
>Hmmm...  why is a 6 character limit necessary in any environment?
>

In article <8992@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
>Because some vendors like FORTRAN---or should I say,
>	PROGRAM BECAUS
>These vendors refuse to go through (or put their users through) the
>trauma of converting compilers, linkers, debuggers, and such.


  Actually this is *real* easy to change, if the supplier stays in
business for a few years:
   First, you change the standard compilers to generate a new linker
record-type, "long name", and the linker to use it.  You don't tell
*ANYONE* about this.
   Then you wait until more than 90% of the sites are using that
recent a release of the system, and announce that the next release
will drop support of the old linkage-record format.
   30% of the customers complain.
   25% are reasurred that they're already using the new type.
    3% upgrade to a later linker (not necessarily the latest), and
your sales manager smiles...
   You announce the drop of the old format, but really only insert an
error-message routine to warn of the old format.
    7% complain about the funny messages.
   Your sales manager $miles again.

   You can now drop support for the old record type whenever sales
says its advisable.

 --dave (this is part of "churning the base") c-b
-- 
 David Collier-Brown.                 {mnetor|yetti|utgpu}!geac!daveb
 Geac Computers International Inc.,   |  Computer Science loses its
 350 Steelcase Road,Markham, Ontario, |  memory (if not its mind)
 CANADA, L3R 1B3 (416) 475-0525 x3279 |  every 6 months.

mckeeman@endor.harvard.edu (William McKeeman) (10/14/87)

In article <1132@gilsys.UUCP> mc68020@gilsys.UUCP (Thomas J Keller) writes:
>
>	   ******  S  I  X  ******  character double case external names!!!!!!
>...
>    I wish I knew how to contact the committee and propose that they fix their
>screwup.  Maybe someone who reads this will know, and pass it on.

Yeah, its a botch.  But its not clear how x3j11 could have done
otherwise since the culprit is ancient loaders.  How about defensive
programming?  For everything you want to share with outsiders, have
#include botch.h
where you place defines like
#define MyLovelyHandpickedName X00001

So long as everyone uses your botch.h only the loader will have to
see those ugly short names.
/s/ Bill
W. M. McKeeman      mckeeman@harvard.edu
105 Aiken Computation Laboratory
Harvard University, Cambridge MA   02138

jwf@munsell.UUCP (Jim Franklin) (10/14/87)

The proposed ANSI C limit of 6 characters on external symbols is
completely bogus.  It will make a great deal of existing code
non-complying, and the thought of the federal government mandating
compliance with such brain damage is frightening.

The "justification" given for the limit is that ANSI C needs to be
supported by many vendors in order for it to succeed as a standard, and
there exists a variety of obsolete compilers and linkers that will only
support 6 character externals.

Suppose the ANSI C standard permitted a more "reasonable" limit, say 16
or 32 characters.  Companies with compilers/linkers that couldn't handle
this limit would have to fix them.  So what?  If I had to guess, most of
this obsolete software is owned by very large companies like IBM, DEC,
HP, PRIME, etc.  These companies have large resources ($$$ and people),
and they also sell a lot of systems/software. They therefore have a large
incentive to fix their obsolete software, and the resources to do it.

What is better?  Having a few companies with large resources and a large
potential payback fix a few compilers/linkers, or have the entire rest of
the C community rewrite many millions of lines of C?

Give me a break!  How hard is it to make an existing compiler/linker
accept longer names?  If it isn't just changing the line
        #define MAX_EXTERN_LEN  6
to
        #define MAX_EXTERN_LEN  32
then the company that wrote that software deserves to be screwed, not
the rest of the C community.
-----
{harvard!adelie,{decvax,allegra,talcott}!encore}!munsell!jwf

Jim Franklin, Eikonix Corp., 23 Crosby Drive, Bedford, MA 01730
Phone: (617) 663-2115 x415

guy%gorodish@Sun.COM (Guy Harris) (10/15/87)

> What is better?  Having a few companies with large resources and a large
> potential payback fix a few compilers/linkers, or have the entire rest of
> the C community rewrite many millions of lines of C?

I remain completely unconvinced that the advent of the ANSI C standard will
cause "millions of lines of C to be rewritten" to conform to the 6-character
one-case limitation.  The ANSI C standard *in no way* tightens the requirements
for fully-portable code; in fact, it substantially *loosens* those
requirements!

ANSI C compilers are required to consider internal symbols to be different if
they differ anywhere in the first 31 characters; code written to be portable to
older UNIX compilers will have no trouble with this.  They are required to
offer many library routines that may not be offered with current C
implementations; they are required to support not only "K&R C", but the
extensions subsequently made to it, such as "enum"s, "void", non-unique
structure names, etc., etc..

And, as for the 6-character one-case limits on external identifiers - if you
really want to be portable to *all* C implementations, you have to obey that
restriction *NOW*.

Any organization imposing C coding standards, or standards for C
implementations that they purchase, is completely free to to relax restrictions
so that conforming, but not strictly conforming, applications meet the coding
standards, and is completely free to impose additional restrictions on C
implementations over and beyond the minimal restriction that they be minimally
ANSI C conformant.  They could, for example, say "no C implementation shall be
purchased unless it supports at least 7 character external names without case
restrictions", if they don't care about OS/360-and-successors.

As for the claim that "the entire rest of the C community" will end up
rewriting "millions of lines of C", I can cite several organizations of whom
I'd willing to bet $10,000 *each* that they will not rewrite the bulk of the
C code in their UNIX implementations to conform to the minimal 6-character
one-case external name limitation (in fact, in many cases, they probably will
not rewrite a *single line* of C code to conform to that restriction!):

	American Telephone and Telegraph Corporation
	Apollo Computer
	Digital Equipment Corporation
	Hewlett-Packard
	International Business Machines Corporation
	Sun Microsystems
	...

(If your company isn't on this list, it's just because I didn't want to fill up
half this article with the list; if I thought people would actually take me up
on the bet, I'd lengthen the list considerably and then put in my order for an
F40....)
	Guy Harris
	{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
	guy@sun.com

gwyn@brl-smoke.ARPA (Doug Gwyn ) (10/15/87)

In article <1298@wyszecki.munsell.UUCP> jwf@munsell.UUCP (Jim Franklin) writes:
>What is better?  Having a few companies with large resources and a large
>potential payback fix a few compilers/linkers, or have the entire rest of
>the C community rewrite many millions of lines of C?

This constraint on portable programming IS NOT AN INVENTION OF X3J11;
it is the sort of thing a programmer concerned with portability has
always had to take into account.  The half-million lines of portable
C code I've created, plus another half-million lines of less-portable
code that I personally maintain, do not rely on long external name
significance.  That's because I have been concerned with portability
all along, and resorted to practical measures rather than wishful
thinking.

I have absolutely no sympathy for programmers who ignored the clear
warning about this given in K&R and are now shouting that "X3J11 has
broken their (already non-portable) code".  No matter what the ANSI C
standard says, that aspect of your code remains exactly as portable
(or not) as it currently is.  The only practical effect of specifying
longer external name significance is to reduce the number of
implementations that will conform to the standard in the near future.
Because there are intentionally no COBOL-like "levels" to the ANSI C
standard, a programmer is in general unable to determine just WHAT a
non-conforming "C" implementation consists of; it could be
non-conforming in ways much more important than limited external
identifier significance.  X3J11 obviously decided it would be a bigger
disservice to the C programming community to limit the number of
environments that could be advertised as ANSI-conforming due to this
one issue than to support a larger range of conforming environments at
the expense of the portable programmer having to take proper precautions.

Your ideas about what is involved in improving existing linker and
object module format limitations are incredibly naive.  They WILL
eventually improve, but not overnight, and what the ANSI C standard
currently says applies about as much pressure as is realistically
possible in that direction.

breck@aimt.UUCP (Robert Breckinridge Beatie) (10/15/87)

In article <8992@mimsy.UUCP>, chris@mimsy.UUCP (Chris Torek) writes:
> In article <104@aimt.UUCP> breck@aimt.UUCP (Robert Breckinridge Beatie) writes:
> >Hmmm...  why is a 6 character limit necessary in any environment?
> Because some vendors like FORTRAN---or should I say,

> 	[ Fortran-like explanation deleted]

> These vendors refuse to go through (or put their users through) the
> trauma of converting compilers, linkers, debuggers, and such.
> 
> `If it was good enough for the 1950s, it is good enough for you'
> (bleah)

You mean that this is to maintain something like compatability with another
language?  And that language is Fortran?

Sigh!

Well, at least this nonsense is destined to be phased out of the ansi
standard.  At least some posters have given that impression.
-- 
Breck Beatie
uunet!aimt!breck

schwartz@gondor.psu.edu (Scott E. Schwartz) (10/15/87)

In article <1298@wyszecki.munsell.UUCP> jwf@munsell.UUCP (Jim Franklin) 
writes about some companies with short external name restrictions:

>HP, PRIME, etc.

Prime, at least for EPF type executables, has already fixed this.
Programs built with BIND support 32 character external names.
(At least it said this in the manual, I didn't bother to try it since our
Suns arrived at about that time :-)

If prime can fix this, why can't everybody else.


-- Scott Schwartz            schwartz@gondor.psu.edu

arnold@apollo.uucp (Ken Arnold) (10/15/87)

In article <6543@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>In article <1246@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
>>No modern system requires such short external identifiers.
>
>Excuse me, but if that were true you can bet that X3J11 would not
>have imposed the restriction!  We don't like it either, but it IS
>necessary for some environments.

Let's not overstate it.  It is *easiest* for some environements.  There
is always a work-around for the compiler writer.  It may be an ugly hack
(in fact, it almost certainly is), but it can be done.  It just may mean
that the global labels may not map trivially onto the names used by the
programmer, but, quite frankly, I'd rather the people living with older,
more limited systems have problems than giving the people working with more
modern systems limitations when trying to write portable code.


		Ken Arnold

flaps@utcsri.UUCP (10/16/87)

In article <2481@cbmvax.UUCP> daveh@cbmvax.UUCP (Dave Haynie) writes:
>For example, "nchars" may be
>just as meaningful as "NumberOfCharactersInString", providing that strings are
>the only thing in which the number of characters has any meaning.  But as soon
>as you add other data types that also have a number of characters rating,
>you're screwed...
>you'd have to
>do something like snchars, bnchars, lnchars, qnchars, etc. instead of
>something more readable like "LengthString", "LengthBuffer", "LengthLine",
>"LengthQueue", etc.

Umm... it's much better style in general to use variables that differ
initially, such as "StringLength", "BufferLength", "LineLength",
"QueueLength", etc.  And these are indeed unique to 6 characters.

(Disclaimer: I don't like six char externals, but at least ansi isn't
adding to the existing language here like they did with function
prototypes.)

ajr

rjg@ruby.TEK.COM (Rich Greco) (10/16/87)

In responding to my article gwyn@brl-smoke.ARPA (Doug Gwyn ) writes:

>  You just have to be careful not to
>require something which breaks conformance to the spec; longer extern
>significance does not.

This conflicts with my understanding, and clearly conflicts with the
guidelines followed by the X3H3 bindings comittie.  They are unable to
require anything above the minimal conforming language definition for
the bindings which come out of their comittie.

Anyone from X3H3 on the net who can clarify this?

daveh@cbmvax.UUCP (Dave Haynie) (10/16/87)

in article <2997@husc6.UUCP>, mckeeman@endor.harvard.edu (William McKeeman) says:
> Keywords: external names, length, bogosity to the max
> 
> In article <1132@gilsys.UUCP> mc68020@gilsys.UUCP (Thomas J Keller) writes:
>>	   ******  S  I  X  ******  character double case external names!!!!!!
>>    I wish I knew how to contact the committee and propose that they fix their
>>screwup.  Maybe someone who reads this will know, and pass it on.
> Yeah, its a botch.  But its not clear how x3j11 could have done
> otherwise since the culprit is ancient loaders.  How about defensive
> programming?  For everything you want to share with outsiders, have
> #include botch.h
> where you place defines like
> #define MyLovelyHandpickedName X00001

Instead of having to deal with ugly thing like this (what happens during
debugging), I claim they should have set a reasonable minimum number of
characters.  Maybe 32 or so, maybe no limits at all.  We are talking about 
a language that doesn't yet exits.  Meaning that everyone who wants to use
it will have to bring up a new|modified compiler on their systems.  While
they're bringing up new software, they have an ideal opportunity to FIX
their brain damaged linkers.  If a $200 home computer can get this right,
they certainly shouldn't have any problem with $500,000 mainframes.

> /s/ Bill
> W. M. McKeeman      mckeeman@harvard.edu
> 105 Aiken Computation Laboratory
> Harvard University, Cambridge MA   02138
-- 
Dave Haynie     Commodore-Amiga    Usenet: {ihnp4|caip|rutgers}!cbmvax!daveh
   "The B2000 Guy"              PLINK : D-DAVE H             BIX   : hazy
    "Computers are what happen when you give up sleeping" - Iggy the Cat

peter@sugar.UUCP (Peter da Silva) (10/16/87)

How would you guys feel about a set of global names in a multi million line
software project with a 4-character limit? 6 characters is abominable but
(alas) unavoidable.
-- 
-- Peter da Silva  `-_-'  ...!hoptoad!academ!uhnix1!sugar!peter
-- Disclaimer: These U aren't mere opinions... these are *values*.

peter@sugar.UUCP (Peter da Silva) (10/16/87)

>                                  Of the machines I work on, I cannot think 
> of any besides a PDP that have this limitation (Sun doesn't, my VAX doesn't, 
> my PC doesn't, MAC's don't, and Amiga's don't.)

Any mainframe you can name is likely to have this limitation. Why? It's
simple... you can fit 6 uppercase letters into a 36-bit word... so why not
take advantage of this and stick all your symbols in words (yes, we know why
NOW but these systems were designed in the '60s).

Also, the PDP-11 under UNIX has a *7* character limitation (there was not, as
you implied, a null included at the end). Under RSX it has a 6 character
limitation because they used a character encoding called Radix-40 that allowed
you to fit 6 uppercase letters into a 32-bit dword.
-- 
-- Peter da Silva  `-_-'  ...!hoptoad!academ!uhnix1!sugar!peter
-- Disclaimer: These U aren't mere opinions... these are *values*.

dave@sdeggo.UUCP (David L. Smith) (10/16/87)

In article <292@rose3.Rosemount.COM>, dan@rose3.Rosemount.COM (Dan Messinger) writes:
> In summary, it takes far more than a new version of cc to get longer
> external identifiers.  It a matter of momentum (and profits).  There is
> a LOT of software that needs to be changed on the ol' PDP-11 to increase
> the external id size.  Do you know of any software houses that want to
> develope all that software, knowing that there are a fixed number of PDP-11s
> in the world, and that number is getting smaller?

Well, if software houses aren't willing to develop software for a PDP-11, who
is going to develop the ANSI standard compilers for them?  It may not be
terribly difficult to port the compiler from, say, a VAX, since they are
fairly similar, but you still have to assign people to maintain it, do
bug fixes, etc.

It seems like a circular argument to me: We have to do this so that the 
compilers which aren't written yet for a dead machine which no one is
willing to write software for can be written.

I'm not in favor of this 6 character external limit.  It may be optional,
but it seems silly to have people saying "Well, our compiler is compliant
except that it produces 32 byte external names."  Let the people with the
ancient hardware and software have the non-compliant compilers!



-- 
David L. Smith
{sdcsvax!man,ihnp4!jack!man, hp-sdd!crash, pyramid}!sdeggo!dave
man!sdeggo!dave@sdcsvax.ucsd.edu 
The net.goddesses made me do it!

henry@utzoo.UUCP (Henry Spencer) (10/16/87)

> ... Companies with compilers/linkers that couldn't handle
> this limit would have to fix them.  So what?  If I had to guess, most of
> this obsolete software is owned by very large companies like IBM, DEC,
> HP, PRIME, etc.  These companies have large resources ($$$ and people),
> and they also sell a lot of systems/software. They therefore have a large
> incentive to fix their obsolete software, and the resources to do it.

Large companies did not get to be large by wasting money.  The first thing
that will occur to people at those companies is "how much will it cost us
if we simply ignore this silly standard?".

Note also that those companies, as major C users, legitimately and properly
have considerable say in the standard's development.

The odds are good that the companies in question will eventually get their
act together and do something about the problem.  However, trying to dragoon
them into doing so as a crash project by writing standards that require it
is very poor strategy; it may even delay the day when the 6-character rule
can go away.  That day will not be soon in any case; there is too much
inertia from massive customer bases.  Very large companies tend to have very
large commitments to support what they've done in the past.

> What is better?  Having a few companies with large resources and a large
> potential payback fix a few compilers/linkers, or have the entire rest of
> the C community rewrite many millions of lines of C?

Rewrite millions of lines of C?!?  What on Earth are you talking about?
C that does not observe the 6-character rule is *already* unportable, which
is about all that violating the standard implies.  None of *my* code needs
rewriting, and not all that much of the more sensibly-written code in the
world needs it either.  In any case, the "rewriting", when needed, can be
entirely mechanized via the preprocessor.

> Give me a break!  How hard is it to make an existing compiler/linker
> accept longer names?  If it isn't just changing the line
>         #define MAX_EXTERN_LEN  6
> to
>         #define MAX_EXTERN_LEN  32
> then the company that wrote that software deserves to be screwed, not
> the rest of the C community.

First you have to change that line in *all* your compilers and *all* your
utilities that manipulate object files, and then recompile them all.  That's
the easy part.  Then you have to make all the object-file-manipulating
programs understand both old and new format, because there will always be
old-format files lurking in some obscure corner of the world.  Then you
have to get those two-format programs out to every last customer who might
ever see a new-format object module.  Then you have to warn the independent
software suppliers and convince them to update *their* programs similarly.
*Then*, finally, you can start shipping the new compilers.  Fifteen or
twenty years later, you can perhaps think about dropping support for the
old format (for which you will be vilified by a substantial number of
customers, by the way, even then).

*FLAME ON*
If your cavalier disregard for compatibility and promises of continued
support, and your total lack of understanding of the difficulty of making
such a change over a large customer base, are at all typical of your
company, remind me never to buy anything from Eikonix.
*FLAME OFF*
-- 
"Mir" means "peace", as in           |  Henry Spencer @ U of Toronto Zoology
"the war is over; we've won".        | {allegra,ihnp4,decvax,utai}!utzoo!henry

rwhite@nusdhub.UUCP (Robert C. White Jr.) (10/16/87)

In article <292@rose3.Rosemount.COM>, dan@rose3.Rosemount.COM (Dan Messinger) writes:
> In summary, it takes far more than a new version of cc to get longer
> external identifiers.  It a matter of momentum (and profits).  There is
> a LOT of software that needs to be changed on the ol' PDP-11 to increase
> the external id size.  Do you know of any software houses that want to
> develope all that software, knowing that there are a fixed number of PDP-11s
> in the world, and that number is getting smaller?

As long as the numbers only get larger nothing has to be changed.

If an identifier is unique in it's first 6 characters, it is unique
in it's first 31.  The old stuff will still work fine, some of it
[symbolic debuggers etc] just may not be a useful against the new code.

Few of the apps will actually change no matter what, but the "changing"
standard will require some of our favorite utilities to be "changed"

not just:
cc
asm
ld
but also:
dump
sdb
[all the trace stuff]
[all the archive stuff (ar...)]

I think the "will be obsolete" warning is enough for this revision.
That says this limit wont last long, and it gives all those people
out there some warning that all their tools are about to need some
work.

Rob.

guy%gorodish@Sun.COM (Guy Harris) (10/17/87)

> I'm not in favor of this 6 character external limit.  It may be optional,
> but it seems silly to have people saying "Well, our compiler is compliant
> except that it produces 32 byte external names."  Let the people with the
> ancient hardware and software have the non-compliant compilers!

Oh, good grief.  A compiler that "is compliant except that it produces 32 byte
external names" is COMPLETELY compliant.  There is NOTHING in the ANSI standard
that says that a compiler producing long external names is non-compliant.
Period.  End of discussion.

What the standard says is that programs *written* in C are not *strictly*
conforming, which means you aren't guaranteed to be able to compile and run it
on *every single ANSI-conforming C implementation*.  This doesn't necessarily
mean that it won't compile and run on 90% of the ANSI-conforming C
implementations out there.

The ANSI C standard is not going to force *anybody* to use 6-character one-case
external names, unless they're interested in porting to every single
ANSI-conforming C implementation out there - but if they're *that* interested
in portability, they may already have decided that they have to restrict
themselves to 6-character one-case external names.

Once more: the adoption of the ANSI C standard will not force ANY C
implementations to limit themselves to 6-character one-case external names.
Anybody out there who still believes this should stop doing so.
	Guy Harris
	{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
	guy@sun.com

guy%gorodish@Sun.COM (Guy Harris) (10/17/87)

> Few of the apps will actually change no matter what, but the "changing"
> standard will require some of our favorite utilities to be "changed"
> 
> not just:
> cc
> asm
> ld
> but also:
> dump
> sdb
> [all the trace stuff]
> [all the archive stuff (ar...)]

Excuse me, but if these are all UNIX utilities (as I infer they are from the
presence of e.g. "sdb"), then on most machines these days they all support
"flexnames" which means they support external names lots longer than 31
characters; thus, the "'changing' standard" won't require anything to be done
to them in that regard.  Of course, on most UNIX machines they don't support
ANSI C, so that will have to change....
	Guy Harris
	{ihnp4, decvax, seismo, decwrl, ...}!sun!guy
	guy@sun.com

ron@topaz.rutgers.edu (Ron Natalie) (10/17/87)

I should follow up the claim that early (PDP-11) unices only had
6 character C externals.  It was seven.  You didn't need to have a
null in the symbol table since the entries were fixed at eight
characters.  The _ prefix ate the eigth character.

One annoying and totally confusing bug to people who hadn't seen
it before was when your C compile did
	m 232
The C compiler had eight character significance in it's symbol
table so it realized that symbols like "abcdefgh" and "abcdefgi"
were different and made assembler code using them.  When prepended
with underscore, the differentiating last character was pushed off
the end and the assembler bitched about multiple symbol definition.

-Ron

bts@sas.UUCP (Brian T. Schellenberger) (10/17/87)

The six-character limit on external names in ANSI is due to the fact that
external names must be resolved by the LINKER, not the COMPILER.  Compiler-
writers are forced under ANSI to keep very long names around; ANSI just 
didn't force compiler-writers to write linkers as well.  In fact, there
are systems where the name limit is forced by the standard system 
object code format.

On IBM MVS systems, for example, external names longer than eight characters 
are simply not expressible in the object-file format; in order for a C
compiler to use them, it would have to have an incompatible format and 
the compiler-writers would have to write their own linker, which would 
effectively preclude sharing code with other languages and standard 
packages (not to mention talking to the operating system!).

Sorry, but that's the way it is.
-- 
                                                         --Brian.
(Brian T. Schellenberger)				 ...!mcnc!rti!sas!bts

DISCLAIMER:  Whereas Brian Schellenberger (hereinafter "the party of the first 

mjr@osiris.UUCP (Marcus J. Ranum) (10/18/87)

	'nuff said !
-- 
If they think you're crude, go technical; if they think you're technical,
go crude. I'm a very technical boy. So I get as crude as possible. These
days, though, you have to be pretty technical before you can even aspire
to crudeness...			         -Johnny Mnemonic

crowl@cs.rochester.edu (Lawrence Crowl) (10/18/87)

In article <5532@utcsri.UUCP> flaps@utcsri.UUCP (Alan J Rosenthal) writes:
>In article <2481@cbmvax.UUCP> daveh@cbmvax.UUCP (Dave Haynie) writes:
>>... something more readable like "LengthString", "LengthBuffer",
>>"LengthLine", "LengthQueue", etc.
>
>Umm... it's much better style in general to use variables that differ
>initially, such as "StringLength", "BufferLength", "LineLength",
>"QueueLength", etc.  And these are indeed unique to 6 characters.

Which is just fine until you need "StringConcat", "StringIndex", "StringField",
"StringUpcase", "StringDowncase", etc.  Six characters are simply not enough
to readably encode both abstract data type and operation name.  One is forced
into some "line noise" encoding to fit into six unique characters.

My personal approach is to use as many characters in external names as I deem
reasonable.  If the code must be ported to a system which only supports six
character external names I'll either buy a REAL compiler/linker or, failing 
that, invoke "s/ReasonableName/resnam/g".

Remember, ANSI may have to support short externals, but you do not.
-- 
  Lawrence Crowl		716-275-9499	University of Rochester
		      crowl@cs.rochester.edu	Computer Science Department
...!{allegra,decvax,rutgers}!rochester!crowl	Rochester, New York,  14627

greg@bass.nosc.MIL (10/18/87)

I think that this issue is important enough to add another message.
There is a way to allow full external name significance without
running afoul of old linkers, etc.

A source to source translator can be provided which reduces the
significance of C external identifiers to six monocase characters.
The design of such a tool is sketched below.  Given such a translator
available in the public domain, the ANSI Standard can require that
implementors either provide full significance throughout their
environment or provide such a translator.  The standard can also
strongly recommend the former as a quality issue.

I was willing to simply accept the restricted external namespace until
the recent message (I forget the author, sorry) explaining what this
will do to the interfaces to large packages, e.g., graphics and
database packages.  Such packages are becomming more and more common
in C programming environments.  This point woke me up to the severe
impact a restricted external namespace will have on the quality of our
programs.  Even a slight increase in maintenance difficulties becomes
a major expense and trial.

As I recall, a software tool to reduce name significance has already
been posted to net.sources.  It probably uses Unix tools, and anyway
it affects internal identifiers too, so its not suitable.  Here's what
I envision:

The tranlator would begin by reading a table of external identifiers
already seen and name substitutions already determined (in processing
other modules or entered by hand when C source isn't available for a
module) and then it would translate one or more modules.  New external
identifiers clashing with those previously seen would be given a
unique prefix, so that their full spelling is still available as a
comment and for those tools that can access it.  After translating the
modules, the augmented identifier table would be output.  Programmers
using debuggers, etc. can refer to the identifier table as needed.

I do not think such a translator need know much about C syntax.  Its
mostly a matter of ignoring reserved words and noticing whether an
identifier is mentioned outside of curly braces.  A fancier
implementation would avoid renaming typedefs, etc., but there would be
no harm in translating all externally mentioned symbols (other than
reserved words).  I also think that such a tool can be quite fast, not
greatly increasing compilation overhead.

If you're interested in writing such a tool, please send me a note.
I won't be able to take that on for a few weeks, and I think the
sooner such a tool is available the better.  Remember, it must be
public domain and be fully portable.


_Greg


J. Greg Davidson			  Virtual Infinity Systems
+1 (619) 452-8059        6231 Branting St; San Diego, CA 92122 USA
 
greg@vis.uucp				ucbvax--| telesoft--|
greg%vis.uucp@nosc.mil			decvax--+--sdcsvax--+--vis
greg%vis.uucp@sdcsvax.ucsd.edu		 ihnp4--|     nosc--|

karsh@geowhiz.UUCP (Bruce Karsh) (10/19/87)

In article <6562@brl-smoke.ARPA> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>In article <104@aimt.UUCP> breck@aimt.UUCP (Robert Breckinridge Beatie) writes:
>>Hmmm...  why is a 6 character limit necessary in any environment?
>
>Linker restriction.

  Am I missing something in this argument?  Is the argument that vendors
are able to write ANSI C compilers, but they are unable to modify their
linkers to allow long names?  Why is it that they can do all the work needed
to write an ANSI compatible C compiler, but they can't extend the name
length limit in their linkers?

  Are their linkers really that complex?



-- 

Bruce Karsh
U. Wisc. Dept. Geology and Geophysics
1215 W Dayton, Madison, WI 53706
(608) 262-1697
{ihnp4,seismo}!uwvax!geowhiz!karsh

eric@snark.UUCP (Eric S. Raymond) (10/19/87)

In article <9840@brl-adm.ARPA>, vis!greg@bass.nosc.MIL writes:
> There is a way to allow full external name significance without
> running afoul of old linkers, etc.
> 
> A source to source translator can be provided which reduces the
> significance of C external identifiers to six monocase characters.

Three cheers for this man! He has found a way to cut the Gordian knot!

Like him, I've found myself reluctantly siding with the 'conservatives'
on this issue -- it would be a disaster if major vendors were to torpedo
or ignore the X3J11 standard because of conversion costs. And it is true
that the six-character limit is not a new restriction.

On the other hand, the limit really is a sufficiently royal pain that I think
we'd all vastly prefer, given any out, not to have it in X3J11. And Greg's idea
gives us a way to do that that doesn't break old code *and* puts the
conformance burden where it belongs -- on the vendors with the botched archaic
linkers, not the rest of us.

I say: hear! hear! hear! Please, Greg, make this a formal proposal and submit
it to X3J11 (I'd do it, but you deserve the kudos, paeans and glory). And you
official-unofficial X3J11 people out there; if there's something fatally
wrong with the concept, let us know *now* before we start writing the PD
translator.
-- 
      Eric S. Raymond
      UUCP:  {{seismo,ihnp4,rutgers}!cbmvax,sdcrdcf!burdvax,vu-vlsi}!snark!eric
      Post:  22 South Warren Avenue, Malvern, PA 19355    Phone: (215)-296-5718

bhj@bhjat.UUCP (burt janz) (10/20/87)

> Also, the PDP-11 under UNIX has a *7* character limitation (there was not, as
> you implied, a null included at the end). Under RSX it has a 6 character
> limitation because they used a character encoding called Radix-40 that allowed
> you to fit 6 uppercase letters into a 32-bit dword.
> -- 
> -- Peter da Silva  `-_-'  ...!hoptoad!academ!uhnix1!sugar!peter
> -- Disclaimer: These U aren't mere opinions... these are *values*.

I'm sure you mean RAD50.  It fits 3 characters into one 16-bit word, thus
6 characters into 1 32-bit dword.  The whole operating system encodes data
this way (filenames, symbol tables, etc.)

Just a minor correction...

Burt Janz
(still an RSX hack... and PROUD OF IT!!!)

..decvax!bhjat!bhj

daveb@geac.UUCP (10/20/87)

In article <9840@brl-adm.ARPA> greg@bass.nosc.MIL writes:
>I think that this issue is important enough to add another message.
>There is a way to allow full external name significance without
>running afoul of old linkers, etc.
>
>A source to source translator can be provided which reduces the
>significance of C external identifiers to six monocase characters.

  A canonicalization algorithm which works well (with such a tool as
the above) is as follows:

	From the right, delete vowels. Do not delete the first
character.  If less than maxima, stop.
	From the right, delete underscores.  Do not delete the first
character.  If less than maxima, stop.
	Truncate at maximum.

  This is "good" in a human sense: it tends to generate pronounceable
short-forms of supplied words.
  Please note it need merely check it has achieved uniqueness in a
tool (such as the one referred to) which keeps a history of
abbreviated words.  It need not actually truncate.

 --dave (yes, I've used such a thing. ugly...) c-b

-- 
 David Collier-Brown.                 {mnetor|yetti|utgpu}!geac!daveb
 Geac Computers International Inc.,   |  Computer Science loses its
 350 Steelcase Road,Markham, Ontario, |  memory (if not its mind)
 CANADA, L3R 1B3 (416) 475-0525 x3279 |  every 6 months.

mjr@osiris.UUCP (Marcus J. Ranum) (10/20/87)

s'ok - as long as it fits on the 80-column virtual punchcards that my
machine is built around....

(snicker) (snort)

--mjr();
-- 
If they think you're crude, go technical; if they think you're technical,
go crude. I'm a very technical boy. So I get as crude as possible. These
days, though, you have to be pretty technical before you can even aspire
to crudeness...			         -Johnny Mnemonic

dag@chinet.UUCP (Daniel A. Glasser) (10/20/87)

Oh come on now!  It has been said before, and will be said again...
I have a copy of the ANSI document in front of me, and the six character
limit is not a fixed number -- It is a MINIMUM for conforming compilers.
Those of you who don't want to make your programs run on these older
systems with this old limitation on object format will still be writing
standard ANSI C, you just won't be maximally portable.  It is outside of
the ANSI C standardization effort to specify link environments or formats.
The reason that compilers for $500 machines can support long names yet
$100,000 machines can't is that the $500 machines are NEW.  The compiler
writers knew they wanted/needed long external names and guided the
creation of the linkers on those machines.  The program translator proposal
is okay for some, but the resulting code would be less maintainable since
well chosen short names are easier to recognise.  I have worked for both
sides of the dotted line.  I used to work for a major computer manufacturer
and now I work for a C compiler writing company.  From the former, I learned
just how difficult it is to change software tools (which, for non-UNIX systems
are usually not written in C, and which often pre-date UNIX anyway) and from
the latter, I have learned just how much hostility you run into when you try
to replace what has been entrenched as the "standard" for that system, even
if your utility is far better (versitility, speed, features, robustness, etc.)
even if you supply utilities for converting both individual objects and entire
libraries to your new format.  Don't go damning the compiler/computer vendors
for this 6 character minimum, the product I am working on intends to be
ANSI compatible and support flex-name style identifiers (these are not
contradictory) but will allow the user to enable name collision checking
(runtime switch) for maximal portability.  The programmer who has written
and debugged an application with long names needs only log the output
from the compiler using this switch and create a list of "#define"s
to make shorter names to be put in some common include file for the
application.

Request:	Would someone at DEC who is, or know someone who is,
		connected with the RSX group please post some estimation
		of the size of the TKB source and the "language" it is
		written in?  Do you know anyone who understands the whole
		thing?  (MULP may, Covert may, Maybe Day?)

Disclaimer: My employer neither has knowlege of or condones my posting
	    in this subject.  I will not name them, but if you know,
	    don't bother them with it.  I take all responsibility for
	    my comments.

	    Send replies/flames to ...!{cbmvax|ihnp4}!mwc!gorgon!dag 
	    rather than the originating machine for this posting.  I
	    get the mail there more regularly.
-- 
					Daniel A. Glasser
					...!ihnp4!chinet!dag
					...!ihnp4!mwc!dag
					...!ihnp4!mwc!gorgon!dag
	One of those things that goes "BUMP!!! (ouch!)" in the night.

john@frog.UUCP (John Woods, Software) (10/20/87)

In article <1298@wyszecki.munsell.UUCP>, jwf@munsell.UUCP (Jim Franklin) writes:
> The proposed ANSI C limit of 6 characters on external symbols is
> completely bogus. ...  If I had to guess, most of
> this obsolete software is owned by very large companies like IBM, DEC,
> HP, PRIME, etc.  These companies have large resources ($$$ and people),
> and they also sell a lot of systems/software. They therefore have a large
> incentive to fix their obsolete software, and the resources to do it.
> 
No, they have a lot of ability to kill the C standard dead, dead, dead, by
simply telling the Government "It will cost you one million billion skillion
dollars to upgrade every linker/assembler/compiler on every system you
currently have.  Have a nice day."

As long as there are DP managers who have lots of bucks and don't want to
be bothered with change ("Gimme 1401 Assembler, none of this weeny C
nonsense!"), and as long as we tasteful C programmers want to separate them
from some of that money, we are going to have to wear the straightjacket of
6 characters/monocase.

--
John Woods, Charles River Data Systems, Framingham MA, (617) 626-1101
...!decvax!frog!john, ...!mit-eddie!jfw, jfw@eddie.mit.edu

"Cutting the space budget really restores my faith in humanity.  It
eliminates dreams, goals, and ideals and lets us get straight to the
business of hate, debauchery, and self-annihilation."
		-- Johnny Hart

ron@topaz.rutgers.edu (Ron Natalie) (10/21/87)

Are links all that complex?

Yes!  C'mon.  For most real applications, you would like the C
suite on a machine to be compatible with the non-C parts of the
system.  This means that you can use all the nice debuggers
and other system features that may be provided.

Of course, there are a few machines where the linker environment
is awful.  Cray COS for example.  SEGLDR is garbage.  It doesn't
even know libraries.  An example of a good loader is VMS.  The VMS
C compiler would be no fun at all if it didn't work with their
splendid library facility (I don't know if it does or not, I hate
VMS).

-Ron

farren@gethen.UUCP (10/22/87)

In article <1748@chinet.UUCP> dag@chinet.UUCP (Daniel A. Glasser) writes:
>Oh come on now!  It has been said before, and will be said again...
>I have a copy of the ANSI document in front of me, and the six character
>limit is not a fixed number -- It is a MINIMUM for conforming compilers.

Not only that, but utilities exist, and have for some time, which will
translate long identifiers into short identifiers + an include file full
of #defines, taking care of the problem quite handily.  Including such
a facility into the preprocessor, invoked with an optional command line
argument, should take little effort.  This doesn't, of course, fix
everything, as there are still issues regarding special cases like source
level debuggers, but it would certainly make the process of coping with
obsolescent compilers/linkers/loaders transparent in most cases where
portability is an issue.

-- 
----------------
Michael J. Farren      "... if the church put in half the time on covetousness
unisoft!gethen!farren   that it does on lust, this would be a better world ..."
gethen!farren@lll-winken.arpa             Garrison Keillor, "Lake Wobegon Days"

rwhite@nusdhub.UUCP (10/23/87)

In article <1748@chinet.UUCP>, dag@chinet.UUCP (Daniel A. Glasser) writes:
> Oh come on now!  It has been said before, and will be said again...
> I have a copy of the ANSI document in front of me, and the six character
> limit is not a fixed number -- It is a MINIMUM for conforming compilers.

This is a plain english paraphrase of the above posting, and some
	of the other pertinent postings.

1)  If your C complier produces 31 char dual case externals, it conforms to
	the ANSI draft.  If it only produces 5 char externals, it does not.

2)  NO SORCE CODE (anyware!) will need to be re-writen to be conformant.
	 if it already uses 1, 2, 3, 4, 5, 6, 24, or whatever significant
	 multi-case externals in it's environment, it conforms within it's
	 system.  It is only the way the compileres handle their symbol
	 tables that is at issue here.

3)  The whole bit about the linkers is as follows:
	a)  Linkers are essentally the "center" of a compilation/software
	development environment.
	b)  Linkers must be able to deal with every compiler and assembeler
	on a given system.
	c)  If the "minimum confromance" on a C compiler were jacked up
	to 31 dual case externals, _sombody_ someware would have to
	eat the cost of updating all the linkers that are already in place.
	d)  The same people who update the installed base of linkers
	would also have to eat the cost of updating any and all compilers
	on the same systems. [if they use fortran fo instance]

4)  By leaving the minimum conformance at 6 mono-case externals, a
	vendor which needs to produce code compatable to their
	installed base of other software, debuggers, compilers,
	linkers, archives [as in unix "ar"], and utilities like
	crash, dump, etc [ad nasuim] can produce such a compiler
	and STILL be conformant.

5)  Most likely checking for extreemly small mono-case externals, or
	possibly the NOT checking the above, will be implemented as
	preprocessor swithches [or some such] and all the new C
	compilers will be capable of dealing with large dual-case
	externals.

6) NOBODY ANYWHERE SAID: six, only six. No more, no less.  The wording
	is the way it is to prevent thousands of installations around
	the world from being left out in the cold with only a $22,000.00
	upgrade bill for company.

7) If your system takes 'em big, use 'em big!!

Rob.

dag@chinet.UUCP (Daniel A. Glasser) (10/23/87)

Actually, RAD50, Radix-40, RAD40, Rad-40, etc., are all the same thing.
Rad-50 is Rad-40 in Octal!  Some RSTS-E documentation called it rad-40.
There are, in fact, only 40 characters in the RAD50 character set.
-- 
					Daniel A. Glasser
					...!ihnp4!chinet!dag
					...!ihnp4!mwc!dag
					...!ihnp4!mwc!gorgon!dag
	One of those things that goes "BUMP!!! (ouch!)" in the night.

jss@hector.UUCP (Jerry Schwarz) (10/23/87)

I don't like a minmax of six significant characters, but in practice
I ignore it.  A minmax of 32 would be much worse, because 32 is still
too short but some system developers would take such a number in the
ANSI C Standard as endorsement of this limit.  They might even "fix"
their systems to ignore characters after the thirty-second.

I know that 32 sounds like enough when you are thinking of human
generated code, but when you start thinking about machine generated
code (e.g. intermediate code of the C++ compiler) you realize that 32
can easily be exceeded.

Jerry Schwarz 
Bell Labs

gwyn@brl-smoke.ARPA (Doug Gwyn ) (10/27/87)

In article <234@snark.UUCP> eric@snark.UUCP (Eric S. Raymond) writes:
>I say: hear! hear! hear! Please, Greg, make this a formal proposal and submit
>it to X3J11 (I'd do it, but you deserve the kudos, paeans and glory). And you
>official-unofficial X3J11 people out there; if there's something fatally
>wrong with the concept, let us know *now* before we start writing the PD
>translator.

It's not clear what good it would do to send this to X3J11.

There is nothing wrong with the idea, but here are some snags
you may run into in practice:

No simple scheme such as taking the first four and last two
character of longer identifiers will suffice, since many long
identifiers will map into the same short version.

Since information is being lost, in general you can also get
collisions between short version of names across multiple
translation units (object modules), since they're compiled
entirely separately so that no single hash table can guarantee
uniqueness (unless you want to maintain a system-wide single
hash table for all C external identifiers ever seen).

If the short names do not bear notable resemblance to the
original longer names, it makes debugging even harder than
it already is.

I think it's easier to address this problem before coding
than to try to solve it after the fact.

jpdres10@usl-pc.UUCP (Green Eric Lee) (10/27/87)

Distribution:

Keywords:

Summary:


In message <8781@utzoo.UUCP>, henry@utzoo.UUCP (Henry Spencer) says:
>> Give me a break!  How hard is it to make an existing compiler/linker
>> accept longer names?  If it isn't just changing the line
>>         #define MAX_EXTERN_LEN  6
>> to
>>         #define MAX_EXTERN_LEN  32
>> then the company that wrote that software deserves to be screwed, not
>> the rest of the C community.
>
>First you have to change that line in *all* your compilers and *all* your
>utilities that manipulate object files, and then recompile them all.  That's
>the easy part.  Then you have to make all the object-file-manipulating
>programs understand both old and new format, because there will always be
>old-format files lurking in some obscure corner of the world.  Then you
>have to get those two-format programs out to every last customer who might
>ever see a new-format object module.  Then you have to warn the independent
>software suppliers and convince them to update *their* programs similarly.
>*Then*, finally, you can start shipping the new compilers.  Fifteen or
>twenty years later, you can perhaps think about dropping support for the
>old format (for which you will be vilified by a substantial number of
>customers, by the way, even then).

Wow. That sure is a lot of trouble. I'd hate to do all that work, just
to support a silly "C" compiler. Hmm. Hey. Wait. Why do I have to use
IBM's linker to support my "C" compiler? Why don't I just supply a
linker with my "C" compiler that can bind my "C" sources with external
sources? Gee, yeah, that's the ticket, now I don't have to re-write my
old linker that dates back to the 1950's, which would upset all my
customers who are using the Autocode system on their 1700 emulators!
Or, since there's lots of unused character values, why don't I just
provide a mapping function that maps my, say, 12-character external
names, into 6-character external names? I once used a system that
allows 6-character names.  Period. Total. Uniqueness wasn't enough. A
couple of years laters, some enterprising hackers in Florida had
patched the thing to do exactly what I just mentioned above (mapping
12-character names into 6-character external names that contained
non-alphabetic characters). Worked, too, since the linker really
didn't care what was in the 6-character name field...

In summary, I don't think that supporting ancient linkers is worth the
trouble of restricting us to 6-character uniqueness. If someone has
gone to the trouble of writing a "C" compiler, the least they can do
is go through the trouble of writing a linker to go with that "C"
compiler, which is capable of binding both "C" objects and other
objects. Yeah, same thing you talked about, but it really isn't as
expensive as you imply.  Talking about the expense of this linker is a
bit ridiculous, considering how easy a task a linker is, compared to
the difficulty of writing a compiler.

Of course then you get to the point where you buy an Ada compiler for
your computer which has its own linker, and a "C" compiler with its
own linker, and want to link objects from each together. I'll leave
this one up to the marketplace (hint: publishing your object/linker
format as a "standard" is a way of both assuring compatibility with
future products, and getting free advertising for your product to
boot).

Note what I'm saying. There's no reason for IBM et. al. to be involved
with a "C" compiler project from a 3rd-party vender (unless they
really want to!). Hardware venders wouldn't have to do a darned thing
if ANSI required longer-length external labels (like, say, 8-12
characters). Any compatibility problems would be with the compiler
manufacturer, not with the hardware vender.

--
Eric Green  elg@usl.CSNET       from BEYOND nowhere:
{ihnp4,cbosgd}!killer!elg,      P.O. Box 92191, Lafayette, LA 70509
{ut-sally,killer}!usl!elg     "there's someone in my head, but it's not me..."

dag@chinet.UUCP (Daniel A. Glasser) (10/27/87)

Once again, people seem to be missing the point about old systems with
old object formats and old linkers which only support short mono-case
identifiers.  Sure, the compiler writer can write a pre-linker or a
whole linker that can be used to link both old and new object formats
together.  This is a LOT of work.  The pre-linker would prohibit the
use of some features provided by some linkers in the final link, like
overlays between user-supplied modules.  Remember that not all machines
have huge linear address spaces.  Not even some so-called "modern" ones.
Lets look at a simple example:  RT-11 linker replacement.  The standard
DEC PDP-11 object format stores labels in RAD-40./RAD-50, which has no
lower case, so name smashing is out.  If the compiler vendor chooses to
write a linker and have that linker accepted by the user community, the
following features must be supported:
	code overlays (RT-11 style)
	PID only (.SAV)
	Relocatable (.REL,.SYS)
Now, lets move on to TKB, the RSX-11M/M+ task builder.  This linker
requires support of PLAS overlays, memory resident/disk resident
overlays, clustered libraries, supervisor libraries, shared code,
Psect extension, shared libraries, etc. etc. etc..  Just TRY and read
the TKB manual some time.  It is not worth the compiler vendors while
to try and write a replacement for a very powerful (if difficult to use)
tool unless the market is being limited to simple users.
If you don't want to write for these machines anyway, why do you care
at all about the six character limit?  Most Unix systems support names
at least 16 characters long, and so do most other new systems.

My point is that you should stop all this noise about the six monocase
character external identifer MINIMUM maximum.  The arguments are going
round and round, and it does not make one bit of difference to the
majority of programmers, since they don't have to deal with the low
end implementation.  A compiler that supports 32767 character externals
using the full ISO latin 1 character set can still be fully conforming --
a compiler that supports only 5 character mono-case character externals
cannot.  That is all that there is to it.  Now, let us get on to some
more contriversial topics, y'know, things that matter.  Huh?
-- 
					Daniel A. Glasser
					...!ihnp4!chinet!dag
					...!ihnp4!mwc!dag
					...!ihnp4!mwc!gorgon!dag
	One of those things that goes "BUMP!!! (ouch!)" in the night.

peter@sugar.UUCP (Peter da Silva) (10/28/87)

In article <234@snark.UUCP>, eric@snark.UUCP (Eric S. Raymond) writes:
> In article <9840@brl-adm.ARPA>, vis!greg@bass.nosc.MIL writes:
> > A source to source translator can be provided which reduces the
> > significance of C external identifiers to six monocase characters.
> Three cheers for this man! He has found a way to cut the Gordian knot!

I'm afraid not. This may be a way to compile non-conformant programs on
6-character systems. It is *not* a way to get away with requiring longer
names. What happens when you want to call routines in languages other
than 'C' that have not gone through this mapping?
-- 
-- Peter da Silva  `-_-'  ...!hoptoad!academ!uhnix1!sugar!peter
-- Disclaimer: These U aren't mere opinions... these are *values*.

peter@sugar.UUCP (10/28/87)

> > Radix-40
> I'm sure you mean RAD50

Well, it was called both, depending on whether or not the radix of the
radix was in decimal or octal. 40 decimal = 50 octal, you see. And you
know how DEC feels about octal. I guess it depends on the order you read
the manuals what you ended up calling it.
-- 
-- Peter da Silva  `-_-'  ...!hoptoad!academ!uhnix1!sugar!peter
-- Disclaimer: These U aren't mere opinions... these are *values*.

greg@bass.nosc.MIL (10/30/87)

Thanks to Doug and others for responding to my proposal, I'm grateful
for the attention its getting.  Let me take a few minutes and respond
to some of the points which Doug raised.

	From: Doug Gwyn <gwyn@BRL-SMOKE.arpa>
	Subject: Re: MAJOR ANSI C FLAW (my opinion, of course)
	Date: 26 Oct 87 21:05:11 GMT
	To: info-c@BRL-SMOKE.arpa
	
	It's not clear what good it would do to send this to X3J11.

My understanding is that X3J11's restriction on external names will
cause the writers and standardizers of C bindings for libraries and
packages to use much less mnemonic names than they would wish.  Given
all of the fancy graphics, database, window managing, mathematics,
etc. libraries and packages we're accumulating, this restriction will
significantly reduce the clarity of our code, leading to many errors
during development, and much more trouble and expense during program
maintenance.  On top of that, all of these little unmnemonic names,
and the comments and defines which try to explain them will be ugly.

My impression is that X3J11 appreciates this and has only agreed to
the restriction because no other way was seen to allow C to work with
standard linkers - a necessity, I agree.  My proposal provides a way
to have long identifiers AND work with existing linkers.  It is
intended to allow X3J11 to give external identifiers the same
requirements as for internal ids, without compromising portability or
X3J11's mandate.  As Doug and other members of X3J11 has said, this is
what they want but thought they couldn't have.
	
	Since information is being lost, in general you can also get
	collisions between short version of names across multiple
	translation units (object modules), since they're compiled
	entirely separately so that no single hash table can guarantee
	uniqueness (unless you want to maintain a system-wide single
	hash table for all C external identifiers ever seen).

This is a good point.  It is indeed necessary for a renaming tool to
keep a table of all external identifiers in a program undergoing
renaming.  In the design sketch I presented, an identifier table with
the renamings assigned so far would be constructed from a renamings
file before processing each module, and the renamings file would be
updated afterwards.  The renamings files would have a simple structure
so that they could easily be produced for non-C modules, even by hand.
To have consistent renamings across a set of programs, just reuse the
same renamings file.
	
	If the short names do not bear notable resemblance to the
	original longer names, it makes debugging even harder than
	it already is.

Quite so.  For this reason, I proposed simply adding a short
distinctive prefix to names which collide.  This leaves the original
spelling intact following the prefix.  If the prefix syntax is
reserved, the renamings can be reversed merely by stripping off any
such prefixed identifiers (identifiers which accidentally already had
the prefix would simply get an extra one when renamed).

	I think it's easier to address this problem before coding
	than to try to solve it after the fact.

This point is very hard to take seriously.  In addition to the
mountains of exiting code, who knows what libraries and packages new
programs will wind up being linked with during future maintenance?
Practically all questions being debated by X3J11 would be moot if
programmers could somehow do all the right things when writing code.
But remember, the biggest problem I see with X3J11's allowing reduced
external name significance is what it will do to writers of libraries
and packages.  With a renaming tool available, such code can use
natural and mmemonic identifiers.  When a library or package is
installed in a deficient environment, the renaming program will be
run, and all programs which use that library or package will use the
(same set of) renamings which were produced.

I ask that this proposal be considered by X3J11, and that others
who agree with me make the same request.  After allowing for more
comments, it will be time to write the renaming program and to put
it into the public domain.

_Greg


J. Greg Davidson			  Virtual Infinity Systems
+1 (619) 452-8059        6231 Branting St; San Diego, CA 92122 USA
 
greg@vis.uucp				ucbvax--| telesoft--|
greg%vis.uucp@nosc.mil			decvax--+--sdcsvax--+--vis
greg%vis.uucp@sdcsvax.ucsd.edu		 ihnp4--|     nosc--|

gwyn@brl-smoke.ARPA (Doug Gwyn ) (10/30/87)

In article <10065@brl-adm.ARPA> vis!greg@bass.nosc.MIL writes:
>	From: Doug Gwyn <gwyn@BRL-SMOKE.arpa>
>	It's not clear what good it would do to send this to X3J11.
>I ask that this proposal be considered by X3J11, and that others
>who agree with me make the same request.  After allowing for more
>comments, it will be time to write the renaming program and to put
>it into the public domain.

After first making clear that I'm not speaking for X3J11 and want to
encourage you to submit any constructive ideas about the draft Standard
to X3J11 (which is not done by posting to this newsgroup or sending
mail to X3J11 members other than Tom Plum, by the way), I'd like to
make clearer what I think the problem with your proposal is.

X3J11 is not in a position to provide, nor orchestrate the provision
of, any part of a C implementation.  (This is similar to our reason
for not establishing a #pragma clearing house.)  At the last meeting,
we had a presentation of a useful facility in the same general class
as yours, the function of which was to establish inter-translation-unit
function parameter type checking (and some other less important services).
Even though I believe most Committee members would be all for the
promised functionality, we really could not agree to dictate such
facilities, for several reasons.  For example, there exist C environments
on systems like the Apple Macintosh that consist of a closely integrated
suite of editing, compilation, debugging, etc. facilities.  Such an
environment may not have a natural place to insert an "add-on" helper
facility, or it may not need one due to providing better facilities.
It would be asking a lot for implementors to have to provide such a
facility, even if someone made it freely available.  Specifying the
facility in sufficient detail would be a lot of extra work that would
further delay adoption of the Standard.  And so on.

Some of these objections would also apply to the external name-mapper.
Additional problems are brought on by having to maintain a system-wide
name pool (which has security ramifications, as well as being a
bookkeeping nightmare).  For a quality implementation, all the
associated system utilities (debuggers, other language compilers, etc.)
would have to be modified to understand or at least cooperate with the
mapping scheme.  I really don't see how you could convince the
Committee that the benefits would outweigh the disadvantages.

This is not to say that a freely available extern-name-mapper tool
would not be a useful contribution to the C programming community --
I think it would be helpful.  I strongly suspect that many C projects
will simply ignore the 6-character monocase constraint, and they will
port readily to a wide range of existing and future systems without
requiring any name-mapping tool.  You can be sure that every attempt
to port an otherwise strictly conforming application to a system with
extern name limits is likely to result in complaints aimed at that
system vendor.  I suspect they'll change their linkers etc. as soon
as they've accumulated enough complaints!

As to just living with the restriction, I and other well-known C
programmers have already pointed out that we've worked within the
constraints without major problems for quite some time.  There are
techniques for avoiding the worst problems brought on by the extern
name constraint.  The most important one is to limit the number of
externs in your application; most C programmers use too many.  The
next trick is to use "package prefixes"; for example, I recently
completed a set of IPC data-exchange modules, where all externally-
visible significant names began with "Dx".  Other programmers working
on other parts of the application are using names with other prefixes,
so we don't step on each other's name spaces.  Another trick that is
sometimes useful is to enclose extern data in a single struct, so
that only one name (the struct name) is subject to the extern name
constraint.  There are other tricks, but I seldom have to resort to
them.  I agree 100% that it would be "nice" to not have to worry at
all about such matters, but then there are lots of more important
things that would be "nice" if they were other than what they are..

stuart@bms-at.UUCP (Stuart D. Gathman) (10/30/87)

Go ahead and use as many characters as you like for external (and internal)
names.  If the target linker (or compiler) chokes, use a handy
utility like "police", posted to the net a while ago.  "police reads a 'C'
source file and replaces identifiers not unique within N characters with
(hopefully) unique subtitutions generated with a CRC of full name.

A shell script subtitute for 'cpp' can run it automatically on all your
source by filtering the output of the real 'cpp'.  If your real 'cpp' barfs,
get 'cccp' from GNU.

I can send 'police' to those that didn't get it.
-- 
Stuart D. Gathman	<stuart@bms-at.uucp>
			<..!{vrdxhq|dgis}!bms-at!stuart>

rwhite@nusdhub.UUCP (Robert C. White Jr.) (11/07/87)

In article <10065@brl-adm.ARPA>, vis!greg@bass.nosc.MIL writes:
> My understanding is that X3J11's restriction on external names will
> cause the writers and standardizers of C bindings for libraries and
> packages to use much less mnemonic names than they would wish.  Given
> all of the fancy graphics, database, window managing, mathematics,
> etc. libraries and packages we're accumulating, this restriction will
> significantly reduce the clarity of our code

I would like to point to a windowing package which is fully compliant.
These are actual external function names from the curses package
included wiht our C Language Utilities set.

slk_noutrefresh();
restartterm();
*boolfnames[];
garbagedlines();
...

	Our system implements external identifiers which are dual case 30+
characters long.  This is compliant with the X3J11 minimum requirement
of 6 single case external names.

You, and others, are repeatedly useing the word "restriction" which
is incorrect.  The proposed standard does not require the identifiers
be any particular length, nor that they be unique in 6 characters.
What it _does_ say is that a compiler must supply _at least_ 6
characters to it's linking enviornment, with a warning that this
MINIMUM will most likeley be INCREASED to a LARGER MINIMUM at a
later date.

The only way to _not_ be compliant is to write a compiler which would
only supply _4_ mono-case characters to it's linking environment.

The extreemly low MINIMUM has been retained so as to allow the
older compilers to still claim compliance.  Sort of a warning
period.

The standard does not contain a MAXIMUM in any way [on this issue].

The crevate to this is that _if_ you want 100% portability as of
this instant <according to the standard> you will need to only use
6 mono-case characters, BUT WE ALL ALREADY KNEW THIS DIDN'T WE?

In summary:
	YES, your Microsoft C [with 40+ dual-case externals] is compliant.
	YES, your local-hack C compiler with 128 dual-case is compliant.
	YES, your VAX C compiler with only 6 mono_case is compliant.
	BUT that VAX C compiler probably won't be netx year!!!!!

	Just write code that fits your most limiting "required"
environment and stop worrying.  ALL your old code will work in
whatever later expansion is done to the standard as far as "will
it compile and link correctly?"  If it is a debugger that is
limited to debugging 6-mono-case externals, you might as well
re-write it now and beat the rush!!

Rob.

Disclaimer:  "I don't care any more...."  - P. Collins.

TGMIKEY%CALSTATE.BITNET@wiscvm.wisc.EDU (Account Manager) (11/12/87)

Received: by CALSTATE via BITNet with NJF for TGMIKEY@CALSTATE;
Comment: 10 Nov 87 02:49:36 PST
Received: by BYUADMIN (Mailer X1.24) id 1737; Tue, 10 Nov 87 03:48:35 MST
Date:     30 Oct 87 19:48:16 GMT
Reply-To: Info-C@BRL.ARPA
Sender:   INFO-C@NDSUVM1
From:     stuart@bms-at.uucp
Subject:  Re: MAJOR ANSI C FLAW (my opinion, of course)
Comments: To: info-c@BRL-SMOKE.arpa
To:       TGMIKEY@CCS.CSUSCC.CALSTATE.EDU


Go ahead and use as many characters as you like for external (and internal)
names.  If the target linker (or compiler) chokes, use a handy
utility like "police", posted to the net a while ago.  "police reads a 'C'
source file and replaces identifiers not unique within N characters with
(hopefully) unique subtitutions generated with a CRC of full name.

A shell script subtitute for 'cpp' can run it automatically on all your
source by filtering the output of the real 'cpp'.  If your real 'cpp' barfs,
get 'cccp' from GNU.

I can send 'police' to those that didn't get it.
--
Stuart D. Gathman    <stuart@bms-at.uucp>
            <..!{vrdxhqdgis}!bms-at!stuart>

===== Reply from Mike Khosraviani <TGMIKEY> ==========================

Would you, please, send me police?  Does it cost me anything?  If so, I think
I'll pass on it!!!!   I really appreciate it. Thank you.


Mike