[comp.std.c] External Linkage in dpANS C

bitbug@vicom.COM (James Buster) (01/04/89)

I saw mentioned in passing in one of these newsgroups (comp.std.c, comp.lang.c)
that a variable with external linkage in dpANS C is resolved with 6 significant
characters and no case sensitivity. I have some questions regarding this:

1. Is this true?
2. Is six significant monocase characters the *minimal implementation*,
   or *required*?
3. If required, why should a case sensitive language like C
   use a case insensitive linker?
4. If required, why should I damage my flexnames linker?
5. If required, why should anybody want to use such a brain damaged
   implementation of C?
5. If required, how can only 6 significant characters be portable?
6. If required, why should companies with ancient linker technology
   force me to use such ancient technology, or why can't they use 80s
   linker technology?
7. I presume I don't have to explain the number of programs
   that would break because of this behavior (in particular,
   the external identifiers _printw and _printf conflict).
   Also, creating a function Write to interface to the system
   write function (along with some extra stuff) is relatively
   common practice.
8. In general, aaaarrrrggghhh!

--------------------------------------------
	     James Buster
	Mad Hacker Extraordinaire
    	...!ames!vsi1!bitbug
	   bitbug@vicom.com
--------------------------------------------

gwyn@smoke.BRL.MIL (Doug Gwyn ) (01/04/89)

In article <1339@vsi1.COM> bitbug@vicom.COM (James Buster) writes:
>I saw mentioned in passing in one of these newsgroups (comp.std.c, comp.lang.c)
>that a variable with external linkage in dpANS C is resolved with 6 significant
>characters and no case sensitivity. I have some questions regarding this:
>1. Is this true?

Only for a few implementations.  Most will do considerably better.

>2. Is six significant monocase characters the *minimal implementation*,
>   or *required*?

Valid C external object/function identifiers that differ in their
first 6 characters other than in case must be treated as distinct.

>3. If required, why should a case sensitive language like C
>   use a case insensitive linker?

Compiler implementors don't always have control over the linker.
Many older linkers provided only what Fortran required.  Even
the PDP-11 UNIX linker had short externs (7 C source characters),
although no case folding.

>4. If required, why should I damage my flexnames linker?

Flexnames linkers are permitted.

>5. If required, why should anybody want to use such a brain damaged
>   implementation of C?

Would you rather use Fortran or Cobol?

>5. If required, how can only 6 significant characters be portable?

Portable programs should not rely on more than that.

>6. If required, why should companies with ancient linker technology
>   force me to use such ancient technology, or why can't they use 80s
>   linker technology?

It's not necessarily the compiler vendors who are responsible
for the restrictions.  Some old operating systems are too hard
(read: traumatic) to change, and until there are enough "modern"
users of those systems, the operating system vendor cannot
justify making the effort.  To do so simply to advertise
conformance to the C standard would not be considered good
enough reason, and requiring more in the standard would simply
result in fewer conforming implementations -- not better linkers.
That would be a disservice to C programmers.

>7. I presume I don't have to explain the number of programs
>   that would break because of this behavior (in particular,
>   the external identifiers _printw and _printf conflict).
>   Also, creating a function Write to interface to the system
>   write function (along with some extra stuff) is relatively
>   common practice.

No doubt about it, some programmers have been blissfully
unaware of the realities of portable programming.  There
is no change in this area due to the proposed C standard.

>8. In general, aaaarrrrggghhh!

Complain to your linker vendor if you run into this problem.
I wish you luck.

w-colinp@microsoft.UUCP (Colin Plumb) (01/04/89)

In article <1339@vsi1.COM> bitbug@vicom.COM (James Buster) writes:
>I saw mentioned in passing in one of these newsgroups (comp.std.c, comp.lang.c)
>that a variable with external linkage in dpANS C is resolved with 6 significant
>characters and no case sensitivity. I have some questions regarding this:
>
>1. Is this true?

Yes.  Unfortunately, in many systems, the object file format is frozen
and writing a more flexible linker is not an option available to compiler
writers.

>2. Is six significant monocase characters the *minimal implementation*,
>   or *required*?

It is minimal.  It is clearly marked as obsolescent, and 31 mixed-case is
both strongly recommended and expected to be required in the next C
standard.  Anyone who writes a *new* linker with this sort of restriction
should be put to death.  Slowly.

>8. In general, aaaarrrrggghhh!

This was a bitter pill to swallow, but Fortran linkers are with us for a
few more years.  I, however, don't plan on paying much attention to this
limitation in code I write (sorry, Henry).
-- 
	-Colin (uunet!microsof!w-colinp)

davidsen@steinmetz.ge.com (William E. Davidsen Jr) (01/04/89)

  Since I was still on the committee when this came up (the first time)
I can state that this was a political decision rather than a technical
decision. In spite of that I guess it's correct at this time.

  There are several companies which have linkers with the limitations
you so justly criticized. There was a feeling (there may have been an
outright statement) that these companies would not produce a vendor
supplied C compiler if their existing linker couldn't be conforming, and
that they wouldn't use C as a library implementation language if they
couldn't link it with existing languages.

  Because these companies have a financial stake in a standard which
they could offer, it was felt that rather than chance non-support or
even opposition for this version of the standard, it was a necessary
compromise.

  Two companies which have limited linkers in at least some of their
operating systems are Honeywell and IBM.

  This standard is supposed to "codify existing practice," and in many
cases that implies compromise to avoid breaking existing
implementations. Hopefully there will be a new standards committee in a
few years with a goal of producing a new version with significant
enhancements and extensions, and allowed to make changes needed for a
more consistant language. I *don't* mean "codify C++" which is a
language in its own right.

  In spite of my criticism of the committee at times I think they have
done a fine job within the limitations of compromise, and I am happy
that I was able to participate for a few years.
-- 
	bill davidsen		(wedu@ge-crd.arpa)
  {uunet | philabs}!steinmetz!crdos1!davidsen
"Stupidity, like virtue, is its own reward" -me

donn@hpfcdc.HP.COM (Donn Terry) (01/05/89)

You are not alone in your dislike of the 6 character monocase limit:

From IEEE 1003.2 Draft 8. (Currently in balloting.)

	9.1.11 Limits (This section is not part of IEEE std 1003.2)

	The C compiler and link editor shall support external symbols
	with a length of at least 31 bytes; symbols exceeding this length
	shall be truncatd and a diagnostic message written to standard
	error.

        [The same statement appears in 10.1.11: the Fortran compiler.]

(Before you flame...)  I believe that the location of this statement (in
a non-required "Limits" section) was not the intent, and will be changed
in balloting to a required section.  (I think this was the result of the
mechanics of editing).  I also believe the "shall be truncated" will be
changed, possibly to "may be truncated, and if this is done a diagnostic
message...".  ("Shall" wording in such a section is a bit strange...)

Assuming my conjectures above are correct, at least POSIX systems and
ones like it will not have that problem.

It is also probable that "market pressure" will help force the issue.

Donn Terry
HP Ft. Collins

barmar@think.COM (Barry Margolin) (01/05/89)

One point in favor of the six-character case-insensitive limitation on
external linkage is that it doesn't break any existing portable
programs.  Prior to the ANSI work there existed a number of C
compilers on systems with such limitations, so any C program that
depends on more than this is not portable to those systems.  If you
aren't worried about porting to those systems now, why do you think
you would be once the standard is published?  The only problem is if
you want to advertise that your program is strictly conforming to ANSI
C (i.e. that it will run on ANY ANSI C implementation); if you depend
on long, case-sensitive external names you'll have to mention that
exception.

BTW, I've heard about programs that will go through a collection of
source files finding all the external names that clash in the first
six case-insensitive characters and rename them uniquely.  This will
let you develop programs on a flexname system and easily convert them
to fully portable versions.



Barry Margolin
Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

anw@nott-cs.UUCP (01/05/89)

In article <9274@smoke.BRL.MIL> gwyn@brl.arpa (Doug Gwyn (VLD/VMB) <gwyn>) writes:
>[...]
>Many older linkers provided only what Fortran required.  Even
>the PDP-11 UNIX linker had short externs (7 C source characters),
     ^^^^^^^^^^^^^^^^^^^^^^
>although no case folding.

	Make that "has"*!  Actually, it causes us surprisingly few problems;
I've installed megabytes of stuff from the Net quite easily.  The occasional
program that insists on using identifiers like "OpenNewWindow[A|B|C]" [Ick!]
soon succumbs to a nifty edit -- I've never even bothered with that program
("shortc" was it?) that abbreviated long identifiers automatically.

---------
* Not much longer, alas.  Once Sun UK work out how to install it, we'll
  have a shiny new machine to play with, and the long era of PDP 11's
  at Nott'm will be drawing to a close.  :-(.

-- 
Andy Walker, Maths Dept., Nott'm Univ., UK.
anw@maths.nott.ac.uk

paul@hcr.UUCP (Paul Jackson) (01/06/89)

In article <1339@vsi1.COM> bitbug@vicom.COM (James Buster) writes:
>I saw mentioned in passing in one of these newsgroups (comp.std.c, comp.lang.c)
>that a variable with external linkage in dpANS C is resolved with 6 significant
>characters and no case sensitivity. I have some questions regarding this:
>
>1. Is this true?
Yes, in that a strictly conforming program cannot rely on better linker
and/or assembler behaviour.
>2. Is six significant monocase characters the *minimal implementation*,
>   or *required*?
Minimal.  NOT NOT NOT NOT required.  An implementation also does NOT have to
warn you if you exceed this limit (although some probably will).
>3. If required, why should a case sensitive language like C
>   use a case insensitive linker?
>4. If required, why should I damage my flexnames linker?
>5. If required, why should anybody want to use such a brain damaged
>   implementation of C?
>5. If required, how can only 6 significant characters be portable?
>6. If required, why should companies with ancient linker technology
>   force me to use such ancient technology, or why can't they use 80s
>   linker technology?
Although not required, the answer is simple.  Some companies with a large
C base claim that they cannot update their assembler/linker.  This, in
my opinion, constitutes the biggest kludge in the standard AND is the one
place where politics really got the best of the committee.  Note that others
would disagree with this statement.
>7. I presume I don't have to explain the number of programs
>   that would break because of this behavior (in particular,
>   the external identifiers _printw and _printf conflict).
>   Also, creating a function Write to interface to the system
>   write function (along with some extra stuff) is relatively
>   common practice.
No, most of us realize that these programs will break.  Note that once again
this is a case of the committee acknowledging existing practice.  These programs
will ALREADY break if moved to the appropriate systems.  To that extent they
are not currently portable.  At least now everybody knows that these are the
limits one can use portably, and maybe you'll be able to get warning messages
from your favourite compiler (or lint).  In my personal opinion this was a
clear and substantial flaw in existing practice and should have been rectified,
but IT WAS existing practice.
>8. In general, aaaarrrrggghhh!
>	     James Buster

jeffrey@algor2.UUCP (jeffrey) (01/06/89)

Those manufacturers whose existing linkers made them feel compelled to insist
that identifiers must be unique when truncated to 6 case indifferent characters
may have done themselves the greatest damage.

Any ANSI C program written on their systems will be portable to all other
systems.  The difficulty will be raised in porting code with longer
identifiers to their systems.  Hence, code on their systems will easily be
ported to their competitors, but code written on their competitor's systems may
not port to them.  I wish I could express great amounts of sympathy over the
business they might lose.

I believe in honesty, decency and portable C code, but the temptation to
deviate from these standards is often overwhelming.  And in a environment
where I expect almost all the major vendors will support 31 characters
everywhere, the trade off of readability versus portability makes the
temptation extremely strong.

-- 

Jeffrey Kegler, President, Algorists, uunet!jeffrey@algor2.UU.NET
1788 Wainwright DR, Reston VA 22090, 703-471-1378

daveb@geaclib.UUCP (David Collier-Brown) (01/06/89)

In article <1339@vsi1.COM> bitbug@vicom.COM (James Buster) writes:
|6. If required, why should companies with ancient linker technology
|   force me to use such ancient technology, or why can't they use 80s
|   linker technology?

From article <9274@smoke.BRL.MIL>, by gwyn@smoke.BRL.MIL (Doug Gwyn ):> 
| It's not necessarily the compiler vendors who are responsible
| for the restrictions.  Some old operating systems are too hard
| (read: traumatic) to change, and until there are enough "modern"
| users of those systems, the operating system vendor cannot
| justify making the effort.  To do so simply to advertise
| conformance to the C standard would not be considered good
| enough reason, and requiring more in the standard would simply
| result in fewer conforming implementations -- not better linkers.
| That would be a disservice to C programmers.

  True, and very annoying to clients of the linker-suppliers!

  However, even **very** old linkers are relatively easy to bring up
to date as long as one can still find the source. One defines a new
record format (I wasn't kidding when I said very old) for long
names, and makes it optional in release N of the operating system.
In release N+2, you produce warnings about each use of the old
format. In N+3 you produce error messages. In N+5 you drop support.
Please note that by N+5, version N of the OS is out of support too!

  This takes about 3-4 years, you realize, but it does work.  One of
my previous employers did it with near-nil angst from their users.
The chap who had to figure out how the linker worked hated the idea,
though (1 man, about 3 weeks).

--dave (I should put this in my "common answers" database (;-)) c-b
-- 
 David Collier-Brown.  | yunexus!lethe!dave
 Interleaf Canada Inc. |
 1550 Enterprise Rd.   | He's so smart he's dumb.
 Mississauga, Ontario  |       --Joyce C-B

zeeff@b-tech.ann-arbor.mi.us (Jon Zeeff) (01/10/89)

In article <196@algor2.UUCP> jeffrey@algor2.UUCP (jeffrey) writes:
>that identifiers must be unique when truncated to 6 case indifferent characters

>Any ANSI C program written on their systems will be portable to all other
>systems.  The difficulty will be raised in porting code with longer

I have this problem now.  Does anyone have a program that will munge through
some source and create unique short identifiers?







-- 
  Jon Zeeff			zeeff@b-tech.ann-arbor.mi.us
  Support ISO 8859/1		zeeff%b-tech.uucp@umix.cc.umich.edu
  Ann Arbor, MI			umix!b-tech!zeeff