[comp.lang.c++] Let's write a new linker!

gwu@quark.tcs.com (George Wu) (01/26/91)

In article <70185@microsoft.UUCP>, jimad@microsoft.UUCP (Jim ADCOCK) writes:
|> It would seem that we need linkers that allow renaming of classes etc
|> at link time.  One way to "automate" this would be if you could tell
|> the linker to imply a libname:: extension to the classes implemented
|> in that lib.  Thus, assuming two libraries aren't named the same
|> [and you _can_ rename them by changing the name of the file containing
|> the library] you could resolve the conflicts.

     What, with name mangling and symbol clashes, what we really need is a
new linker.  The whole concept of name mangling just to be able to use the
venerable UNIX linker should be tossed.  Yeah, and I can hear to screams of
agony as some people would switch to such a linker before others.  Still,
I've long thought name mangling is revolting.

							George

----
George J Wu                           | gwu@tcs.com or uunet!tcs!gwu
Software Engineer                     | 2121 Allston Way, Berkeley, CA, 94704
Teknekron Communications Systems, Inc.| (415) 649-3752

linton@sgi.com (Mark Linton) (01/29/91)

I don't see how the global-class-name problem can be solved at link time.
Suppose you two class libraries, L1 and L2.  Suppose that they both have
String classes.  Suppose that your file a.c wants to use class A from L1 and
class B from L2.  Suppose both A and B have member functions that take
L1::String and L2::String parameters, respectively.  How is the compiler
to understand what is going on?  This problem arises regardless of whether
a.c uses the member functions in question.  Seems to me that class scoping has to be
solved during compilation.  Once it is fixed there, I think fixing link-time
is straightforward.

mat@mole-end.UUCP (Mark A Terribile) (02/03/91)

>      What, with name mangling and symbol clashes, what we really need is a
> new linker.  The whole concept of name mangling just to be able to use the
> venerable UNIX linker should be tossed.  Yeah, and I can hear to screams of
> agony as some people would switch to such a linker before others.  Still,
> I've long thought name mangling is revolting.

I'm inclined to agree that we need better linkers, but I don't know how we
can get them.  It's hard to force linkers on operating systems, especially
since there are a few machines (HP3000, for example) whose architectures
require that the linker be a trusted program to ensure the security of the
operating system.

Probably the best way to coax this change on the world is to design a
new set of linker capabilities which can be implemented either by a new
linker or by a preprocessor plus the `real' linker.  You may also have to
modify assemblers, etc.  Come to think of it, the preprocessors might just
have to resort to name mangling.  (ugh!)
-- 

 (This man's opinions are his own.)
 From mole-end				Mark Terribile

rfg@NCD.COM (Ron Guilmette) (02/08/91)

In article <474@mole-end.UUCP> mat@mole-end.UUCP (Mark A Terribile) writes:
+>      What, with name mangling and symbol clashes, what we really need is a
+> new linker.  The whole concept of name mangling just to be able to use the
+> venerable UNIX linker should be tossed.  Yeah, and I can hear to screams of
+> agony as some people would switch to such a linker before others.  Still,
+> I've long thought name mangling is revolting.
+
+I'm inclined to agree that we need better linkers, but I don't know how we
+can get them.  It's hard to force linkers on operating systems, especially
+since there are a few machines (HP3000, for example) whose architectures
+require that the linker be a trusted program to ensure the security of the
+operating system.

I'm inclined to disagree that we need new linkers before we can rid
ourselves of the various link-time problems caused by "name mangling".

In fact, what is needed is *not* new linkers but rather new assemblers
and C compilers.

Name mangling is used in the cfront translator because it generates C and
because there are no C compilers which accept "void func (int, int)" as
an identifier.  Likewise, g++ (which generates assembly code directly)
must use name mangling because no existing assembler accepts the string
"void func (int, int)" as an identifier.

By the time the "name" strings get down into the object files, it no longer
really matters whether or not they contain "special" characters (or even
blanks).  By that time, the only tools that will be doing anything with the
"name" strings are the linker and possibly some object-file-dump program.
Those programs probably don't give a damn what characters appear the
strings in the linker symbol table.

Just considering *real* C++ compilers for a moment (e.g. g++) we could
rid ourselves of the linking problems that name mangling causes if we
had assemblers which allowed something like:

	.alias	func__FiT1,"void func (int, int)"

This directive would allow the (mangled) name "func__FiT1" to be used 
throughout the generated assembly code, but when it came time for the
assembler to generate the object file, it would place the string
"void func (int, int)" into the object file symbol table instead of
"func__FiT1".

For C++ translators (like cfront), a similar facility at the level of the
C compiler would likewise be useful.

-- 

// Ron Guilmette  -  C++ Entomologist
// Internet: rfg@ncd.com      uucp: ...uunet!lupine!rfg
// Motto:  If it sticks, force it.  If it breaks, it needed replacing anyway.

sarima@tdatirv.UUCP (Stanley Friesen) (02/10/91)

In article <3782@lupine.NCD.COM> rfg@NCD.COM (Ron Guilmette) writes:
|Name mangling is used in the cfront translator because it generates C and
|because there are no C compilers which accept "void func (int, int)" as
|an identifier.  Likewise, g++ (which generates assembly code directly)
|must use name mangling because no existing assembler accepts the string
|"void func (int, int)" as an identifier.
|
|By the time the "name" strings get down into the object files, it no longer
|really matters whether or not they contain "special" characters (or even
|blanks).  By that time, the only tools that will be doing anything with the
|"name" strings are the linker and possibly some object-file-dump program.
|Those programs probably don't give a damn what characters appear the
|strings in the linker symbol table.

This sounds very nice, however, these tools *do* sometimes care what characters
appear in the name strings.  In particular, the way debugging information is
maintained for sdb and dbx under UNIX is to append a type descriptor string to
the end of the each symbol name.  This means that even the linker must care
about the contents of the name string, since it must link only on the 'real'
part of the name, not the type extension.  These type extensions are introduced
by a ':', which makes the natural C++ name for a member function unusable.
-- 
---------------
uunet!tdatirv!sarima				(Stanley Friesen)

mat@mole-end.UUCP (Mark A Terribile) (02/14/91)

> In article <474@mole-end.UUCP> mat@mole-end.UUCP (Mark A Terribile) writes:
> +>      What, with name mangling and symbol clashes, what we really need is a
> +> new linker.  ...

> +I'm inclined to agree that we need better linkers, but I don't know how we
> +can get them.  It's hard to force linkers on operating systems, ...

> In fact, what is needed is *not* new linkers but rather new assemblers
> and C compilers.
 
> Name mangling is used in the cfront translator because it generates C and
> because there are no C compilers which accept "void func (int, int)" as
> an identifier.  Likewise, g++ (which generates assembly code directly)
> must use name mangling because no existing assembler accepts the string
> "void func (int, int)" as an identifier.

Ron, you are absolutely right.  It's enough to have a way to get the
symbols into the object file (assuming that there aren't peculiar bugs in
the linker's algorithms that will cause it to address Never-Never Land when
punctuation is seen).  Why not post-process the object files either before
or after linking?  (I think I did hint at this somewhere recently.)  But
again, you couldn't do this on the HP3000 under MPE; the assembler too is
a trusted program.

Depending on the organization of the linker's symbol table (hash, B-tree,
whatever) teaching the linker about name spaces belonging to classes or
modules might improve performance somewhat, especially if it improves cache
or paging performance.

There's really no excuse for g++, though; GNU writes its own assembler, as
I recall.
-- 

 (This man's opinions are his own.)
 From mole-end				Mark Terribile

rfg@NCD.COM (Ron Guilmette) (02/17/91)

In article <1991Feb7.152041.25151@clear.com> rmartin@clear.com (Bob Martin) writes:
>
>In the future we will likely see more true C++ compilers.  These programs
>will not have to depend on the C object format.  Thus the authors could 
>generate relocatable object files of a format which specified the types
>and classes of objects.

>[... text deleted...]

>So let me encourage compiler writers everywhere to _think_ about an
>alternate format for the object files which could eventually lead us
>away from name mangling, and towards a new linker.

Two points:

First, you note that it would be good if object file formats included some
mechanism by which "typing" information about objects could be expressed.

Have you ever heard of symbolic debugging information?  It generally
includes very complete typing information and it goes into the object
file.  I think it fits the bill pretty nicely.

Now what do you want to do with that information?  (Perhaps you want a
linker that will pay some attention to that information.  If so, I'm
with you 100%).

Second point.  You encourage compiler writers to *think* about object
files formats.  Well let me tell you that they do.  Some even do more
than that.  They join organizations (such as the recently formed UNIX
International Programming Languages Special Interest Group) to discuss
just this type of issue and to try to evolve commonly used formats (such
as ELF & DWARF) to meet the challenges of the 90's (including C++).

As a member of the UI/PLSIG myself, I encourage everyone who has an
interest in such issues to seek out such groups and to participate
in their activities and discussions.

P.S.  Membership in the UI/PLSIG (and/or its mailing list) is *not*
limited to UI members.  Membership is open to the public.  Contact
the chairman, Dan Oldman <oldman@dg-rtp.dg.com> for additional
information.

-- 

// Ron Guilmette  -  C++ Entomologist
// Internet: rfg@ncd.com      uucp: ...uunet!lupine!rfg
// Motto:  If it sticks, force it.  If it breaks, it needed replacing anyway.