vaughan%cadillac.cad.mcc.com@MCC.COM (Paul Vaughan) (04/02/91)
Several people including myself have asked for a name mangler. Michael Tiemann has replied that such a thing exists in the g++ file cplus-method.c, but I haven't figured out any way to use that just yet. I've been looking into the problem a bit and have come to a sticky question: Exactly what input strings should a name mangler accept? Michael's mangling function is called as part of the process of compiling code and is oriented very strongly for this purpose. It appears to accept any legal function declaration (or a start of a definition) of the current context in parse tree form. Any types mentioned must have been declared, default argument expressions are admitted, explicit type declarations (i.e. void foo(struct Bar);) etc. are allowed. This makes for a very complex grammar for function declarations--for instance, it subsumes the expression grammar of C++. I looked into the way that gdb handles name mangling. It avoids the issue by only doing name demangling. That is, when you type in a C++ function name (not a complete declaration) like "foo" to set a breakpoint, it looks through all symbols that start with foo, demangles any matches and compares the demangled base name (the demangler has an option to return only the base name, instead of a full declaration with argument specs) against the given base name. (As an aside, this code doesn't quite work for ordinary functions in gdb 3.6, and I don't understand how it is intended to work when a function is overloaded). The reason I wanted a name mangler was in connection with dynamic linking. I'd like to be able to specify a full declaration (but not in any context of typedefed names, and without the return type) and get out the mangled symbol. For instance, "foo(Foo, Bar*, int, int)" would give "_foo__FG3FooP3Barii" for g++-1.39 I'm wondering, is this even feasible? Would it be necessary to have built up the context of typedef'ed names? Suppose that certain restrictions to the input format of unmangled names, such as prohibiting foo(Foo, struct Bar*, int, int) were in effect. Then would there a exist a 1:1 mapping between legal mangled and unmangled names? Does anyone have a specification for the mangler that is simpler than ferreting it out of the demangler in cplus-dem.cc? How many people would be interested in having a bison grammar based mangler and demangler?
tiemann@CYGNUS.COM (Michael Tiemann) (04/02/91)
Does anyone have a specification for the mangler that is simpler than ferreting it out of the demangler in cplus-dem.cc? You need to do the whole job because of typedefs. I.e., typedef int foo; typedef int bar; foo f (bar); mangles to the same thing that bar f (foo); mangles to. Michael
vaughan%cadillac.cad.mcc.com@MCC.COM (Paul Vaughan) (04/03/91)
You need to do the whole job because of typedefs. I.e.,
typedef int foo;
typedef int bar;
foo f (bar);
I was thinking that in the "function specification language", typedefs
in this sense would not be allowed. Even though foo might be declared
foo f(bar);
in some source code, it would have to be declared
int f(int)
in a function specification to be accepted by the mangler. Note that
there are other differences between this function specification
language and C++. For instance,
class Foo {
int foo(Foo*);
};
int Foo::foo(Foo*);
isn't a valid declaration in C++. (Oooh, speaking of valid C++, note
that the above is accepted by g++-1.39 but not by cfront 2.0--bug?.)
I was thinking that any identifier (name other than reserved words,
symbols, or basic types) would be assumed to directly name a user
defined type. Typedefed aliases or full anonymous struct definitions
would not be accepted.
It seems clear that one requirement would be that the mangler be able
to accept any output generated by the demangler and vice versa. I
think the simplifications I'm making are consistent with that. However
it's not clear what other requirements exist for a useful tool. For
instance, these specs would not necessarily let you directly use
pieces of source code or output from the compiler as input to the
mangler. I don't see any way of creating such a mangler without
analyzing the entire source code for a module and that's significantly
more work than I want the mangler to do. Aside from compilation, it
seems the reason most people have cited for wanting a mangler is for
dynamic loading. I'm not sure if these simplifications would
adequately support that.
tiemann@CYGNUS.COM (Michael Tiemann) (04/03/91)
I think the way to handle dynamic loading is related to the way that parameterized types must be handled. I would like to see discussion about how the linker and compiler should communicate to handle both jobs with equal facility. If we can get this working, then using the name mangler that comes with the compiler will be a simple application of software reuse. Michael