[comp.lang.c++] Naming Conventions

jimad@microsoft.UUCP (Jim ADCOCK) (01/09/91)

As the library committee begins to propose classes and libraries to
standardize, I wish to mention a subject that will probably provoke
howls of anguish from all involved, namely: Naming Conventions.

My suggestion is that [besides being almost impossible to agree on]
naming conventions need to be agreed upon in order to have standarized 
software, and that the exact names chosen for standard libraries and
classes are important to their reusability.  Some names suggested for
use in standard libraries to date include, I believe:

complex
iostream
istream
ostream
String

which seem to represent at least three different styles of naming.

---

A humble suggestion: "simple" standard classes, that follow value 
semantics, and are intended as extensions to the built-in primitive
types of the language, should be named in a manner as similar as possible
to the built-in types of the language.  "Good" names for "simple" standard
classes then, might include:

complex
stream
string

Preferred names for these "not-built-in primitive-like" classes then,
would be lower case, not include underbars, be one word, or one
abbreviated word.  The extended primitive, and primitive-like set
of types available for C++ programmer would then include:

int 
long 
char
double
unsigned
complex
stream
string

---

I do not here suggest what proper naming conventions should be for
standard classes that are _not_ intended as "simple" extensions to
the built-in primitive types, but rather are intended to be used
in an "object oriented" manner, accessed via pointer or reference,
in a polymorphic manner.  The defacto standard for such classes seems
to me to be:  whole words, first letter capitalized, word breaks 
capitalized, no underbars:

Object
Set
Bag
MultipleWordName

I believe if the committee did decide that "simple" value semantics standard
classes should be all lower case like built-in types, whereas as "complicated"
standard classes intended to be used with derivation, polymorphism, etc,
start with a capital, then this would be _a_ good choice, helpful to 
C++ programmers by clarifying the difference in uses of these two 
types of classes.

jimad@microsoft.UUCP (Jim ADCOCK) (01/09/91)

A second, harder problem in naming conventions is how to choose the syntax
and names to be used in protocols.  Consider the "simple" problem of 
"setting" or "getting" a subpart of an object.  Below I list a few of
the design choices I have seen to implement these actions:


Set part of a thing to something:

thing.part = something;
thing.part() = something;
thing->part = something;
thing->part() = something;
thing.setPart(something);
thing->setPart(something);
thing.SetPart(something);
thing->SetPart(something);
thing.set_part(something);
thing->set_part(something);
thing.part.set(something);
thing->part.set(something)
thing->part->set(something);
thing.part = *something;
thing.part() = *something;
thing->part = *something;
thing->part() = *something;
thing.setPart(*something);
thing->setPart(*something);
thing.SetPart(*something);
thing->SetPart(*something);
thing.set_part(*something);
thing->set_part(*something);
thing.part.set(*something);
thing->part.set(*something)
thing->part->set(*something);
thing.part = &something;
thing.part() = &something;
thing->part = &something;
thing->part() = &something;
thing.setPart(&something);
thing->setPart(&something);
thing.SetPart(&something);
thing->SetPart(&something);
thing.set_part(&something);
thing->set_part(&something);
thing.part.set(&something);
thing->part.set(&something)
thing->part->set(&something);
thing.part(something);
thing.part(&something);
thing.part(*something);
thing->part(something);
thing->part(&something);
thing->part(*something);
something = thing->QueryPart();
something = thing->PartQuery();


Get part of a thing:

something = thing.part;
something = thing.part();
something = thing->part;
something = thing->part();
something = thing.getPart();
something = thing->getPart();
something = thing.GetPart();
something = thing->GetPart();
something = thing.get_part();
something = thing->get_part();
something = thing.part.get();
something - thing->part.get()
something = thing->part->get();
something = *thing.part;
something = *thing.part();
something = *thing->part;
something = *thing->part();
something = *thing.getPart();
something = *thing->getPart();
something = *thing.GetPart();
something = *thing->GetPart();
something = *thing.get_part();
something = *thing->get_part();
something = *thing.part.get();
something = *thing->part.get()
something = *thing->part->get();
something = &thing.part;
something = &thing.part();
something = &thing->part;
something = &thing->part();
something = &thing.getPart();
something = &thing->getPart();
something = &thing.GetPart();
something = &thing->GetPart();
something = &thing.get_part();
something = &thing->get_part();
something = &thing.part.get();
something = &thing->part.get()
something = &thing->part->get();
something = thing.part();
something = &thing.part();
something = *thing.part();
something = thing->part();
something = &thing->part();
something = *thing->part();
thing.getPart(something);
thing->getPart(something);
thing.getPart(&something);
thing->getPart(&something);
thing.getPart(*something);
thing->getPart(*something);

-----

A humble partial suggestion to help restrict the variety of syntax choices 
possible in implementing standard library protocols:

_Please_ don't use the get, set, nor let "noise words."

-----


PS:

my personal favorites [for what _that's_ worth] are:

// to "set" a part

thing.part(something);

// to "get" a part

something = thing.part();

davidm@uunet.UU.NET (David S. Masterson) (01/10/91)

>>>>> On 8 Jan 91 20:28:25 GMT, jimad@microsoft.UUCP (Jim ADCOCK) said:

Jim> As the library committee begins to propose classes and libraries to
Jim> standardize, I wish to mention a subject that will probably provoke howls
Jim> of anguish from all involved, namely: Naming Conventions.

True, everyone seems to have their own convention -- sometimes influenced by
utilities like indent, cref, cfunc, etc.

Jim> My suggestion is that [besides being almost impossible to agree on]
Jim> naming conventions need to be agreed upon in order to have standarized
Jim> software, and that the exact names chosen for standard libraries and
Jim> classes are important to their reusability.

Knowing the names of objects promotes their reuse.  Without some convention,
finding an object to reuse in a series of massive libraries is like trying to
find the meaning of a word in a dictionary when you don't know how to spell
the word.

Jim> I believe if the committee did decide that "simple" value semantics
Jim> standard classes should be all lower case like built-in types, whereas as
Jim> "complicated" standard classes intended to be used with derivation,
Jim> polymorphism, etc, start with a capital, then this would be _a_ good
Jim> choice, helpful to C++ programmers by clarifying the difference in uses
Jim> of these two types of classes.

Yet another humble suggestion:

No matter what naming convention is chosen, someone (probably a lot of
someones) somewhere will not understand the convention and, therefore, have
trouble (re)using the library because of the dictionary principle above.  This
was recognized as unavoidable with dictionaries and, so, thesaurus's and the
like came into being (not a perfect solution, but it definitely helps).

Therefore, I suggest adopting a thesaurus principle in the naming of objects
and, I believe, that the C Preprocessor is the mechanism to implement it.  In
laying out a C++ library of objects for others to (re)use, each object will be
declared somewhere in some include file.  Users of the library will include
this declaration file and then begin making use of these objects.  If the
include file has provisions for a C preprocessor "thesaurus", then programs
akin to Cref can be developed that can not only produce a list of object
declarations, but a cross-referenced thesaurus of possible other names for the
objects.

For instance:

================
// File - complex.h
#ifndef COMPLEX_H
#define COMPLEX_H

#include "complex.ths"

class complex {
	// ...
};
#endif
================
// File - complex.ths
#ifndef THESAURUS_OFF
#ifndef COMPLEX_THS
#define COMPLEX_THS

#define Complex complex
#define COMPLEX complex
#define ThisIsComplex complex
// ...

#endif
#endif
================

With this convention and a proper tool for browsing these definitions, users
of the library should be able to find an object regardless of the convention
they prefer (assuming the thesaurus has been extended to include their
particular convention).  This convention also has the potential to allow for
an open libraries convention (preventing library vendor lock-in) as it allows
a user to use libraries with different naming conventions but similar
functionality simply by adjusting the thesaurus.

The C preprocessor doesn't go far enough in this scheme because for this
scheme to be truly useful, it should allow the renaming of member functions
and data elements within the object.  The C preprocessor has no way of
determining the difference between an add() function that is a member of the
complex object and an add() function that is a member of the real object.
This may be an area that the standards committee could address.

Comments?
--
====================================================================
David Masterson					Consilium, Inc.
(415) 691-6311					640 Clyde Ct.
uunet!cimshop!davidm				Mtn. View, CA  94043
====================================================================
"If someone thinks they know what I said, then I didn't say it!"

chip@tct.uucp (Chip Salzenberg) (01/11/91)

According to cimshop!davidm@uunet.UU.NET (David S. Masterson):
>// File - complex.ths
>#ifndef THESAURUS_OFF
>#ifndef COMPLEX_THS
>#define COMPLEX_THS
>
>#define Complex complex
>#define COMPLEX complex
>#define ThisIsComplex complex
>// ...
>
>#endif
>#endif

Aargh!  It's hard enough to read a mix of other people's code, without
making one class available under <n> names.
-- 
Chip Salzenberg at Teltronics/TCT     <chip@tct.uucp>, <uunet!pdn!tct!chip>
       "If Usenet exists, then what is its mailing address?"  -- me
             "c/o The Daily Planet, Metropolis."  -- Jeff Daiell

mat@mole-end.UUCP (Mark A Terribile) (01/11/91)

> Jim> As the library committee begins to propose classes and libraries to
> Jim> standardize, I wish to mention a subject that will probably provoke howls
> Jim> of anguish from all involved, namely: Naming Conventions.
  . . .
> Jim> My suggestion is that [besides being almost impossible to agree on]
> Jim> naming conventions need to be agreed upon in order to have standarized
> Jim> software, and that the exact names chosen for standard libraries and
> Jim> classes are important to their reusability.
 
> Knowing the names of objects promotes their reuse.  Without some convention,
> finding an object to reuse in a series of massive libraries is like trying to
> find the meaning of a word in a dictionary when you don't know how to spell
> the word.

It is my belief, and perhaps mine alone, that if we are working with the
Standard, then just about everything in the `library' ought to be provided
by parameterization.  Even when it's not quite clear what the ideal
parameterization is, the matter should be considered carefully.

Then the problem is not what the types are named, but what the templates are
named.  Of course, the types must be created, too, perhaps by a standard--but
excludable--header of some sort.

My own preference for user-defined types is to capitalize the initial letter
or abbreviation, and no more.  I don't yet know what to make of templates;
they are likely to be interesting no matter what.

For objects, if they must be created visible to the user, I strongly prefer
lower case unless there is some good reason for the upper case (e.g. the
`word' is an acronym).
-- 

 (This man's opinions are his own.)
 From mole-end				Mark Terribile

davidm@uunet.UU.NET (David S. Masterson) (01/12/91)

>>>>> On 11 Jan 91 01:39:51 GMT, chip@tct.uucp (Chip Salzenberg) said:

Chip> According to cimshop!davidm@uunet.UU.NET (David S. Masterson):

David> // File - complex.ths 
David> #ifndef THESAURUS_OFF
David> #ifndef COMPLEX_THS
David> #define COMPLEX_THS
David>
David> #define Complex complex
David> #define COMPLEX complex
David> #define ThisIsComplex complex
David> // ...
David>
David> #endif
David> #endif

Chip> Aargh!  It's hard enough to read a mix of other people's code, without
Chip> making one class available under <n> names.

But, if you had to maintain a system of a million LOC and were changing
libraries somewhere along the line, it would be very nice to change old
references to the new style in one fell swoop rather than hunt through the
code.

One thing I'm allowing for here is the potential that somewhere along the line
in the development of C++, vendors are going to begin vending C++ objects that
will act as building blocks for code that application developers develop.  If
the application has a long lifetime, its likely that the libraries that the
application depends on will change (not only version changes to one vendor's
library, but also changing vendors).  Allowing for this may mean coping with
different vendors naming strategies.  Even with an extremely stringent naming
convention, its likely that different vendors will interpret it different ways
(can you say 'vendor lock-in'?).  Therefore, I'm in favor of adopting a
strategy that allows this and yet allows the user to correct the differences
if need be.

I also think, though, that the thesaurus approach allows users of repositories
of objects (many, many libraries) to more easily find candidate objects for
reuse in their application.  Initially, the thesaurus may be very lean in
alternate phrases for an idea, but, as the use of the library grows, the
thesaurus can be filled in.  Some tool could be developed to even automate
this.
--
====================================================================
David Masterson					Consilium, Inc.
(415) 691-6311					640 Clyde Ct.
uunet!cimshop!davidm				Mtn. View, CA  94043
====================================================================
"If someone thinks they know what I said, then I didn't say it!"

mjv@objects.mv.com (Michael J. Vilot) (01/13/91)

Jim Adcock suggested X3J16 adopt a consistent approach to naming the classes in
the standard library for C++.  He also suggests a naming convention for the 
public member functions of these classes.

We discussed exactly this topic within the library group at November's meeting.
I agree with Jim that the naming is an important aspect of the usability of the
classes.  I even agree with the proposed conventions he suggested, but others
may differ.

There are other library issues to discuss, such as how we can tell (in the 
presence of inheritance and access specifiers) when a vendor's implementation
``conforms'' to the standard; and just how much of the extant I/O streams code
represents ``existing practice'' that should be supported; and how to apply
templates and exceptions to the standard library.

I hope folks will contribute their thoughts to the discussion.  comp.std.c++
seems to be a relevant forum for the discussion.

--
Mike Vilot,  ObjectWare Inc, Nashua NH
mjv@objects.mv.com  (UUCP:  ...!decvax!zinn!objects!mjv)

keffert@jacobs.CS.ORST.EDU (Thomas Keffer) (01/13/91)

In article <278D1767.505@tct.uucp> chip@tct.uucp (Chip Salzenberg) writes:
>According to cimshop!davidm@uunet.UU.NET (David S. Masterson):
>>// File - complex.ths
>>#ifndef THESAURUS_OFF
>>#ifndef COMPLEX_THS
>>#define COMPLEX_THS
>>
>>#define Complex complex
>>#define COMPLEX complex
>>#define ThisIsComplex complex
>>// ...
>>
>>#endif
>>#endif
>
>Aargh!  It's hard enough to read a mix of other people's code, without
>making one class available under <n> names.

Indeed!  In fact, there are more subtle differences between the various types
of complex than their names: some pass arguments by reference into functions,
some by value, etc.   This matters when one must pass a pointer to the
function for use by "forEach" functions iterating over vector elements, etc.

-tk
___
Thomas Keffer
Rogue Wave
PO Box 2328; Corvallis, OR 97330
(503) 745-5908

jimad@microsoft.UUCP (Jim ADCOCK) (01/15/91)

In article <CIMSHOP!DAVIDM.91Jan9105408@uunet.UU.NET> cimshop!davidm@uunet.UU.NET (David S. Masterson) writes:
....
|Therefore, I suggest adopting a thesaurus principle in the naming of objects
|and, I believe, that the C Preprocessor is the mechanism to implement it.  
....
|Comments?

Supposedly the committee is working on ways to get the CPP out of standard
header files as much as possible -- a good idea, in my opinion.

jhc@irwin.uucp (James H. Coombs) (01/18/91)

This posting did not make it out when I first wrote it, so it may not
fit clearly into current discussion.  Nonetheless...

In article <60352@microsoft.UUCP> you write:
>As the library committee begins to propose classes and libraries to
>standardize, I wish to mention a subject that will probably provoke
>howls of anguish from all involved, namely: Naming Conventions.

This is extremely important.  I want to stress the need for names that
minimize the likelihood of collisions.

I have lost a lot of time because one developer or another found it
convenient to do things like:

        #define left someVal

Recently, I had a class with something like

        static const u_short kOrig;

Since the constant was known only within the scope of my class, I
thought that it was as well protected from collisions as possible.  But
it turned out that a developer had done something like:

        #define kOrig 8

I now feel that my names are safer if I use a package prefix.  Others
are less likely to use the name 'kFsOrig', for example.  Certainly, the
problem has to do with names chosen, not just with the practice of
using the preprocessor to define constants.

>I do not here suggest what proper naming conventions should be for
>standard classes that are _not_ intended as "simple" extensions to
>the built-in primitive types, but rather are intended to be used
>in an "object oriented" manner, accessed via pointer or reference,
>in a polymorphic manner.  The defacto standard for such classes seems
>to me to be:  whole words, first letter capitalized, word breaks
>capitalized, no underbars:
>
>Object
>Set
>Bag
>MultipleWordName

I recently renamed every class in our development tree because we
started linking to a Motif library that defines Object.  That's one of
those names that everyone wants to use for their base class, and it
broke our code.

These capitalization conventions are not sufficient.  We have a long
standing tradition of using initial letters to help identify the type
of entity that is being named.  This comes from Apple's MacApp, I
believe.  For example:

        constants begin with 'k'
        globals begin with 'g'
        statics begin with 'q'
        classes begin with 'T'

Even this is not enough.  Many people work with files, and we can
predict that many people will find the class name 'TFile' appropriate.

We have established a convention of packages with prefixes.  For
example, our file system monitor has the package prefix 'Fs'.  Our
Envoy project has the package prefix 'Env'.  All names in the global
name space must begin with the package prefix.  Also, I switched to 'c'
for class names because 'T' is a capital letter and not as
mnemonic.  So, now the file classes are:

        cFsFile
        cEnvFile

This significantly reduces the likelihood of conflict both in house and
with externally developed code.

I believe that this sort of naming should be enforced with a heavy
hand.  It is extremely expensive to chase down problems that are caused
by naming conflicts.  Commercial vendors should make a special effort
to address these issues.

--Jim

jimad@microsoft.UUCP (Jim ADCOCK) (01/18/91)

In article <1991Jan12.225851.6764@usenet@scion.CS.ORST.EDU> keffert@jacobs.CS.ORST.EDU (Thomas Keffer) writes:

|Indeed!  In fact, there are more subtle differences between the various types
|of complex than their names: some pass arguments by reference into functions,
|some by value, etc.   This matters when one must pass a pointer to the
|function for use by "forEach" functions iterating over vector elements, etc.

Good point, I had not thought of this problem.  In any case, if we are
to have standardized libraries, then they represent in some sense extensions
to the language, and should have agreed upon names and interfaces.  If
we agree not to agree on these things, well, then we have an agreement --
but not a standard.

My interest in naming conventions came from considering what design 
decisions _cannot_ be encapsulated within a class, and the answer seems
to be mainly class names and interface protocols, though exception
handling and memory management issues are sure to be involved too.

I don't think people should under-estimate the difficulty in resolving
these issues.  It seems like every different organization chooses a 
different style of programming in C++ -- and swears that their approach
is _the one true way_

steve@taumet.com (Stephen Clamage) (01/19/91)

jhc@irwin.uucp (James H. Coombs) writes:

>Recently, I had a class with something like

>        static const u_short kOrig;

>Since the constant was known only within the scope of my class, I
>thought that it was as well protected from collisions as possible.  But
>it turned out that a developer had done something like:

>        #define kOrig 8

This is a primary reason not to use #define's in C++ code.  Constants
and inline functions give you what you need for anything but conditional
compilation.
-- 

Steve Clamage, TauMetric Corp, steve@taumet.com

sdm@cs.brown.edu (Scott Meyers) (01/23/91)

In article <1568@tcs.tcs.com> gwu@nujoizey.tcs.com (George Wu) writes:
>     Unfortunately, it is often the case that one is using existing C
>libraries from new C++ code.  For example, I'm at this instant trying to
>rationalize no less than three definitions of a boolean type: HP OpenViews,
>Motif, and our own internal library.  It isn't too bad, since only the
>OpenViews code and our own library have an actual symbol clash.  They both
>define the types "boolean," whereas Motif defines "Boolean."
>
>     So in my case, it's just a matter of changing our code.  If it'd been
>Motif and OpenViews code which conflicted, I wouldn't have been able to
>modify the code, so there'd be #defines wrapping the include statements of
>their header files.  Yeah, I know, pretty gross.  Has anyone thought of a
>better solution for this problem?

I don't know how to solve this problem, but I think when nested classes are
more widely available, big libraries should come out as a single class
(used to demarcate a name space) with all the "real" classes nested inside:

    class Motif {
      public:
        class realMotifClass1 { ... };
        class realMotifClass2 { ... };
        ...
    };

    class OpenViews {
      public:
        class realOpenViewsClass1 { ... };
        class realOpenViewsClass2 { ... };
        ...
    };

This should reduce name conflicts substantially: Motif::boolean is quite
different than OpenViews::boolean.  On the other hand, this makes
coding applications using the libraries more cumbersome:

    Motif::realMotifClass1 object1(x, y, z);
    object1.function1(Motif::true);

    OpenViews::realOpenViewsClass1 object2;
    object1.function2(OpenViews::true);

It would be nice if we could use this mechanism right now to solve the
prolems that have been mentioned:

    class MotifClasses {
      public:
      #include "motif.h"
    };

    class OpenViewsClasses {
      public:
      #include "openviews.h"
    };

But then the mangled names generated from the .h files wouldn't match the
mangled names present in the .o files.  Sigh.

Scott


-------------------------------------------------------------------------------
What do you say to a convicted felon in Providence?  "Hello, Mr. Mayor."

jimad@microsoft.UUCP (Jim ADCOCK) (01/24/91)

In article <62211@brunix.UUCP> sdm@cs.brown.edu (Scott Meyers) writes:
|It would be nice if we could use this mechanism right now to solve the
|prolems that have been mentioned:
|
|    class MotifClasses {
|      public:
|      #include "motif.h"
|    };
|
|    class OpenViewsClasses {
|      public:
|      #include "openviews.h"
|    };
|
|But then the mangled names generated from the .h files wouldn't match the
|mangled names present in the .o files.  Sigh.

It would seem that we need linkers that allow renaming of classes etc
at link time.  One way to "automate" this would be if you could tell
the linker to imply a libname:: extension to the classes implemented
in that lib.  Thus, assuming two libraries aren't named the same
[and you _can_ rename them by changing the name of the file containing
the library] you could resolve the conflicts.

But, name conflicts are only a small part of the conflicts that occur
when trying to use libraries from multiple vendors.  A bigger problem
is the discord that results from trying to use libraries of different
programming style and intent.  Also memory management schemes tend to
conflict.

I agree that a good, simple way to resolve name conflicts is going to
be necessary.  But, we also need to figure out some way to get to a 
consensus of a good, standard, base set of C++ libraries for people
to build on.  [However, I can imagine no way to do this.]

imp@Solbourne.COM (Warner Losh) (01/25/91)

In article <70185@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes:
->I agree that a good, simple way to resolve name conflicts is going to
->be necessary.  But, we also need to figure out some way to get to a 
->consensus of a good, standard, base set of C++ libraries for people
->to build on.  [However, I can imagine no way to do this.]


Our X toolkit (OI) prefixes all of the identifiers with either a oi_
or a OI_.  This seems to avoid many of the problems that people have
with 2 libraries defining stuff like Base.

Warner
-- 
Warner Losh		imp@Solbourne.COM
We sing about Beauty and we sing about Truth at $10,000 a show.

jimad@microsoft.UUCP (Jim ADCOCK) (01/30/91)

In article <1991Jan25.005440.27042@Solbourne.COM> imp@Solbourne.COM (Warner Losh) writes:
|In article <70185@microsoft.UUCP> jimad@microsoft.UUCP (Jim ADCOCK) writes:
|->I agree that a good, simple way to resolve name conflicts is going to
|->be necessary.  But, we also need to figure out some way to get to a 
|->consensus of a good, standard, base set of C++ libraries for people
|->to build on.  [However, I can imagine no way to do this.]
|
|
|Our X toolkit (OI) prefixes all of the identifiers with either a oi_
|or a OI_.  This seems to avoid many of the problems that people have
|with 2 libraries defining stuff like Base.

Yes, programmer supplied prefixes suffice in practical terms to avoid
name collisions across vendors -- assuming vendors write using the same
memory models, don't override global new, delete, etc, then presumably a 
library user has a vague possibility of getting libraries from two vendors
to coexist -- separately -- in one program.  Still, it seems silly for
programmers to have to manually provide name mangling in the program source
while simultaneously the compiler is providing automatic name mangling
to avoid collisions for the kinds of name overloading C++ supports.  Why
not just support module names in C++ as an additional factor in name 
overloading, and let the compiler do _all_ the name mangling?  The same
module::doSomething() syntax already used for nested classes should also 
suffice for module name disambiguising -- assuming a class and a module 
[library] can't have the same name.

However, if we are to have standard libraries, ratified by the ANSI-C++
committee, then for those libraries, the naming problem cannot be 
skirted.  Standard libraries must have agreed upon standard names, 
standard syntax, and standard semantics.  Today, C++ programmers  cannot
seem to agree on even one of these issues.  Thus, I don't know how the
ANSI-C++ committee is going to get agreement on all three issues.