[comp.lang.misc] Readable names

pd@sics.se (Per Danielsson) (03/14/88)

In article <2835@enea.se> sommar@enea.UUCP(Erland Sommarskog) writes:
>More seriously I think the use of mixed key in identifier names is a
>virtue. Why call something averylongnamewhichyoubarelycannotread when
>you can call AVeryLongNameWhichIsYetEasyToReadAtFirstGlance? OK, you
>may use underscores instead of case shifts. However, there is a drawback
>if the compiler has low limit of significant characters.

Why not call it A very long name which is yet easy to read at first
glance?
Much easier to read than your version. There are programming languages
which allow blanks in identifiers, and rightly so, since there's no
reason not to.
-- 
Per Danielsson		UUCP: pd@sics.se (or {mcvax,decvax}!enea!sics!pd)
Swedish Institute of Computer Science
PO Box 1263, S-164 28 KISTA, SWEDEN
"No wife, no horse, no moustache."

g-rh@cca.CCA.COM (Richard Harter) (03/15/88)

In article <2835@enea.se> sommar@enea.UUCP(Erland Sommarskog) writes:
>More seriously I think the use of mixed key in identifier names is a
>virtue. Why call something averylongnamewhichyoubarelycannotread when
>you can call AVeryLongNameWhichIsYetEasyToReadAtFirstGlance? OK, you
>may use underscores instead of case shifts. However, there is a drawback
>if the compiler has low limit of significant characters.

Just a note -- it is my experience that long (readable) identifier names
should not be used.  They appear to make a program more readable and
self documenting, and so on.  However there are two disadvantages:
The first is that they make code very bulky; secondly it is harder to
keep track of names.  I.e. you have to remember, at each writing, the
exact words used in the name.
-- 

In the fields of Hell where the grass grows high
Are the graves of dreams allowed to die.
	Richard Harter, SMDS  Inc.

sommar@enea.se (Erland Sommarskog) (03/16/88)

Richard Harter (g-rh@CCA.CCA.COM.UUCP) writes:
>Just a note -- it is my experience that long (readable) identifier names
>should not be used.  They appear to make a program more readable and
>self documenting, and so on.  However there are two disadvantages:
>The first is that they make code very bulky; secondly it is harder to
>keep track of names.  I.e. you have to remember, at each writing, the
>exact words used in the name.

The examples I had in my article were of course extreme examples to
illustrate my point about case shifts. If your note was directed against 
such extremes, I agree with you. 
  But if you mean to imply that NumOfAcc would be better than 
NumberOfAccidents, I have to object. NumOfAcc is harder to remember, 
since I have to remember the exact abbreviation too. (Your first argument 
may still be valid, though. Particulary if it is a freqeuntly used name.)
  As a whole, choosing names that are easy to remember and to understand
is not always that easy.
-- 
Erland Sommarskog       
ENEA Data, Stockholm        
sommar@enea.UUCP           "Si tu crois l'amour tabou...
                            Regarde bien, les yeux d'un fou!!!" -- Ange

g-rh@cca.CCA.COM (Richard Harter) (03/18/88)

In article <2857@enea.se> sommar@enea.UUCP(Erland Sommarskog) writes:

	... My note arguing against long names deleted
>
>The examples I had in my article were of course extreme examples to
>illustrate my point about case shifts. If your note was directed against 
>such extremes, I agree with you. 

	Not really, although I think we can both agree that 30 character
	names are undesirable.

> But if you mean to imply that NumOfAcc would be better than 
>NumberOfAccidents, I have to object. NumOfAcc is harder to remember, 
>since I have to remember the exact abbreviation too. (Your first argument 
>may still be valid, though. Particulary if it is a freqeuntly used name.)

	I agree that NumOfAcc is worse than NumberOfAccidents.  There
are human factor studies that back this up.  Ordinarily.  There are a
few shops that use a standard dictionary of abbreviations that is used
as a coding standard.  In such a shop you would know that NumOfAcc is
NumberOfAccidents because Num is the standard abreviation for Number and
Acc is the standard abreviation for Accidents.  Standardized abbreviations
are a big win.  (This has been backed up with studies.)

	However that was not what I meant -- I meant na or nac instead
of NumberOfAccidents, coupled, of course, with clear documentation of
what na, et. al. mean.  I will grant that na is not as menmonic as
NumberOfAccidents.  However I have tried it both ways and my conclusion
is that the short form works better (others may feel differently).  The
name is a symbol; I only need to find out what it means once; I don't
need to be told what it means everytime that I see it.  The long name
is, in effect, a redundancy that increase the noise level in a program.

	As I say, this is just my experience.  I went through a "long
descriptive names" are beautiful phase, and came up saying, no they're
not.  I can well believe that they work well for some people.  
-- 

In the fields of Hell where the grass grows high
Are the graves of dreams allowed to die.
	Richard Harter, SMDS  Inc.

mce@tc.fluke.COM (Brian McElhinney) (03/19/88)

In article <1810@sics.se> pd@sics.UUCP (Per Danielsson) writes:

> There are programming languages which allow blanks in identifiers,
> and rightly so, since there's no reason not to.

Well, some might argue that the resulting code is not as readable (since
whitespace is often used to visually separate syntax elements).

The only language I am aware of that allows embedded blanks is CORAL.
It has the mis-feature of allowing any number of blanks ("the horror...
the horror...").

Are there languages other than CORAL that allow blanks in symbol names?

Brian McElhinney
mce@tc.fluke.com

franka@mmintl.UUCP (Frank Adams) (03/19/88)

In article <2857@enea.se> sommar@enea.UUCP(Erland Sommarskog) writes:
>  But if you mean to imply that NumOfAcc would be better than 
>NumberOfAccidents, I have to object.

I don't much care for either of these alternatives.  Looking at the words
here, 'Number' is convention and should be abbreviated, 'Of' is
superfluous, and 'Accidents' contains the meat of the name.  (Just to keep
things down, I would omit the final 's' here, since it can be inferred from
the 'Number' prefix.)  This gives us NumAccident or nAccident, depending on
which convention you use for 'Number'.
-- 

Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108

carroll@snail.CS.UIUC.EDU (03/20/88)

	If I remember correctly, the canonical bug in FORTRAN is caused
by the joint property of implicit variable declarations and allowing space
in identifiers. My understanding is that the early compilers simply
removed all spaces before the parser ever saw it, so that spaces were
"allowed" in a sense.

sommar@enea.se (Erland Sommarskog) (03/20/88)

Richard Harter (g-rh@CCA.CCA.COM.UUCP) writes:
>	I agree that NumOfAcc is worse than NumberOfAccidents.  There
>are human factor studies that back this up.  Ordinarily.  
> ...
>	However that was not what I meant -- I meant na or nac instead
>of NumberOfAccidents, coupled, of course, with clear documentation of
>what na, et. al. mean.  I will grant that na is not as menmonic as
>NumberOfAccidents.  ... The
>name is a symbol; I only need to find out what it means once; I don't
>need to be told what it means everytime that I see it.  The long name
>is, in effect, a redundancy that increase the noise level in a program.

I must admit that I detest such naming conventions. Yes, if "na" is
very frequent and appears everywhere, and is not accompanioned by "lp", 
"oi" etc, well OK. But if I have to look up the definition each time I 
see it, it's a big loss. This is exactly why dislike things like Unix
and nroff with their short, seldom-mnemonic names. I find it very tire-
some to have look up the same thing more than once. 
  You write that this is your personal preference. I don't agree here.
Most likely someone else will have to work with your code some day.
I'm quite sure it will take him a longer time to understand your "na" 
than NumberOfAccidents.

I can agree with you so far that using abbreviations for frequently
occuring entities. But just as too many long names increases the
noise level, too many abbreviations does as well. 

-- 
Erland Sommarskog       
ENEA Data, Stockholm        
sommar@enea.UUCP           "Si tu crois l'amour tabou...
                            Regarde bien, les yeux d'un fou!!!" -- Ange

nick@ccicpg.UUCP (Nick Crossley) (03/21/88)

In article <3156@fluke.COM> mce@tc.fluke.COM (Brian McElhinney) writes:
>The only language I am aware of that allows embedded blanks is CORAL.
>It has the mis-feature of allowing any number of blanks ("the horror...
>the horror...").
>
>Are there languages other than CORAL that allow blanks in symbol names?
>
>
>Brian McElhinney
>mce@tc.fluke.com	

Yes, Algol68 allows spaces in identifiers.  It can do this because there
are two different alphabets used, one for keywords, operators and type names,
and another for identifiers, etc.  These are normally upper and lower
case, respectively.  Thus, one can say without ambiguity, but obviously
poor style :-
	BEGIN
		INT int := begin;
		REAL this is a real identifier, and here is another;

		this is a real identifier := begin + int * 2.0;
		...etc...
	END

The spaces in identifiers are not significant.  Used correctly, this feature
does make code very readable.

-- 

<<< standard disclaimers >>>
Nick Crossley, CCI, 9801 Muirlands, Irvine, CA 92718-2521, USA
Tel. (714) 458-7282,  uucp: ...!uunet!ccicpg!nick

g-rh@cca.CCA.COM (Richard Harter) (03/21/88)

In article <2883@enea.se> sommar@enea.UUCP(Erland Sommarskog) writes:
>Richard Harter (g-rh@CCA.CCA.COM.UUCP) writes:
>
>I must admit that I detest such naming conventions. Yes, if "na" is
>very frequent and appears everywhere, and is not accompanioned by "lp", 
>"oi" etc, well OK. But if I have to look up the definition each time I 
>see it, it's a big loss. This is exactly why dislike things like Unix
>and nroff with their short, seldom-mnemonic names. I find it very tire-
>some to have look up the same thing more than once. 

	Good point.  I can bear cp (rather than copy) precisely because
I use it frequently.  And rm is no worse than wondering if the command
is purge, flush, kill, or delete.  

>You write that this is your personal preference. I don't agree here.

	Now really, how can you say this?  If anyone is an expert on
my personal preferences, I am. :-)

>Most likely someone else will have to work with your code some day.
>I'm quite sure it will take him a longer time to understand your "na" 
>than NumberOfAccidents.

	I will have to disagree here.  As a caveat, I will immodestly
say that [some of] my code has been pointed out as exemplars of clear,
well written, well documented code.  More to the point, I have maintained
code written in both styles, written by both myself, and others, and I
find that documented code with concise variable names is easier to read
and maintain than code with large names, whether or not it is documented.

	Long names take longer to read and to write.  In particular, it
is harder to understand complicated statements.  Let me illustrate with
a ridiculous example, in no particular language:

	c = sqrt(a**2 + b**2);	/* Calculate hypotenuse of triangle a,b,c */

versus

hypotenuse_of_triangle_abc = sqrt(side_a_of_triangle_abc ** 2 +
	side_b_of_triangle_abc);

Notice the typo in the second version?  You did notice it, didn't you?
The length of the names didn't obscure the formula at all, or did they?

	I will grant that this is a contrived, and unfair example,
and that the names in my second example are not good ones.  Still, I
think the point is clear -- something that is easy to read (and write)
using short names, is actually less easy to read and write using long
names.

	Something that may be relevant is that I have been trained as
a mathematician.  I am much more comfortable with "let x be ..., y be
..., etc" and a body of work that uses x, y, etc, then with a style that
use wordy names instead of x, y, etc.  However I do insist on the definitions.
CODE THAT USES UNEXPLAINED AND UNDOCUMENTED SHORT NAMES IS HORRIBLE AND
UNMAINTAINABLE.  Another point is that the sort of code that I write and
prefer to maintain is written in small modules -- short names are much
more supportable if their scope is small.
-- 

In the fields of Hell where the grass grows high
Are the graves of dreams allowed to die.
	Richard Harter, SMDS  Inc.

pd@sics.se (Per Danielsson) (03/21/88)

In article <3156@fluke.COM> mce@tc.fluke.COM (Brian McElhinney) writes:
>The only language I am aware of that allows embedded blanks is CORAL.
>It has the mis-feature of allowing any number of blanks ("the horror...
>the horror...").
>
>Are there languages other than CORAL that allow blanks in symbol names?

Yes, Algol-68 allows whitespace in symbol names. Whitespace has no
significance at all in Algol-68, which means that it is *not* a part
of the symbol name, it's simply ignored. Of course, Algol-68 uses a
different typeface for other syntactic elements than symbol names, which
you might call cheating. :-)
In a real implementation capital letters are used instead of the
separate typeface, leaving lower case letters for identifiers. The
code becomes very readable if you remember to put in a few blanks here
and there in your symbol names.
-- 
Per Danielsson		UUCP: pd@sics.se (or {mcvax,decvax}!enea!sics!pd)
Swedish Institute of Computer Science
PO Box 1263, S-164 28 KISTA, SWEDEN
"No wife, no horse, no moustache."

schooler@oak.bbn.com (Richard Schooler) (03/22/88)

In article <3156@fluke.COM>, mce@tc (Brian McElhinney) writes:
>
>Are there languages other than CORAL that allow blanks in symbol names?
>

Fortran, of course, since blanks are insignificant except inside
string literals.  In some ways this is kind of nice, one can right one
million as "1 000 000", and one does not have to remember whether to
say "GO TO" or "GOTO", or for more modern code "END DO" or "ENDDO".
This feature can be abused, of course, and it makes some aspects of
parsing needlessly complex.
				-- Richard Schooler

dhesi@bsu-cs.UUCP (Rahul Dhesi) (03/22/88)

In article <3156@fluke.COM> mce@tc.fluke.COM (Brian McElhinney) writes:
>Are there languages other than CORAL that allow blanks in symbol names?

S ur e,FORT RANallow sblank sanywhere.
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi

hirchert@uxe.cso.uiuc.edu (03/22/88)

>In article <1810@sics.se> pd@sics.UUCP (Per Danielsson) writes:

>> There are programming languages which allow blanks in identifiers,
>> and rightly so, since there's no reason not to.

>Well, some might argue that the resulting code is not as readable (since
>whitespace is often used to visually separate syntax elements).

>The only language I am aware of that allows embedded blanks is CORAL.
>It has the mis-feature of allowing any number of blanks ("the horror...
>the horror...").

>Are there languages other than CORAL that allow blanks in symbol names?

>Brian McElhinney
>mce@tc.fluke.com	

Although its not quite the same thing, FORTRAN allows symbolic names to be
written with embedded blanks.  The blanks are ignored and thus are not
significant in the interpretation of the name.  Thus, in a FORTRAN that allows
variable names longer than 6 characters, DO GRATE and DOG RATE are the same
identifier.  There is an ongoing debate within the FORTRAN community and the
FORTRAN standards committee in particular about whether this "feature" should
be phased out of the language.

Kurt W. Hirchert     National Center for Supercomputing Applications

barmar@think.COM (Barry Margolin) (03/22/88)

In article <2779@mmintl.UUCP> franka@mmintl.UUCP (Frank Adams) writes:
>In article <2857@enea.se> sommar@enea.UUCP(Erland Sommarskog) writes:
>>  But if you mean to imply that NumOfAcc would be better than 
>>NumberOfAccidents, I have to object.
>
>I don't much care for either of these alternatives.  Looking at the words
>here, 'Number' is convention and should be abbreviated, 'Of' is
>superfluous, and 'Accidents' contains the meat of the name.  (Just to keep
>things down, I would omit the final 's' here, since it can be inferred from
>the 'Number' prefix.)  This gives us NumAccident or nAccident, depending on
>which convention you use for 'Number'.

Another thing to take into account is context.  If this variable is
being used in a program that deals with insurance claims, the
abbreviation "Acc" for "Accident" would be pretty clear from the
context.  In another program, it might be necessary to spell it out to
avoid ambiguity.

Barry Margolin
Thinking Machines Corp.

barmar@think.com
uunet!think!barmar

msf@amelia.nas.nasa.gov (Michael S. Fischbein) (03/22/88)

In article <3156@fluke.COM> mce@tc.fluke.COM (Brian McElhinney) writes:
>In article <1810@sics.se> pd@sics.UUCP (Per Danielsson) writes:
>> There are programming languages which allow blanks in identifiers,
>Are there languages other than CORAL that allow blanks in symbol names?

Sure. FORTRAN.
        
	mike

-- 
Michael Fischbein                 msf@ames-nas.arpa
                                  ...!seismo!decuac!csmunix!icase!msf
These are my opinions and not necessarily official views of any
organization.

sommar@enea.se (Erland Sommarskog) (03/23/88)

Frank Adams (franka@mmintl.UUCP) writes:
>I don't much care for either of these alternatives.  Looking at the words
>here, 'Number' is convention and should be abbreviated, 'Of' is
>superfluous, and 'Accidents' contains the meat of the name.  (Just to keep
>things down, I would omit the final 's' here, since it can be inferred from
>the 'Number' prefix.)  This gives us NumAccident or nAccident, depending on
>which convention you use for 'Number'.

You don't live as learn I see. Or else why do you use so many superfluous
words and letters? Let's take the first sentence. It's quite clear that
you are one who don't care, so drop "I". "Do" is just a dummy, away
with it. "For" and "of" and just prepositions and syntactic sugar. 
The "s" in "Alternatives" is clear from "these" and besides "Alt" is 
the standard abbreviation, and should be used. And what on earth does 
"either" tell us? Nothing! You should have written "Not care much these alt". 
Much clearer, more consice and faster for us to read. 
-- 
Erland Sommarskog       
ENEA Data, Stockholm        
sommar@enea.UUCP           "Si tu crois l'amour tabou...
                            Regarde bien, les yeux d'un fou!!!" -- Ange

esh@otter.hple.hp.com (Sean Hayes) (03/23/88)

>Are there languages other than CORAL that allow blanks in symbol names?
Algol 68
 _________________________________________________________________________
 |Sean Hayes,          Hewlett Packard Laboratories,      Bristol,  England|
 |net: esh@hplb.uucp   esh%shayes@hplabs.HP.COM       ..!mcvax!ukc!hplb!esh|

sommar@enea.se (Erland Sommarskog) (03/23/88)

Richard Harter (g-rh@CCA.CCA.COM.UUCP) writes:
>	I will have to disagree here.  As a caveat, I will immodestly
>say that [some of] my code has been pointed out as exemplars of clear,
>well written, well documented code.  

Well, I haven't seen it, so I can't judge. You have to post some of that
wonderful code. :-)

>	Long names take longer to read and to write.  In particular, it
>is harder to understand complicated statements.  Let me illustrate with
>a ridiculous example, in no particular language:

I beginning to feel that I am repeating myslef. But it takes longer to read
"na" than "NumberOfAccidents", unless you managed to burn a hard link
that gives an immediate understanding of "na". Else you have to waste time
to look it up or recall what it means. Same may be valid for typing: "What
abbreviation did I use?"

The use of your example (deleted for brevity) is that it shows that names
*can* be too long names sometimes, and that in certain circumstances 
short names are preferable. But we have already agreed on that, haven't we? 

>Another point is that the sort of code that I write and
>prefer to maintain is written in small modules -- short names are much
>more supportable if their scope is small.

Yes, short names are more acceptable in a small scope. But even if your
modules are small, you have a lot of global names: procedures, types,
constants, enumeration values and field elements in records. You use
your short names for these too?

-- 
Erland Sommarskog       
ENEA Data, Stockholm        
sommar@enea.UUCP           "Si tu crois l'amour tabou...
                            Regarde bien, les yeux d'un fou!!!" -- Ange

throopw@xyzzy.UUCP (Wayne A. Throop) (03/24/88)

> g-rh@cca.CCA.COM (Richard Harter)
> 	Something that may be relevant is that I have been trained as
> a mathematician.  I am much more comfortable with "let x be ..., y be
> ..., etc" and a body of work that uses x, y, etc, then with a style that
> use wordy names instead of x, y, etc.

I straddle the long-names/short-names camps, and Richard's comment here
is relevant to my rationale.  I use long names for data that has
extensive scope, and short names for data of local scope.  That is, I
might have a global entity number_of_errors, but if I were manipulating
it extensively, to calculate statistics or whatnot, I'd enclose the
relevant calculations in a scope that coins the nickname "ne".

My greatest gripe for most programming languages is that they don't
provide methods of coining restricted scope names by by-reference
binding (with optimization, of course).  My general programming style
has evolved so that I often write code consisting of a large windup
phase of let x ..., y ..., z ... and then conclude with a small, often
trivial evaluation (most of the work having been done in the "let"s).

Thus: I define the long names to the compiler, so that I'm not hiding
important information in comments, and then use what name-coining
facilities I can to make the expression of the needed calculations
compact and readily comprehensible, to both me and the compiler instead
of just to me.

> Another point is that the sort of code that I write and
> prefer to maintain is written in small modules -- short names are much
> more supportable if their scope is small.

Exactly.  I've found that my method allows the blending of small scopes
with small names with larger scopes with more mnemonic names cleanly.
Or, so it seems to me.

--
A program without a loop and a structured variable isn't worth writing.
                                        --- Alan J. Perlis
-- 
Wayne Throop      <the-known-world>!mcnc!rti!xyzzy!throopw

biagioni@mckinley.cs.unc.edu (Edoardo Biagioni) (03/24/88)

In article <2899@enea.se> sommar@enea.UUCP(Erland Sommarskog) writes:
>Frank Adams (franka@mmintl.UUCP) writes:
>>I don't much care for either of these alternatives.  Looking at the words
>>here, 'Number' is convention and should be abbreviated, 'Of' is
>>...
>>the 'Number' prefix.)  This gives us NumAccident or nAccident, depending on
>
>You don't live as learn I see. Or else why do you use so many superfluous
>words and letters? Let's take the first sentence.
>...
>... You should have written "Not care much these alt". 
>Much clearer, more concise and faster for us to read. 
>-- 
>Erland Sommarskog       
>ENEA Data, Stockholm        
>sommar@enea.UUCP           "Si tu crois l'amour tabou...
>                            Regarde bien, les yeux d'un fou!!!" -- Ange

NO!!!!!!

Natural language needs its redundancy, to avoid misunderstandings. Yes,
there are many more compact ways of writing our thoughts down, but very
few better ways of communicating our thoughts. For example, I find it
quite tiring to read postings where the spelling of some words is
phonetic (i.e. thru or shur). Don't let anyone think the above example
is an example of "good communication"!

Ed Biagioni	biagioni@cs.unc.edu 		Department of Computer Science
		seismo!mcnc!unc!biagioni	Chapel Hill, N.C. 27514, USA

As to variable names, I try to find a concise name that expresses what
I want, but that's my own choice. I will use long names to express
abstruse usages, and I am glad to report that most of my variable names
turn out to be short!

g-rh@cca.CCA.COM (Richard Harter) (03/24/88)

In article <2902@enea.se> sommar@enea.UUCP(Erland Sommarskog) writes:
>Richard Harter (g-rh@CCA.CCA.COM.UUCP) writes:
>>	I will have to disagree here.  As a caveat, I will immodestly
>>say that [some of] my code has been pointed out as exemplars of clear,
>>well written, well documented code.  

>Well, I haven't seen it, so I can't judge. You have to post some of that
>wonderful code. :-)

	Not on your life!  Praising your own accomplishments works much
better if you don't embarrass yourself by making them public.  :-)
>I beginning to feel that I am repeating myself. But it takes longer to read
>"na" than "NumberOfAccidents", unless you managed to burn a hard link
>that gives an immediate understanding of "na". Else you have to waste time
>to look it up or recall what it means. Same may be valid for typing: "What
>abbreviation did I use?"

	Oh, I see your point alright.  I find NumberOfAccidents hard
to read and unpleasant, whereas Number_of_Accidents is more pleasing to
mine eye.  In either case it is quite clear what the variable is, whereas
na is simply cryptic.  Or is it?  Suppose we have several types of accidents,
and partial sums, and so on.  Even so, the short names remain cryptic.

	However the issue is not, as I see it, one of reading variables,
it is one of reading lines of code and blocks of code.  Suppose, for example,
that you use names 15-20 characters in length and that the statement has half
a dozen names in it  -- you now have statements ~100 characters in length.
Moreover, your code, depending on the task, may have big blocks of this
long winded code.  Another consideration is that variables typically do
not occur as isolated instances; they tend to appear in closely joined
statements.  If you see na once and Number_of_Accidents once, the former
is cryptic and the latter is self documenting.  If you see them ten times
it becomes worth your while to understand what na means, and the self
documenting feature of Number_of_Accidents is redundant.

	However, if one uses short names, documentation becomes very
important.  As an example, I recently came across a listing of a scratch
test program I wrote about 15 years ago.  This program used one and two
letter names, and had no comments whatsoever.  [Not to sniff and hold your
nose up; it was a one time test program of about 200 lines; it was written
with the intent of being used once and being thrown away; the saving of
the listing was pure chance.]  Now, when I wrote it, I knew exactly what
it did.  From the listing it is nearly impossible to understand what it
does, what the algorithm being tested was (or how it worked), and what the
output represented.

>>Another point is that the sort of code that I write and
>>prefer to maintain is written in small modules -- short names are much
>>more supportable if their scope is small.

>Yes, short names are more acceptable in a small scope. But even if your
>modules are small, you have a lot of global names: procedures, types,
>constants, enumeration values and field elements in records. You use
>your short names for these too?

Depends -- procedure names are usually 6-8 characters to avoid portability
problems in the software I am working with now.  Also, cross reference
listings are more readable if all names are about the same length.  The same
is true of global variables (All code currently is C).  These names are
all compressed names with standardized abbreviations, e.g. cmpdir is
'compress directory', assion is 'assign insertion order number', 'dspbib'
is 'display bibliography'.  Names global to a file are short if they are
common, longer if they are not.  Field elements are almost always short.
Flags and enumeration values are ususally full words.  One letter names
almost always have a standard meaning, e.g. i is a loop index, n is the
number of values of something.  Types are treated much the same as globals.
Also relevant is the fact that I annotate every line, ala assembly language
code, unless it is too long to fit a comment field in.  Normal format is
a code field of about 40 chars wide and a comment field of about 35 chars
wide with all comments aligned.  As you may imagine, short names are more
attractive with these constraints :-).    I have been using this style for
about 6 years now, having used others (including long names).
-- 

In the fields of Hell where the grass grows high
Are the graves of dreams allowed to die.
	Richard Harter, SMDS  Inc.

jefu@pawl17.pawl.rpi.edu (Jeffrey Putnam) (03/24/88)

In article <1816@sics.se> pd@sics.UUCP (Per Danielsson) writes:
:In article <3156@fluke.COM> mce@tc.fluke.COM (Brian McElhinney) writes:
::The only language I am aware of that allows embedded blanks is CORAL.
::It has the mis-feature of allowing any number of blanks ("the horror...
::the horror...").
::
::Are there languages other than CORAL that allow blanks in symbol names?
:
:Yes, Algol-68 allows whitespace in symbol names. Whitespace has no
:significance at all in Algol-68, which means that it is *not* a part
:of the symbol name, it's simply ignored. Of course, Algol-68 uses a
:different typeface for other syntactic elements than symbol names, which
:you might call cheating. :-)

Just out of curiosity (as i couldnt fit more than about another copy of
my .signature into my disk quota) does anyone know of an algol 68 
implementation for Suns?  Or even for PC's? 
jeff putnam  
{jefu%pawl -or- jeff_putnam%rpitsmts}@itsgw.rpi.edu
"People would rather believe a simple lie than the complex truth."

jonathan@pitt.UUCP (Jonathan Eunice) (03/24/88)

Erland Sommarskog wages a reducing-to-absurdity argument against
heavily truncated and abbreviated names of programming objects.  Ed
Biagioni replies "natural language needs its reduncancy, to avoid
misunderstandings," and more to the point, "there are many more compact
ways of of writing our thoughts down, but very few better ways of
communicating our thoughts."   

Exactly!

If your aim is to be precise, symbolic notation wins every time.  But
when your aim is to communicate, to be understood, then including the
redundancy is a Good Idea.  Not just in natural language, mind you, but
in programming, mathematics, logic, and so on.  The shortest statements
can take the longest time to read, because the reader must figure out
what it means, must translate or interpret it at read-time.  Isn't this
why math, logic, and some C programs are so hard to figure out?  It's
the classic space-time tradeoff: represent more compactly, but then you
must spend more time generating the desired information or details upon
retrieval.  Efficiency of reading should be measured by how long it
takes to understand the material, not how long it takes the eyes to
scan it.  When evaluating a notational scheme, such as a programming
language, it's the time-to-understanding figure that's important.

Writing programs is a form of communication, with the machine somewhat,
but with yourself and other programmers, moreso.  You must understand
your own code/theorems/methods to use/revise/prove/improve them, and
this is not as easy as it seems.  That's often why bugs creep in --
don't tell me you don't have bugs, even in the most precise symbolic
notation -- we often do not fully understand the
effects/side-effects/implications of our own code/decisions/methods.

Of couse there are things like COBOL at the other end of the world,
where the information conent (ratio of information to bulk) is so low
that it is difficult to put the information together in a coherent
fashion.  (Please, no COBOL-IS-OK-FLAMES.)

You can't have Too Much or Too Little, it's got to be Just Right.  This
is somewhat a stylistic issue, so not something that's really
"solvable" in an objective sense, but it's important to see the
tradeoffs involved, not just run a "my style is better" flame war.

------------------------------------------------------------------------------
      Jonathan S. Eunice                    ARPA: jonathan%pitt@relay.cs.net
   University of Pittsburgh          UUCP, CSNET: jonathan@pitt
       Computer Science                   BITNET: jonathan@pittvms
        (412) 624-8836

firth@sei.cmu.edu (Robert Firth) (03/25/88)

In article <3156@fluke.COM> mce@tc.fluke.COM (Brian McElhinney) writes:

>Are there languages other than CORAL that allow blanks in symbol names?

If by blanks you mean "white space", then this is allowed by two of
the oldest & best known languages: Algol-60 and Fortran.  The ways
of designing the language to permit white space in identifiers are
very well known; so well known that I believe the decision not to
allow white space is always a conscious one.

ken@cs.rochester.edu (Ken Yap) (03/25/88)

Readable identifiers are a good thing, if we can decide on how long the
name should be :-).

But to me, clarity and correctness of the algorithm is just as, or even
more important. I would rather have a simple, efficient, reliable
algorithm that has short identifiers than a baroque, marginal hack with
lots of special cases that don't work, even if the programmer wrote an
essay on his algorithm.

I can edit the short names with global substitutions but I don't want
to understand or fix broken algorithms if I can avoid them. My
experience has been that it was usually the contortions in dusty deck
FORTRAN codes that frustrated me rather than the 6 max character
identifiers.

	Ken

faustus@ic.Berkeley.EDU (Wayne A. Christopher) (03/25/88)

Really the worst thing is when people draw a picture of what they want,
with labels like "a" and "b" attached to the relevant things in the
diagram so that it's obvious how the algorithm works, and code it
using the same labels.  Then they lose the diagram...  Personally I
always put the PostScript source to any explanatory diagrams like this
in the comments of by code... :-)

	Wayne

peter@sugar.UUCP (Peter da Silva) (03/25/88)

In article <2857@enea.se>, sommar@enea.se (Erland Sommarskog) writes:
> Richard Harter (g-rh@CCA.CCA.COM.UUCP) writes:
> >Just a note -- it is my experience that long (readable) identifier names
> >should not be used....
> >I.e. you have to remember, at each writing, the
> >exact words used in the name.

This is not a readability argument. This is a memorability argument. In any
large program you're not going to remember the meanings of all your variables
without either finding the context they're used in, or popping back up
to where they're declared and reading your comments. Long, short, and fancy
variable names don't help much in writing the code. Long names do help a lot
in reading the code.

>   But if you mean to imply that NumOfAcc would be better than 
> NumberOfAccidents, I have to object. NumOfAcc is harder to remember, 
> since I have to remember the exact abbreviation too. (Your first argument 
> may still be valid, though. Particulary if it is a freqeuntly used name.)

Gee, what did I call the accident count? AccidentCount? NumberOfAccidents?
CrashAndBurn? Um...

If you're going to be reading code more than writing it, long variable names
are a win. If you're going to be writing more code (say, it's APL or a
command language), short ones are a win.

If you ever used VAX/VMS, and can't remember whether you get the file sizes
with a DIR/LENGTH or a DIR/SIZE or whatever, you'll know what I mean.
(I'm surprised they didn't make that "SHOW FILES" -- everything else in
the system seems to be an option on SHOW).

Frequently used names should be short: you're not going to forget "mv" or "rm",
but a lot of people can't remember whether to use "fsck" or "fsdb".
(Speaking of which, I think doubling the size of every directory on the system
wouldn't be too high a price to pay for 30 character file names in UNIX).

>   As a whole, choosing names that are easy to remember and to understand
> is not always that easy.

You win a No-Prize for the understatement of the year!
-- 
-- Peter da Silva  `-_-'  ...!hoptoad!academ!uhnix1!sugar!peter
-- Disclaimer: These U aren't mere opinions... these are *values*.

mls@whutt.UUCP (SIEMON) (03/26/88)

In article <3125@pitt.UUCP>, jonathan@pitt.UUCP (Jonathan Eunice) writes:
> 
> If your aim is to be precise, symbolic notation wins every time.  But
> when your aim is to communicate, to be understood, then including the
> redundancy is a Good Idea.  Not just in natural language, mind you, but
> in programming, mathematics, logic, and so on.  The shortest statements
> can take the longest time to read, because the reader must figure out
> what it means, must translate or interpret it at read-time.  Isn't this
> why math, logic, and some C programs are so hard to figure out?

	Well, ...  A lot math is "hard" because it isn't compact enough.
[Classic example, integration a la Archimedes vs. a la Newton.  Or more
recently, the progress from tensors with scads of indices, through the
Einstein summation convention, to Cartan's elimination of coordinates
altogether.  By the time you get your symbols REALLY compact, you reduce
everything to catgeory theory, which is trivial! :-) :-) :-) :-) :-) :-)]

	Besides their use in algebraic manipulation, the point of "formulas"
is that by emcompassing a full context in one visual scope, they give a full
"picture" of the problem and its solution.  Richard's point about short names
used frequently in a limited local scope is exactly analogous to the (good)
use of symbols in math.  (Yes, there ARE bad uses of symbols in math!)

 	The search for "one-liner" solutions to problems is normally bad,
being usually an excuse for "obfuscating" code; but the motivation is often
excellent -- you can comprehend, totally, what is going on if it's short
enough; otherwise, you flounder.  Yes, it may take time, but it's POSSIBLE!
The longer something is, the less likely that it can be understood except in
a vague (and hence for programming inadequate) way.  And short in this case
means a measure like page area for reproducing the algorithm.  The explicatory
"legend of symbols" must be easily seen as well, preferably in the same visual
field (as on a cartographer's map).  Only if the algorithmic content is almost 
nil (as in _some_ accounting routines) does it make sense for a computer
procedure to be "conversational" in its use of language.

Michael Siemon
contracted to AT&T Bell Laboratories
ihnp4!mhuxu!mls
standard disclaimer

sommar@enea.se (Erland Sommarskog) (03/27/88)

Me and Richard Harter have been exchaning points of view on identifier
names for a while. First I thought I should give Richard the last word
here, but then I changed my mind and decided to add some short notes
to his last article. It seems to me that we have the same basic ideas,
but we put the limit differently. While I'm getting doubts for names of
15-20 characters, Richard gets worried at 6-8.

>	Not on your life!  Praising your own accomplishments works much
>better if you don't embarrass yourself by making them public.  :-)

Coward! :-)

>	Oh, I see your point alright.  I find NumberOfAccidents hard
>to read and unpleasant, whereas Number_of_Accidents is more pleasing to
>mine eye.  

Underscore of case shifts doesn't really matter to me, as long as it is 
not "numberofaccidents". It was there my discussion started. These days 
I prefer underscores, before I used case shifts. Underscores are not
standard Pascal, that's why.

>	However the issue is not, as I see it, one of reading variables,
>it is one of reading lines of code and blocks of code.  Suppose, for example,
>that you use names 15-20 characters in length and that the statement has half
>a dozen names in it  -- you now have statements ~100 characters in length.

Sure names can get too long. I dislike having to split a statement in two 
lines. It's fairly OK when an assignment looks like:
   Some_variable := Quite_long_expression + 
                    Another_long_expression;
But when it declines to
   Record_field_with_long_prefix_or_a_slice := 
                                  Another_long_one_that_does_not_fit_the_line;
I'm not too amused. However, to cure these problems I rather split
statements in two or something like that, than cutting down the names.
   
>Depends -- procedure names are usually 6-8 characters to avoid portability
>problems in the software I am working with now.  Also, cross reference

If you have problem with such a limitation, you have a valid argument.
I wouldn't say I'm envious of you...

>all compressed names with standardized abbreviations, e.g. cmpdir is
>'compress directory', assion is 'assign insertion order number', 'dspbib'

Oh, I would have guessed "compare". As you may guess I would have
called it Compress_directory.

>Also relevant is the fact that I annotate every line, ala assembly language
>code, unless it is too long to fit a comment field in.  Normal format is
>a code field of about 40 chars wide and a comment field of about 35 chars
>wide with all comments aligned.  As you may imagine, short names are more

Good habit. You need a lot of discipline to keep them updated. 
  Mine is about the opposite. I usually put a describing comment in the 
procedure header, explaining parameters if their names doesn't say it all. 
For local variables I add a describing comment, except the most trivial. 
  The code itself does only contain comments if I do some special trick. 
I don't know if this is a good habit, but I tend to feel that the comment 
tends to make code unnecessary verbose. And since I am using longer names 
than Richard I have to add extra lines for them, which makes it harder to 
keep the procedure in the desired one-page limit.
  (And besides, I don't write in obscure languages like C, so I don't
have to write that much comments :-) :-) )


-- 
Erland Sommarskog       
ENEA Data, Stockholm        
sommar@enea.UUCP

g-rh@cca.CCA.COM (Richard Harter) (03/28/88)

Erland has wrapped things up nicely.  I couldn't resist putting in one
final final word:


>  (And besides, I don't write in obscure languages like C, so I don't
>have to write that much comments :-) :-) )

	Coward!  :-) :-)
-- 

In the fields of Hell where the grass grows high
Are the graves of dreams allowed to die.
	Richard Harter, SMDS  Inc.

franka@mmintl.UUCP (Frank Adams) (03/29/88)

[Followups directed to sci.lang; edit as appropriate.]

In article <2899@enea.se> sommar@enea.UUCP(Erland Sommarskog) writes:
|Frank Adams (franka@mmintl.UUCP) writes:
|>[Re: variable names NumberOfAccidents vs NumOfAcc]
|>I don't much care for either of these alternatives. ... This gives us
|>NumAccident or nAccident, depending on which convention you use for
|>'Number'.
|
|... You should have written "Not care much these alt".

Quite funny.  English not programming language.  Redundency role oral,
unneeded written.  Eliminate?  Confusion: two language.  Transition.  May
lose meaning, not careful: example, keep "for"; could be "not care much
between these alt".  Don't need "these": "Not care much for alt".
-- 

Frank Adams                           ihnp4!philabs!pwa-b!mmintl!franka
Ashton-Tate          52 Oakland Ave North         E. Hartford, CT 06108

roberts@cognos.uucp (Robert Stanley) (04/03/88)

In article <2921@enea.se> sommar@enea.UUCP(Erland Sommarskog) writes:

>Me and Richard Harter have been exchaning points of view on identifier
>names for a while.
>
>>	Oh, I see your point alright.  I find NumberOfAccidents hard
>>to read and unpleasant, whereas Number_of_Accidents is more pleasing to
>>mine eye.  
>
>Underscore of case shifts doesn't really matter to me, as long as it is 
>not "numberofaccidents". It was there my discussion started. These days 
>I prefer underscores, before I used case shifts.
>standard Pascal, that's why.

The above exchange roughly summarizes what has been a largely inconclusive
discussion, to wit: everyone diving into a large chunk of unremembered or
alien code finds it much easier to read if there are fully descriptive
names, while anyone actively writing code inevitably mimizes names for the
sake of mechanical expediency.

So what?  We are all entitled to our own views and idiosyncrasies, but we
are typically paid to produce working, understandable, flexible,
maintainable, etc. code.  If in doubt, make the computer pick up the slack.
It doesn't take much effort to adapt one of today's editors/browsers to
using a name alias table, or to make global changes to names in a compile
unit via a name substitution list.  Like everyone else, I have developed a
set of coding habits over the years, and every so often I find that my
style is at odds with local convention/standards.  When this happens, I can
either attempt to change my habits, or I can build a transformer which will
massage my (consistent) code into the required local format.  I long ago
learned that the second approach is much more successful (amazing how set
in ones ways it is possible to get!), and have seldom encountered any
problems finding a suitable engine with which to perform the transforms.

I am a short(ish) name, underscore separator, case-insensitive programmer
from habit (PL/1 origins, you know), but I have discovered that a simple
comment attached to each identifier declaration (I always explicitly declare
all identifiers) makes future understanding of code substantially easier.  I
can also use said descriptive comment to hold a longer descriptive name in
a form suitable for automatic extraction and substitution.

For those of you who are interested in the cognitive psychology aspects of
understanding case-sensitive text, a substantial amount of work has been
done in this field.  It seems that single case becomes a problem only when
the length of text becomes significant (useful word that, which I won't
explicate here :-) ), or when proportional fonts are in use.  E-mail me if
you have difficulty finding references on your own.

Totally subjective observation on the subject:  case sensitivity is
undoubtedly useful, but it sure introduces a whole lot of problems.

Ummm, don't you think we've about bashed this subject to death?

Robert_S
-- 
Robert Stanley - Cognos Incorporated: P.O. Box 9707, 3755 Riverside Drive
Compuserve: 76174,3024		      Ottawa, Ontario  K1G 3Z4, CANADA
uucp: decvax!utzoo!dciem!nrcaer!cognos!roberts  Voice: (613)738-1440(Research)
arpa/internet: roberts%cognos.uucp@uunet.uu.net   FAX: (613)738-0002

ralphw@IUS3.IUS.CS.CMU.EDU (Ralph Hyre) (04/08/88)

In article <2640@cognos.UUCP> roberts@cognos.UUCP (Robert Stanley) writes:
>So what?  We are all entitled to our own views and idiosyncrasies, but we
>are typically paid to produce working, understandable, flexible,
>maintainable, etc. code.  If in doubt, make the computer pick up the slack.
>It doesn't take much effort to adapt one of today's editors/browsers to
>using a name alias table, or to make global changes to names in a compile
>unit via a name substitution list....
exactly
>... every so often I find that my style is at odds with local 
>convention/standards.  When this happens, I can .. build a transformer
>which will massage my (consistent) code into the required local format.
Better yet for the programming environment to handle the transformation.
...
>Ummm, don't you think we've about bashed this subject to death?
Not yet.  A parallel discussion has been going on in comp.lang.smalltalk,
and I posted the following note: (with minor reformatting changes --  text
that missed the first posting is added between {})

Path: PT.CS.CMU.EDU!IUS3.IUS.CS.CMU.EDU!ralphw
From: ralphw@IUS3.IUS.CS.CMU.EDU (Ralph Hyre)
Newsgroups: comp.lang.smalltalk
Subject: Re: embeddedCapitals
Message-ID: <1229@PT.CS.CMU.EDU>
Date: 25 Mar 88 21:15:59 GMT
References: <2472@pdn.UUCP> <12100008@osiris.cso.uiuc.edu>
Sender: netnews@PT.CS.CMU.EDU
Organization: Carnegie-Mellon University, CS/RI
Lines: 31

In article <12100008@osiris.cso.uiuc.edu> goldfain@osiris.cso.uiuc.edu writes:
>
>Now that I've thought about it for a bit, I think there is no solution to the
>embedded-capitals-causes-naming-errors problem.  Anything you do to improve
>readability will cause possible name-matching  errors, whether you separate
>parts of long names with spaces, underscores, capitals, etc.
This is exactly why the name separator stuff might best be handled by a {good}
programming environment, not the compiler/interpreter itself.  A potential
drawback is that {in a naive implementation} you might end up limiting {the
number of usable characters by reserving too many characters as attribute 
descriptors, like how Scribe tends to reserve @ and (, and Tex \}
For example, if uppercase is 'reserved' bu the programming environment as a
possible 'long-word-separator' attribute, then you may lose the ability to use
uppercase in {representing} program symbols. [but maybe these should be 
represented internally by something other than character strings anyway, like
a tagged number (so that the intepreter/compiler doesn't have to do parsing in
addition to its other duties).]

Ideally, the user preference could be whatever they're most comfortable with,
so a symbol displays as AVeryLongWord to users who prefer that style, or 
a_very_long_word, or just averylongword for those with a lot of patience.

Comments?
--
					- Ralph W. Hyre, Jr.

Internet: ralphw@ius2.cs.cmu.edu    Phone:(412)268-{2847,3275} CMU-{BUGS,DARK}
Amateur Packet Radio: N3FGW@W2XO, or c/o W3VC, CMU Radio Club, Pittsburgh, PA


-- 
					- Ralph W. Hyre, Jr.

Internet: ralphw@ius2.cs.cmu.edu    Phone:(412)268-{2847,3275} CMU-{BUGS,DARK}
Amateur Packet Radio: N3FGW@W2XO, or c/o W3VC, CMU Radio Club, Pittsburgh, PA

peter@sugar.UUCP (Peter da Silva) (04/14/88)

ARTICLE: DO;
	EVERY$BODY$KNOWS$THAT(PLM$STYLE$IDENTIFIERS,ARE,EASIER$TO$READ);
END;

END PLM86 3 LINES 5 IDENTIFIERS

	ARTICLE
	EVERYBODYKNOWSTHAT
	PLMSTYLEIDENTIFIERS
	ARE
	EASIERTOREAD
-- 
-- Peter da Silva      `-_-'      ...!hoptoad!academ!uhnix1!sugar!peter
-- "Have you hugged your U wolf today?" ...!bellcore!tness1!sugar!peter
-- Disclaimer: These aren't mere opinions, these are *values*.