[comp.lang.misc] case sensitivity

dhesi@bsu-cs.UUCP (Rahul Dhesi) (03/11/88)

Just an opinion:  case sensitivity in a programming language is not in
itself a bad thing.  It is how it is used that can cause problems.

In Modula-2 reserved words are required to be in uppercase while names
of standard procedures are in mixed case.  Thus one must be constantly
using the shift key.  This is painful.

In C reserved words and standard functions are in lowercase.  By simply
not using uppercase letters one gets all the advantages of a
case-insensitive language without any hassles.  There are just a very
few words that need to be typed in uppercase (e.g. FILE and NULL).

Thus case-sensitivity in Modula-2 is a bad thing but in C it is nearly
irrelevant to those who don't care for it.

Those who enjoy hassling users with names like WriteString (or
WrItEsTrInG etc.) can still use them in C, though the availability of
printf() and putstr() makes them superfluous.

So, if the case-sensitivity of Modula-2 is a drag, it is so because of
some bad language design decisions, not because case-sensitivity is in
itself bad.
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi

kasper@csli.STANFORD.EDU (Kasper Osterbye) (03/11/88)

In article <2318@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
>Just an opinion:  case sensitivity in a programming language is not in
>itself a bad thing.  It is how it is used that can cause problems.
> --> [Arguments]
>Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi

I disagree. Case sensitivity is always a drag. Someone always will use
the TWO variables i and I, and I will get quite confused. I like a language
to treat WriteString as the same identifier as writestring, but I like
that I can write it both ways.

wsmith@uiucdcsm.cs.uiuc.edu (03/12/88)

>In article <2318@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
>>Just an opinion:  case sensitivity in a programming language is not in
>>itself a bad thing.  It is how it is used that can cause problems.
		       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>> --> [Arguments]
>>Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi
>
>I disagree. Case sensitivity is always a drag. Someone always will use
>the TWO variables i and I, and I will get quite confused. I like a language
>to treat WriteString as the same identifier as writestring, but I like
>that I can write it both ways.

It is how it is used that can cause problems.  If someone names two variables
I and i, he should expect that you, reading his program later,  would get 
quite confused.  >>Naming conventions<< let you use the power of the case 
sensitivity safely.

If I have a naming convention that types are declared with a uppercase
first letter and lower case for the rest of the letters, that macros
should be all upper case and variable names should be all lower case and
other rules for procedures, functions and constants, then as long as a 
programmer follows the rules no one will get confused.  BUT, if
the language is case insensitive, the programmer will start ignoring the 
naming conventions because the compiler lets her, and the situation is just as 
bad as a case sensitive with no conventions and variables I and i.

The case usage of a token can be used to convey information that is available
elsewhere, but is more convenient to redundantly convey as upper/lower
case letters.  The name WriteString is obviously a procedure (based on
most naming conventions).  What is writestring?  Is it a variable
containing the string that is going to be written out?  Is it a boolean
deciding if a string should be written?    The language should let you
pick naming conventions and then help enforce them.  A case insensitive
language cannot help enforce naming conventions like the ones I describe.

Bill Smith
pur-ee!uiucdcs!wsmith
wsmith@a.cs.uiuc.edu

dhesi@bsu-cs.UUCP (Rahul Dhesi) (03/12/88)

In article <2758@csli.STANFORD.EDU> kasper@csli.UUCP (Kasper Osterbye) writes:
>Case sensitivity is always a drag. Someone always will use
>the TWO variables i and I, and I will get quite confused.

Well, I almost agree with this.  Using both i and I could be bad
practice, but not if one is consistently using i, j, k etc. to mean
something and I, J, K, etc. to mean something related.  If
mathematicians can do it (and have done it) for centuries, it can't be
that confusing.

The programer who is committed to writing bad code is hard to ST0P.
(Or was that STOP?  Darned language allows both ST0P and STOP to be
used.  They really ought to make O and 0 equivalent...and what's this
nonsense about "ioctl" and "ioctrl" and "iocntl" and "iocntrl" being
different?  A properly-designed language would do a Soundex encoding on
variables so we would not accidentally use two variables that could be
confused with each other. And while we're at it, we really ought to
require each numeric literal to have a check-digit, so if a programmer
mistypes 356 as 365, the error will be detected at compile time...
unless, of course, if the programmer miscalculates the check digit...so
better make that three check digits, so if they are miscalculated, the
error will be immediately detected.)

>I like a language
>to treat WriteString as the same identifier as writestring, but I like
>that I can write it both ways.

On the contrary, if somebody uses both WriteString and writestring,
that is clearly a typing error that should be detected by the compiler.
-- 
Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi

sommar@enea.se (Erland Sommarskog) (03/12/88)

Rahul Dhesi (dhesi@bsu-cs.UUCP) writes:
>In Modula-2 reserved words are required to be in uppercase while names
>of standard procedures are in mixed case.  Thus one must be constantly
>using the shift key.  This is painful.
>
>In C reserved words and standard functions are in lowercase.  By simply
>not using uppercase letters one gets all the advantages of a
>case-insensitive language without any hassles.  

So the argument against using uppercase letters should by the use of the 
shift key? But then English must be a pain as well. Not talking of any
programming language that uses operators like !"#$%&/() for which you
need the shift key.

More seriously I think the use of mixed key in identifier names is a
virtue. Why call something averylongnamewhichyoubarelycannotread when
you can call AVeryLongNameWhichIsYetEasyToReadAtFirstGlance? OK, you
may use underscores instead of case shifts. However, there is a drawback
if the compiler has low limit of significant characters.

The good point with having the reserved words in uppercase like in 
Modula-2 is that they stand out much more and highlights the structure.
(Provided of course that you don't use uppercase in your identifiers!)
Personally I find to read programs just in lowercase as quite unreadable.

I am a little ambvivalent whether against case sensitivity as such, but
I tend to be in favour for it. (In languages that is. In a Unix it's pain
and nothing else.) The reason is that is quite confusing to see a variable
sometime called "Count" and sometimes "count". Did he mean anything by
that? Also I recall spending my time to find a bug in the following 
scenario: 
   Program ....
   Const  WORDLENGTH = 80;      (* Which *is* one word in Swedish *)
   .....
      Procedure ....
      var WordLength : integer;
      ... 
         If WordLength < 20 then
For me "WORDLENGTH" and "WordLength" were two different entities so
I couldn't just understand why the machine thought all words were longer
than 20 letters. Took quite a while until I realized what happened.

Finally on reserved words: Ideally I think they should be accepted in
any case, to not bind to the programmer too much to a style he may
dislike. And anyway it's quite folly to declare a variable called "else".
-- 
Erland Sommarskog       
ENEA Data, Stockholm        
sommar@enea.UUCP           "Si tu crois l'amour tabou...
                            Regarde bien, les yeux d'un fou!!!" -- Ange

kers@otter.hple.hp.com (Christopher Dollin) (03/14/88)

"kasper@csli.STANFORD.EDU (Kasper Osterbye)" says:

|I disagree. Case sensitivity is always a drag. Someone always will use
|the TWO variables i and I, and I will get quite confused. 

But you don't object to the two names "ii", "ij"? Or "Payment_1" and 
"Payment_2"? Or using "i" ("x", "z", "a" ...) as anything other than a *very*
local name?

Anyone who uses both "i" and "I" in the same region of text (never mind scope!)
is inviting the same kinds of problem as any confusing use of names. The case
structure of the names has little to do with it.

[Hm, how about the names "i" and "eye"? or "x" and "eggs"? Good thing I never
really believed in the tellingbone test .......]



Regards,
Kers                                    | "Why Lisp if you can talk Poperly?"

John_M@spectrix.UUCP (John Macdonald) (03/15/88)

In article <2758@csli.STANFORD.EDU> kasper@csli.UUCP (Kasper Osterbye) writes:
>In article <2318@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
>>Just an opinion:  case sensitivity in a programming language is not in
>>itself a bad thing.  It is how it is used that can cause problems.
>> --> [Arguments]
>>Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi
>
>I disagree. Case sensitivity is always a drag. Someone always will use
>the TWO variables i and I, and I will get quite confused. I like a language
>to treat WriteString as the same identifier as writestring, but I like
>that I can write it both ways.

I would rather you used write_string or WriteString, but not writestring.
Whenever you are working in a language which does not permit _ (or . or
some other separator) as a legitimate character in an identifier, then
using capitalization to indicate internal breaks is neccessary, and case
insensitivity is DANGEROUS.  In the following (contrived) declarations:

integer     ForEd, ForSusan, ForWard;

integer     Backward, Forword;

you certainly get a shock when your two identifiers ForWard and Forward
conflict.  If the second is declared in a nested scope, there isn't
any error in many languages - just a nasty debugging problem.  Even if
your language does tell you about the conflict, how do you avoid such
conflicts?  In a large program, you end up having to define an official
standard naming convention used by eveyone on the project with central
control of the top level of the naming scheme to ensure that there is
no POSSIBLE overlap.

This sort of unexpected equivalence is especially prone to occur when
acronyms or abbrieviations are used as a portion of an identifier name.
For example, suppose Wildenwooly Research Institute designed an automotive
analysis program, containing the routines wriTestPiston, wriTestSpark,
wriTestValve, and wriTestRing.  The "wri" prefix distinguishes all of the
routines in their library, so that you are unlikely to get conflicts in
your own programs when you use this (they distributed it of the net, of
course).  You would get an interesting result if you included that module
in your program and tried to also use a call to that previously mentioned
WriteString.

Certainly someone can (mis-)use case sensitivity to avoid thinking of
descriptive meaningful names.  However, if both cases are being used,
then the person who wrote the code thinks it is sensitive (and sensible)
to use that specific case.  Why should a compiler throw away information
that someone had to deliberately provide?

If holding down the shift key is too much of "a drag", then convert
the programs which you are using to lower case (and live with any
conflicts you have thereby created).  Don't ask for compilers to
destroy useful distinctions.
-- 
John Macdonald   UUCP:    {mnetor,utzoo}             !spectrix!jmm

d25001@mic.UUCP (03/15/88)

>>In article <2318@bsu-cs.UUCP> dhesi@bsu-cs.UUCP (Rahul Dhesi) writes:
>>>Just an opinion:  case sensitivity in a programming language is not in
>>>itself a bad thing.  It is how it is used that can cause problems.
>		       ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>> --> [Arguments]
>>>Rahul Dhesi         UUCP:  <backbones>!{iuvax,pur-ee,uunet}!bsu-cs!dhesi
>>
>>I disagree. Case sensitivity is always a drag. Someone always will use
>>the TWO variables i and I, and I will get quite confused. I like a language
>>to treat WriteString as the same identifier as writestring, but I like
>>that I can write it both ways.
>
>It is how it is used that can cause problems.  If someone names two variables
>I and i, he should expect that you, reading his program later,  would get
>quite confused.  >>Naming conventions<< let you use the power of the case
>sensitivity safely.
>
>If I have a naming convention that ...
>                                                ..., then as long as a
>programmer follows the rules no one will get confused.  BUT, if
>the language is case insensitive, the programmer will start ignoring the
>naming conventions because the compiler lets her, and the situation is just as
>bad as a case sensitive with no conventions and variables I and i.
>
>The case usage of a token can be used to convey information that is available
>elsewhere, but is more convenient to redundantly convey as upper/lower
>case letters.  The name WriteString is obviously a procedure (based on
>most naming conventions).  What is writestring?  Is it a variable
>containing the string that is going to be written out?  Is it a boolean
>deciding if a string should be written?    The language should let you
>pick naming conventions and then help enforce them.  A case insensitive
>language cannot help enforce naming conventions like the ones I describe.
>
>Bill Smith
>pur-ee!uiucdcs!wsmith
>wsmith@a.cs.uiuc.edu

     This posting seems to confuse two issues.  1) The desirability
of a set of conventions for encoding extra meaning into token names by
means of upper and lower case distinctions.  2) Case sensitive languages
are useful tool for enforcing such conventions.
     I am ambivalent about 1).  I have used case conventions on a few
personal projects and am not sure of the results.  But even if we --
for the sake of argument -- grant the first point, I am doubtful of the
second.
     The _ONLY_ useful tool for enforcing case naming conventions (or
any other programming conventions) is the code walk-thru or code
review.  Case sensitive languages are at best a mixed blessing in this
regard.  The CSL can help to enforce the group's standards ONLY with
regard to external names and names that occur in included code
fragments (if the language supports such).  The CSL will enforce that
WriteString <> writestring <> WRITESTRING, but for local names it will
be as happy to use one as another.
     The poster claims the WriteString is "obviously"  a procedure
name.  He may work in an area where case conventions make is obvious,
but such conditions are far from universal.  To me WriteString is
'obviously' a procedure name only because I have dabbled in Modula-2
where it is the name of a 'standard' procedure (in the InOut module).
I have three books on Modula-2 (Wirth, Gleaves, and Beidler &
Jackowitz), and from none of them can I infer any convention by which I
could deduce that WriteString is a procedure.  Indeed, the code
fragments and examples in these three books show no convention
whatsoever in the use of case; I can find example procedure names in
mixed case, all lower case and all upper.  If "WriteString" is
'obviously' a procedure name, what about "ALLOCATE?"  It too is a
'standard' Modula-2 procedure!
     This, of course, is the down side of attempting to use the case
sensitivity of the language to enforce your own coding standards.  The
rest of the world -- including those supplying your language's standard
library -- may have different conventions that you do.  Your
conventions may say the procedure ought to be called "Allocate", but
the language will enforce "ALLOCATE".  (Unless, of course, you want to
rewrite all of the standard library -- and any other 'foreign" code --
that does not conform.)
    With the case insensitive language you don't have this particular
problem.  You can call "Allocate" and "DeAllocate" without regard to
the fact that your vendor thought of them as "ALLOCATE" and
"DEALLOCATE".
     Thus, the CSL cannot enforce your case conventions for local
names.  It can (sort of) enforce them for external names, but this
seems to be a mixed blessing at best.  As you are going to need code
reviews to enforce your conventions for local variables, maybe you
really would be happier with a case insensitive language after all.

Carrington Dixon
UUCP: {convex, killer} mic!d25001

kasper@csli.STANFORD.EDU (Kasper Osterbye) (03/15/88)

In a previous posting I said that I would like to be able to use both
cases (upper and lower) in my writing and reading, but the compiler
should not be able to tell the difference between i and I. Several
people have given so good remarks to this that I feel convinced that
I was wrong. Propper use of ``cased''-names might be a sure way to
introduce readability, and my example of using I and i was badly picked,
especially I got convinced by the example of "cntl","cntrl","contrl".
Confusion of names are not limited to cases, and as long as we do not
have a way to let the compiler warn us about the "control" case, I
fell now that we must make the best of the "case". Also some Modula
programs are more readable because of the casing.

"stack" of type Stack, where "stack" are used locally, and no other
variables of type "stack" are used in the immidiate surroundings.

I hereby retract from the scene, a bit wiser.

regards,

Kasper

yuval@taux01.UUCP (Gideon Yuval) (03/17/88)

An obvious fix is a compiler warning (NOT error) message: "the following
pairs of names differ only in the case of 1 or more letters:".
-- 
Gideon Yuval, +972-52-522255 (work), -2-690992 (home), yuval@taux02.nsc.com
 Paper-mail: National Semiconductor, 6 Maskit St., Herzliyah, Israel

nevin1@ihlpf.ATT.COM (00704a-Liber) (03/19/88)

In article <2832@csli.STANFORD.EDU> kasper@csli.UUCP (Kasper Osterbye) writes:
>In a previous posting I said that I would like to be able to use both
>cases (upper and lower) in my writing and reading, but the compiler
>should not be able to tell the difference between i and I.
>
>I hereby retract from the scene, a bit wiser.


Don't retract so fast :-)!  Following this discussion, there seems to be
only two possibilities:

A)	Variables such as 'CaseVar' and 'casevar' should be considered the same.
	(Case-insensitive)

B)	Variables such as 'CaseVar' and 'casevar' should be considered distinct.
	(Case-sensitive)

There is a third possibility.  Here is my proposal:

C)	If two or more variables, such as 'CaseVar' and 'casevar', only
	differ with respect to case, only one of the variable declarations
	is allowed and the other declarations result in an error (or
	alternatively, the variables are consdered distinct but the second
	and subsequent declarations produce warnings).

This seems to fix the problems with case-sensitive and case-insensitive
languages.  Comments??


-- 
 _ __			NEVIN J. LIBER	..!ihnp4!ihlpf!nevin1	(312) 510-6194
' )  )				"The secret compartment of my ring I fill
 /  / _ , __o  ____		 with an Underdog super-energy pill."
/  (_</_\/ <__/ / <_	These are solely MY opinions, not AT&T's, blah blah blah

chris@mimsy.UUCP (Chris Torek) (03/21/88)

In article <2758@csli.STANFORD.EDU> kasper@csli.STANFORD.EDU (Kasper
Osterbye) writes:
>Case sensitivity is always a drag.

NATUraLLy, since as yOU CAn see bY ReADINg this, peoPLe aRE nOT CAse
seNsiTIve thEMSelVEs.

That (unless you read it on an uppercase-only system) should make
it clear that people ARE case-sensitive to some degree.  Since I
am case sensitive, it has never bothered me that many of the languages
I use are case sensitive as well.  To be fair, it has never bothered
me that some are case insensitive.  What DOES bother me is languages
that constantly force me to shift cases:

	FOR primecand := 1 TO max DO BEGIN
		IF is_prime(primecand) THEN
			op1(primecand) ELSE op2(primecand); ...

All this does is add noise, much like my first paragraph above.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@mimsy.umd.edu	Path:	uunet!mimsy!chris

ok@quintus.UUCP (Richard A. O'Keefe) (03/21/88)

In article <4049@ihlpf.ATT.COM>, nevin1@ihlpf.ATT.COM (00704a-Liber) writes:
: There is a third possibility.  Here is my proposal:
: C)	If two or more variables, such as 'CaseVar' and 'casevar', only
: 	differ with respect to case, only one of the variable declarations
: 	is allowed and the other declarations result in an error (or
: 	alternatively, the variables are consdered distinct but the second
: 	and subsequent declarations produce warnings).
: 
: This seems to fix the problems with case-sensitive and case-insensitive
: languages.  Comments??

Comment:  Euclid already does this.
Comment:  Some Xerox file-systems retain the case specified when a file was
	  created but accept either case when a file is looked up; attempting
	  to create another file with the same name but different case pattern
	  can warn you of the existing file.

rcd@ico.ISC.COM (Dick Dunn) (03/24/88)

> >Just an opinion:  case sensitivity in a programming language is not in
> >itself a bad thing.  It is how it is used that can cause problems.
...
> I disagree. Case sensitivity is always a drag. Someone always will use
> the TWO variables i and I, and I will get quite confused...

I've found this (mixing case to get separate identifiers) only a few times
in code I've seen in the past ten years or so, and only once was it
confusing.  In the confusing case, it was used to try to plaster over a bug
which involved using one variable, r, to mean two different things.  When a
second variable, R, was introduced, the confusion was to be expected.

In the other cases, the different variable names were precisely
those used in a formula in non-computer usage.  For example, in simple
physics, g and G are both common.  Both are named with the same letter
because both refer to gravitation, but they are numerically quite different
and, in fact, dimensionally different.  People work with these two
constants, represented by letters differing only in case, with no
particular difficulty.

One useful convention I came across was names like Alpha and alpha,
referring respectively to upper and lower case Greek letters.  I'd hate
to pull that out from under someone in a language design.

(And speaking from a personal view, since my first name is Dick I happen to
like case-sensitivity.)
-- 
Dick Dunn      UUCP: {hao,nbires,cbosgd}!ico!rcd       (303)449-2870
   ...Simpler is better.

timd@cognos.uucp (Tim Dudley) (03/29/88)

In article <10742@mimsy.UUCP> chris@mimsy.UUCP (Chris Torek) writes:
-In article <2758@csli.STANFORD.EDU> kasper@csli.STANFORD.EDU (Kasper
-Osterbye) writes:
--Case sensitivity is always a drag.
-
-NATUraLLy, since as yOU CAn see bY ReADINg this, peoPLe aRE nOT CAse
   etc.
-... clear that people ARE case-sensitive to some degree.  Since I
-am case sensitive, it has never bothered me that many of the languages
-I use are case sensitive as well.  To be fair, it has never bothered
-me that some are case insensitive.  What DOES bother me is languages
-that constantly force me to shift cases:
-

Sounds to me like you guys need a good CASE tool.

-- 
Tim Dudley         Cognos Incorporated
(613) 738-1440     3755 Riverside Drive, Ottawa, Ontario, CANADA  K1G 3N3
decvax!utzoo!dciem!nrcaer!cognos!timd

barmar@think.COM (Barry Margolin) (04/25/89)

This discussion is losing its specificity to C, so I've directed
followups to comp.lang.misc.

In article <10182@socslgw.csl.sony.JUNET> diamond@csl.sony.junet (Norman Diamond) writes:
>Come on Henry, you wouldn't want to have to distinguish identifiers named
>myFunc and myfunc, when reading someone else's code.  If you don't want to
>have myFunc map onto myfunc (i.e. not be synonymous) then suggest a require-
>ment that all occurences of an identifier be consistent in case, but it is
>silly to permit two distinct identifiers to differ only in case.

So long as such things are not used haphazardly, case differences can
be useful.  For instance, in a language that doesn't permit some kind
of delimiter (e.g. underscores or hyphens) in identifiers, it's
possible that there could be two procedures named ReAdjustOneToken and
ReadJustOneToken (I know, I'm stretching); case is being used as a
delimiter, and it's completely obvious which one is which (but
readjustonetoken would be a poor name for either).

At a previous place of employment, case distinctions were often part
of some developers' or groups' programming conventions.  For instance,
parameters to top-level procedures might be named P_xxx, while
parameters to internal procedures would be named p_xxx.  Another group
used uppercase prefixes and suffixes to indicate data type; e.g. fooP
would be a pointer to a foo (we used PL/I, which doesn't have typed
pointers), but foop would be something else (and foopP would be a
pointer to a foop).

I've used both case-sensitive languages and case-insensitive
languages.  They both have their advantages and disadvantages.  Case
distinction, if used carefully, can be a good tool.  It's easy to
ignore when you aren't concerned about the attribute it represents,
and easy to notice when you are.

Barry Margolin
Thinking Machines Corp.

barmar@think.com
{uunet,harvard}!think!barmar

flee@shire.cs.psu.edu (Felix Lee) (04/25/89)

Case sensitivity is useful.  For those opposed, I propose "case
tolerance":  if a case sensitive scan of the symbol table fails, do a
case insensitive scan and spit out a "wrong case" warning if it
succeeds.
--
Felix Lee	flee@shire.cs.psu.edu	*!psuvax1!shire!flee