[comp.protocols.tcp-ip.domains] Case-sensitive Name Service Routines

jbeck@hpindwa.HP.COM (John Beck) (09/28/90)

I have noticed that gethostbyname matches host-names in a case-insensitive
manner, whereas the other name service routines (getbetbyname, getprotobyname,
getservbyname) are case sensitive. In addition, getservbyname and getservbyport
also takes a protocol parameter, which they match in a case-sensitive manner.

The reason gethostbyname is case-insensitive is that it is prescribed in the
RFC's for the Domain Name Server (1134 & 1135) that domain names must be case-
insensitive. The others, however, are unspecified.

It seems to me that the other routines should also match names in a case-
insensitive manner. Should it matter whether you call getservbyname with a
parameter of "ftp" vs. "FTP"? I think both should work, although currently
the former will return a pointer to a proper servent structure, whereas the
latter will only return a null pointer.

How do others feel about this issue? Is it worth pursuing to get a change
effected? Or is it too trivial to bother with? Responses encouraged.

Thanks. :-)

-- John Beck
   jbeck@hpda.hp.com

Dan@dna.lth.se (Dan Oscarsson) (09/29/90)

In article <40500002@hpindwa.HP.COM> jbeck@hpindwa.HP.COM (John Beck) writes:
>I have noticed that gethostbyname matches host-names in a case-insensitive
>manner, whereas the other name service routines (getbetbyname, getprotobyname,
>getservbyname) are case sensitive. In addition, getservbyname and getservbyport
>also takes a protocol parameter, which they match in a case-sensitive manner.
>
>The reason gethostbyname is case-insensitive is that it is prescribed in the
>RFC's for the Domain Name Server (1134 & 1135) that domain names must be case-
>insensitive. The others, however, are unspecified.
>
>It seems to me that the other routines should also match names in a case-
>insensitive manner. Should it matter whether you call getservbyname with a
>parameter of "ftp" vs. "FTP"? I think both should work, although currently
>the former will return a pointer to a proper servent structure, whereas the
>latter will only return a null pointer.
>
>How do others feel about this issue? Is it worth pursuing to get a change
>effected? Or is it too trivial to bother with? Responses encouraged.
>
This is one area where Unix is a mess. For most normal people (non unix)
a letter A is an A independent of case. I think most of you do noy think
an A and an a read in a book have different meanings. Therefore I feel
it would be much better if as much as possible should be case-insensitive.
At least all routines concerning host names, protocol, services, user names and
group names (and some more I cannot think of just now).
Even file names should be looked up case-insensitive (but named case-sensitive),
but this may be to much for some unix hackers.

   Dan

-- 
Dan Oscarsson                              Department of Computer Science
                                           Lund Institute of Technology
e-mail:  Dan@dna.lth.se                    Box 118
                                           S-221 00 Lund, Sweden

rickert@mp.cs.niu.edu (Neil Rickert) (09/29/90)

In article <40500002@hpindwa.HP.COM> jbeck@hpindwa.HP.COM (John Beck) writes:
>I have noticed that gethostbyname matches host-names in a case-insensitive
>manner, whereas the other name service routines (getbetbyname, getprotobyname,
>getservbyname) are case sensitive. In addition, getservbyname and getservbyport
>also takes a protocol parameter, which they match in a case-sensitive manner.
>
>The reason gethostbyname is case-insensitive is that it is prescribed in the
>RFC's for the Domain Name Server (1134 & 1135) that domain names must be case-
>insensitive. The others, however, are unspecified.
>(...)
>How do others feel about this issue? Is it worth pursuing to get a change
>effected? Or is it too trivial to bother with? Responses encouraged.

 This is not worth worrying about.  There is an important reason for treating
'gethostbyname' differently.  The data used by 'getnetbyname()' etc is
maintained on the local host, so the local administrator can ensure that it
is all lower case.  The data searched for 'gethostbyname()' is part of
a distributed database that is not under local control, so software cannot
make assumptions about case.

 If it ain't broken, don't fix it.
-- 
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
  Neil W. Rickert, Computer Science               <rickert@cs.niu.edu>
  Northern Illinois Univ.
  DeKalb, IL 60115.                                  +1-815-753-6940

jacob@gore.com (Jacob Gore) (09/29/90)

/ comp.protocols.tcp-ip.domains / Dan@dna.lth.se (Dan Oscarsson) / Sep 29 '90 /
> For most normal people (non unix)
> a letter A is an A independent of case. I think most of you do noy think
> an A and an a read in a book have different meanings.

That's not always true.  The following phrases have case-sensitive meanings:

	the President		the president
	Mother			mother
	Little Rock		little rock

etc.

Jacob
--
Jacob Gore		Jacob@Gore.Com			boulder!gore!jacob

wb8foz@mthvax.cs.miami.edu (David Lesher) (09/30/90)

>That's not always true.  The following phrases have case-sensitive meanings:

>	the President		the president
>	Mother			mother
>	Little Rock		little rock

The ultimate:
	Polish			polish

How do you talk about waxing cars in Warsaw?

-- 
A host is a host from coast to coast.....wb8foz@mthvax.cs.miami.edu 
& no one will talk to a host that's close............(305) 255-RTFM
Unless the host (that isn't close)......................pob 570-335
is busy, hung or dead....................................33257-0335

david@twg.com (David S. Herron) (09/30/90)

In article <1990Sep29.091829.7625@lth.se> Dan@dna.lth.se (Dan Oscarsson) writes:
>In article <40500002@hpindwa.HP.COM> jbeck@hpindwa.HP.COM (John Beck) writes:

[John Beck notices that gethostbyname() does caseless matching
 but that other things such as getservbyname() does not...]

>>How do others feel about this issue? Is it worth pursuing to get a change
>>effected? Or is it too trivial to bother with? Responses encouraged.
>>
>This is one area where Unix is a mess. For most normal people (non unix)
>a letter A is an A independent of case. I think most of you do noy think
>an A and an a read in a book have different meanings. Therefore I feel
>it would be much better if as much as possible should be case-insensitive.
>At least all routines concerning host names, protocol, services, user names and
>group names (and some more I cannot think of just now).
>Even file names should be looked up case-insensitive (but named case-sensitive),
>but this may be to much for some unix hackers.

yah.. that's "too much" for this Unix hacker.

XtAddCallback() is a very different looking creature from
xtaddcallback().  Ergo .. one case where case changes are useful.

In any case .. gethostbyname() ultimately uses strcasecmp() to do
caseless string comparisons.  I'm sure it would be trivial to make
getservbyname() and the others to do caseless string comparisons.
Since the semantics of network, host and service names say that
case is insignificant then this is a reasonable change.

But this does not mean that the semantics of different cases are
always unimportant!

Just because *you* can't think of any use for enforcing casefull-ness
in file names doesn't mean that there isn't a use.

A for-instance I have in mind is a database being stored using the
file system as the hashing/indexing scheme.  There's a number of ways
of doing this, and one of the early Unix authors did a comparison
paper showing that if the Unix file system were a database then the
indexing overhead is a h*ll of a lot lower than in more traditional
databases.

Er, back to the subject -- casefullness would help by increasing
the set of characters that can be used in hash strings.  This would
tend to decrease the number of levels of indexing required to get
to a particular data item.

Even in written-communication case differences are important.  For
instance it grates on the nerves to see "UNIX" or "unix", the
spelling is properly "Unix".  Notice our two sentences above ...

At the most -- to help humans who are using these systems -- there
could be some kind of environmental option one could set to influence
the kind of matching used in the file system.  It would be a pretty
simple hack to have iname() use a caseless compare given some global
flag, in either the "proc" or "user" structures.  The hard part is
getting that global option into whichever of those structures is
the appropriate place.  There isn't, currently, either a system call
or a user command for manipulating that kind of environment option.

I do think it would be a good idea ..

I don't think that traditional "environment variables" are the right
place since this particular thing should be implemented inside the
kernel and the kernel doesn't really have access to environment variables.
Rather .. my memory is that environment variables are purely an invention
of shell authors and the closest the OS gets to being involved
with them is to pass them along during exec()'s.


-- 
<- David Herron, an MMDF & WIN/MHS guy, <david@twg.com>
<- Formerly: David Herron -- NonResident E-Mail Hack <david@ms.uky.edu>
<-
<- Sign me up for one "I survived Jaka's Story" T-shirt!

eric@sunic.sunet.se (Eric Thomas SUNET) (10/01/90)

In article <700001@gore.com> jacob@gore.com (Jacob Gore) writes:
>/ comp.protocols.tcp-ip.domains / Dan@dna.lth.se (Dan Oscarsson) / Sep 29 '90 /
>> For most normal people (non unix) a letter A is an A independent of case.
>> I think most of you do noy think an A and an a read in a book have different
>> meanings.
>That's not always true.  The following phrases have case-sensitive meanings:
>
>	the President		the president

If I send snail-mail to the 'president' instead of 'President', does it get
delivered? If I send e-mail to him and he's on a Unix host, will it get
delivered?

  Eric

So, I don't have a .sig, I'm opposed to the concept. But some stupid piece of
software decided that you can't make a posting where you quote more than you
say, even though the total article size is less than 15 lines. I guess it is
deemed appropriate for me to waste more bandwidth and bore you with my whining.

rickert@mp.cs.niu.edu (Neil Rickert) (10/01/90)

In article <2172@sunic.sunet.se> eric@sunic.sunet.se (Eric Thomas SUNET) writes:
>In article <700001@gore.com> jacob@gore.com (Jacob Gore) writes:
>>/ comp.protocols.tcp-ip.domains / Dan@dna.lth.se (Dan Oscarsson) / Sep 29 '90 /
>>> For most normal people (non unix) a letter A is an A independent of case.
>>> I think most of you do noy think an A and an a read in a book have different
>>> meanings.
>>That's not always true.  The following phrases have case-sensitive meanings:
>>
>>	the President		the president
>
>If I send snail-mail to the 'president' instead of 'President', does it get

 Most likely the capitalization will have no influence on whether it is
delivered.

>delivered? If I send e-mail to him and he's on a Unix host, will it get
>delivered?

 This depends on the mailer flags for the local mailer.  If the mailer
flags specify changing names to lower case, and the correct name is
'President', nothing will ever be delivered, since the lower case version
will never match the correct name.

 With same flags, but the login id is 'president' it will be delivered
regardless of spelling.  If the mailer flags preserve case, then the
capitalization must be exact for delivery.

-- 
=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=*=
  Neil W. Rickert, Computer Science               <rickert@cs.niu.edu>
  Northern Illinois Univ.
  DeKalb, IL 60115.                                  +1-815-753-6940