[mod.std.unix] node name

std-unix@ut-sally.UUCP (Moderator, John Quarterman) (01/08/87)

From: cbosgd!mark@seismo.css.gov (Mark Horton)
Date: 4 Jan 87 21:19:54 GMT
Organization: AT&T Bell Laboratories, Columbus, Oh

I was just going through POSIX and noticed that the only mechanism
for determining the node name is uname (4.4.1.)  I think it's clear
that, while uname is adequate for UUCP, the 8 character limit on
the node name is inadequate for other networks, especially domain
networks such as the ARPANET, CSNET, and the UUCP Zone.  It won't
be adequate for OSI, either, although it isn't currently clear what
would be, since host names may not even be character strings in OSI.

While P.1003 does not restrict implementations to SYS_NMLN=9 (including
the null) it requires that all 5 fields support the full length.
I don't know of any way to increase SYS_NMLN while maintaining binary
compatibility with older programs, which is a typical requirement.

I am also unaware of any application that makes use of the other four
fields.  (Of course, as soon as I say that, several people will point
some out, but I don't know of a runtime use for those fields that is
sufficiently motivating to be included in POSIX.)  A similar feature
would be useful at compile time (predefined preprocessor variables to
allow conditional compilation based on the version) but the typical
program needs to make these decisions at compile time, not runtime.

Wouldn't it make more sense to standardize on a simple long character
string for the node name?  Assuming that OSI names can somehow be
encoded as character strings (a fairly safe assumption, I think)
this ought to handle all the cases.  The 4.2BSD gethostname function,
which passes the length of the buffer:
	gethostname(buffer, bufferlen)
	char *buffer;
	int bufferlen;
seems perfectly suited to this problem.

I believe that uname will have to be phased out in favor of a more
general mechanism over the next few years.  Why is it in the standard?

	Mark

Volume-Number: Volume 9, Number 12

std-unix@ut-sally.UUCP (01/28/87)

From: seismo!gatech!hpcnof!hpfcla!hpfcdc!rml (Bob Lenk)
Date: Thu, 8 Jan 87 20:20:47 mst

> From: cbosgd!mark@seismo.css.gov (Mark Horton)
> 
> While P.1003 does not restrict implementations to SYS_NMLN=9 (including
> the null) it requires that all 5 fields support the full length.
> I don't know of any way to increase SYS_NMLN while maintaining binary
> compatibility with older programs, which is a typical requirement.

Since the publication of the trial use standard, the working group has
agreed to drop the constant SYS_NMLN and any requirement that all
five fields be the same length.  All fields are now specified simply
as null-terminated character arrays.  Increasing the length can still
cause binary compatibility problems, but there are (ugly) ways of
dealing with binary compatibility.

> I am also unaware of any application that makes use of the other four
> fields.

I can imagine applications using the fields in some type of reports,
but I don't know of any portable applications which use them, or of
any strong reason why they are needed.

> Wouldn't it make more sense to standardize on a simple long character
> string for the node name?  Assuming that OSI names can somehow be
> encoded as character strings (a fairly safe assumption, I think)
> this ought to handle all the cases.  The 4.2BSD gethostname function,
> which passes the length of the buffer:
> 	gethostname(buffer, bufferlen)
> 	char *buffer;
> 	int bufferlen;
> seems perfectly suited to this problem.

If we use such an approach, we still need to specify a symbolic constant
(in <limits.h>) for the maximum length of a hostname on an
implementation, so that applications don't need to deal with having
truncated names returned to them.  Uname handles this by the inclusion
of the string within a structure.  Given that, the only difference from
uname is the existence of other fields.  For binary compatibilty, I
don't see much difference between an implementation having two calls
both called "uname" or one called "uname" and the other called
"gethostname", which return names of different lengths.

		Bob Lenk
		{ihnp4, hplabs}!hpfcla!rml

Volume-Number: Volume 9, Number 30

std-unix@ut-sally.UUCP (01/30/87)

From: guy%gorodish@Sun.COM (Guy Harris)
Date: 29 Jan 87 06:53:29 GMT
Reply-To: guy@sun.UUCP (Guy Harris)
Organization: Sun Microsystems, Mountain View

>Increasing the length can still cause binary compatibility problems, but
>there are (ugly) ways of dealing with binary compatibility.

Not even that ugly; yes, they leave loose bits of crud floating
around in your kernel, but most UNIX distributions these days have
lots of this sort of loose crud.  It's aesthetically unpleasant, but
it beats the hell out of supporting aesthetically- and
technically-unpleasant interfaces because you can't declare a flag day and
nuking those interfaces.

Most, if not all, implementations based on UNIX could just assign a
new system call number to a new improved "uname" and leave the old
one around with its old number for binary compatibility.  You can
write a library that contains a "uname" that uses the old call, or
uses the new call and throws away the extra characters.

Volume-Number: Volume 9, Number 37

std-unix@ut-sally.UUCP (02/01/87)

>From: seismo!gatech!hpcnof!hpfcla!hpfcdc!rml (Bob Lenk)
>
>> From: cbosgd!mark@seismo.css.gov (Mark Horton)
>> 
>> While P.1003 does not restrict implementations to SYS_NMLN=9 (including
>> the null) it requires that all 5 fields support the full length.
>> I don't know of any way to increase SYS_NMLN while maintaining binary
>> compatibility with older programs, which is a typical requirement.
>
>Since the publication of the trial use standard, the working group has
>agreed to drop the constant SYS_NMLN and any requirement that all
>five fields be the same length.  All fields are now specified simply
>as null-terminated character arrays.

Does the new standard still allow an implementation whose nodename field
holds only 9 characters?  If so, I predict that the binary compatibility
issue will be so overwhelming that vendors will not increase this size,
and systems will continue to be unable to handle networks properly.

> Increasing the length can still cause binary compatibility problems, but
> there are (ugly) ways of dealing with binary compatibility.

> From: guy%gorodish@Sun.COM (Guy Harris)

> Most, if not all, implementations based on UNIX could just assign a
> new system call number to a new improved "uname" and leave the old
> one around with its old number for binary compatibility.  You can
> write a library that contains a "uname" that uses the old call, or
> uses the new call and throws away the extra characters.

Generally, you have to be upward compatible in three ways:
(1) source code compatible: easy, just fix <utsname.h>
(2) binary a.out compatible: ugly but easy, change the system call number
(3) binary .o compatible: oops - how do you handle this one?
    An existing library libfoo.a can call uname.  You relink the
    old library with the new libc, getting the new system call
    number with the old include file.  I don't see any way to tell
    old .o's from new .o's, since uname does not pass the size
    of the structure or any other distinguishing information.
    (You could change the .o format/version, and teach the linker to know
    about uname and which .o format/version gets which version of uname,
    but that's a pretty horrible thought.)

>From: seismo!gatech!hpcnof!hpfcla!hpfcdc!rml (Bob Lenk)

>If we use such an approach, we still need to specify a symbolic constant
>(in <limits.h>) for the maximum length of a hostname on an
>implementation, so that applications don't need to deal with having
>truncated names returned to them.

Of course - like any array, you should specify a minimum maximum,
and put the size in <limits.h>.  (Although Bob Lenk's note sounds like
the new version of uname doesn't have sizes for the 5 arrays, I hope
I just misunderstand what it really says.)  One nice thing about
gethostbyname, however, is that, since it passes the size at runtime,
it doesn't really matter what the minimum maximum is, except that if
system or user specifies a small number like 8, you'll lose information.

>Uname handles this by the inclusion
>of the string within a structure.  Given that, the only difference from
>uname is the existence of other fields.  For binary compatibilty, I
>don't see much difference between an implementation having two calls
>both called "uname" or one called "uname" and the other called
>"gethostname", which return names of different lengths.

There are several differences:

(1) uname has 4 other fields, of marginal use for inclusion in POSIX.
    I doubt any implementation would provide a call called "uname"
    that supports only one field, even if POSIX allowed it.

(2) uname does not pass the size of the structure as another parameter.

(3) the traditional (and easily compatible) implementation of uname
    only allows 8 chars in the node name.  Since SYS_NMLN is new to
    POSIX, there is a lot of code out there that has the number 8
    hardwired into it, especially in buffers used to store the name.
    My SVr3 manual still tells me that the fields are 9 bytes long,
    and I'll bet lots of programmers believe that instead of checking
    the SVID and POSIX standards.

(4) Because of (1) and (2), there is no easy way to grow the length
    of any one field without superhuman binary compatibility efforts.
    Ditto for adding new fields.  Any multi-field table lookup system
    call ought to be extensible, which means it ought to pass info at
    runtime about which items it wants, and the sizes of the buffers
    provided to copy these items into.

I as a user would be satisfied if you were to require that the uname
call support at least 256 characters of node name (and, of course, that
the actual size be in <limits.h>.)  I almost said 64 characters, but
then I thought of OSI and wanted to be safe. I could immediately write
code to implement gethostname in terms of uname.  But the result would
be awfully unclean for the users (having to declare a structure and copy,
having to fix existing code not to know the number 8) and would be an
incredible mess for the people stuck supporting binary compatibility.  (Let's
see now, the C compiler is unbundled from the kernel, so we have to make
sure we put out a new ld in the right places, and have to ensure that
<utsname.h> is the new version if you have the new loader and the new
kernel, and ...  Do we want to require this in a standard without someone
implementing it first to find the gotchas?)

The current uname is inadequate for modern networks.  There is no way
to make it adequate without requiring that nodename be made bigger.
There is no way to make the nodename bigger without considerable
uglyness and kludging, some of which will be visible to the users.

It would be far cleaner and simpler, with far less upheaval among
implementors and users, to put in gethostname, which does exactly
what is needed, and is already present in 4.2BSD and AT&T's WIN/3B
TCP/IP package.  uname could continue to exist, in its old form, for
upward compatibility, but it would return a truncated host name
(or else the above superhuman efforts could be undertaken by the
system developers to return a full host name.)  I see no reason to
require these superhuman efforts with ugly results in POSIX.

	Mark Horton

Volume-Number: Volume 9, Number 42