std-unix@ut-sally.UUCP (Moderator, John Quarterman) (01/08/87)
From: cbosgd!mark@seismo.css.gov (Mark Horton) Date: 4 Jan 87 21:19:54 GMT Organization: AT&T Bell Laboratories, Columbus, Oh I was just going through POSIX and noticed that the only mechanism for determining the node name is uname (4.4.1.) I think it's clear that, while uname is adequate for UUCP, the 8 character limit on the node name is inadequate for other networks, especially domain networks such as the ARPANET, CSNET, and the UUCP Zone. It won't be adequate for OSI, either, although it isn't currently clear what would be, since host names may not even be character strings in OSI. While P.1003 does not restrict implementations to SYS_NMLN=9 (including the null) it requires that all 5 fields support the full length. I don't know of any way to increase SYS_NMLN while maintaining binary compatibility with older programs, which is a typical requirement. I am also unaware of any application that makes use of the other four fields. (Of course, as soon as I say that, several people will point some out, but I don't know of a runtime use for those fields that is sufficiently motivating to be included in POSIX.) A similar feature would be useful at compile time (predefined preprocessor variables to allow conditional compilation based on the version) but the typical program needs to make these decisions at compile time, not runtime. Wouldn't it make more sense to standardize on a simple long character string for the node name? Assuming that OSI names can somehow be encoded as character strings (a fairly safe assumption, I think) this ought to handle all the cases. The 4.2BSD gethostname function, which passes the length of the buffer: gethostname(buffer, bufferlen) char *buffer; int bufferlen; seems perfectly suited to this problem. I believe that uname will have to be phased out in favor of a more general mechanism over the next few years. Why is it in the standard? Mark Volume-Number: Volume 9, Number 12
std-unix@ut-sally.UUCP (01/28/87)
From: seismo!gatech!hpcnof!hpfcla!hpfcdc!rml (Bob Lenk) Date: Thu, 8 Jan 87 20:20:47 mst > From: cbosgd!mark@seismo.css.gov (Mark Horton) > > While P.1003 does not restrict implementations to SYS_NMLN=9 (including > the null) it requires that all 5 fields support the full length. > I don't know of any way to increase SYS_NMLN while maintaining binary > compatibility with older programs, which is a typical requirement. Since the publication of the trial use standard, the working group has agreed to drop the constant SYS_NMLN and any requirement that all five fields be the same length. All fields are now specified simply as null-terminated character arrays. Increasing the length can still cause binary compatibility problems, but there are (ugly) ways of dealing with binary compatibility. > I am also unaware of any application that makes use of the other four > fields. I can imagine applications using the fields in some type of reports, but I don't know of any portable applications which use them, or of any strong reason why they are needed. > Wouldn't it make more sense to standardize on a simple long character > string for the node name? Assuming that OSI names can somehow be > encoded as character strings (a fairly safe assumption, I think) > this ought to handle all the cases. The 4.2BSD gethostname function, > which passes the length of the buffer: > gethostname(buffer, bufferlen) > char *buffer; > int bufferlen; > seems perfectly suited to this problem. If we use such an approach, we still need to specify a symbolic constant (in <limits.h>) for the maximum length of a hostname on an implementation, so that applications don't need to deal with having truncated names returned to them. Uname handles this by the inclusion of the string within a structure. Given that, the only difference from uname is the existence of other fields. For binary compatibilty, I don't see much difference between an implementation having two calls both called "uname" or one called "uname" and the other called "gethostname", which return names of different lengths. Bob Lenk {ihnp4, hplabs}!hpfcla!rml Volume-Number: Volume 9, Number 30
std-unix@ut-sally.UUCP (01/30/87)
From: guy%gorodish@Sun.COM (Guy Harris) Date: 29 Jan 87 06:53:29 GMT Reply-To: guy@sun.UUCP (Guy Harris) Organization: Sun Microsystems, Mountain View >Increasing the length can still cause binary compatibility problems, but >there are (ugly) ways of dealing with binary compatibility. Not even that ugly; yes, they leave loose bits of crud floating around in your kernel, but most UNIX distributions these days have lots of this sort of loose crud. It's aesthetically unpleasant, but it beats the hell out of supporting aesthetically- and technically-unpleasant interfaces because you can't declare a flag day and nuking those interfaces. Most, if not all, implementations based on UNIX could just assign a new system call number to a new improved "uname" and leave the old one around with its old number for binary compatibility. You can write a library that contains a "uname" that uses the old call, or uses the new call and throws away the extra characters. Volume-Number: Volume 9, Number 37
std-unix@ut-sally.UUCP (02/01/87)
>From: seismo!gatech!hpcnof!hpfcla!hpfcdc!rml (Bob Lenk) > >> From: cbosgd!mark@seismo.css.gov (Mark Horton) >> >> While P.1003 does not restrict implementations to SYS_NMLN=9 (including >> the null) it requires that all 5 fields support the full length. >> I don't know of any way to increase SYS_NMLN while maintaining binary >> compatibility with older programs, which is a typical requirement. > >Since the publication of the trial use standard, the working group has >agreed to drop the constant SYS_NMLN and any requirement that all >five fields be the same length. All fields are now specified simply >as null-terminated character arrays. Does the new standard still allow an implementation whose nodename field holds only 9 characters? If so, I predict that the binary compatibility issue will be so overwhelming that vendors will not increase this size, and systems will continue to be unable to handle networks properly. > Increasing the length can still cause binary compatibility problems, but > there are (ugly) ways of dealing with binary compatibility. > From: guy%gorodish@Sun.COM (Guy Harris) > Most, if not all, implementations based on UNIX could just assign a > new system call number to a new improved "uname" and leave the old > one around with its old number for binary compatibility. You can > write a library that contains a "uname" that uses the old call, or > uses the new call and throws away the extra characters. Generally, you have to be upward compatible in three ways: (1) source code compatible: easy, just fix <utsname.h> (2) binary a.out compatible: ugly but easy, change the system call number (3) binary .o compatible: oops - how do you handle this one? An existing library libfoo.a can call uname. You relink the old library with the new libc, getting the new system call number with the old include file. I don't see any way to tell old .o's from new .o's, since uname does not pass the size of the structure or any other distinguishing information. (You could change the .o format/version, and teach the linker to know about uname and which .o format/version gets which version of uname, but that's a pretty horrible thought.) >From: seismo!gatech!hpcnof!hpfcla!hpfcdc!rml (Bob Lenk) >If we use such an approach, we still need to specify a symbolic constant >(in <limits.h>) for the maximum length of a hostname on an >implementation, so that applications don't need to deal with having >truncated names returned to them. Of course - like any array, you should specify a minimum maximum, and put the size in <limits.h>. (Although Bob Lenk's note sounds like the new version of uname doesn't have sizes for the 5 arrays, I hope I just misunderstand what it really says.) One nice thing about gethostbyname, however, is that, since it passes the size at runtime, it doesn't really matter what the minimum maximum is, except that if system or user specifies a small number like 8, you'll lose information. >Uname handles this by the inclusion >of the string within a structure. Given that, the only difference from >uname is the existence of other fields. For binary compatibilty, I >don't see much difference between an implementation having two calls >both called "uname" or one called "uname" and the other called >"gethostname", which return names of different lengths. There are several differences: (1) uname has 4 other fields, of marginal use for inclusion in POSIX. I doubt any implementation would provide a call called "uname" that supports only one field, even if POSIX allowed it. (2) uname does not pass the size of the structure as another parameter. (3) the traditional (and easily compatible) implementation of uname only allows 8 chars in the node name. Since SYS_NMLN is new to POSIX, there is a lot of code out there that has the number 8 hardwired into it, especially in buffers used to store the name. My SVr3 manual still tells me that the fields are 9 bytes long, and I'll bet lots of programmers believe that instead of checking the SVID and POSIX standards. (4) Because of (1) and (2), there is no easy way to grow the length of any one field without superhuman binary compatibility efforts. Ditto for adding new fields. Any multi-field table lookup system call ought to be extensible, which means it ought to pass info at runtime about which items it wants, and the sizes of the buffers provided to copy these items into. I as a user would be satisfied if you were to require that the uname call support at least 256 characters of node name (and, of course, that the actual size be in <limits.h>.) I almost said 64 characters, but then I thought of OSI and wanted to be safe. I could immediately write code to implement gethostname in terms of uname. But the result would be awfully unclean for the users (having to declare a structure and copy, having to fix existing code not to know the number 8) and would be an incredible mess for the people stuck supporting binary compatibility. (Let's see now, the C compiler is unbundled from the kernel, so we have to make sure we put out a new ld in the right places, and have to ensure that <utsname.h> is the new version if you have the new loader and the new kernel, and ... Do we want to require this in a standard without someone implementing it first to find the gotchas?) The current uname is inadequate for modern networks. There is no way to make it adequate without requiring that nodename be made bigger. There is no way to make the nodename bigger without considerable uglyness and kludging, some of which will be visible to the users. It would be far cleaner and simpler, with far less upheaval among implementors and users, to put in gethostname, which does exactly what is needed, and is already present in 4.2BSD and AT&T's WIN/3B TCP/IP package. uname could continue to exist, in its old form, for upward compatibility, but it would return a truncated host name (or else the above superhuman efforts could be undertaken by the system developers to return a full host name.) I see no reason to require these superhuman efforts with ugly results in POSIX. Mark Horton Volume-Number: Volume 9, Number 42