[comp.std.unix] Changed names in POSIX directory access library

gnu@hoptoad.UUCP (John Gilmore) (08/08/87)

Cc: gwyn@brl.arpa, utzoo.UUCP!henry@cgl.ucsf.edu
From: gnu@hoptoad.UUCP (John Gilmore)

I just ran across the first use of the Posix directory access library
and was disappointed that it is not source-code compatible with the
Unix BSD library, or with the public domain directory access library
that has been widely used on non-BSD systems.

For some reason the ***&%^$ standard changed the names of the include
file and the struct!

The old #include <sys/dir.h>	is now	#include <dirent.h>
  "	struct direct			struct dirent

My problem arose when Richard Todd ported my PD tar to Minix.  This
tar program has been ported and run on many, many Unix variants.  All
these versions compile from the same sources.  Minix lacks a directory
access library (and a lot of other library routines), so Richard used
the package posted to mod.sources by Doug Gwyn.  Unfortunately, it's
not compatible with Unix, and I wouldn't change tar; I'd rather have it
break on Richard's modified Minix than break on every Unix in existence.
So his Minix tar sources must remain slightly different from the master,
portable sources I keep.

Somehow when you read these standards on paper, they don't mean much.
When people implement it and your 'proven portable' programs break, the
little meddling changes get a lot more important.

From the NOTES file in Doug's package:
>One annoying compatibility problem has arisen along the way, namely that the
>original Berkeley interface used the same name, struct direct, for the new data
>structure as had been used for the original UNIX filesystem directory record
>structure.  This name was changed by the IEEE 1003.1 (POSIX) Working Group to
>"struct dirent" and was picked up for SVR3 under the new name; it is also the
>name used in this portable package.  I believe it is necessary to bite the
>bullet and adopt the new non-conflicting name.  Code using a 4.2BSD-compatible
>package needs to be slightly revised to work with this new package...

Why POSIX changed it I'll probably never know, but it's clear why SVR3
adopted the change.  They have been resisting incorporating the
directory library for years and years, apparently because it was a good
idea.  When POSIX mangled it, AT&T jumped at the chance to adopt a
version incompatible with all the code written for BSD Unix.  So call
me a cynic.

No portable application program should be including a struct defining
the internal format of old Unix file system entries, so I can't see any
theoretical problem created by using the name "struct direct" for the
result of Posix readdir().  There is certainly no practical problem
with it either, since we have been doing it for the last 5 years.

I'm curious why Doug thinks "it is necessary to bite the bullet".  How about
the obvious alternative of changing the standard to match all existing
applications?

Volume-Number: Volume 12, Number 5

gwyn@brl.arpa (VLD/VMB) (08/08/87)

To: John Gilmore <hoptoad.UUCP!gnu@cgl.ucsf.edu>
Cc: std-unix@sally.utexas.edu, utzoo.UUCP!henry@cgl.ucsf.edu
From: Doug Gwyn (VLD/VMB) <gwyn@brl.arpa>

I explained what the problem was with McKusick's implementation
and why both AT&T and IEEE 1003.1 decided to do it differently.
I also provided a public-domain package that can be built for
use on Berkeley-based systems, so the POSIX interface is
available there, too.  However, it is simply not practical for
most vendors to provide the Berkeley interface in a true System
V environment (more on this later).

You keep making noises that sound like you think there is a
conspiracy to reject correct Berkeley designs for various things
in favor of inferior approaches.  I've been at least peripherally
involved with almost all those situations, and that simply is not
what happens.  The real problem is that Berkeley keeps coming
out with stuff that has design or compatibility problems that
other people have to figure out how to fix later.  Such fixes
are not gratuitous changes to perfect designs!

You probably don't notice the problems because you work in a
Berkeley-based environment.  Try building some of your code on
real SVR2 systems some time and see what you break when you
change SVR2 to use the Berkeley <sys/dir.h> -- your code would
then work but there are a LOT of existing programs affected by
that incompatible change.  Real-world vendors have to accommodate
existing (pre-BFS/NFS) code that did the best it could under
prevailing circumstances while providing improved, portable
support for future applications.

I ran head-on into this directory issue when I ported UNIX
System V (user mode) onto 4.2BSD.  My solution was to throw out
<sys/dir.h> altogether, and provide (McKusick-compatible)
directory access functions for the System V environment.  The
reason I could do that was that I was in total charge of all
software to run in that environment.  When we started importing
code, the absence of <sys/dir.h> is the only thing that saved us
from wasting valuable time tracking down obscure bugs.  Vendors
that might have tried such an approach would have upset their
existing customer base.

My older package served as the basis for the one in SVR3, which
was modified by AT&T to conform to POSIX, which by then had
already identified the problem with using <sys/dir.h> and/or
(struct dirent) for the portable directory interface.  I fully
support the POSIX choice of names as a correct decision.

Until we get 4.4BSD (or whatever) into POSIX compliance, simply
build my PD package for BFS and use it with your portable code.

Volume-Number: Volume 12, Number 6

henry@utzoo.uucp (08/09/87)

Cc: std-unix@sally.utexas.edu
Cc: gwyn@brl.arpa
From: henry@utzoo.uucp

> I just ran across the first use of the Posix directory access library
> and was disappointed that it is not source-code compatible...
> 
> For some reason the ***&%^$ standard changed the names of the include
> file and the struct!

For an excellent reason:  Berkeley's imbecilic decision to use the same
names in the high-level library as in the implementation within the
kernel causes endless trouble on non-Berkeley systems.

> The old #include <sys/dir.h>	is now	#include <dirent.h>

This is a godsend for those of us who have a *different* directory structure
in our sys/dir.h, namely that used in the kernel.  User-level libraries
have no business parking their include files in the sys subdirectory!

>   "	struct direct			struct dirent

And this is a godsend for those of us who want to include *other* kernel
include files in the same program as the directory library.  The kernel
include files are so interrelated that it is impossible to pick up, say,
user.h without also picking up dir.h.  Which means that no program that
uses any of the kernel include files can use the directory library, because
the definitions of "struct direct" clash.  Yes, there are legitimate reasons
for using some of the other include files, at least in non-Berkeley systems.
Arguably these reasons themselves indicate problems that really ought to be
fixed -- but we were talking about compatibility with existing practice!

In short, I can assure John that there *are* practical problems with using
the Berkeley naming on non-Berkeley systems.  I have run into them too
often myself to do anything but applaud this "incompatibility"; non-Berklix
systems already find it necessary to engage in incompatible kludges to use
the otherwise-praiseworthy directory library.  I am one of the (doubtless
numerous) people who strongly urged P1003 to fix this bug.

				Henry Spencer @ U of Toronto Zoology
				{allegra,ihnp4,decvax,pyramid}!utzoo!henry

Volume-Number: Volume 12, Number 7

jsq@longway.tic.com (John S. Quarterman) (08/09/87)

To: gnu@sun.com, sun.com!utzoo!henry@uunet.UU.NET
Cc: gwyn@BRL.ARPA, std-unix@sally.utexas.edu
From: jsq@longway.tic.com (John S. Quarterman)

I'd like to remind all of you to avoid words like ``imbecilic.''
I'm interested in posting technical discussions, not ad-hominem attacks.

That said, mea culpa, mea max culpa for not getting the real
reason for the name change into the POSIX Rationale.

Volume-Number: Volume 12, Number 8

mckusick%okeeffe@berkeley.edu (Kirk McKusick) (08/11/87)

Cc: hoptoad!gnu@cgl.ucsf.edu (John Gilmore), gwyn@brl.arpa (Doug Gwyn),
        utzoo!henry@sun.com (Henry Spencer),
        longway!jsq@uunet.UU.NET (John S. Quarterman),
        karels@okeeffe.berkeley.edu (Mike Karels)
From: mckusick%okeeffe@berkeley.edu (Kirk McKusick)

When I wrote the directory access routines I made a mistake in 
connecting them with the underlying implementation of the file
system. They clearly should work on an abstract directory
definition which may coincidentally be the same as the underlying
directory structure (as it is in 4.3BSD). As such, I fully agree
with the decision in the POSIX standard to create a new header
file <dirent.h> rather than using <sys/dir.h>. It is the intention
of CSRG to change the source code at Berkeley to <dirent.h>.

I do take exception to the apparently gratuitous change of renaming
`d_ino' to `d_fileno' in the System V interface, though this can be
worked around with a #define.

	Kirk McKusick
	mckusick@berkeley.edu
	ucbvax!mckusick

Volume-Number: Volume 12, Number 9