jdc@naucse.UUCP (John Campbell) (06/25/89)
A favorite program of mine, that works with Doug Gwyn's dirent routines, was reposted and when built yielded the following: RE - can't read RE Ma - can't read Ma with each file name truncated to two letters. It turns out that this was due to the fact that Doug Gwyn's dirent.h file (at least my version) refused to put a 14 character limit (or any limit) on file names. Instead he defined the following (/usr/include/sys/dirent.h): ======start of doug's dirent.h====== /* <sys/dirent.h> -- file system independent directory entry (SVR3) last edit: 25-Apr-1987 D A Gwyn prerequisite: <sys/types.h> */ struct dirent /* data from getdents()/readdir() */ { long d_ino; /* inode number of entry */ off_t d_off; /* offset of disk directory entry */ unsigned short d_reclen; /* length of this record */ char d_name[1]; /* name of file */ /* non-POSIX */ }; /* The following nonportable ugliness could have been avoided by defining DIRENTSIZ and DIRENTBASESIZ to also have (struct dirent *) arguments. */ #define DIRENTBASESIZ (((struct dirent *)0)->d_name \ - (char *)&((struct dirent *)0)->d_ino) #define DIRENTSIZ( namlen ) ((DIRENTBASESIZ + sizeof(long) + (namlen)) \ / sizeof(long) * sizeof(long)) /* DAG -- the following was moved from <dirent.h>, which was the wrong place */ #define MAXNAMLEN 512 /* maximum filename length */ #ifndef NAME_MAX #define NAME_MAX (MAXNAMLEN - 1) /* DAG -- added for POSIX */ #endif =========end of doug's dirent.h======== What this means is that memcpy (x, y, sizeof (struct dirent)) was only copying two bytes of the name (structure alignment). Anyway, I made patches based on DIRENTSIZ and DIRENTBASESIZ to work around this--but it added a mess to the code and some #define DIRENTSIZ 0 lines for other unix implementations. My question (great, now he gets to it) is, "Is this the best way to build a portable opendir(), readdir(), etc. package?" Handling arbitrary length file names is always a bit more work. Also, I run on a 3b1 SYSV (sort of) machine and I have another opendir() package by Scott Hazen Muellyer (scott@zorch.uucp) that would have worked with the original code since it assumes a fixed size for each directory entry (ndir.h): ======start of scott's ndir.h====== /* @(#)ndir.h 1.7 10/7/87 */ #if defined(HP9K5) /* He should have included it instead of this, but prevent confusion */ #include <ndir.h> #else /* other */ #ifndef DEV_BSIZE #define DEV_BSIZE 512 #endif #define DIRBLKSIZ DEV_BSIZE #define MAXNAMLEN 255 struct directy { long d_ino; /* inode number of entry */ short d_reclen; /* length of this record */ short d_namlen; /* length of string in d_name */ char d_name[MAXNAMLEN + 1]; /* name must be no longer than this */ }; /* * The DIRSIZ macro gives the minimum record length which will hold * the directory entry. This requires the amount of space in struct directy * without the d_name field, plus enough space for the name with a terminating * null byte (dp->d_namlen+1), rounded up to a 4 byte boundary. */ #ifdef DIRSIZ #undef DIRSIZ #endif /* DIRSIZ */ #define DIRSIZ(dp) \ ((sizeof (struct directy) - (MAXNAMLEN+1)) + (((dp)->d_namlen+1 + 3) &~ 3)) /* * Definitions for library routines operating on directories. */ typedef struct _dirdesc { int dd_fd; long dd_loc; long dd_size; char dd_buf[DIRBLKSIZ]; } DIR; #ifndef NULL #define NULL 0 #endif extern DIR *opendir(); extern struct directy *readdir(); extern void closedir(); #define rewinddir(dirp) seekdir((dirp), (long)0) #endif /* other */ ======end of scott's ndir.h====== What is the consensus? Which way should the package work? I know Doug's stuff is wide spread (and good), but is there a reason to change it's implementation? What does POSIX say? -- John Campbell ...!arizona!naucse!jdc CAMPBELL@NAUVAX.bitnet unix? Sure send me a dozen, all different colors.
gwyn@smoke.BRL.MIL (Doug Gwyn) (06/27/89)
In article <1509@naucse.UUCP> jdc@naucse.UUCP (John Campbell) writes: >======start of doug's dirent.h====== Here is the current version: /* <sys/dirent.h> -- file system independent directory entry (SVR3) last edit: 27-Oct-1988 D A Gwyn prerequisite: <sys/types.h> */ struct dirent /* data from getdents()/readdir() */ { long d_ino; /* inode number of entry */ off_t d_off; /* offset of disk directory entry */ unsigned short d_reclen; /* length of this record */ char d_name[1]; /* name of file */ /* non-ANSI */ }; #ifdef BSD_SYSV /* (e.g., when compiling getdents.c) */ extern struct dirent __dirent; /* (not actually used) */ /* The following is portable, although rather silly. */ #define DIRENTBASESIZ (__dirent.d_name - (char *)&__dirent.d_ino) #else /* The following nonportable ugliness could have been avoided by defining DIRENTSIZ and DIRENTBASESIZ to also have (struct dirent *) arguments. There shouldn't be any problem if you avoid using the DIRENTSIZ() macro. */ #define DIRENTBASESIZ (((struct dirent *)0)->d_name \ - (char *)&((struct dirent *)0)->d_ino) #endif #define DIRENTSIZ( namlen ) ((DIRENTBASESIZ + sizeof(long) + (namlen)) \ / sizeof(long) * sizeof(long)) /* DAG -- the following was moved from <dirent.h>, which was the wrong place */ #define MAXNAMLEN 512 /* maximum filename length */ #ifndef NAME_MAX #define NAME_MAX (MAXNAMLEN - 1) /* DAG -- added for POSIX */ #endif >What this means is that memcpy (x, y, sizeof (struct dirent)) ... It is strongly implied by the POSIX spec that readdir() "owns" the contents of this struct; an attempt to keep it from getting overwritten by making a copy of it is doomed, because there is no portable way to know how big the actual allocation for the struct dirent is; IEEE 1003.1 specifically states that the character array d_name is of unspecified size, and this was done deliberately to allow implementations such as mine. In some drafts we had required d_name to be a char* rather than an array (thus the "non-POSIX" comment in the version you posted). After Section 5.1 had gotten straightened out, taking into account my feedback, some more comments (from Berkeley, I think) were received and further changes were made, unfortunately with no opportunity for further review. (This was a generic problem with the 1003.1 balloting process.) There are a lot of things that needed to be more clearly specified. For example, can a DIR be copied to make a separate but equal handle on a directory stream? (The answer is "no", but it's not specified in IEEE Std 1003.1.) The way to save a directory entry is to either copy the name string, using strlen() to determine the proper size, or to use the (non-POSIX) telldir() function to obtain a position for a later seekdir(). I recommend not using telldir()/seekdir() for reasons other than their being nonstandard. Note that d_name is the only member of a struct dirent that POSIX mentions; therefore it's the only part you can portably use anyway. >My question (great, now he gets to it) is, "Is this the best way to >build a portable opendir(), readdir(), etc. package?" Certainly I think so. The only system dependency (apart from stretching the limits of the C language) is isolated in the getdents() function, which is either a system call (SVR3) or an emulation of one. Thus porting the package to a previously unsupported environment consists almost entirely of devising a working getdents() emulation. By the way, the reason for my using DIRENTBASESIZ etc. as you see them is that the SVR3 implementors had done so, and my package is intended to be usable as a direct replacement for SVR3's. The comment in the code explains how it could be done better were SVR3 compatibility not a requirement. (Actually, much of the SVR3 implementation appears to have been based on an earlier version of my package. Very strange feedback loops we have operating here!)