dave@galaxia.Newport.RI.US (David H. Brierley) (02/16/89)
Since there have been several programs posted both here and in the comp.sources groups lately that require some form of directory reading routines, and since the directory reading routines are generally a good thing to have if you are going to be writing programs, here is a copy of the current version of the directory reading package written by Doug Gwyn. This version was received from Doug on Feb 13, 1989 and is reposted here with his permission. -------- cut here and feed to /bin/sh ------- #!/bin/sh # Self-unpacking archive format. To unbundle, sh this file. echo 'NOTES' 1>&2 cat >'NOTES' <<'END OF NOTES' NOTES FOR POSIX-COMPATIBLE C LIBRARY DIRECTORY-ACCESS ROUTINES Older UNIX C libraries lacked support for reading directories, so historically programs had knowledge of UNIX directory structure hard-coded into them. When Berkeley changed the format of directories for 4.2BSD, it became necessary to change programs to work with the new structure. Fortunately, Berkeley designed a small set of directory access routines to encapsulate knowledge of the new directory format so that user programs could deal with directory entries as an abstract data type. (Unfortunately, they didn't get it quite right.) The interface to these routines was nearly independent of the particular implementation of directories on any given UNIX system; this has become a particularly important requirement with the advent of heterogeneous network filesystems such as NFS. It has consequently become possible to write portable applications that search directories by restricting all directory access to use these new interface routines. The sources supplied here are a total rewrite of Berkeley's code, incorporating ideas from a variety of sources and conforming as closely to published standards as possible, and are in the PUBLIC DOMAIN to encourage their widespread adoption. They support four methods of access to system directories: the original UNIX filesystem via read(), the 4.2BSD filesystem via read(), NFS and native filesystems via getdirentries(), and SVR3 getdents(). The other three types are accomplished by appropriate emulation of the SVR3 getdents() system call, which attains portability at the cost of slightly more data movement than absolutely necessary for some systems. These routines should be added to the standard C library on all UNIX systems, and all existing and future applications should be changed to use this interface. Once this is done, there should be no portability problems due to differences in underlying directory structures among UNIX systems. (When porting your applications to other UNIX systems, you can always carry this package around with you.) An additional benefit of these routines is that they buffer directory input, which provides improved access speed over raw read()s of one entry at a time. One annoying compatibility problem has arisen along the way, namely that the original Berkeley interface used the same name, struct direct, for the new data structure as had been used for the original UNIX filesystem directory record structure. This name was changed by the IEEE 1003.1 (POSIX) Working Group to "struct dirent" and was picked up for SVR3 under the new name; it is also the name used in this portable package. I believe it is necessary to bite the bullet and adopt the new non-conflicting name. Code using a 4.2BSD-compatible package needs to be slightly revised to work with this new package, as follows: Change #include <ndir.h> /* Ninth Edition UNIX */ or #include <sys/dir.h> /* 4.2BSD */ or #include <dir.h> /* old BRL System V emulation */ to #include <sys/types.h> /* if not already #included */ #include <dirent.h> Change struct direct to struct dirent Change (anything)->d_namlen to strlen( (anything)->d_name ) There is a minor compatibility problem in that the closedir() function was originally defined to have type void, but IEEE 1003.1 changed this to type int, which is what this implementation supports (even though I disagree with the change). However, the difference does not affect most applications. Error handling is not completely satisfactory, due to the variety of possible failure modes in a general setting. For example, the rewinddir() function might fail, but there is no good way to indicate this. I have tried to follow the specifications in IEEE 1003.1 and the SVID as closely as possible, but there are minor deviations in this regard. Applications should not rely too heavily on exact failure mode semantics. Please do not change the new standard interface in any way, as that would defeat the major purpose of this package! (It's okay to alter the internal implementation if you really have to, although I tried to make this unnecessary for the vast majority of UNIX variants.) Installation instructions can be found in the file named INSTALL. This implementation is provided by: Douglas A. Gwyn U.S. Army Ballistic Research Laboratory SLCBR-VL-V Aberdeen Proving Ground, MD 21005-5066 (301)278-6647 Gwyn@BRL.MIL This is UNSUPPORTED, use-at-your-own-risk, free software in the public domain. However, I would appreciate hearing of any actual bugs you find in this implementation and/or any improvements you come up with. END OF NOTES echo 'INSTALL' 1>&2 cat >'INSTALL' <<'END OF INSTALL' INSTALLATION INSTRUCTIONS The following instructions are for systems resembling Ninth Edition UNIX, with hints about dealing with variations you may encounter for your specific system. Installation should be done only by someone who is comfortable with modifying the standard C library and header files. If your system already includes directory access routines, you should replace them with this package. We're trying to get this standardized; see the discussion in the NOTES file. I have tried to make the source code as generic as possible, but if your system predates Seventh Edition UNIX you will have problems. DISCLAIMER: Although I believe the code and procedures described here to be correct, I make no warranty of any kind, and you are advised to perform your own careful testing before making any substantial change like this to your programming environment. 0) For antique systems that do not support C's "void" data type, edit the file sys.dirent.h to add the following: typedef int void; /* good enough for govt work */ If for some reason your <sys/types.h> doesn't define them, add the following to sys.dirent.h: typedef unsigned short ino_t; /* (assuming original UFS) */ typedef long off_t; /* long is forced by lseek() */ None of this should be necessary for any modern UNIX system. 1) Copy the file dirent.h to /usr/include/dirent.h and copy the file sys.dirent.h to /usr/include/sys/dirent.h. (The file sys._dir.h is also provided for the BRL UNIX System V emulation for 4.nBSD. That environment uses different directory names for everything.) 2) Copy the file directory.3c to /usr/man/man3/directory.3 and copy the file dirent.4 to /usr/man/man5/dirent.5; edit the new file /usr/man/man3/directory.3 to change the "SEE ALSO" reference from dirent(4) to dirent(5) and to change the 3C on the first line to 3; edit the new file /usr/man/man5/dirent.5 to change the 4 on the first line to 5; then print the manual pages via the command man directory dirent to see what the new routines are like. (If you have a "catman" style of on-line manual, adapt these instructions accordingly. Manual entries are kept in directories with other names on some systems such as UNIX System V. On systems that already had a directory library documented in some other manual entry, remove the superseded manual entry; if the description of the native filesystem directory format found by "man dir" refers to a directory library, modify it to simply refer to the entry for "dirent".) 3) Copy the files closedir.c, opendir.c, readdir.c, rewinddir.c, seekdir.c, and telldir.c to the "gen" or "port/gen" subdirectory of your C library source directory. If you do not have a getdents() system call, copy the file getdents.c to the "sys" or "port/sys" subdirectory and copy the file getdents.2 to /usr/man/man2/getdents.2 (actually you may prefer to put this file in section 3 and adjust the references in the other manual entries accordingly; also adjust the references to dirent(4) to be to dirent(5) if that's where the entry is). Edit the C library makefile(s) to include the new object modules in the C library. (See the comments at the beginning of getdents.c for symbols that must be defined to configure getdents.c.) Then remake and reinstall the C library. Alternatively, you can just compile the new sources and insert their objects near the front of the C library /lib/libc.a using the "ar" utility (seekdir.o should precede readdir.o, which in turn should precede getdents.o). On some systems you then need to use the "ranlib" utility to update the archive symbol table. 4) After the C library has been updated, delete /usr/include/ndir.h or any other header used with a previous directory library to prevent inadvertent use of the superseded directory access interface. Also delete any corresponding library such as /usr/lib/libndir.a. 5) To verify installation, try compiling, linking, and running the program testdir.c. This program searches the current directory "." for each file named as a program argument and prints `"FOO" found.' or `"FOO" not found.' where FOO is of course replaced by the name being sought in the directory. Try something like cd /usr/bin # a multi-block directory $WHEREVER/testdir FOO lint BAR f77 XYZZY which should produce the output "FOO" not found. "lint" found. "BAR" not found. "f77" found. "XYZZY" not found. A more thorough test would be cd /usr/bin # a multi-block directory $WHEREVER/testdir `ls -a` | grep 'not found' This program does not test the seekdir() and telldir() functions. 6) Notify your programmers that all directory access must be made through the new interface, and that documentation is available via man directory dirent Make the NOTES file available to those programmers who might want to understand what this is all about. 7) Change all system sources that were accessing directories to use the new routines. Nearly all such sources contain the line #include <sys/dir.h> or #include <ndir.h> so they should be easy to find. (If you earlier removed some other header file, that is, if this package superseded an earlier version of the directory access library, look for its name too. See the conversion instructions in the NOTES file.) END OF INSTALL echo 'directory.3c' 1>&2 cat >'directory.3c' <<'END OF directory.3c' .TH DIRECTORY 3C "Standard Extension" .SH NAME opendir, readdir, telldir, seekdir, rewinddir, closedir \- directory operations .SH SYNOPSIS .B "#include <sys/types.h>" .br .B "#include <dirent.h>" .P .B "DIR \(**opendir (dirname)" .br .B "char \(**dirname;" .P .B "struct dirent \(**readdir (dirp)" .br .B "DIR \(**dirp;" .P .B "off_t telldir (dirp)" .br .B "DIR \(**dirp;" .P .B "void seekdir (dirp, loc)" .br .B "DIR \(**dirp;" .br .B "off_t loc;" .P .B "void rewinddir (dirp)" .br .B "DIR \(**dirp;" .P .B "int closedir (dirp)" .br .B "DIR \(**dirp;" .SH DESCRIPTION .I Opendir establishes a connection between the directory named by .I dirname and a unique object of type .SM DIR known as a .I "directory stream" that it creates. .I Opendir returns a pointer to be used to identify the directory stream in subsequent operations. A .SM NULL pointer is returned if .I dirname cannot be accessed or is not a directory, or if .I opendir is unable to create the .SM DIR object (perhaps due to insufficient memory). .P .I Readdir returns a pointer to an internal structure containing information about the next active directory entry. No inactive entries are reported. The internal structure may be overwritten by another operation on the same directory stream; the amount of storage needed to hold a copy of the internal structure is given by the value of a macro, .IR DIRENTSIZ(strlen(direntp\->d_name)) , not by .I "sizeof(struct\ dirent)" as one might expect. A .SM NULL pointer is returned upon reaching the end of the directory, upon detecting an invalid location in the directory, or upon occurrence of an error while reading the directory. .P .I Telldir returns the current position associated with the named directory stream for later use as an argument to .IR seekdir . .P .I Seekdir sets the position of the next .I readdir operation on the named directory stream. The new position reverts to the one associated with the directory stream when the .I telldir operation from which .I loc was obtained was performed. .P .I Rewinddir resets the position of the named directory stream to the beginning of the directory. All buffered data for the directory stream is discarded, thereby guaranteeing that the actual file system directory will be referred to for the next .I readdir on the directory stream. .P .I Closedir closes the named directory stream; internal resources used for the directory stream are liberated, and subsequent use of the associated .SM DIR object is no longer valid. .I Closedir returns a value of zero if no error occurs, \-1 otherwise. .P There are several possible errors that can occur as a result of these operations; the external integer variable .I errno is set to indicate the specific error. .RI ( Readdir 's detection of the normal end of a directory is not considered to be an error.) .SH EXAMPLE Sample code which searches the current working directory for entry .IR name : .P .ft B dirp = opendir( "." ); .br while ( (dp = readdir( dirp )) != NULL ) .br if ( strcmp( dp\->d_name, name ) == 0 ) .br { .br (void) closedir( dirp ); .br return FOUND; .br } .br (void) closedir( dirp ); .br return NOT_FOUND; .ft P .SH "SEE ALSO" getdents(2), dirent(4). .SH WARNINGS Entries for "." and ".." may not be reported for some file system types. .P The value returned by .I telldir need not have any simple interpretation and should only be used as an argument to .IR seekdir . Similarly, the .I loc argument to .I seekdir must be obtained from a previous .I telldir operation on the same directory stream. .P .I Telldir and .I seekdir are unreliable when used in conjunction with file systems that perform directory compaction or expansion or when the directory stream has been closed and reopened. It is best to avoid using .I telldir and .I seekdir altogether. .P The exact set of .I errno values and meanings may vary among implementations. .P Because directory entries can dynamically appear and disappear, and because directory contents are buffered by these routines, an application may need to continually rescan a directory to maintain an accurate picture of its active entries. END OF directory.3c echo 'dirent.4' 1>&2 cat >'dirent.4' <<'END OF dirent.4' .TH DIRENT 4 "Standard Extension" .SH NAME dirent \- file system independent directory entry .SH SYNOPSIS .B "#include <sys/types.h>" .br .B "#include <sys/dirent.h>" .SH DESCRIPTION Different file system types may have different directory entries. The .I dirent structure defines a file system independent directory entry, which contains information common to directory entries in different file system types. A set of these structures is returned by the .IR getdents (2) system call. .P The .I dirent structure is defined below. .br struct dirent { .br long d_ino; .br off_t d_off; .br unsigned short d_reclen; .br char d_name[1]; .br }; .P The field .I d_ino is a number which is unique for each file in the file system. The field .I d_off\^ represents an offset of that directory entry in the actual file system directory. The field .I d_name is the beginning of the character array giving the name of the directory entry. This name is null terminated and may have at most .SM NAME_MAX characters in addition to the null terminator. This results in file system independent directory entries being variable-length entities. The value of .I d_reclen is the record length of this entry. This length is defined to be the number of bytes between the beginning of the current entry and the next one, adjusted so that the next entry will start on a long boundary. .SH FILES /usr/include/sys/dirent.h .SH "SEE ALSO" getdents(2). .SH WARNING The field .I d_off\^ does not have a simple interpretation for some file system types and should not be used directly by applications. END OF dirent.4 echo 'getdents.2' 1>&2 cat >'getdents.2' <<'END OF getdents.2' .TH GETDENTS 2 "Standard Extension" .SH NAME getdents \- get directory entries in a file system independent format .SH SYNOPSIS .B "#include <sys/types.h>" .br .B "#include <sys/dirent.h>" .P .B "int getdents (fildes, buf, nbyte)" .br .B "int fildes;" .br .B "char \(**buf;" .br .B "unsigned nbyte;" .SH DESCRIPTION .I Fildes is a file descriptor obtained from an .IR open (2) or .IR dup (2) system call. .P .I Getdents attempts to read .I nbyte bytes from the directory associated with .I fildes and to format them as file system independent entries in the buffer pointed to by .IR buf . Since the file system independent directory entries are of variable length, in most cases the actual number of bytes returned will be less than .IR nbyte . .P The file system independent directory entry is specified by the .I dirent structure. For a description of this see .IR dirent (4). .P On devices capable of seeking, .I getdents starts at a position in the file given by the file pointer associated with .IR fildes . Upon return from .IR getdents , the file pointer has been incremented to point to the next directory entry. .P This system call was developed in order to implement the .I readdir routine [for a description see .IR directory (3C)] and should not be used for other purposes. .SH "SEE ALSO" directory(3C), dirent(4). .SH DIAGNOSTICS Upon successful completion a non-negative integer is returned indicating the number of bytes of .I buf\^ actually filled. (This need not be the number actually used in the actual directory file.)\|\| A value of zero indicates the end of the directory has been reached. If .I getdents fails for any other reason, a value of \-1 is returned and the external integer variable .I errno is set to indicate the error. .SH WARNINGS Entries for "." and ".." may not be reported for some file system types. .P The exact set of .I errno values and meanings may vary among implementations. END OF getdents.2 echo 'dirent.h' 1>&2 cat >'dirent.h' <<'END OF dirent.h' /* <dirent.h> -- definitions for SVR3 directory access routines last edit: 25-Apr-1987 D A Gwyn Prerequisite: <sys/types.h> */ #include <sys/dirent.h> #define DIRBUF 8192 /* buffer size for fs-indep. dirs */ /* must in general be larger than the filesystem buffer size */ typedef struct { int dd_fd; /* file descriptor */ int dd_loc; /* offset in block */ int dd_size; /* amount of valid data */ char *dd_buf; /* -> directory block */ } DIR; /* stream data from opendir() */ extern DIR *opendir(); extern struct dirent *readdir(); extern off_t telldir(); extern void seekdir(); extern void rewinddir(); extern int closedir(); #ifndef NULL #define NULL 0 /* DAG -- added for convenience */ #endif END OF dirent.h echo 'sys.dirent.h' 1>&2 cat >'sys.dirent.h' <<'END OF sys.dirent.h' /* <sys/dirent.h> -- file system independent directory entry (SVR3) last edit: 27-Oct-1988 D A Gwyn prerequisite: <sys/types.h> */ struct dirent /* data from getdents()/readdir() */ { long d_ino; /* inode number of entry */ off_t d_off; /* offset of disk directory entry */ unsigned short d_reclen; /* length of this record */ char d_name[1]; /* name of file */ /* non-ANSI */ }; #ifdef BSD_SYSV /* (e.g., when compiling getdents.c) */ extern struct dirent __dirent; /* (not actually used) */ /* The following is portable, although rather silly. */ #define DIRENTBASESIZ (__dirent.d_name - (char *)&__dirent.d_ino) #else /* The following nonportable ugliness could have been avoided by defining DIRENTSIZ and DIRENTBASESIZ to also have (struct dirent *) arguments. There shouldn't be any problem if you avoid using the DIRENTSIZ() macro. */ #define DIRENTBASESIZ (((struct dirent *)0)->d_name \ - (char *)&((struct dirent *)0)->d_ino) #endif #define DIRENTSIZ( namlen ) ((DIRENTBASESIZ + sizeof(long) + (namlen)) \ / sizeof(long) * sizeof(long)) /* DAG -- the following was moved from <dirent.h>, which was the wrong place */ #define MAXNAMLEN 512 /* maximum filename length */ #ifndef NAME_MAX #define NAME_MAX (MAXNAMLEN - 1) /* DAG -- added for POSIX */ #endif END OF sys.dirent.h echo 'sys._dir.h' 1>&2 cat >'sys._dir.h' <<'END OF sys._dir.h' /* <sys/_dir.h> -- definitions for 4.2,4.3BSD directories last edit: 25-Apr-1987 D A Gwyn A directory consists of some number of blocks of DIRBLKSIZ bytes each, where DIRBLKSIZ is chosen such that it can be transferred to disk in a single atomic operation (e.g., 512 bytes on most machines). Each DIRBLKSIZ-byte block contains some number of directory entry structures, which are of variable length. Each directory entry has the beginning of a (struct direct) at the front of it, containing its filesystem-unique ident number, the length of the entry, and the length of the name contained in the entry. These are followed by the NUL- terminated name padded to a (long) boundary with 0 bytes. The maximum length of a name in a directory is MAXNAMELEN. The macro DIRSIZ(dp) gives the amount of space required to represent a directory entry. Free space in a directory is represented by entries that have dp->d_reclen > DIRSIZ(dp). All DIRBLKSIZ bytes in a directory block are claimed by the directory entries; this usually results in the last entry in a directory having a large dp->d_reclen. When entries are deleted from a directory, the space is returned to the previous entry in the same directory block by increasing its dp->d_reclen. If the first entry of a directory block is free, then its dp->d_fileno is set to 0; entries other than the first in a directory do not normally have dp->d_fileno set to 0. prerequisite: <sys/types.h> */ #if defined(accel) || defined(sun) || defined(vax) #define DIRBLKSIZ 512 /* size of directory block */ #else #ifdef alliant #define DIRBLKSIZ 4096 /* size of directory block */ #else #ifdef gould #define DIRBLKSIZ 1024 /* size of directory block */ #else #ifdef ns32000 /* Dynix System V */ #define DIRBLKSIZ 2600 /* size of directory block */ #else /* be conservative; multiple blocks are okay but fractions are not */ #define DIRBLKSIZ 4096 /* size of directory block */ #endif #endif #endif #endif #define MAXNAMELEN 255 /* maximum filename length */ /* NOTE: not MAXNAMLEN, which has been preempted by SVR3 <dirent.h> */ struct direct /* data from read()/_getdirentries() */ { unsigned long d_fileno; /* unique ident of entry */ unsigned short d_reclen; /* length of this record */ unsigned short d_namlen; /* length of string in d_name */ char d_name[MAXNAMELEN+1]; /* NUL-terminated filename */ /* typically shorter */ }; /* The DIRSIZ macro gives the minimum record length which will hold the directory entry. This requires the amount of space in a (struct direct) without the d_name field, plus enough space for the name with a terminating NUL character, rounded up to a (long) boundary. (Note that Berkeley didn't properly compensate for struct padding, but we nevertheless have to use the same size as the actual system.) */ #define DIRSIZ( dp ) ((sizeof(struct direct) - (MAXNAMELEN+1) \ + sizeof(long) + (dp)->d_namlen) \ / sizeof(long) * sizeof(long)) END OF sys._dir.h echo 'opendir.c' 1>&2 cat >'opendir.c' <<'END OF opendir.c' /* opendir -- open a directory stream last edit: 27-Oct-1988 D A Gwyn */ #include <sys/errno.h> #include <sys/types.h> #include <sys/stat.h> #include <dirent.h> #ifdef BSD_SYSV #define open _open /* avoid emulation overhead */ #endif typedef char *pointer; /* (void *) if you have it */ extern void free(); extern pointer malloc(); extern int open(), close(), fstat(); extern int errno; #ifndef NULL #define NULL 0 #endif #ifndef O_RDONLY #define O_RDONLY 0 #endif #ifndef S_ISDIR /* macro to test for directory file */ #define S_ISDIR( mode ) (((mode) & S_IFMT) == S_IFDIR) #endif DIR * opendir( dirname ) char *dirname; /* name of directory */ { register DIR *dirp; /* -> malloc'ed storage */ register int fd; /* file descriptor for read */ /* The following is static just to keep the stack small. */ static struct stat sbuf; /* result of fstat() */ if ( (fd = open( dirname, O_RDONLY )) < 0 ) return NULL; /* errno set by open() */ if ( fstat( fd, &sbuf ) != 0 || !S_ISDIR( sbuf.st_mode ) ) { (void)close( fd ); errno = ENOTDIR; return NULL; /* not a directory */ } if ( (dirp = (DIR *)malloc( sizeof(DIR) )) == NULL || (dirp->dd_buf = (char *)malloc( (unsigned)DIRBUF )) == NULL ) { register int serrno = errno; /* errno set to ENOMEM by sbrk() */ if ( dirp != NULL ) free( (pointer)dirp ); (void)close( fd ); errno = serrno; return NULL; /* not enough memory */ } dirp->dd_fd = fd; dirp->dd_loc = dirp->dd_size = 0; /* refill needed */ return dirp; } END OF opendir.c echo 'readdir.c' 1>&2 cat >'readdir.c' <<'END OF readdir.c' /* readdir -- read next entry from a directory stream last edit: 25-Apr-1987 D A Gwyn */ #include <sys/errno.h> #include <sys/types.h> #include <dirent.h> extern int getdents(); /* SVR3 system call, or emulation */ extern int errno; #ifndef NULL #define NULL 0 #endif struct dirent * readdir( dirp ) register DIR *dirp; /* stream from opendir() */ { register struct dirent *dp; /* -> directory data */ if ( dirp == NULL || dirp->dd_buf == NULL ) { errno = EFAULT; return NULL; /* invalid pointer */ } do { if ( dirp->dd_loc >= dirp->dd_size ) /* empty or obsolete */ dirp->dd_loc = dirp->dd_size = 0; if ( dirp->dd_size == 0 /* need to refill buffer */ && (dirp->dd_size = getdents( dirp->dd_fd, dirp->dd_buf, (unsigned)DIRBUF ) ) <= 0 ) return NULL; /* EOF or error */ dp = (struct dirent *)&dirp->dd_buf[dirp->dd_loc]; dirp->dd_loc += dp->d_reclen; } while ( dp->d_ino == 0L ); /* don't rely on getdents() */ return dp; } END OF readdir.c echo 'telldir.c' 1>&2 cat >'telldir.c' <<'END OF telldir.c' /* telldir -- report directory stream position last edit: 25-Apr-1987 D A Gwyn NOTE: 4.nBSD directory compaction makes seekdir() & telldir() practically impossible to do right. Avoid using them! */ #include <sys/errno.h> #include <sys/types.h> #include <dirent.h> extern off_t lseek(); extern int errno; #ifndef SEEK_CUR #define SEEK_CUR 1 #endif off_t telldir( dirp ) /* return offset of next entry */ DIR *dirp; /* stream from opendir() */ { if ( dirp == NULL || dirp->dd_buf == NULL ) { errno = EFAULT; return -1; /* invalid pointer */ } if ( dirp->dd_loc < dirp->dd_size ) /* valid index */ return ((struct dirent *)&dirp->dd_buf[dirp->dd_loc])->d_off; else /* beginning of next directory block */ return lseek( dirp->dd_fd, (off_t)0, SEEK_CUR ); } END OF telldir.c echo 'seekdir.c' 1>&2 cat >'seekdir.c' <<'END OF seekdir.c' /* seekdir -- reposition a directory stream last edit: 24-May-1987 D A Gwyn An unsuccessful seekdir() will in general alter the current directory position; beware. NOTE: 4.nBSD directory compaction makes seekdir() & telldir() practically impossible to do right. Avoid using them! */ #include <sys/errno.h> #include <sys/types.h> #include <dirent.h> extern off_t lseek(); extern int errno; #ifndef NULL #define NULL 0 #endif #ifndef SEEK_SET #define SEEK_SET 0 #endif typedef int bool; /* Boolean data type */ #define false 0 #define true 1 void seekdir( dirp, loc ) register DIR *dirp; /* stream from opendir() */ register off_t loc; /* position from telldir() */ { register bool rewind; /* "start over when stymied" flag */ if ( dirp == NULL || dirp->dd_buf == NULL ) { errno = EFAULT; return; /* invalid pointer */ } /* A (struct dirent)'s d_off is an invented quantity on 4.nBSD NFS-supporting systems, so it is not safe to lseek() to it. */ /* Monotonicity of d_off is heavily exploited in the following. */ /* This algorithm is tuned for modest directory sizes. For huge directories, it might be more efficient to read blocks until the first d_off is too large, then back up one block, or even to use binary search on the directory blocks. I doubt that the extra code for that would be worthwhile. */ if ( dirp->dd_loc >= dirp->dd_size /* invalid index */ || ((struct dirent *)&dirp->dd_buf[dirp->dd_loc])->d_off > loc /* too far along in buffer */ ) dirp->dd_loc = 0; /* reset to beginning of buffer */ /* else save time by starting at current dirp->dd_loc */ for ( rewind = true; ; ) { register struct dirent *dp; /* See whether the matching entry is in the current buffer. */ if ( (dirp->dd_loc < dirp->dd_size /* valid index */ || readdir( dirp ) != NULL /* next buffer read */ && (dirp->dd_loc = 0, true) /* beginning of buffer set */ ) && (dp = (struct dirent *)&dirp->dd_buf[dirp->dd_loc])->d_off <= loc /* match possible in this buffer */ ) { for ( /* dp initialized above */ ; (char *)dp < &dirp->dd_buf[dirp->dd_size]; dp = (struct dirent *)((char *)dp + dp->d_reclen) ) if ( dp->d_off == loc ) { /* found it! */ dirp->dd_loc = (char *)dp - dirp->dd_buf; return; } rewind = false; /* no point in backing up later */ dirp->dd_loc = dirp->dd_size; /* set end of buffer */ } else /* whole buffer past matching entry */ if ( !rewind ) { /* no point in searching further */ errno = EINVAL; return; /* no entry at specified loc */ } else { /* rewind directory and start over */ rewind = false; /* but only once! */ dirp->dd_loc = dirp->dd_size = 0; if ( lseek( dirp->dd_fd, (off_t)0, SEEK_SET ) != 0 ) return; /* errno already set (EBADF) */ if ( loc == 0 ) return; /* save time */ } } } END OF seekdir.c echo 'rewinddir.c' 1>&2 cat >'rewinddir.c' <<'END OF rewinddir.c' /* rewinddir -- rewind a directory stream last edit: 25-Apr-1987 D A Gwyn This is not simply a call to seekdir(), because seekdir() will use the current buffer whenever possible and we need rewinddir() to forget about buffered data. */ #include <sys/errno.h> #include <sys/types.h> #include <dirent.h> extern off_t lseek(); extern int errno; #ifndef NULL #define NULL 0 #endif #ifndef SEEK_SET #define SEEK_SET 0 #endif void rewinddir( dirp ) register DIR *dirp; /* stream from opendir() */ { if ( dirp == NULL || dirp->dd_buf == NULL ) { errno = EFAULT; return; /* invalid pointer */ } dirp->dd_loc = dirp->dd_size = 0; /* invalidate buffer */ (void)lseek( dirp->dd_fd, (off_t)0, SEEK_SET ); /* may set errno */ } END OF rewinddir.c echo 'closedir.c' 1>&2 cat >'closedir.c' <<'END OF closedir.c' /* closedir -- close a directory stream last edit: 11-Nov-1988 D A Gwyn */ #include <sys/errno.h> #include <sys/types.h> #include <dirent.h> typedef char *pointer; /* (void *) if you have it */ extern void free(); extern int close(); extern int errno; #ifndef NULL #define NULL 0 #endif int closedir( dirp ) register DIR *dirp; /* stream from opendir() */ { register int fd; if ( dirp == NULL || dirp->dd_buf == NULL ) { errno = EFAULT; return -1; /* invalid pointer */ } fd = dirp->dd_fd; /* bug fix thanks to R. Salz */ free( (pointer)dirp->dd_buf ); free( (pointer)dirp ); return close( fd ); } END OF closedir.c echo 'getdents.c' 1>&2 cat >'getdents.c' <<'END OF getdents.c' /* getdents -- get directory entries in a file system independent format (SVR3 system call emulation) last edit: 27-Oct-1988 D A Gwyn This single source file supports several different methods of getting directory entries from the operating system. Define whichever one of the following describes your system: UFS original UNIX filesystem (14-character name limit) BFS 4.2BSD (also 4.3BSD) native filesystem (long names) NFS getdirentries() system call Also define any of the following flags that are pertinent: ATT_SPEC check user buffer address for longword alignment BSD_SYSV BRL UNIX System V emulation environment on 4.nBSD INT_SIGS <signal.h> thinks that signal handlers have return type int (rather than the standard void) NEG_DELS deleted entries have inode number -1 rather than 0 UNK have _getdents() system call, but kernel may not support it If your C library has a getdents() system call interface, but you can't count on all kernels on which your application binaries may run to support it, change the system call interface name to _getdents() and define "UNK" to enable the system-call validity test in this "wrapper" around _getdents(). If your system has a getdents() system call that is guaranteed to always work, you shouldn't be using this source file at all. */ #include <sys/errno.h> #include <sys/types.h> #ifdef BSD_SYSV #include <sys/_dir.h> /* BSD flavor, not System V */ #else #include <sys/dir.h> #undef MAXNAMLEN /* avoid conflict with SVR3 */ /* Good thing we don't need to use the DIRSIZ() macro! */ #ifdef d_ino /* 4.3BSD/NFS using d_fileno */ #undef d_ino /* (not absolutely necessary) */ #else #define d_fileno d_ino /* (struct direct) member */ #endif #endif #include <sys/dirent.h> #include <sys/stat.h> #ifdef UNK #ifndef UFS #include "***** ERROR ***** UNK applies only to UFS" /* One could do something similar for getdirentries(), but I didn't bother. */ #endif #include <signal.h> #endif #if defined(UFS) + defined(BFS) + defined(NFS) != 1 /* sanity check */ #include "***** ERROR ***** exactly one of UFS, BFS, or NFS must be defined" #endif #ifdef BSD_SYSV struct dirent __dirent; /* (just for the DIRENTBASESIZ macro) */ #endif #ifdef UFS #define RecLen( dp ) (sizeof(struct direct)) /* fixed-length entries */ #else /* BFS || NFS */ #define RecLen( dp ) ((dp)->d_reclen) /* variable-length entries */ #endif #ifdef NFS #ifdef BSD_SYSV #define getdirentries _getdirentries /* package hides this system call */ #endif extern int getdirentries(); static long dummy; /* getdirentries() needs basep */ #define GetBlock( fd, buf, n ) getdirentries( fd, buf, (unsigned)n, &dummy ) #else /* UFS || BFS */ #ifdef BSD_SYSV #define read _read /* avoid emulation overhead */ #endif extern int read(); #define GetBlock( fd, buf, n ) read( fd, buf, (unsigned)n ) #endif #ifdef UNK extern int _getdents(); /* actual system call */ #endif extern char *strncpy(); extern int fstat(); extern off_t lseek(); extern int errno; #ifdef NEG_DELS #define DELETED (-1) #else #define DELETED 0 #endif #ifndef DIRBLKSIZ #define DIRBLKSIZ 4096 /* directory file read buffer size */ #endif #ifndef NULL #define NULL 0 #endif #ifndef SEEK_CUR #define SEEK_CUR 1 #endif #ifndef S_ISDIR /* macro to test for directory file */ #define S_ISDIR( mode ) (((mode) & S_IFMT) == S_IFDIR) #endif #ifdef UFS /* The following routine is necessary to handle DIRSIZ-long entry names. Thanks to Richard Todd for pointing this out. */ static int NameLen( name ) /* return # chars in embedded name */ char name[]; /* -> name embedded in struct direct */ { register char *s; /* -> name[.] */ register char *stop = &name[DIRSIZ]; /* -> past end of name field */ for ( s = &name[1]; /* (empty names are impossible) */ *s != '\0' /* not NUL terminator */ && ++s < stop; /* < DIRSIZ characters scanned */ ) ; return s - name; /* # valid characters in name */ } #else /* BFS || NFS */ extern int strlen(); #define NameLen( name ) strlen( name ) /* names are always NUL-terminated */ #endif #ifdef UNK static enum { maybe, no, yes } state = maybe; /* does _getdents() work? */ #ifdef INT_SIGS #define RET_SIG int #else #define RET_SIG void #endif /*ARGSUSED*/ static RET_SIG sig_catch( sig ) int sig; /* must be SIGSYS */ { state = no; /* attempted _getdents() faulted */ #ifdef INT_SIGS return 0; /* telling lies */ #endif } #endif /* UNK */ int getdents( fildes, buf, nbyte ) /* returns # bytes read; 0 on EOF, -1 on error */ int fildes; /* directory file descriptor */ char *buf; /* where to put the (struct dirent)s */ unsigned nbyte; /* size of buf[] */ { int serrno; /* entry errno */ off_t offset; /* initial directory file offset */ /* The following are static just to keep the stack small. */ static struct stat statb; /* fstat() info */ static union { char dblk[DIRBLKSIZ #ifdef UFS +1 /* for last entry name terminator */ #endif ]; /* directory file block buffer */ struct direct dummy; /* just for alignment */ } u; /* (avoids having to malloc()) */ register struct direct *dp; /* -> u.dblk[.] */ register struct dirent *bp; /* -> buf[.] */ #ifdef UNK if ( state == yes ) /* _getdents() is known to work */ return _getdents( fildes, buf, nbyte ); if ( state == maybe ) /* first time only */ { RET_SIG (*shdlr)(); /* entry SIGSYS handler */ register int retval; /* return from _getdents() if any */ shdlr = signal( SIGSYS, sig_catch ); retval = _getdents( fildes, buf, nbyte ); /* try it */ (void)signal( SIGSYS, shdlr ); if ( state == maybe ) /* SIGSYS did not occur */ { state = yes; /* so _getdents() must have worked */ return retval; } } /* state == no; perform emulation */ #endif if ( buf == NULL #ifdef ATT_SPEC || (unsigned long)buf % sizeof(long) != 0 /* ugh */ #endif ) { errno = EFAULT; /* invalid pointer */ return -1; } if ( fstat( fildes, &statb ) != 0 ) return -1; /* errno set by fstat() */ if ( !S_ISDIR( statb.st_mode ) ) { errno = ENOTDIR; /* not a directory */ return -1; } if ( (offset = lseek( fildes, (off_t)0, SEEK_CUR )) < 0 ) return -1; /* errno set by lseek() */ #ifdef BFS /* no telling what remote hosts do */ if ( (unsigned long)offset % DIRBLKSIZ != 0 ) { errno = ENOENT; /* file pointer probably misaligned */ return -1; } #endif serrno = errno; /* save entry errno */ for ( bp = (struct dirent *)buf; bp == (struct dirent *)buf; ) { /* convert next directory block */ int size; do size = GetBlock( fildes, u.dblk, DIRBLKSIZ ); while ( size == -1 && errno == EINTR ); if ( size <= 0 ) return size; /* EOF or error (EBADF) */ for ( dp = (struct direct *)u.dblk; (char *)dp < &u.dblk[size]; dp = (struct direct *)((char *)dp + RecLen( dp )) ) { #ifndef UFS if ( dp->d_reclen <= 0 ) { errno = EIO; /* corrupted directory */ return -1; } #endif if ( dp->d_fileno != DELETED ) { /* non-empty; copy to user buffer */ register int reclen = DIRENTSIZ( NameLen( dp->d_name ) ); if ( (char *)bp + reclen > &buf[nbyte] ) { errno = EINVAL; return -1; /* buf too small */ } bp->d_ino = dp->d_fileno; bp->d_off = offset + ((char *)dp - u.dblk); bp->d_reclen = reclen; { #ifdef UFS /* Is the following kludge ugly? You bet. */ register char save = dp->d_name[DIRSIZ]; /* save original data */ dp->d_name[DIRSIZ] = '\0'; /* ensure NUL termination */ #endif (void)strncpy( bp->d_name, dp->d_name, reclen - DIRENTBASESIZ ); /* adds NUL padding */ #ifdef UFS dp->d_name[DIRSIZ] = save; /* restore original data */ #endif } bp = (struct dirent *)((char *)bp + reclen); } } #if !(defined(BFS) || defined(sun)) /* 4.2BSD screwed up; fixed in 4.3BSD */ if ( (char *)dp > &u.dblk[size] ) { errno = EIO; /* corrupted directory */ return -1; } #endif } errno = serrno; /* restore entry errno */ return (char *)bp - buf; /* return # bytes read */ } END OF getdents.c echo 'testdir.c' 1>&2 cat >'testdir.c' <<'END OF testdir.c' /* testdir -- basic test for C library directory access routines last edit: 25-Apr-1987 D A Gwyn */ #include <sys/types.h> #include <stdio.h> #include <dirent.h> extern void exit(); extern int strcmp(); main( argc, argv ) int argc; register char **argv; { register DIR *dirp; register struct dirent *dp; int nerrs = 0; /* total not found */ if ( (dirp = opendir( "." )) == NULL ) { (void)fprintf( stderr, "Cannot open \".\" directory\n" ); exit( 1 ); } while ( --argc > 0 ) { ++argv; while ( (dp = readdir( dirp )) != NULL ) if ( strcmp( dp->d_name, *argv ) == 0 ) { (void)printf( "\"%s\" found.\n", *argv ); break; } if ( dp == NULL ) { (void)printf( "\"%s\" not found.\n", *argv ); ++nerrs; } rewinddir( dirp ); } (void)closedir( dirp ); exit( nerrs ); } END OF testdir.c exit 0 -- David H. Brierley Home: dave@galaxia.Newport.RI.US {rayssd,xanth,lazlo,jclyde}!galaxia!dave Work: dhb@rayssd.ray.com {sun,decuac,gatech,necntc,ukma}!rayssd!dhb