[unix-pc.sources] Doug Gwyn directory reading routines

dave@galaxia.Newport.RI.US (David H. Brierley) (02/16/89)

Since there have been several programs posted both here and in the
comp.sources groups lately that require some form of directory reading
routines, and since the directory reading routines are generally a good
thing to have if you are going to be writing programs, here is a copy of
the current version of the directory reading package written by Doug
Gwyn.  This version was received from Doug on Feb 13, 1989 and is
reposted here with his permission.

-------- cut here and feed to /bin/sh -------
#!/bin/sh
# Self-unpacking archive format.  To unbundle, sh this file.
echo 'NOTES' 1>&2
cat >'NOTES' <<'END OF NOTES'


	NOTES FOR POSIX-COMPATIBLE C LIBRARY DIRECTORY-ACCESS ROUTINES


Older UNIX C libraries lacked support for reading directories, so historically
programs had knowledge of UNIX directory structure hard-coded into them.  When
Berkeley changed the format of directories for 4.2BSD, it became necessary to
change programs to work with the new structure.  Fortunately, Berkeley designed
a small set of directory access routines to encapsulate knowledge of the new
directory format so that user programs could deal with directory entries as an
abstract data type.  (Unfortunately, they didn't get it quite right.)  The
interface to these routines was nearly independent of the particular
implementation of directories on any given UNIX system; this has become a
particularly important requirement with the advent of heterogeneous network
filesystems such as NFS.

It has consequently become possible to write portable applications that search
directories by restricting all directory access to use these new interface
routines.  The sources supplied here are a total rewrite of Berkeley's code,
incorporating ideas from a variety of sources and conforming as closely to
published standards as possible, and are in the PUBLIC DOMAIN to encourage
their widespread adoption.  They support four methods of access to system
directories: the original UNIX filesystem via read(), the 4.2BSD filesystem via
read(), NFS and native filesystems via getdirentries(), and SVR3 getdents().
The other three types are accomplished by appropriate emulation of the SVR3
getdents() system call, which attains portability at the cost of slightly more
data movement than absolutely necessary for some systems.  These routines
should be added to the standard C library on all UNIX systems, and all existing
and future applications should be changed to use this interface.  Once this is
done, there should be no portability problems due to differences in underlying
directory structures among UNIX systems.  (When porting your applications to
other UNIX systems, you can always carry this package around with you.)

An additional benefit of these routines is that they buffer directory input,
which provides improved access speed over raw read()s of one entry at a time.

One annoying compatibility problem has arisen along the way, namely that the
original Berkeley interface used the same name, struct direct, for the new data
structure as had been used for the original UNIX filesystem directory record
structure.  This name was changed by the IEEE 1003.1 (POSIX) Working Group to
"struct dirent" and was picked up for SVR3 under the new name; it is also the
name used in this portable package.  I believe it is necessary to bite the
bullet and adopt the new non-conflicting name.  Code using a 4.2BSD-compatible
package needs to be slightly revised to work with this new package, as follows:
	Change
		#include <ndir.h>	/* Ninth Edition UNIX */
	or
		#include <sys/dir.h>	/* 4.2BSD */
	or
		#include <dir.h>	/* old BRL System V emulation */
	to
		#include <sys/types.h>	/* if not already #included */
		#include <dirent.h>

	Change
		struct direct
	to
		struct dirent

	Change
		(anything)->d_namlen
	to
		strlen( (anything)->d_name )

There is a minor compatibility problem in that the closedir() function was
originally defined to have type void, but IEEE 1003.1 changed this to type int,
which is what this implementation supports (even though I disagree with the
change).  However, the difference does not affect most applications.

Error handling is not completely satisfactory, due to the variety of possible
failure modes in a general setting.  For example, the rewinddir() function
might fail, but there is no good way to indicate this.  I have tried to
follow the specifications in IEEE 1003.1 and the SVID as closely as possible,
but there are minor deviations in this regard.  Applications should not rely
too heavily on exact failure mode semantics.

Please do not change the new standard interface in any way, as that would
defeat the major purpose of this package!  (It's okay to alter the internal
implementation if you really have to, although I tried to make this unnecessary
for the vast majority of UNIX variants.)

Installation instructions can be found in the file named INSTALL.

This implementation is provided by:

	Douglas A. Gwyn
	U.S. Army Ballistic Research Laboratory
	SLCBR-VL-V
	Aberdeen Proving Ground, MD 21005-5066

	(301)278-6647

	Gwyn@BRL.MIL

This is UNSUPPORTED, use-at-your-own-risk, free software in the public domain.
However, I would appreciate hearing of any actual bugs you find in this
implementation and/or any improvements you come up with.
END OF NOTES
echo 'INSTALL' 1>&2
cat >'INSTALL' <<'END OF INSTALL'


			INSTALLATION INSTRUCTIONS


The following instructions are for systems resembling Ninth Edition UNIX, with
hints about dealing with variations you may encounter for your specific system.
Installation should be done only by someone who is comfortable with modifying
the standard C library and header files.

If your system already includes directory access routines, you should replace
them with this package.  We're trying to get this standardized; see the
discussion in the NOTES file.

I have tried to make the source code as generic as possible, but if your system
predates Seventh Edition UNIX you will have problems.

DISCLAIMER:  Although I believe the code and procedures described here to be
correct, I make no warranty of any kind, and you are advised to perform your
own careful testing before making any substantial change like this to your
programming environment.


0)  For antique systems that do not support C's "void" data type, edit the file
    sys.dirent.h to add the following:

	typedef int		void;	/* good enough for govt work */

    If for some reason your <sys/types.h> doesn't define them, add the
    following to sys.dirent.h:

	typedef unsigned short	ino_t;	/* (assuming original UFS) */
	typedef long		off_t;	/* long is forced by lseek() */

    None of this should be necessary for any modern UNIX system.

1)  Copy the file dirent.h to /usr/include/dirent.h and copy the file
    sys.dirent.h to /usr/include/sys/dirent.h.  (The file sys._dir.h is also
    provided for the BRL UNIX System V emulation for 4.nBSD.  That environment
    uses different directory names for everything.)

2)  Copy the file directory.3c to /usr/man/man3/directory.3 and copy the file
    dirent.4 to /usr/man/man5/dirent.5; edit the new file
    /usr/man/man3/directory.3 to change the "SEE ALSO" reference from dirent(4)
    to dirent(5) and to change the 3C on the first line to 3; edit the new file
    /usr/man/man5/dirent.5 to change the 4 on the first line to 5; then print
    the manual pages via the command

	man directory dirent

    to see what the new routines are like.  (If you have a "catman" style of
    on-line manual, adapt these instructions accordingly.  Manual entries are
    kept in directories with other names on some systems such as UNIX System V.
    On systems that already had a directory library documented in some other
    manual entry, remove the superseded manual entry; if the description of the
    native filesystem directory format found by "man dir" refers to a directory
    library, modify it to simply refer to the entry for "dirent".)

3)  Copy the files closedir.c, opendir.c, readdir.c, rewinddir.c, seekdir.c,
    and telldir.c to the "gen" or "port/gen" subdirectory of your C library
    source directory.  If you do not have a getdents() system call, copy the
    file getdents.c to the "sys" or "port/sys" subdirectory and copy the file
    getdents.2 to /usr/man/man2/getdents.2 (actually you may prefer to put this
    file in section 3 and adjust the references in the other manual entries
    accordingly; also adjust the references to dirent(4) to be to dirent(5) if
    that's where the entry is).  Edit the C library makefile(s) to include the
    new object modules in the C library.  (See the comments at the beginning of
    getdents.c for symbols that must be defined to configure getdents.c.)  Then
    remake and reinstall the C library.  Alternatively, you can just compile
    the new sources and insert their objects near the front of the C library
    /lib/libc.a using the "ar" utility (seekdir.o should precede readdir.o,
    which in turn should precede getdents.o).  On some systems you then need to
    use the "ranlib" utility to update the archive symbol table.

4)  After the C library has been updated, delete /usr/include/ndir.h or any
    other header used with a previous directory library to prevent inadvertent
    use of the superseded directory access interface.  Also delete any
    corresponding library such as /usr/lib/libndir.a.

5)  To verify installation, try compiling, linking, and running the program
    testdir.c.  This program searches the current directory "." for each file
    named as a program argument and prints `"FOO" found.' or `"FOO" not found.'
    where FOO is of course replaced by the name being sought in the directory.
    Try something like

	cd /usr/bin			# a multi-block directory
	$WHEREVER/testdir FOO lint BAR f77 XYZZY

    which should produce the output

	"FOO" not found.
	"lint" found.
	"BAR" not found.
	"f77" found.
	"XYZZY" not found.

    A more thorough test would be

	cd /usr/bin			# a multi-block directory
	$WHEREVER/testdir `ls -a` | grep 'not found'

    This program does not test the seekdir() and telldir() functions.

6)  Notify your programmers that all directory access must be made through the
    new interface, and that documentation is available via

	man directory dirent

    Make the NOTES file available to those programmers who might want to
    understand what this is all about.

7)  Change all system sources that were accessing directories to use the new
    routines.  Nearly all such sources contain the line

	#include <sys/dir.h>
    or
	#include <ndir.h>

    so they should be easy to find.  (If you earlier removed some other header
    file, that is, if this package superseded an earlier version of the
    directory access library, look for its name too.  See the conversion
    instructions in the NOTES file.)
END OF INSTALL
echo 'directory.3c' 1>&2
cat >'directory.3c' <<'END OF directory.3c'
.TH DIRECTORY 3C "Standard Extension"
.SH NAME
opendir, readdir, telldir, seekdir, rewinddir, closedir \- directory operations
.SH SYNOPSIS
.B "#include <sys/types.h>"
.br
.B "#include <dirent.h>"
.P
.B "DIR \(**opendir (dirname)"
.br
.B "char \(**dirname;"
.P
.B "struct dirent \(**readdir (dirp)"
.br
.B "DIR \(**dirp;"
.P
.B "off_t telldir (dirp)"
.br
.B "DIR \(**dirp;"
.P
.B "void seekdir (dirp, loc)"
.br
.B "DIR \(**dirp;"
.br
.B "off_t loc;"
.P
.B "void rewinddir (dirp)"
.br
.B "DIR \(**dirp;"
.P
.B "int closedir (dirp)"
.br
.B "DIR \(**dirp;"
.SH DESCRIPTION
.I Opendir
establishes a connection between
the directory named by
.I dirname
and a unique object of type
.SM DIR
known as a
.I "directory stream"
that it creates.
.I Opendir
returns a pointer to be used to identify the
directory stream
in subsequent operations.
A
.SM NULL
pointer is returned if
.I dirname
cannot be accessed or is not a directory,
or if
.I opendir
is unable to create the
.SM DIR
object
(perhaps due to insufficient memory).
.P
.I Readdir
returns a pointer to an internal structure
containing information about the next active directory entry.
No inactive entries are reported.
The internal structure may be overwritten by
another operation on the same
directory stream;
the amount of storage needed to hold a copy
of the internal structure is given by the value of a macro,
.IR DIRENTSIZ(strlen(direntp\->d_name)) ,
not by
.I "sizeof(struct\ dirent)"
as one might expect.
A
.SM NULL
pointer is returned
upon reaching the end of the directory,
upon detecting an invalid location in the directory,
or upon occurrence of an error while reading the directory.
.P
.I Telldir
returns the current position associated with the named
directory stream
for later use as an argument to
.IR seekdir .
.P
.I Seekdir
sets the position of the next
.I readdir
operation on the named
directory stream.
The new position reverts to the one associated with the
directory stream
when the
.I telldir
operation from which
.I loc
was obtained was performed.
.P
.I Rewinddir
resets the position of the named
directory stream
to the beginning of the directory.
All buffered data for the directory stream is discarded,
thereby guaranteeing that the actual
file system directory will be referred to for the next
.I readdir
on the
directory stream.
.P
.I Closedir
closes the named
directory stream;
internal resources used for the
directory stream are liberated,
and subsequent use of the associated
.SM DIR
object is no longer valid.
.I Closedir
returns a value of zero if no error occurs,
\-1 otherwise.
.P
There are several possible errors that can occur
as a result of these operations;
the external integer variable
.I errno
is set to indicate the specific error.
.RI ( Readdir 's
detection of the normal end of a directory
is not considered to be an error.)
.SH EXAMPLE
Sample code which searches the current working directory for entry
.IR name :
.P
.ft B
	dirp = opendir( "." );
.br
	while ( (dp = readdir( dirp )) != NULL )
.br
		if ( strcmp( dp\->d_name, name ) == 0 )
.br
			{
.br
			(void) closedir( dirp );
.br
			return FOUND;
.br
			}
.br
	(void) closedir( dirp );
.br
	return NOT_FOUND;
.ft P
.SH "SEE ALSO"
getdents(2), dirent(4).
.SH WARNINGS
Entries for "." and ".."
may not be reported for some file system types.
.P
The value returned by
.I telldir
need not have any simple interpretation
and should only be used as an argument to
.IR seekdir .
Similarly,
the
.I loc
argument to
.I seekdir
must be obtained from a previous
.I telldir
operation on the same
directory stream.
.P
.I Telldir
and
.I seekdir
are unreliable when used in conjunction with
file systems that perform directory compaction or expansion
or when the directory stream has been closed and reopened.
It is best to avoid using
.I telldir
and
.I seekdir
altogether.
.P
The exact set of
.I errno
values and meanings may vary among implementations.
.P
Because directory entries can dynamically
appear and disappear,
and because directory contents are buffered
by these routines,
an application may need to continually rescan
a directory to maintain an accurate picture
of its active entries.
END OF directory.3c
echo 'dirent.4' 1>&2
cat >'dirent.4' <<'END OF dirent.4'
.TH DIRENT 4 "Standard Extension"
.SH NAME
dirent \- file system independent directory entry
.SH SYNOPSIS
.B "#include <sys/types.h>"
.br
.B "#include <sys/dirent.h>"
.SH DESCRIPTION
Different file system types
may have different directory entries.
The
.I dirent
structure defines a
file system independent directory entry,
which contains information common to
directory entries in different file system types.
A set of these structures is returned by the
.IR getdents (2)
system call.
.P
The
.I dirent
structure is defined below.
.br
struct	dirent	{
.br
			long			d_ino;
.br
			off_t			d_off;
.br
			unsigned short		d_reclen;
.br
			char			d_name[1];
.br
		};
.P
The field
.I d_ino
is a number which is unique
for each file in the file system.
The field
.I d_off\^
represents an offset of that directory entry
in the actual file system directory.
The field
.I d_name
is the beginning of the character array
giving the name of the directory entry.
This name is null terminated
and may have at most
.SM NAME_MAX
characters in addition to the null terminator.
This results in file system independent directory entries
being variable-length entities.
The value of
.I d_reclen
is the record length of this entry.
This length is defined to be the number of bytes
between the beginning of the current entry and the next one,
adjusted so that the next entry
will start on a long boundary.
.SH FILES
/usr/include/sys/dirent.h
.SH "SEE ALSO"
getdents(2).
.SH WARNING
The field
.I d_off\^
does not have a simple interpretation
for some file system types
and should not be used directly by applications.
END OF dirent.4
echo 'getdents.2' 1>&2
cat >'getdents.2' <<'END OF getdents.2'
.TH GETDENTS 2 "Standard Extension"
.SH NAME
getdents \- get directory entries in a file system independent format
.SH SYNOPSIS
.B "#include <sys/types.h>"
.br
.B "#include <sys/dirent.h>"
.P
.B "int getdents (fildes, buf, nbyte)"
.br
.B "int fildes;"
.br
.B "char \(**buf;"
.br
.B "unsigned nbyte;"
.SH DESCRIPTION
.I Fildes
is a file descriptor obtained from an
.IR open (2)
or
.IR dup (2)
system call.
.P
.I Getdents
attempts to read
.I nbyte
bytes from the directory associated with
.I fildes
and to format them as
file system independent entries
in the buffer pointed to by
.IR buf .
Since the file system independent directory entries
are of variable length,
in most cases the actual number of bytes returned
will be less than
.IR nbyte .
.P
The file system independent directory entry is specified by the
.I dirent
structure.
For a description of this see
.IR dirent (4).
.P
On devices capable of seeking,
.I getdents
starts at a position in the file given by
the file pointer associated with
.IR fildes .
Upon return from
.IR getdents ,
the file pointer has been incremented
to point to the next directory entry.
.P
This system call was developed in order to implement the
.I readdir
routine
[for a description see
.IR directory (3C)]
and should not be used for other purposes.
.SH "SEE ALSO"
directory(3C), dirent(4).
.SH DIAGNOSTICS
Upon successful completion
a non-negative integer is returned
indicating the number of bytes of
.I buf\^
actually filled.
(This need not be the number actually used
in the actual directory file.)\|\|
A value of zero
indicates the end of the directory has been reached.
If
.I getdents
fails for any other reason,
a value of \-1 is returned and
the external integer variable
.I errno
is set to indicate the error.
.SH WARNINGS
Entries for "." and ".."
may not be reported for some file system types.
.P
The exact set of
.I errno
values and meanings may vary among implementations.
END OF getdents.2
echo 'dirent.h' 1>&2
cat >'dirent.h' <<'END OF dirent.h'
/*
	<dirent.h> -- definitions for SVR3 directory access routines

	last edit:	25-Apr-1987	D A Gwyn

	Prerequisite:	<sys/types.h>
*/

#include	<sys/dirent.h>

#define	DIRBUF		8192		/* buffer size for fs-indep. dirs */
	/* must in general be larger than the filesystem buffer size */

typedef struct
	{
	int	dd_fd;			/* file descriptor */
	int	dd_loc;			/* offset in block */
	int	dd_size;		/* amount of valid data */
	char	*dd_buf;		/* -> directory block */
	}	DIR;			/* stream data from opendir() */

extern DIR		*opendir();
extern struct dirent	*readdir();
extern off_t		telldir();
extern void		seekdir();
extern void		rewinddir();
extern int		closedir();

#ifndef NULL
#define	NULL	0			/* DAG -- added for convenience */
#endif
END OF dirent.h
echo 'sys.dirent.h' 1>&2
cat >'sys.dirent.h' <<'END OF sys.dirent.h'
/*
	<sys/dirent.h> -- file system independent directory entry (SVR3)

	last edit:	27-Oct-1988	D A Gwyn

	prerequisite:	<sys/types.h>
*/

struct dirent				/* data from getdents()/readdir() */
	{
	long		d_ino;		/* inode number of entry */
	off_t		d_off;		/* offset of disk directory entry */
	unsigned short	d_reclen;	/* length of this record */
	char		d_name[1];	/* name of file */	/* non-ANSI */
	};

#ifdef BSD_SYSV				/* (e.g., when compiling getdents.c) */
extern struct dirent	__dirent;	/* (not actually used) */
/* The following is portable, although rather silly. */
#define	DIRENTBASESIZ		(__dirent.d_name - (char *)&__dirent.d_ino)

#else
/* The following nonportable ugliness could have been avoided by defining
   DIRENTSIZ and DIRENTBASESIZ to also have (struct dirent *) arguments.
   There shouldn't be any problem if you avoid using the DIRENTSIZ() macro. */

#define	DIRENTBASESIZ		(((struct dirent *)0)->d_name \
				- (char *)&((struct dirent *)0)->d_ino)
#endif

#define	DIRENTSIZ( namlen )	((DIRENTBASESIZ + sizeof(long) + (namlen)) \
				/ sizeof(long) * sizeof(long))

/* DAG -- the following was moved from <dirent.h>, which was the wrong place */
#define	MAXNAMLEN	512		/* maximum filename length */

#ifndef NAME_MAX
#define	NAME_MAX	(MAXNAMLEN - 1)	/* DAG -- added for POSIX */
#endif
END OF sys.dirent.h
echo 'sys._dir.h' 1>&2
cat >'sys._dir.h' <<'END OF sys._dir.h'
/*
	<sys/_dir.h> -- definitions for 4.2,4.3BSD directories

	last edit:	25-Apr-1987	D A Gwyn

	A directory consists of some number of blocks of DIRBLKSIZ bytes each,
	where DIRBLKSIZ is chosen such that it can be transferred to disk in a
	single atomic operation (e.g., 512 bytes on most machines).

	Each DIRBLKSIZ-byte block contains some number of directory entry
	structures, which are of variable length.  Each directory entry has the
	beginning of a (struct direct) at the front of it, containing its
	filesystem-unique ident number, the length of the entry, and the length
	of the name contained in the entry.  These are followed by the NUL-
	terminated name padded to a (long) boundary with 0 bytes.  The maximum
	length of a name in a directory is MAXNAMELEN.

	The macro DIRSIZ(dp) gives the amount of space required to represent a
	directory entry.  Free space in a directory is represented by entries
	that have dp->d_reclen > DIRSIZ(dp).  All DIRBLKSIZ bytes in a
	directory block are claimed by the directory entries; this usually
	results in the last entry in a directory having a large dp->d_reclen.
	When entries are deleted from a directory, the space is returned to the
	previous entry in the same directory block by increasing its
	dp->d_reclen.  If the first entry of a directory block is free, then
	its dp->d_fileno is set to 0; entries other than the first in a
	directory do not normally have 	dp->d_fileno set to 0.

	prerequisite:	<sys/types.h>
*/

#if defined(accel) || defined(sun) || defined(vax)
#define	DIRBLKSIZ	512		/* size of directory block */
#else
#ifdef alliant
#define	DIRBLKSIZ	4096		/* size of directory block */
#else
#ifdef gould
#define	DIRBLKSIZ	1024		/* size of directory block */
#else
#ifdef ns32000	/* Dynix System V */
#define	DIRBLKSIZ	2600		/* size of directory block */
#else	/* be conservative; multiple blocks are okay but fractions are not */
#define	DIRBLKSIZ	4096		/* size of directory block */
#endif
#endif
#endif
#endif

#define	MAXNAMELEN	255		/* maximum filename length */
/* NOTE:  not MAXNAMLEN, which has been preempted by SVR3 <dirent.h> */

struct direct				/* data from read()/_getdirentries() */
	{
	unsigned long	d_fileno;	/* unique ident of entry */
	unsigned short	d_reclen;	/* length of this record */
	unsigned short	d_namlen;	/* length of string in d_name */
	char		d_name[MAXNAMELEN+1];	/* NUL-terminated filename */
	/* typically shorter */
	};

/*
	The DIRSIZ macro gives the minimum record length which will hold the
	directory entry.  This requires the amount of space in a (struct
	direct) without the d_name field, plus enough space for the name with a
	terminating NUL character, rounded up to a (long) boundary.

	(Note that Berkeley didn't properly compensate for struct padding,
	but we nevertheless have to use the same size as the actual system.)
*/

#define	DIRSIZ( dp )	((sizeof(struct direct) - (MAXNAMELEN+1) \
			+ sizeof(long) + (dp)->d_namlen) \
			/ sizeof(long) * sizeof(long))
END OF sys._dir.h
echo 'opendir.c' 1>&2
cat >'opendir.c' <<'END OF opendir.c'
/*
	opendir -- open a directory stream

	last edit:	27-Oct-1988	D A Gwyn
*/

#include	<sys/errno.h>
#include	<sys/types.h>
#include	<sys/stat.h>
#include	<dirent.h>

#ifdef BSD_SYSV
#define open	_open			/* avoid emulation overhead */
#endif

typedef char	*pointer;		/* (void *) if you have it */

extern void	free();
extern pointer	malloc();
extern int	open(), close(), fstat();

extern int	errno;

#ifndef NULL
#define	NULL	0
#endif

#ifndef O_RDONLY
#define	O_RDONLY	0
#endif

#ifndef S_ISDIR				/* macro to test for directory file */
#define	S_ISDIR( mode )		(((mode) & S_IFMT) == S_IFDIR)
#endif

DIR *
opendir( dirname )
	char			*dirname;	/* name of directory */
	{
	register DIR		*dirp;	/* -> malloc'ed storage */
	register int		fd;	/* file descriptor for read */
	/* The following is static just to keep the stack small. */
	static struct stat	sbuf;	/* result of fstat() */

	if ( (fd = open( dirname, O_RDONLY )) < 0 )
		return NULL;		/* errno set by open() */

	if ( fstat( fd, &sbuf ) != 0 || !S_ISDIR( sbuf.st_mode ) )
		{
		(void)close( fd );
		errno = ENOTDIR;
		return NULL;		/* not a directory */
		}

	if ( (dirp = (DIR *)malloc( sizeof(DIR) )) == NULL
	  || (dirp->dd_buf = (char *)malloc( (unsigned)DIRBUF )) == NULL
	   )	{
		register int	serrno = errno;
					/* errno set to ENOMEM by sbrk() */

		if ( dirp != NULL )
			free( (pointer)dirp );

		(void)close( fd );
		errno = serrno;
		return NULL;		/* not enough memory */
		}

	dirp->dd_fd = fd;
	dirp->dd_loc = dirp->dd_size = 0;	/* refill needed */

	return dirp;
	}
END OF opendir.c
echo 'readdir.c' 1>&2
cat >'readdir.c' <<'END OF readdir.c'
/*
	readdir -- read next entry from a directory stream

	last edit:	25-Apr-1987	D A Gwyn
*/

#include	<sys/errno.h>
#include	<sys/types.h>
#include	<dirent.h>

extern int	getdents();		/* SVR3 system call, or emulation */

extern int	errno;

#ifndef NULL
#define	NULL	0
#endif

struct dirent *
readdir( dirp )
	register DIR		*dirp;	/* stream from opendir() */
	{
	register struct dirent	*dp;	/* -> directory data */

	if ( dirp == NULL || dirp->dd_buf == NULL )
		{
		errno = EFAULT;
		return NULL;		/* invalid pointer */
		}

	do	{
		if ( dirp->dd_loc >= dirp->dd_size )	/* empty or obsolete */
			dirp->dd_loc = dirp->dd_size = 0;

		if ( dirp->dd_size == 0	/* need to refill buffer */
		  && (dirp->dd_size =
			getdents( dirp->dd_fd, dirp->dd_buf, (unsigned)DIRBUF )
		     ) <= 0
		   )
			return NULL;	/* EOF or error */

		dp = (struct dirent *)&dirp->dd_buf[dirp->dd_loc];
		dirp->dd_loc += dp->d_reclen;
		}
	while ( dp->d_ino == 0L );	/* don't rely on getdents() */

	return dp;
	}
END OF readdir.c
echo 'telldir.c' 1>&2
cat >'telldir.c' <<'END OF telldir.c'
/*
	telldir -- report directory stream position

	last edit:	25-Apr-1987	D A Gwyn

	NOTE:	4.nBSD directory compaction makes seekdir() & telldir()
		practically impossible to do right.  Avoid using them!
*/

#include	<sys/errno.h>
#include	<sys/types.h>
#include	<dirent.h>

extern off_t	lseek();

extern int	errno;

#ifndef SEEK_CUR
#define	SEEK_CUR	1
#endif

off_t
telldir( dirp )				/* return offset of next entry */
	DIR	*dirp;			/* stream from opendir() */
	{
	if ( dirp == NULL || dirp->dd_buf == NULL )
		{
		errno = EFAULT;
		return -1;		/* invalid pointer */
		}

	if ( dirp->dd_loc < dirp->dd_size )	/* valid index */
		return ((struct dirent *)&dirp->dd_buf[dirp->dd_loc])->d_off;
	else				/* beginning of next directory block */
		return lseek( dirp->dd_fd, (off_t)0, SEEK_CUR );
	}
END OF telldir.c
echo 'seekdir.c' 1>&2
cat >'seekdir.c' <<'END OF seekdir.c'
/*
	seekdir -- reposition a directory stream

	last edit:	24-May-1987	D A Gwyn

	An unsuccessful seekdir() will in general alter the current
	directory position; beware.

	NOTE:	4.nBSD directory compaction makes seekdir() & telldir()
		practically impossible to do right.  Avoid using them!
*/

#include	<sys/errno.h>
#include	<sys/types.h>
#include	<dirent.h>

extern off_t	lseek();

extern int	errno;

#ifndef NULL
#define	NULL	0
#endif

#ifndef SEEK_SET
#define	SEEK_SET	0
#endif

typedef int	bool;			/* Boolean data type */
#define	false	0
#define	true	1

void
seekdir( dirp, loc )
	register DIR	*dirp;		/* stream from opendir() */
	register off_t	loc;		/* position from telldir() */
	{
	register bool	rewind;		/* "start over when stymied" flag */

	if ( dirp == NULL || dirp->dd_buf == NULL )
		{
		errno = EFAULT;
		return;			/* invalid pointer */
		}

	/* A (struct dirent)'s d_off is an invented quantity on 4.nBSD
	   NFS-supporting systems, so it is not safe to lseek() to it. */

	/* Monotonicity of d_off is heavily exploited in the following. */

	/* This algorithm is tuned for modest directory sizes.  For
	   huge directories, it might be more efficient to read blocks
	   until the first d_off is too large, then back up one block,
	   or even to use binary search on the directory blocks.  I
	   doubt that the extra code for that would be worthwhile. */

	if ( dirp->dd_loc >= dirp->dd_size	/* invalid index */
	  || ((struct dirent *)&dirp->dd_buf[dirp->dd_loc])->d_off > loc
					/* too far along in buffer */
	   )
		dirp->dd_loc = 0;	/* reset to beginning of buffer */
	/* else save time by starting at current dirp->dd_loc */

	for ( rewind = true; ; )
		{
		register struct dirent	*dp;

		/* See whether the matching entry is in the current buffer. */

		if ( (dirp->dd_loc < dirp->dd_size	/* valid index */
		   || readdir( dirp ) != NULL	/* next buffer read */
		   && (dirp->dd_loc = 0, true)	/* beginning of buffer set */
		     )
		  && (dp = (struct dirent *)&dirp->dd_buf[dirp->dd_loc])->d_off
			<= loc		/* match possible in this buffer */
		   )	{
			for ( /* dp initialized above */ ;
			      (char *)dp < &dirp->dd_buf[dirp->dd_size];
			      dp = (struct dirent *)((char *)dp + dp->d_reclen)
			    )
				if ( dp->d_off == loc )
					{	/* found it! */
					dirp->dd_loc =
						(char *)dp - dirp->dd_buf;
					return;
					}

			rewind = false;	/* no point in backing up later */
			dirp->dd_loc = dirp->dd_size;	/* set end of buffer */
			}
		else			/* whole buffer past matching entry */
			if ( !rewind )
				{	/* no point in searching further */
				errno = EINVAL;
				return;	/* no entry at specified loc */
				}
			else	{	/* rewind directory and start over */
				rewind = false;	/* but only once! */

				dirp->dd_loc = dirp->dd_size = 0;

				if ( lseek( dirp->dd_fd, (off_t)0, SEEK_SET )
					!= 0
				   )
					return;	/* errno already set (EBADF) */

				if ( loc == 0 )
					return; /* save time */
				}
		}
	}
END OF seekdir.c
echo 'rewinddir.c' 1>&2
cat >'rewinddir.c' <<'END OF rewinddir.c'
/*
	rewinddir -- rewind a directory stream

	last edit:	25-Apr-1987	D A Gwyn

	This is not simply a call to seekdir(), because seekdir()
	will use the current buffer whenever possible and we need
	rewinddir() to forget about buffered data.
*/

#include	<sys/errno.h>
#include	<sys/types.h>
#include	<dirent.h>

extern off_t	lseek();

extern int	errno;

#ifndef NULL
#define	NULL	0
#endif

#ifndef SEEK_SET
#define	SEEK_SET	0
#endif

void
rewinddir( dirp )
	register DIR		*dirp;	/* stream from opendir() */
	{
	if ( dirp == NULL || dirp->dd_buf == NULL )
		{
		errno = EFAULT;
		return;			/* invalid pointer */
		}

	dirp->dd_loc = dirp->dd_size = 0;	/* invalidate buffer */
	(void)lseek( dirp->dd_fd, (off_t)0, SEEK_SET );	/* may set errno */
	}
END OF rewinddir.c
echo 'closedir.c' 1>&2
cat >'closedir.c' <<'END OF closedir.c'
/*
	closedir -- close a directory stream

	last edit:	11-Nov-1988	D A Gwyn
*/

#include	<sys/errno.h>
#include	<sys/types.h>
#include	<dirent.h>

typedef char	*pointer;		/* (void *) if you have it */

extern void	free();
extern int	close();

extern int	errno;

#ifndef NULL
#define	NULL	0
#endif

int
closedir( dirp )
	register DIR	*dirp;		/* stream from opendir() */
	{
	register int	fd;

	if ( dirp == NULL || dirp->dd_buf == NULL )
		{
		errno = EFAULT;
		return -1;		/* invalid pointer */
		}

	fd = dirp->dd_fd;		/* bug fix thanks to R. Salz */
	free( (pointer)dirp->dd_buf );
	free( (pointer)dirp );
	return close( fd );
	}
END OF closedir.c
echo 'getdents.c' 1>&2
cat >'getdents.c' <<'END OF getdents.c'
/*
	getdents -- get directory entries in a file system independent format
			(SVR3 system call emulation)

	last edit:	27-Oct-1988	D A Gwyn

	This single source file supports several different methods of
	getting directory entries from the operating system.  Define
	whichever one of the following describes your system:

	UFS	original UNIX filesystem (14-character name limit)
	BFS	4.2BSD (also 4.3BSD) native filesystem (long names)
	NFS	getdirentries() system call

	Also define any of the following flags that are pertinent:

	ATT_SPEC	check user buffer address for longword alignment
	BSD_SYSV	BRL UNIX System V emulation environment on 4.nBSD
	INT_SIGS	<signal.h> thinks that signal handlers have
			return type int (rather than the standard void)
	NEG_DELS	deleted entries have inode number -1 rather than 0
	UNK		have _getdents() system call, but kernel may not
			support it

	If your C library has a getdents() system call interface, but you
	can't count on all kernels on which your application binaries may
	run to support it, change the system call interface name to
	_getdents() and define "UNK" to enable the system-call validity
	test in this "wrapper" around _getdents().

	If your system has a getdents() system call that is guaranteed 
	to always work, you shouldn't be using this source file at all.
*/

#include	<sys/errno.h>
#include	<sys/types.h>
#ifdef BSD_SYSV
#include	<sys/_dir.h>		/* BSD flavor, not System V */
#else
#include	<sys/dir.h>
#undef	MAXNAMLEN			/* avoid conflict with SVR3 */
	/* Good thing we don't need to use the DIRSIZ() macro! */
#ifdef d_ino				/* 4.3BSD/NFS using d_fileno */
#undef	d_ino				/* (not absolutely necessary) */
#else
#define	d_fileno	d_ino		/* (struct direct) member */
#endif
#endif
#include	<sys/dirent.h>
#include	<sys/stat.h>
#ifdef UNK
#ifndef UFS
#include "***** ERROR ***** UNK applies only to UFS"
/* One could do something similar for getdirentries(), but I didn't bother. */
#endif
#include	<signal.h>
#endif

#if defined(UFS) + defined(BFS) + defined(NFS) != 1	/* sanity check */
#include "***** ERROR ***** exactly one of UFS, BFS, or NFS must be defined"
#endif

#ifdef BSD_SYSV
struct dirent	__dirent;		/* (just for the DIRENTBASESIZ macro) */
#endif

#ifdef UFS
#define	RecLen( dp )	(sizeof(struct direct))	/* fixed-length entries */
#else	/* BFS || NFS */
#define	RecLen( dp )	((dp)->d_reclen)	/* variable-length entries */
#endif

#ifdef NFS
#ifdef BSD_SYSV
#define	getdirentries	_getdirentries	/* package hides this system call */
#endif
extern int	getdirentries();
static long	dummy;			/* getdirentries() needs basep */
#define	GetBlock( fd, buf, n )	getdirentries( fd, buf, (unsigned)n, &dummy )
#else	/* UFS || BFS */
#ifdef BSD_SYSV
#define read	_read			/* avoid emulation overhead */
#endif
extern int	read();
#define	GetBlock( fd, buf, n )	read( fd, buf, (unsigned)n )
#endif

#ifdef UNK
extern int	_getdents();		/* actual system call */
#endif

extern char	*strncpy();
extern int	fstat();
extern off_t	lseek();

extern int	errno;

#ifdef NEG_DELS
#define	DELETED	(-1)
#else
#define	DELETED	0
#endif

#ifndef DIRBLKSIZ
#define	DIRBLKSIZ	4096		/* directory file read buffer size */
#endif

#ifndef NULL
#define	NULL	0
#endif

#ifndef SEEK_CUR
#define	SEEK_CUR	1
#endif

#ifndef S_ISDIR				/* macro to test for directory file */
#define	S_ISDIR( mode )		(((mode) & S_IFMT) == S_IFDIR)
#endif

#ifdef UFS

/*
	The following routine is necessary to handle DIRSIZ-long entry names.
	Thanks to Richard Todd for pointing this out.
*/

static int
NameLen( name )				/* return # chars in embedded name */
	char		name[];		/* -> name embedded in struct direct */
	{
	register char	*s;		/* -> name[.] */
	register char	*stop = &name[DIRSIZ];	/* -> past end of name field */

	for ( s = &name[1];		/* (empty names are impossible) */
	      *s != '\0'		/* not NUL terminator */
	   && ++s < stop;		/* < DIRSIZ characters scanned */
	    )
		;

	return s - name;		/* # valid characters in name */
	}

#else	/* BFS || NFS */

extern int	strlen();

#define	NameLen( name )	strlen( name )	/* names are always NUL-terminated */

#endif

#ifdef UNK
static enum	{ maybe, no, yes }	state = maybe;
					/* does _getdents() work? */

#ifdef INT_SIGS
#define	RET_SIG	int
#else
#define	RET_SIG	void
#endif

/*ARGSUSED*/
static RET_SIG
sig_catch( sig )
	int	sig;			/* must be SIGSYS */
	{
	state = no;			/* attempted _getdents() faulted */
#ifdef INT_SIGS
	return 0;			/* telling lies */
#endif
	}
#endif	/* UNK */

int
getdents( fildes, buf, nbyte )		/* returns # bytes read;
					   0 on EOF, -1 on error */
	int			fildes;	/* directory file descriptor */
	char			*buf;	/* where to put the (struct dirent)s */
	unsigned		nbyte;	/* size of buf[] */
	{
	int			serrno;	/* entry errno */
	off_t			offset;	/* initial directory file offset */
	/* The following are static just to keep the stack small. */
	static struct stat	statb;	/* fstat() info */
	static union
		{
		char		dblk[DIRBLKSIZ
#ifdef UFS
				     +1	/* for last entry name terminator */
#endif
				    ];
					/* directory file block buffer */
		struct direct	dummy;	/* just for alignment */
		}	u;		/* (avoids having to malloc()) */
	register struct direct	*dp;	/* -> u.dblk[.] */
	register struct dirent	*bp;	/* -> buf[.] */

#ifdef UNK
	if ( state == yes )		/* _getdents() is known to work */
		return _getdents( fildes, buf, nbyte );

	if ( state == maybe )		/* first time only */
		{
		RET_SIG		(*shdlr)();	/* entry SIGSYS handler */
		register int	retval;	/* return from _getdents() if any */

		shdlr = signal( SIGSYS, sig_catch );
		retval = _getdents( fildes, buf, nbyte );	/* try it */
		(void)signal( SIGSYS, shdlr );

		if ( state == maybe )	/* SIGSYS did not occur */
			{
			state = yes;	/* so _getdents() must have worked */
			return retval;
			}
		}

	/* state == no; perform emulation */
#endif

	if ( buf == NULL
#ifdef ATT_SPEC
	  || (unsigned long)buf % sizeof(long) != 0	/* ugh */
#endif
	   )	{
		errno = EFAULT;		/* invalid pointer */
		return -1;
		}

	if ( fstat( fildes, &statb ) != 0 )
		return -1;		/* errno set by fstat() */

	if ( !S_ISDIR( statb.st_mode ) )
		{
		errno = ENOTDIR;	/* not a directory */
		return -1;
		}

	if ( (offset = lseek( fildes, (off_t)0, SEEK_CUR )) < 0 )
		return -1;		/* errno set by lseek() */

#ifdef BFS				/* no telling what remote hosts do */
	if ( (unsigned long)offset % DIRBLKSIZ != 0 )
		{
		errno = ENOENT;		/* file pointer probably misaligned */
		return -1;
		}
#endif

	serrno = errno;			/* save entry errno */

	for ( bp = (struct dirent *)buf; bp == (struct dirent *)buf; )
		{			/* convert next directory block */
		int	size;

		do	size = GetBlock( fildes, u.dblk, DIRBLKSIZ );
		while ( size == -1 && errno == EINTR );

		if ( size <= 0 )
			return size;	/* EOF or error (EBADF) */

		for ( dp = (struct direct *)u.dblk;
		      (char *)dp < &u.dblk[size];
		      dp = (struct direct *)((char *)dp + RecLen( dp ))
		    )	{
#ifndef UFS
			if ( dp->d_reclen <= 0 )
				{
				errno = EIO;	/* corrupted directory */
				return -1;
				}
#endif

			if ( dp->d_fileno != DELETED )
				{	/* non-empty; copy to user buffer */
				register int	reclen =
					DIRENTSIZ( NameLen( dp->d_name ) );

				if ( (char *)bp + reclen > &buf[nbyte] )
					{
					errno = EINVAL;
					return -1;	/* buf too small */
					}

				bp->d_ino = dp->d_fileno;
				bp->d_off = offset + ((char *)dp - u.dblk);
				bp->d_reclen = reclen;

				{
#ifdef UFS
				/* Is the following kludge ugly?  You bet. */

				register char	save = dp->d_name[DIRSIZ];
					/* save original data */

				dp->d_name[DIRSIZ] = '\0';
					/* ensure NUL termination */
#endif
				(void)strncpy( bp->d_name, dp->d_name,
					       reclen - DIRENTBASESIZ
					     );	/* adds NUL padding */
#ifdef UFS
				dp->d_name[DIRSIZ] = save;
					/* restore original data */
#endif
				}

				bp = (struct dirent *)((char *)bp + reclen);
				}
			}

#if !(defined(BFS) || defined(sun))	/* 4.2BSD screwed up; fixed in 4.3BSD */
		if ( (char *)dp > &u.dblk[size] )
			{
			errno = EIO;	/* corrupted directory */
			return -1;
			}
#endif
		}

	errno = serrno;			/* restore entry errno */
	return (char *)bp - buf;	/* return # bytes read */
	}
END OF getdents.c
echo 'testdir.c' 1>&2
cat >'testdir.c' <<'END OF testdir.c'
/*
	testdir -- basic test for C library directory access routines

	last edit:	25-Apr-1987	D A Gwyn
*/

#include	<sys/types.h>
#include	<stdio.h>
#include	<dirent.h>

extern void	exit();
extern int	strcmp();

main( argc, argv )
	int			argc;
	register char		**argv;
	{
	register DIR		*dirp;
	register struct dirent	*dp;
	int			nerrs = 0;	/* total not found */

	if ( (dirp = opendir( "." )) == NULL )
		{
		(void)fprintf( stderr, "Cannot open \".\" directory\n" );
		exit( 1 );
		}

	while ( --argc > 0 )
		{
		++argv;

		while ( (dp = readdir( dirp )) != NULL )
			if ( strcmp( dp->d_name, *argv ) == 0 )
				{
				(void)printf( "\"%s\" found.\n", *argv );
				break;
				}

		if ( dp == NULL )
			{
			(void)printf( "\"%s\" not found.\n", *argv );
			++nerrs;
			}

		rewinddir( dirp );
		}

	(void)closedir( dirp );
	exit( nerrs );
	}
END OF testdir.c
exit 0
-- 
David H. Brierley
Home: dave@galaxia.Newport.RI.US   {rayssd,xanth,lazlo,jclyde}!galaxia!dave
Work: dhb@rayssd.ray.com           {sun,decuac,gatech,necntc,ukma}!rayssd!dhb