[comp.sys.hp] DIRECTORY

burzio@mmlai.UUCP (Tony Burzio) (11/24/89)

System: HP-UX 6.5 B 9000/370

While trying out the C DIRECTORY(3C) routines, I noticed
some very strange behavior.  When you do a readdir,
the files are returned in what seems to be random
order.  At first it appeared that they return in the
order that the files are created, but new files pop
up where ever.   The DIR pointer is not closed until
NULL is returned from readdir.  Has anybody else used
this code and been able to extract file names in
alphabetical order?  Where should I look for more info
besides the HP-UX reference?  On a related note, is it
possible to run a sort on a tmpfile() created file?  I
really need the filenames in order...  Thanks in advance.

*********************************************************************
Tony Burzio               *  SNOW!  On Thanksgiving?  Ski time!
Martin Marietta Labs      *
mmlai!burzio@uunet.uu.net *
*********************************************************************

donn@hpfcdc.HP.COM (Donn Terry) (11/25/89)

(Almost?) all UNIX systems, not just HP-UX, return the directory enties in 
unspecified order.  (The actual order is a function of the order
in which the entries were created (and removed) and of the length of
the entries if you are using long file names.)

The simplest (although not the most efficient in terms of system
resources) solution to get what you want would be

	foo = popen("ls", "r");

and do freads from foo to get the names.

You could read them all with readdir() and use qsort(), or copy them
to an ordinary file and use the sort command via system().

Donn Terry
HP Ft. Collins
(No, I'm not speaking officially for HP, or anyone else for that matter.)

bae@hp-lsd.COS.HP.COM (Bruce Erickson) (11/26/89)

>While trying out the C DIRECTORY(3C) routines, I noticed
>some very strange behavior.  When you do a readdir,
>the files are returned in what seems to be random
>order.  At first it appeared that they return in the
>order that the files are created, but new files pop
>up where ever.

I'm not an expert, but I believe that 'readdir' returns the files in the
order that they are in the directory inode, which I think is essentially
an array of file names, which are placed in the first unused slot.  So, when
a directory is first created the files are in first-come first-slot basis;
after a file is removed, however, I believe that the next created file gets
the slot freed up by the removed file.

(I am remembering this from an O/S class 8 years ago, so I may be mis-
remembering this!)

If you need the files in alphabetical order, read them in but stick them
in a binary tree (use bsearch(3C) for ease of implementation) then read
them back from the binary tree....


                              Bruce Erickson
                              hp-lsd!bae

markf@hpupnja.HP.COM (Mark Fresolone) (11/26/89)

The 'ls' command has us all spoiled!  It turns out that directory entries
are not actually stored in any order.  The directory(3) routines simply
report 'em as they see 'em.

Entries are kept rather primitively in a static array.  At the beginning of
a directory's life, files appear in creation order.  However, when a file or
directory is "removed", its entry is simply marked unused (traditionally,
inode == 0).  The next file created fills the first available unused slot,
allocating a new block of entries when necessary.

To visualize the behavior, create a directory, and then create and delete
files, using "strings ." or "od -c ." between operations.

Good coding!

#include <disclaimer.h>
Mark Fresolone
hplabs!hpfcse!hpupnja!markf

fkittred@bbn.com (Fletcher Kittredge) (11/26/89)

In article <616@mmlai.UUCP> burzio@mmlai.UUCP (Tony Burzio) writes:
>System: HP-UX 6.5 B 9000/370
>
>While trying out the C DIRECTORY(3C) routines, I noticed
>some very strange behavior.  When you do a readdir,
>the files are returned in what seems to be random
>order.  At first it appeared that they return in the
>order that the files are created, but new files pop
>up where ever.   The DIR pointer is not closed until
>NULL is returned from readdir.  Has anybody else used
>this code and been able to extract file names in
>alphabetical order?  Where should I look for more info
>besides the HP-UX reference?  On a related note, is it
>possible to run a sort on a tmpfile() created file?  I
>really need the filenames in order...  Thanks in advance.

Tony;

	If you intend to do programming on a Unix system, I strongly recommend
you buy "The Unix Programming Environment" (Kernighan and Pike, 1984),
and "The Design of the Unix Operating System" (Bach).  The basic answer to
your question is that you are assuming that directories are stored in
sorted order.  They are not; there is no reason for them to be stored
sorted.  You can look at the order they are stored in by entering the
command:

% strings <directory> |more 

For a justification of this design decision, any good intro to
algorithms would help; my favorites are "Data Structures and Algorithms" 
(Aho, Hopcroft and Ullman, 1983), and Sedgewick's "Algorithms" (1988?).
These would also be good places to look for sort algorithms to get
your file names in order!

happy hunting,
fletcher

P.S. If I read the manual correctly, you won't be able to sort a file
created by tmpfile(), since it disappears from the file system immediately
after it created.
Fletcher E. Kittredge  fkittred@bbn.com

rer@hpfcdc.HP.COM (Rob Robason) (11/30/89)

> System: HP-UX 6.5 B 9000/370

> Has anybody else used this code and been able to extract file names in
> alphabetical order?

The 6.5 release has an undocumented function in libc called scandir and
another called alphasort.  These derive from BSD and will read an entire
directory and sort it alphabetically.  Note that use of scandir and
alphasort is not yet supported on the 800, not for any particular reason
-- just that it got into the 300 more by accident than plan.  Attached
is the reference page for scandir.

HP hasn't tested the scandir function in 6.5, and since it's not
documented it is officially "unsupported", but the sample application
shown at the bottom of this response does seem to work.

> I really need the filenames in order...  Thanks in advance.

> Tony Burzio               *  SNOW!  On Thanksgiving?  Ski time!

Rob Robason		* Went skiing yesterday at Keystone -- GREAT!!
Not speaking for HP, just trying to be helpful
########################################################################

     SCANDIR(3)	    (September 17, 1985)	    SCANDIR(3)

     NAME
	  scandir, alphasort - scan a directory

     SYNOPSIS
	  #include <sys/types.h>
	  #include <sys/dir.h>

	  scandir(dirname, namelist, select, compar)
	  char *dirname;
	  struct direct *(*namelist[]);
	  int (*select)();
	  int (*compar)();

	  alphasort(d1, d2)
	  struct direct **d1, **d2;

     DESCRIPTION
	  Scandir reads the directory dirname and builds an array of
	  pointers to directory entries using malloc(3).  It returns
	  the number of entries in the array and a pointer to the
	  array through namelist.

	  The select parameter is a pointer to a user supplied
	  subroutine which is called by scandir to select which
	  entries are to be included in the array.  The select routine
	  is passed a pointer to a directory entry and should return a
	  non-zero value if the directory entry is to be included in
	  the array.  If select is null, then all the directory
	  entries will be included.

	  The compar parameter is a pointer to a user supplied
	  subroutine which is passed to qsort(3) to sort the completed
	  array. If this pointer is null, the array is not sorted.
	  Alphasort is a routine which can be used for the compar
	  parameter to sort the array alphabetically.

	  The memory allocated for the array can be deallocated with
	  free (see malloc(3)) by freeing each pointer in the array
	  and the array itself.

     SEE ALSO
	  directory(3), malloc(3), qsort(3), dir(5)

     DIAGNOSTICS
	  Returns -1 if the directory cannot be opened for reading or
	  if malloc(3) cannot allocate enough memory to hold all the
	  data structures.

     Hewlett-Packard Company	   - 1 -		  Nov 29, 1989

##########################################################################
#include <sys/types.h>
#include <sys/dir.h>

int alphasort();

main()
{
	struct direct **namelist;

	scandir(".", &namelist, 0, alphasort);
	while (*namelist) {
		printf("%s\n", (*namelist)->d_name);
		namelist++;
	}
}