[comp.sys.amiga] Fast Directory program

RFORSTER@UNCAEDU.BITNET (05/20/87)

From: <RFORSTER%UNCAEDU.BITNET@wiscvm.wisc.edu>

        Steve Vermeulen asked me to post this to the net as he does
        not have write access to comp.sys.amiga.  Note if you don't get
        a .signature at the end, you didn't get the whole thing.

        /Russ

---- cut here ----- cut here ----- cut here ----- cut here ----- cut here ----
/************************************************************
  fd.c        - fast directory

  By Stephen Vermeulen

  3635 Utah Dr., N.W.,
  Calgary, Alberta, CANADA.
  (403) 282-7990

  This code adds to Leo Schwab's attempt at making a faster
  dir command with "eless".

  What I have done is to use Leo's program to track the way
  the directory blocks are scattered about the disk and to
  realize that with his near-optimum approach there will
  still be several read attempts on each track.  This is a
  result of the hash chains.  For example if track A contains
  8 directory blocks, then at the first level of the hash
  chain structure only 3 of these blocks may be read.  Then
  at another level or two in the hash chain the other 5
  directory blocks are accessed.  This results in many extra
  seeks and reads of the same track.

  So what I have done is to process the ENTIRE track (both
  surfaces or 22 blocks) once it is addressed by a hash
  chain.  When I process a track I save the needed information
  in an instance of the "disk_block" structure and chain it
  onto the appropriate track (actually cylinder) pointer.
  Only blocks which represent UserDirectories or FileHeaders
  are recorded, the others are ignored.  Since there could be
  blocks from several directories intermixed on the same track
  we save only the information about blocks which belong to
  our directory.  We do this by checking that DOS's parent
  directory back pointer points to the initial block of our
  directory.

  Then when I detect a read of a sector on a track that has
  already been processed I just look it up in this structure
  rather than having to seek back to the track.  This results
  in about a 30-50% improvement in speed (over eless) on large directories.
  For example, a directory with 223 files in it takes:

         fd      22 seconds
         eless   30 seconds
         dir     81 seconds.

  This routine may actually be slower on small directories.

  Additional improvements in speed may be obtained by going for the
  track disk's buffer directly when determining which of the blocks
  actually need to be recorded for possible future use.  Also, it
  should be possible to pre-read the next track while processing the
  current track.

  Thanks Leo, sorry I messed with your TABBING, I hate tabs...
  (It still compiles under MANX 3.4A no sweat.)

  Will someone add the additional generality so this program can
  print the directories of NON-BLOCK devices too?

  KNOWN BUG:

     It seems if you try to read blocks on a track that have never
     been used before you get a System Requester with the message:

         Volume
         test
         has a read/write error.

     Unfortnately, one does not seem to be able to turn off these
     error messages (even using the Process structure's WindowPtr
     field).  This is not fatal, but is dammed inconvenient; you
     just have to keep clicking cancel to jump those blocks.
     To see this happen, format a disk then copy one file onto it.
     Then do a "fd df0:" of the disk.  You will probably get a few
     of these requesters.  Some of the Fish Disk also have a few
     blocks that have never been used either.  Note: DiskEd gives
     two different types upon reading one of these blocks:
     Corrupt or Deleted Corrupt when the info (i) command is used
     to examine the block.

******************************************************************/


/* :ts=8 bk=0
 *
 * eless.c:     An attempt at a reasonable-speed 'ls' program by fiddling
 *              with the DOS.
 *
 *              Compiles under Manx 3.20 and patched 3.40.
 *              cc eless.c
 *              ln eless.c -lc -o eless
 *
 * Leo L. Schwab                8704.26         (415)-456-6565
 */

#include <exec/types.h>
#include <exec/memory.h>
#include <libraries/dosextens.h>
#include <stdio.h>

#define BLOCKSIZE       128
#define STARTTAB        6
#define ENDTAB          (BLOCKSIZE-51)
#define TABSIZ          (BLOCKSIZE-56)
#define FILENAME        (BLOCKSIZE-20)
#define HASHCHAIN       (BLOCKSIZE-4)
#define SECONDARY       (BLOCKSIZE-1)
#define PARENT          (BLOCKSIZE-3)
#define ST_DIR          2
#define ST_FILE         -3
#define ST_SHORT        2

struct disk_block
{
  struct disk_block *db;
  long type;
  char name[64];
  long chain;
  long number;
};

/***************************************************
  The following array holds the base pointers to the
  blocks that have been read from each track.  At the
  start this will be set to all NULLs, then as each
  track is read in the pointer to the list of disk
  blocks that contain relevant information for that
  track is added into the array.  In this way we
  seek to, and read each track only once.  Hence, if
  we want a single block on a new track we scan all
  the blocks on that track, and record any of the blocks
  that are UserDirectory or FileHeader type blocks
  in that track in this list.
*****************************************************/

struct disk_block *tracks[80];

/*
 * Note:  I usually declare Amiga functions this way to disable the compiler's
 * type checking.  This allows me to assign values to variables without
 * explicitly casting them to the appropriate type.  In reality, these
 * functions return things entirely different from the way I've declared
 * them.  For example, CreatePort() should really be declared as:
 *
 * struct MsgPort *CreatePort();
 *
 * Caveat Programmer.
 */
extern void     *AllocMem(), *CreatePort(), *RemTail(), *FindTask();
extern long     Lock(), Examine();

long            dopacket();
int             longcmp(), namecmp();


struct StandardPacket   *packet;
struct FileInfoBlock    *fib;
struct FileLock         *lock;
struct InfoData         *id;
struct MsgPort          *dosreply;
BPTR                    lok;
long                    *buf, hashtab[TABSIZ];


struct disk_block *free_disk_block(db)
struct disk_block *db;
{
  struct disk_block *next_db;

  if (!db) return(NULL);
  next_db = db->db;
  FreeMem(db, (long) sizeof(struct disk_block));
  return(next_db);
}

main (ac, av)
char **av;
{
        int flag;
        struct Process *my_proc;
        APTR w_save;

        openstuff ();

        my_proc = (struct Process *) FindTask(0L);

        /****************************************
          The following is used to suppress error
          messages that can occur when we attempt
          to read sectors that have never been
          used on the disk.
        *****************************************/

        w_save = my_proc->pr_WindowPtr;
        my_proc->pr_WindowPtr = (APTR) (-1);
        if (ac < 2)             /*  No arguments; do current directory  */
                printdir (my_proc -> pr_CurrentDir);

        else {                  /*  Arguments; treat as directories  */
                flag = (ac > 2);
                while (++av, --ac) {
                        if (!(lok = Lock (*av, ACCESS_READ))) {
                                printf ("%s: Not found.\n", *av);
                                if (ac > 1)
                                        putchar ('\n');
                                continue;
                        }

                        if (!Examine (lok, fib))
                                die ("Examine failed.");
                        if (fib -> fib_DirEntryType < 0) {
                                printf ("%s: Not a directory.\n", *av);
                                if (ac > 1)
                                        putchar ('\n');
                                continue;
                        }

                        if (flag)
                                printf ("%s:\n", *av);
                        printdir (lok);
                        if (ac > 1)
                                putchar ('\n');
                        UnLock (lok);  lok = NULL;
                }
        }

        closestuff ();
        my_proc->pr_WindowPtr = w_save;
}


/*
 * This function prints the directory in a nice, pretty columnar format.
 */
printdir (dirlock)
BPTR dirlock;
{
        struct List             namelist;
        register struct Node    *name, **namearray;
        long                    args[2], start, track_no, parent_block;
        register int            i, n, j;
        int                     len = 0, ncols, nlines, nfiles = 0;
        char                    fmt1[16], fmt2[16];
        char                    *fn = (char *) (&buf[FILENAME]) + 1;
        struct disk_block *pdb, *ndb, *db;

        for (i = 0; i < 80; ++i)
          tracks[i] = NULL;

        lock = (struct FileLock *) (dirlock << 2);
        args[0] = parent_block = lock -> fl_Key;        /*  Block number  */
        args[1] = (long) buf >> 2;      /*  BPTR to buffer  */

        /*  Attempt to get raw directory block.  */
        if (!dopacket (lock -> fl_Task, ACTION_GET_BLOCK, args, 2)) {
                if (packet -> sp_Pkt.dp_Res2 == ERROR_ACTION_NOT_KNOWN)
                        printf ("Not a block device.\n");
                else
                        printf ("Error %ld while getting dirblock.\n",
                                packet -> sp_Pkt.dp_Res2);
                return;
        }

        /*  Copy DOS's hash table into our array.  */
        for (i=0; i<TABSIZ; i++)
                hashtab[i] = buf[i+STARTTAB];

        NewList (&namelist);
        while (1) {
                /*  Sort hash table.  */
                qsort (hashtab, TABSIZ, sizeof (long), longcmp);
                if (!hashtab[0])        /*  Empty hash table  */
                        break;

                /*
                 * Retrieve disk blocks according to sorted hash table and
                 * collect filenames.
                 */
                for (i=0; hashtab[i] && i<TABSIZ; i++)
                {
                  track_no = hashtab[i]/22;
                  if (db = tracks[track_no])
                  {
                    /*********************************
                      We have already hit this track
                      so get the information from
                      our list of other blocks.
                      Doing this prevents seeking
                      several times to the same track.

                      first find the block in the list
                    **********************************/

                    while (db->number != hashtab[i])  db = db->db;
                    if (!(name = AllocMem (sizeof (*name) +
                                           db->name[0] + 1L,
                                           MEMF_CLEAR)))
                    {
                      puts ("Node memory allocation failure.");
                      goto bombout;
                    }

                    name -> ln_Name = (char *) name + sizeof (*name);
                    name -> ln_Type = (db->type == ST_DIR);
                    j = db->name[0];
                    name->ln_Name[j] = 0;
                    while (j)
                    {
                      name->ln_Name[j-1] = db->name[j];
                      --j;
                    }

                    AddTail (&namelist, name);
                    nfiles++;   /*  Number of files found  */
                    hashtab[i] = db->chain;
                  }
                  else
                  {
                    /*********************************
                      We have never visited this track
                      before, so we read all the blocks
                      from it and stash away all those
                      that are directory blocks, by doing
                      this we don't ever have to go back
                      to this track.  We don't record the
                      contents of the block that caused
                      us to process this track, instead
                      we add it to the final name list
                      directly.
                    *************************************/

                    pdb = (struct disk_block *) &tracks[track_no];
                    start = track_no*22;
                    for (j = 0; j < 22; ++j)
                    {
                      args[0] = start+j;

                      /**************************************
                        Make sure we never try to read the
                        0 or 1 blocks as these are never used
                        and will cause a GURU!
                      ***************************************/

                      if ((args[0] == 0L) || (args[0] == 1L)) continue;
                      dopacket (lock -> fl_Task, ACTION_GET_BLOCK, args, 2);
                      if ((start+j)  == hashtab[i])
                      {
                        if (!(name = AllocMem (sizeof (*name) + *(fn-1) + 1L,
                                               MEMF_CLEAR)))
                        {
                          puts ("Node memory allocation failure.");
                          goto bombout;
                        }

                        name -> ln_Name = (char *) name + sizeof (*name);
                        name -> ln_Type = (buf[SECONDARY] == ST_DIR);
                        fn[*(fn-1)] = '\0';     /*  Force null-termination  */
                        strcpy (name -> ln_Name, fn);

                        AddTail (&namelist, name);
                        nfiles++;       /*  Number of files found  */
                        hashtab[i] = buf[HASHCHAIN];
                      }
                      else
                          /**********************************
                            Note: if blocks are not ST_SHORT
                            then they do not contain user
                            directories or file names.  Also
                            to be a member of the directory
                            we are scanning they must have
                            the same parent block.
                          ***********************************/
                      {
                        if ((buf[0] == ST_SHORT) &&
                            (parent_block == buf[PARENT]))
                        {
                          /*********************************
                            This block contains information
                            which might be of future use so
                            record it.
                          **********************************/
                          if (ndb = (struct disk_block *)
                              AllocMem((long) sizeof(struct disk_block), 0L))
                          {
                            pdb->db = ndb;
                            ndb->db = NULL;
                            ndb->type = buf[SECONDARY];
                            movmem(&buf[BLOCKSIZE-20], &ndb->name, 64);
                            ndb->chain = buf[HASHCHAIN];
                            ndb->number = start+j;
                            pdb = ndb;
                          }
                          else
                            goto bombout;
                        }
                      }
                    }  /* end for 22 blocks */
                  }   /* end if new or old track */
                }    /* end scan the hash table */
        }
        if (!nfiles)    /*  No files  */
                return;

        /*  Create array that qsort can deal with.  */
        if (!(namearray = AllocMem
                           ((long) nfiles * sizeof (char *), MEMF_CLEAR))) {
                puts ("Name array allocation failure.");
                goto bombout;
        }

        /*  Load up the array.  */
        for (name = namelist.lh_Head, i=0;
             name -> ln_Succ;
             name = name -> ln_Succ, i++)
                namearray[i] = name;

        /*  Alphabetize filenames.  */
        qsort (namearray, nfiles, sizeof (struct Node *), namecmp);

        /*  Find longest string so we can format intelligently.  */
        for (i=0; i<nfiles; i++)
                if ((n = strlen (namearray[i] -> ln_Name)) > len)
                        len = n;
        len += 2;       /*  Inter-name spacing  */

        /*  Print them suckers out  */
        ncols = 77 / len;               /*  Assume CON: is 77 columns  */
        nlines = nfiles/ncols + (nfiles%ncols != 0);
        sprintf (fmt1, "%%-%ds", len);          /*  Normal format string  */
        sprintf (fmt2, "\033[33m%%-%ds\033[m", len);    /* For directories */
        for (i=0; i<nlines; i++) {
                for (n=i; n<nfiles; n+=nlines)
                        printf (namearray[n] -> ln_Type ? fmt2 : fmt1,
                                namearray[n] -> ln_Name);
                putchar ('\n');
        }

        /*  Phew!  Now free all the stuff we allocated.  */
bombout:
        freenamelist (&namelist);
        if (namearray)
                FreeMem (namearray, (long) nfiles * sizeof (struct Node *));

  /*****************************
    free up any recorded disk
    blocks from the track array
  ******************************/

  for (i = 0; i < 80; ++i)
  {
    if (tracks[i])
    {
      db = tracks[i];
      while( db = free_disk_block(db) ) ;
    }
  }
}

freenamelist (list)
struct List *list;
{
        register struct Node *now;

        while (now = RemTail (list))
                FreeMem (now, sizeof (*now) + strlen (now -> ln_Name) + 1L);
}


/*
 * This function assumes the existence of a properly initialized
 * standardpacket structure called "packet" and a reply port called
 * "dosreply".  A more general function would not do this.
 *
 * This function is synchronous i.e. it doesn't return until the specified
 * action has been performed.
 */
long
dopacket (proc, action, args, nargs)
struct MsgPort *proc;
long action;
register long *args;
register int nargs;
{
        register long *argp = &packet -> sp_Pkt.dp_Arg1;

        for (; nargs>0; nargs--)
                *argp++ = *args++;

        /*
         * AmigaDOS scribbles over the reply port, so we have to initialize
         * it every time we send a packet.
         */
        packet -> sp_Pkt.dp_Port = dosreply;
        packet -> sp_Pkt.dp_Type = action;

        PutMsg (proc, packet);
        WaitPort (dosreply);
        GetMsg (dosreply);

        return (packet -> sp_Pkt.dp_Res1);
}


/*
 * These functions are called by qsort().  The sense of camparisons is
 * reversed in longcmp to get a reverse sort (largest elements first).
 */
longcmp (foo, bar)
long *foo, *bar;
{
        /*
         * We have to do it this way because qsort() expects an int to be
         * returned.  Subtracting longs directly might overflow a 16-bit int.
         */
        return ((*foo < *bar) - (*foo > *bar));
}

namecmp (foo, bar)
struct Node **foo, **bar;
{
        return (strcmp ((*foo) -> ln_Name, (*bar) -> ln_Name));
}


/*
 * Resource allocation/deallocation routines.
 */
openstuff ()
{
        if (!(packet = AllocMem ((long) sizeof (*packet), MEMF_CLEAR)))
                die ("Can't allocate packet space.");

        if (!(dosreply = CreatePort (NULL, NULL)))
                die ("Dos reply port create failed.");
        packet -> sp_Msg.mn_Node.ln_Name        = (char *) &packet -> sp_Pkt;
        packet -> sp_Pkt.dp_Link                = &packet -> sp_Msg;

        if (!(fib = AllocMem ((long) sizeof (*fib), MEMF_CLEAR)))
                die ("Can't allocate FileInfoBlock.");

        /*
         * Note:  This allocation may not work with DMA hard disks.
         */
        if (!(buf = AllocMem (BLOCKSIZE * 4L, MEMF_CHIP | MEMF_PUBLIC)))
                die ("Couldn't allocate sector buffer.");
}

closestuff ()
{
        if (lok)                UnLock (lok);
        if (buf)                FreeMem (buf, BLOCKSIZE * 4L);
        if (fib)                FreeMem (fib, (long) sizeof (*fib));
        if (dosreply)           DeletePort (dosreply);
        if (packet)             FreeMem (packet, (long) sizeof (*packet));
}

die (str)
char *str;
{
        puts (str);
        closestuff ();
        exit (1);
}
--
        "We must acknowledge, once and for all, that the
        purpose of diplomacy is to prolong a crisis."
                                                 - Spock

Russell M Forster (Russ)
Mount Royal College                      BITnet: RForster@UNCAEDU.BITnet
4825 Richard Rd. S.W.                    ARPA:   OC.Russ@CU20B.Columbia.Edu
Calgary, Alberta                         Voice:  (403) 240-6052
Canada, T3E 6K6

Disclaimer: I deny everything!

perry@well.UUCP (05/26/87)

It is really good to see a number of fine performance boosters coming
along for doing dir's on floppies.  I  want to add that Facc, the new
floppy  performance  booster  from  ASDG, will work perfectly well in
conjunction with these dir  accelerators. This is because Facc accel-
erates floppy operations at a lower level  thus  will view floppy re-
quests from accelerated dir programs as simply more floppy requests.

I would be interested to learn from an owner of Facc who also has the
dir hot rods what the combined results are.

Perry S. Kivolowitz
ASDG Incorporated
(201) 563-0529