[net.micro.amiga] Wild Card Filenames for Amiga Dos

star@fluke.UUCP (David Whitlock) (01/05/86)

~r news

star@fluke.UUCP (David Whitlock) (01/05/86)

Does anyone have a Lattice C compatiable function that creates a list of
expanded wild card filenames compatiable with Unix style syntax i.e.
*, ? and [].  The folks at Tardis Software, posted an example called
'echox.c' but somehow they neglected to post the sources to the functions
they call, "expand.c".

Regards,

Dave Whitlock

sgt@alice.UucP (Steve Tell) (01/12/86)

>Does anyone have a Lattice C compatiable function that creates a list of
>expanded wild card filenames compatiable with Unix style syntax i.e.
>*, ? and []. 
>....

The above function should exist only in the shell, and do it's work
there.  I hope the folks out there writing alternatives to
CLI do this right.

wagner@utcs.uucp (Michael Wagner) (01/14/86)

In article <4795@alice.UUCP> sgt@alice.UucP (Steve Tell) writes:
>>Does anyone have a Lattice C compatiable function that creates a list of
>>expanded wild card filenames compatiable with Unix style syntax i.e.
>>*, ? and []. 
>>....
>
>The above function should exist only in the shell, and do it's work
>there.  I hope the folks out there writing alternatives to
>CLI do this right.

Actually, putting this function in the CLI (and clones) would make for
a *very* slow feature until someone gets to a few disk performance
"hot-spots" that also slow down the search for .info files when you are
opening a window.  That's not to say that developers shouldn't put
wildcards into the shells...  rather, the people who are working on
AmigaDOS should see the light and improve the performance in this area
so that they don't get in the way of this obvious need.

Basically, if I read the AmigaDOS manuals properly (and I was only able
to borrow them for a day or two), the names of the files are not kept
in the directory itself, but rather in the header blocks for the
individual files.  Therefore, resolving wildcard filenames means
seeking to each file and reading the first block of each.  An
intelligent re-copying utility *could* cluster all the header blocks
together, but the natural tendency is for them to be all over the place
(and that's the way they are distributed on the Workbench disks, if I
understand the output from LIST).  The track-buffer *might* help a bit,
but a best-case scenario would only give you 10 other header blocks on
the same track.  The best case is actually very unlikely; you're
probably lucky if you get two or three.

It's not clear (at least from the little browsing I was able to do) how
the hashing works, but it would have to be a pretty inspired hashing
scheme before you could use it to limit the search with wildcards.  I
imagine that you really have to follow *all* the hash-chains in order
to find all the files in a directory.

If I understand properly, the directory block actually contains very
little beyond the hash.  A hash is an admirable idea for very large,
flat file systems.  It seems less useful in a hierarchical file
system.  It's true that one can get very large directories in a
hierarchical file system, but not very likely in this one...you can
only get about 880 files on the whole disk (each file consists of 1
header block and one or more data blocks).

I would have thought that it would be more useful to put the names,
packed cheek-to-jowl, into the directory.  Then the various file names
that you need to select for wild-carding would all be there in one
place, and once you had read in the track where the directory had
started, it would merely be a storage compare operation to find the
files you wanted.

It may be that the originators of the DOS hadn't expected to do much in
the way of wild-card operations...but then, what is it you have to do
to open a window in the workbench?  Open, in turn, all the .info files
and put the ICONS up on the screen.    Basically a wild-card
operation.  It takes about 12 seconds on my system to open the disk
ICON for the workbench disk.  I've put in about 10 ICONS all together
(I was able to get it down to about 9 seconds with judicious copying
and placement, but it's not clear it was worth the work).  There are
about 300 bytes per ICON, so 10 ICONS of 1 block each should come into
storage, in a well-designed system, for one revolution of the disk to
read the directory, and one or two revolutions of the disk to read all
the ICONS (if we assume we didn't track-align them).   It
ought to be possible to open a window in three revolutions of the disk,
then.

Having put the names into the directory, it would not take much more to
put most of the rest of what's useful from the file header block into
the directory too, and thereby make it possible to have 1 block files.
Or, you could push the rest of the header stuff into the first half of
the header block, and leave the second half of the header block for
user data.  An ICON based system has lots of little files...by their
nature, you want them to be fast and not take up much space.  This, of
course, is not directly a performance issue, although seeking all over
the disk to find these things costs a lot.  If the whole file were
wrapped up in the one block, once you knew this was the file you
wanted, the data would already be there (basically, for almost free).

The second problem with the Amiga disk system is that the track buffer
is really there to serve the purposes of the hardware design, and isn't
really a performance boost.  It works great for sequential reads, but
then, no one ever really bought a disk drive when they wanted a tape
drive.  Locality of reference is seldom that good in a UNIX-like file
system.  Performance of UNIX and similar systems is generally abysmal
until you make the disk cache size approximately 2 per active process.
The number 2 (originally arrived at empirically, I think) is explained
as one cache buffer for the working directory, and one for everything
else.  When you go below 2, processes tend to steal their own working
directories when they do any other file I/O AND THEN THEY ALMOST
IMMEDIATELY NEED THE WORKING DIRECTORY BACK.    So, the second problem
really amounts to the need for a real disk cache instead of (as well
as?) the track buffer currently in there.

In summary, then, my wish from Amiga Claus would be a redesign of
AmigaDOS so that wild-card operations, so vital to so much of the
power of UNIX and UNIX-like systems, would not take the heads all
over the disk, and the addition of a disk cache to the kernel so that
the disk system would degrade a little more gracefully in the face
of seeking (which, after all, one expects a disk system to be able
to handle gracefully).

Michael Wagner, Computing Services, University of Toronto
(utcs!wagner)

P.S.  I know it's late for Christmas wishes.

tim@ism780c.UUCP (Tim Smith) (01/15/86)

In article <4795@alice.UUCP> sgt@alice.UucP (Steve Tell) writes:
>>
>> [wants wildcard expansion functions in C]
>
>The above function should exist only in the shell, and do it's work
>there.  I hope the folks out there writing alternatives to
>CLI do this right.

Actually, they should exist as a C library, so that anyone writing a shell
can easily make sure they have wildcard handling.  Also, any program that
for some reason wants to get a file name from the user can then handle
wildcards easily this way.

--
Tim Smith       sdcrdcf!ism780c!tim || ima!ism780!tim || ihnp4!cithep!tim

rokicki@Navajo.ARPA (01/16/86)

> Basically, if I read the AmigaDOS manuals properly (and I was only able
> to borrow them for a day or two), the names of the files are not kept
> in the directory itself, but rather in the header blocks for the
> individual files.  Therefore, resolving wildcard filenames means
...
> I would have thought that it would be more useful to put the names,
> packed cheek-to-jowl, into the directory.  Then the various file names
...
> user data.  An ICON based system has lots of little files...by their
> nature, you want them to be fast and not take up much space.  This, of
...
> Michael Wagner, Computing Services, University of Toronto
> (utcs!wagner)

Very good comments, valid on every point.  I can not believe the directory
structure was defined the way it was!  Let's get this fixed and soon; the
delay times are outrageous at the moment.  OS-9 68K looks better and better.

-tom rokicki

[[  I want my Manx compiler!  ]]

gnu@hoptoad.uucp (John Gilmore) (01/16/86)

In article <248@ism780c.UUCP>, tim@ism780c.UUCP (Tim Smith) writes:
> In article <4795@alice.UUCP> sgt@alice.UucP (Steve Tell) writes:
> >> [wants wildcard expansion functions in C]
> >The above function should exist only in the shell, and do it's work
> >there.
> Actually, they should exist as a C library, ...

*Actually*, the first C library routine to write is opendir()/readdir().
Since this is the standard Unix method of reading directories,
this means that programs will port to and from Unix without having
to hack up the directory access part of the program.

*Then*, if you want to write a regular expression library routine for file
name expansion, you can write one that's portable to Unix too [gasp!].
You could even debug it there, where bugs produce core dumps instead
of koans, and where there are reasonable debuggers.

If somebody is into this, I can send you a PD version of opendir/readdir
for Unix systems, and the "man page" description of the routines.
-- 
# I resisted cluttering my mail with signatures for years, but the mail relay
# situation has gotten to where people can't reach me without it.  Dammit!
# John Gilmore  {sun,ptsfa,lll-crg,nsc}!hoptoad!gnu    jgilmore@lll-crg.arpa
#					^^^^^^^ Hoptoad used to be L5.