[comp.sys.mac.programmer] The Beauty of HFS

heksterb@Apple.COM (Ben Hekster) (01/30/91)

In <1991Jan29.074646.7218@actrix.gen.nz> Bruce.Hoult@bbs.actrix.gen.nz writes:

> What's so ugly about HFS?  I don't see any problems with the programming
> interface to it, and it looks to be a pretty good implementation of a
> high-performance file system -- the use of a volume allocation bitmap, and
> file extent and directory b*-trees in particular enable much better
> performance than the schemes in many other operating system (such as
> MS-DOS and Unix, for example).

jmunkki@hila.hut.fi (Juri Munkki) writes:

> HFS performs ok, but it has a really nasty progrmming interface. Unless
> you are just using SFGetFile and SFPutFile, you need to know about
> directory id numbers, working directory reference numbers and all the
> garbage that goes with them.

It's not all that difficult.  There are basically two strategies you can
pursue-- either you keep the working directory reference number around and
stick to the MFS calls, or use _GetWDInfo immediately whenever you get a
working directory reference number and just work in terms of volume reference
numbers and directory IDs.

> Assume for instance that the user opens a document. To do this, a file
> is selected and the program now has a working directory number and a file
> name. The program opens, reads and closes the file (no problem here). The user
> then edits the file and wants to save. The program still only has the
> working directory reference number, which at this time might no longer
> be valid. (I might be wrong here, so please correct me. I've never really
> understood when and how the numbers are allocated and deallocated. I know
> they will not be deallocated, if a file is open, but I think the latest
> guidelines say that files should not be left open.)

I'd think we can safely assume the working directory reference number will
remain valid while the application is running...  but again, if you were
really worried about it, why not convert the working directory reference
numbers and forget about them altogether?

> Another interesting scenario involves copying a file. The only true way
> to do it involves opening and reading the resource and data forks and
> the finder info. It's a lot of work, if you want to do it reliably.

Again, it's really not all that difficult.  The basic fork-copying algorithm
involves allocating a buffer and looping _Read and _Write calls, which works
for every case.  You get to use the same code for both forks.  For really
efficient file copying, you can vary the size of the buffer according to the
amount of memory you have.

[_CopyFile]
> This trap of course works only on AppleShare volumes. Apple didn't
> bother implementing it on local disks (at least I never got it to
> work). Another silly limitation is that you can only copy or move a
> file within a single volume.

Using _CopyFile on shared volumes is useful because you avoid the redundant
network traffic caused by what is basically the application echoing the entire
file back to the shared volume, and the one server can handle the call.  In
fact, _CopyFile does work on different volumes when they are both on the same
server.  For it to work between servers would require the shared volume to
communicate with other servers and wouldn't eliminate the network traffic of
file copying, so would give you little (if any) savings over the basic copying
loop.

_______________________________________________________________________________
Ben Hekster                           | "I've had my fun
Student intern                        |  but now it's time
AppleLink:   heksterb                 |  to serve your conscience
Internet:    heksterb@apple.com       |  overseas"
BITNET:      heksterb@henut5          | --Orange Crush, R.E.M. (Green)

time@tbomb.ice.com (Tim Endres) (01/31/91)

In article <48627@apple.Apple.COM>, heksterb@Apple.COM (Ben Hekster) writes:
> > What's so ugly about HFS?  I don't see any problems with the programming
> > interface to it, and it looks to be a pretty good implementation of a
> > high-performance file system -- the use of a volume allocation bitmap, and
> > file extent and directory b*-trees in particular enable much better
> > performance than the schemes in many other operating system (such as
> > MS-DOS and Unix, for example).

Performance may be better, I have no data on that, but I do know that
the Mac HFS starts to really choke when a directory begins to contain
hundreds of files. I have not had time to research things, but we
have seen a couple of cases where the Mac was disabled because it
could not seem to deal with the enormous number of files in a directory
(several hundred if I remember correctly).

tim.

-------------------------------------------------------------
Tim Endres                |  time@ice.com
ICE Engineering           |  uupsi!ice.com!time
8840 Main Street          |  Voice            FAX
Whitmore Lake MI. 48189   |  (313) 449 8288   (313) 449 9208

jmunkki@hila.hut.fi (Juri Munkki) (01/31/91)

In article <48627@apple.Apple.COM> heksterb@Apple.COM (Ben Hekster) writes:
>jmunkki@hila.hut.fi (Juri Munkki) writes:
>[_CopyFile]
>> This trap of course works only on AppleShare volumes. Apple didn't
>> bother implementing it on local disks (at least I never got it to
>> work). Another silly limitation is that you can only copy or move a
>> file within a single volume.
>
>Using _CopyFile on shared volumes is useful because you avoid the redundant
>network traffic caused by what is basically the application echoing the entire
>file back to the shared volume, and the one server can handle the call.  In
>fact, _CopyFile does work on different volumes when they are both on the same
>server.  For it to work between servers would require the shared volume to
>communicate with other servers and wouldn't eliminate the network traffic of
>file copying, so would give you little (if any) savings over the basic copying
>loop.

My point was that it shouldn't be the programmers job to check for a file
server and use the trap if is allowed and different code if it doesn't. The
file system knows about these things much better than the poor programmer
that is struggling to understand HFS. So what if there is no saving? There's
no harm done either, if the trap knows how to do a simple copy. Of course
this would also allow Apple to upgrade server code so that when copying is
from one file server to another, the client node is not used.

The weakness about WDRefNums and directory IDs is that the documentation
first talks about the other and then about the other. If I'm scanning
through IM IV, I probably only want to know about one or the other. I
get the same feeling I got when I tried to learn Apple II basic, but
didn't notice that I had been given two basic books: one about AppleSoft
and the other about Integer basic. It was and is confusing.

I'm not saying that I can't handle the File Manager. I'm only saying that
I didn't have any problems with MFS calls (back in the days when there
wasn't anything else) and I had a lot more trouble when HFS arrived.

Recently I wanted a program that automatically copies files that were
changed and I tried using the new file copying operation. Of course I
installed AppleShare, but I learned that it didn't work on my hard disk
only after an hour or two of struggling with the parameter block.

I've heard that System 7.0 will improve things. I hope it does.
I'll get IM-VI when it is published. I used to care enough to
get beta documentation, but nowadays I don't have the time
to bother with that.

I think that the file manager should be as intelligent and object
oriented as possible. With object oriented I mean that the same calls
should apply to files, folders and volumes. The same file copying
command could quite well support all those objects.

   ____________________________________________________________________________
  / Juri Munkki	    /  Helsinki University of Technology   /  Wind  / Project /
 / jmunkki@hut.fi  /  Computing Center Macintosh Support  /  Surf  /  STORM  /
 ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~

ldo@waikato.ac.nz (Lawrence D'Oliveiro, Waikato University) (01/31/91)

In article <1CE00001.2uro9z@tbomb.ice.com>, time@tbomb.ice.com (Tim Endres)
claims cases where a Mac was "disabled because it could not seem to deal
with the enormous number of files in a directory".

Are you sure this wasn't a problem with the well-known limitations on
the resource-based Desktop database? If so, switching to Desktop Manager
should fix the problem.

Lawrence D'Oliveiro                       fone: +64-71-562-889
Computer Services Dept                     fax: +64-71-384-066
University of Waikato            electric mail: ldo@waikato.ac.nz
Hamilton, New Zealand    37^ 47' 26" S, 175^ 19' 7" E, GMT+13:00

rang@cs.wisc.edu (Anton Rang) (02/01/91)

In article <1CE00001.2uro9z@tbomb.ice.com> time@tbomb.ice.com (Tim Endres) writes:
>Performance may be better, I have no data on that, but I do know that
>the Mac HFS starts to really choke when a directory begins to contain
>hundreds of files.

  Are you sure?  The *Finder* slows down quite a bit.  HFS itself
doesn't, at least as far as I've seen, and I've played around with
directories with several thousand files before (purely to experiment
with its limits, if any).

  The HFS directory is stored as a B-tree; the key is a directory ID
and the file name.  Having one large directory with 700 files, say,
has nearly exactly the same hit on performance as having 600 files
scattered amongst 100 directories.

  I don't think the Finder was designed to handle large numbers of
files in one directory, though.

	Anton
   
+---------------------------+------------------+-------------+
| Anton Rang (grad student) | rang@cs.wisc.edu | UW--Madison |
+---------------------------+------------------+-------------+

andyp@treehouse.UUCP (Andy Peterman) (02/01/91)

In article <1CE00001.2uro9z@tbomb.ice.com> time@ice.com writes:
>
>In article <48627@apple.Apple.COM>, heksterb@Apple.COM (Ben Hekster) writes:
>> > What's so ugly about HFS?  I don't see any problems with the programming
>> > ...etc...
>
>Performance may be better, I have no data on that, but I do know that
>the Mac HFS starts to really choke when a directory begins to contain
>hundreds of files. I have not had time to research things, but we
>have seen a couple of cases where the Mac was disabled because it
>could not seem to deal with the enormous number of files in a directory
>(several hundred if I remember correctly).

I don't think that it's HFS that's choking but rather Finder that can't
deal with the large number of files.  I have a CD ROM (The Right
Stuffed) that has some folders with over 500 files and have no problem
with them when using a Finder alternative (MacTree Plus).  I don't
remember where, but I think I've seen folders with over 1000 files
and havn't had any problems with them.  I believe HFS will handle a very

large (< 16000?) number of files per folder without any problem.

The new Finder in System 7.0, along with use of the Desktop Manager,
should be able to deal with these kind of folders a lot better.

-- 
Andy Peterman                       |   Opinions expressed
treehouse!andyp@gvgpsa.gvg.tek.com  | are definitely those of
(916) 273-4569                      |      my employer!

calvin@portia.stanford.edu (02/01/91)

In article <1991Jan31.184558.2874@waikato.ac.nz> ldo@waikato.ac.nz (Lawrence
D'Oliveiro, Waikato University) writes:
>In article <1CE00001.2uro9z@tbomb.ice.com>, time@tbomb.ice.com (Tim Endres)
>claims cases where a Mac was "disabled because it could not seem to deal
>with the enormous number of files in a directory".
>
>Are you sure this wasn't a problem with the well-known limitations on
>the resource-based Desktop database? If so, switching to Desktop Manager
>should fix the problem.

	What is the DeskTop Manger? I've heard of this before, but I have never seen
such a thing. If it is so useful where can I get such a beast?

Peter Chang

unierik@uts.uni-c.dk (Erik Bertelsen) (02/01/91)

>Performance may be better, I have no data on that, but I do know that
>the Mac HFS starts to really choke when a directory begins to contain
>hundreds of files. I have not had time to research things, but we
>have seen a couple of cases where the Mac was disabled because it
>could not seem to deal with the enormous number of files in a directory
>(several hundred if I remember correctly).

Well -- I don't think HFS is the problem here. I have crated a folder
with about 25000 PICT-files in it. My application has no problem with
opening these files by name, but the Finder wont display the contents
of this folder -- and I don't blame it too much for not doing it ... :-)

regards
Erik Bertelsen
UNI-C, The Danish COmputing Centre for Research and Education.

time@tbomb.ice.com (Tim Endres) (02/02/91)

OK, well, since I stirred such a debate I took time to go back and
investigate what had happened. Sure enough, creating a directory with
thousands of files will kill Finder. Using software like DiskTop or
using SFGetFile(), the directory was handled OK. As for the software
that croaked, there was a memory allocation that exhausted memory.

Thus, as many have stated, HFS appears to have no problem with a
large number of files in a directory.

tim.

-------------------------------------------------------------
Tim Endres                |  time@ice.com
ICE Engineering           |  uupsi!ice.com!time
8840 Main Street          |  Voice            FAX
Whitmore Lake MI. 48189   |  (313) 449 8288   (313) 449 9208

gurgle@well.sf.ca.us (Pete Gontier) (02/05/91)

In article <48627@apple.Apple.COM> heksterb@Apple.COM (Ben Hekster) writes:
>I'd think we can safely assume the working directory reference number will
>remain valid while the application is running...  but again, if you were
>really worried about it, why not convert the working directory reference
>numbers and forget about them altogether?

Nope, working directories certainly don't stick around. In most cases, they
stick around just long enough for you to use them once reliably, i.e.
opening a fork in response to Standard File or SetVol to a directory via a
WD. Otherwise, there's no telling when they'll drop out from under you. I
usually assume that a call the GNE/WNE means they've become invalid, unless
you've done something like open a file access path or SetVol to them.
I've been bitten by this once, and it bit a friend of mine whose application
started crashing when the ci came out; he traced the problem to an invalid
WD using MasBug but no longer worked for the company and so had no way to
fix it.

As far as System 7 goes, I don't know. There is a new explanation of the way
all these various file system identifiers work in the File Manager chapter
of Inside Mac VI, and although the WD is still explained, it is described in
the context of providing compatibility between HFS and MFS, and I don't know
if its presence in the volume is an indication of continued support from
Apple or if its description suggests that it may go away some time in the
future.
-- 
 Pete Gontier, gurgle@well.sf.ca.us
 Software Imagineer, Kiwi Software, Inc.