[comp.protocols.appletalk] Mac File Representation Update

tom@CITI.UMICH.EDU (09/08/87)

Rich,

My response to your latest posting.  The gist of my comments is due to
the fact that in MacNFS, *all* processing is done on the Mac side;
there is *no possibility* of doing any work on the remote side (except
for i/o).

> 5)  We will "suggest" that creators of AppleSIngle and AppleDouble Header files
> keep info like file attributes, real file name, Finder Info, etc. as close to
> the header as possible, but there is no way to enforce it.

If it is possible for people to put the crucial information off in some
odd place in the file (away from the header) someone will.  Rather than
add code to MacNFS to deal with those special cases I am inclined to
reject those files that do not meet the guidelines.  Still, it is extra
code to test for compliance and reject files.  I would like to see the
file info (finder info and file attributes) and dates in fixed fields
in the header.

> 7)  We dislike the idea of breaking resource forks out into yet another file,

I don't know all the implications of keeping the resource fork in a
file with the file info but I believe that it is workable given the
following guidelines:  The resource fork must be the last item in the
file and it's descriptor must be near the header such that we can read
it when we read the header information (preferably a fixed field).  I
still prefer breaking resource forks into separate files.

> 11)  If the Macintosh file does not contain a resource fork, then the
> AppleSingle or AppleDouble file may contain a resource ID entry of
length > zero bytes, or it may contain no entry descriptor at all.

Again, a special case to handle when creating a resource fork.  If
there is no descriptor I have to create one.  If there is no space for
a new descriptor I have to munge the file to make space.

> 12)  We are currently rethinking our restriction about not using subdirectories.

I used to think putting all the files in one directory was the better
approach until I actually used it.  Now I prefer subdirectories.  Both,
however, are workable, but UNIX people are fairly insistent that the
clutter resulting from not using subdirectories is a major headache.


General observations:

There are two problems you are trying to solve:  archiving files on a
foreign file system and a storing mac files on a foreign for use with
an external file system.  To solve the first problem we need a general
extensible format that makes few assumptions about the file formats
which AppleDouble provides.  To solve the second problem we need a
macintosh specific format that allows for quick storage and retrieval
of Macintosh file information.  Few assumptions about services provided
by the foreign file system should be made.  AppleSingle is not the
solution that I would like to see.  It retains un-needed generality,
allowing too many variations in the file format.  Each variation is a
special case that must be dealt with.  This introduces additional
complex code that is ripe for bugs.  A simple, fixed, macintosh
specific format would side step all these problems.

MacNFS has tighter size constraints than Charlie's aufs server, our
memory comes from the application space.  Adding code to deal with
special cases in the file format further reduces the memory available
for applications.


Tom Unger
tom@citi.umich.edu
Univeristy of Michigan, CITI

cck@CUNIXC.COLUMBIA.EDU (Charlie C. Kim) (09/10/87)

Following is a summary of the response we sent to Apple for the
original posting and a copy of the message we send in response to the
update.

A brief summary of message we sent in response to the original posting
(but not posted to info-appletalk for a number of reasons) is:
 * proposal is basically okay, but have following mods that make it
more platable
     - re: file naming
	o subdirectories are a way of naming files conviently
	  (os dep, but the naming is already os dep so who cares?)
	o prepending or suffixing chars to distinguish is more expensive
	  for servers
     - combining resource & finderinfo in appledouble
        o causes problems with semantics on file operations
	o proposed using a tag in the resource field that marks whether
	  resource file is included in the appledouble file
     - holes are problematic since they get filled (on copies)
     - need basic information about file near top of file so we can get
       in fewer reads
     - want file protections in afp client (if server says it handles them)
	(unlikely)

I've added some corrections in "[]"'s.

Date: Fri, 4 Sep 87 11:13:52 EDT
From: Charlie C. Kim <cck>
To: andrews@apple.UUCP
Subject: Re: Mac File Representation Update
Newsgroups: comp.protocols.appletalk
In-Reply-To: <6131@apple.UUCP>
Organization: Columbia University Center for Computing Activities
Cc: tom@citi.umich.edu, cck, bill

>2)  I think this means that we must define two different Home File System
>values for unix, since it runs on processors that use both ordering schemes.

This is sufficient for some binary data, but not for the general case
since different versions of unix [on different machines] have
different magic numbers, different sizes for floats, ints, etc.  The
issue is two-fold: files have some os dependent features (obviously)
and data encoded in files have machine dependenet [dependent]
features. You would be better served by adding a "machine type" that
has a map of "machine attributes" (like the Adobe AFD files for
PostScript machines) if you want to attempt to services [service] this
particular requirement.  In addition, there is a (lesser) need for
such a map for ["for" should be "of"] OS dependent features that
contain information about file name representations, end of line
mappings, etc.

>3)  Someone claimed that "holes" in a file are not useful since someone may
>copy the file and not preserve the holes.  Our feeling is that it is more 
>general to allow them, and nothing is gained by prohibiting them.
Just to set the record straight.  No complaints about allowing holes -
just noting the problem with them - the real problem under Unix is
that it is not possible to determine if a file has a hole in it and
entire paradigm is somewhat contradictory to the stream of bytes
philosphy of Unix files.  It's a particular problem if someone blithely
leaves a lot of space to allow for expansion.  Since the file formats
are designed for files that will be copied around, this should be at
least noted in the specification since the document suggests the use
of holes.

>5)  We will "suggest" that creators of AppleSIngle and AppleDouble Header files
>keep info like file attributes, real file name, Finder Info, etc. as close to
>the header as possible, but there is no way to enforce it.
Yes there is.  Make them required entries that are defined as part of the
header - e.g. make them fit into fixed positions.  The "strong"
suggestion is good enough for me though.  This is simply a
issue of efficency and ease of programming. [Still prefer the "hard"
approach, it makes life much less complicated]

>7)  We dislike the idea of breaking resource forks out into yet another file,
>since two files is enough to have to worry about.  Shifting the resource fork
>entry to the end of the file to allow easy munging should be easy.
Yes, the key word is munging.  You have to play a lot of games to make
this work.  Every fork operation becomes a special case - some even
problematic (e.g. seek from end backwards or byte range lock entire
file).  You may not think this is a problem, but it is for us, Aufs is
already over 11,000 lines of code (too much) and biggest single
segment of this code (~4,000 lines) is that which maps the unix
to/from AFP semantics.  If you are going to allow holes, you might as
well allow split resource files.  [Okay, so this really isn't an
argument] The tradeoff is a time-space one, but in our case it is a
big time gain in relation to a small space loss.  If I'm willing to
deal with three files, why don't you let me?  As long as there is a
consistent method for figuring out where the files are, this is no
problem. [e.g. os dep. naming policy]

In addition, by using a "tag" that indicates that the fork exists
outside this file, you will have compressed the AppleSingle and
AppleDouble formats into an integrated format.  Also, while there is
no single standard that will encompass all file naming schemes
required, there is no reason not to define a some number of them and
encode them as the "tag" values valid for one or more OSes.  This
would allow different file name representations to co-exists [sorry,
"co-exist"] on a particular systems [woops, "system"].

Charlie C. Kim
User Services
Columbia University