[comp.protocols.appletalk] Remote file representation

joel%beowulf@palomar.UUCP (08/25/87)

First, I'd like to thank Apple for sharing this before it's set
in stone.  There are many Mac programmers who are on USENET, 
(and also Delphi and Bix, which receive edited copies of USENET news)
who can participate.

The spec seems well thought-out, but I have a few suggestions:

1. Home File System should include:
	'vms ' or $766D7320
	'os2 ' or $6F734220
   I'm not sure if 'unix' should be treated as a monolithic
   whole, as the syntax of file names, at least, differs significantly
   from System V (14 chars max) to BSD.

2. The AppleDouble format does not address one important issue.
   The data fork of a Macintosh file is not normally readable
   by another file system.  To contrast it with three with
   which I am familiar:
	*) Mac: CR at end of each record
	A) UNIX: LF at end of each record
	B) MS-DOS: CR-LF at end of each record
	C) VMS: preceeded by a binary length
   If the data fork is to be readable, then it must be translated
   into the native format, and its original format saved.

   I think there needs to be a line-translation entry to indicate
   the original line delimiter so it can be restored from the
   native format
	Record Delim 9  1 or more delimiter characters
   For example, a Mac file stored on MS-DOS would be stored with CR-LF
   delimiters but with Record Delim = $0D.  (on VMS, you could use
   RAT=CR, which would be an exact copy of the Mac data fork).

   The AFP or some other protocol also needs either record-
   at-a-time i/o or a "What is this file's delimiter" query, so
   that a Mac program can read a file directly from a UNIX
   server (in LF-delimited form) and vice versa.

3. The use of the icon ought to be better defined, or this will
   vary all over the place.  For example, the spec ought to say
   something like:
	Applications normally include their icons, but documents
	do not.
   (or maybe all always include their icon)

4. Again, their ought to be file naming guidelines for mapping
   (recommendations for consistency, not absolute rules) so that 
   there is similarity between various systems.  For example,
     A) Case is stripped on systems that don't support case, so
	Foo.C becomes FOO.C
     B)	Spaces are mapped to "_" if available, otherwise to
	another available non-alphanumeric special character.
     C)	Standard mappings for Macintosh extended characters.
	For example, all accented letters are stripped to their
	IUMagString equivalents,
		`e becomes e
		c, becomes c
	etc.  Characters that have no equivalent are removed
	((R), TM, (C), etc.) since they are likely to be noise
	characters.

5. Presumably if a new file is being created in this directory
   and it truncates to an existing AppleSingle or AppleDouble
   file, the program will check for the actual name and, if
   different, choose a new truncated name to save the file.

   For example,

	'AntiDisestablish Text' and 'AntiDisestablish Text 2'
   might be stored as
	ANTIDIST. and ANTIDIS2.
   on an MS-DOS system.

This is obviously an area where a standard is sorely needed,
and as an A/UX user, I hope A/UX 1.0 implements the final 
AppleDouble proposal.

	Joel West		
	Palomar Software, Inc., POB 2635, Vista, CA  92083
	joel%palomar.uucp@beowulf.ucsd.edu
	ihnp4!crash!palomar!joel	joel@palomar.cts.com
	AppleLink: D0619			MCI: Palomar

elwell%tut.cis.ohio-state.edu@osu-eddie.UUCP (Clayton Elwell) (08/26/87)

In article <8708250004.AA03964@beowulf.UCSD.EDU>
joel%beowulf@palomar.UUCP writes:
>[...]
>
>2. The AppleDouble format does not address one important issue.
>   The data fork of a Macintosh file is not normally readable
>   by another file system.  To contrast it with three with
>   which I am familiar:
>	*) Mac: CR at end of each record
>	A) UNIX: LF at end of each record
>	B) MS-DOS: CR-LF at end of each record
>	C) VMS: preceeded by a binary length
>   If the data fork is to be readable, then it must be translated
>   into the native format, and its original format saved.
>
>   I think there needs to be a line-translation entry to indicate
>   the original line delimiter so it can be restored from the
>   native format

We talked about this issue at the file representation session before
Macworld Expo.  The problem is that there are text files, and there
are other files (I, for example, routinely move TeX DVI files back and
forth betweeen my Mac and our Pyramid 98x).  Unfortunately, it's not
as simple as putting in a delimiter specification (see below for my
solution (hack?)).

The only way I could see to do this generally would be to define some
sort of translation script language, but this seems to fall outside of
the file format itself.

>	Record Delim 9  1 or more delimiter characters
>   For example, a Mac file stored on MS-DOS would be stored with CR-LF
>   delimiters but with Record Delim = $0D.  (on VMS, you could use
>   RAT=CR, which would be an exact copy of the Mac data fork).
>
>   The AFP or some other protocol also needs either record-
>   at-a-time i/o or a "What is this file's delimiter" query, so
>   that a Mac program can read a file directly from a UNIX
>   server (in LF-delimited form) and vice versa.

Coming from the point of view of writing a server, I translate on the
fly between native text format and Mac text format.  This does assume
that you're not going to seek around in text files randomly (although
in my case even this works, since my host system is UNIX).  If you're
using it for archival purposes, a little utility to extract or replace
a text file wouldn't be hard, and is usable in practice (it's what
I've been doing for the last 6 months or so).  The combination of the
file system identifier and file type should be enough information.

>3. The use of the icon ought to be better defined, or this will
>   vary all over the place.  For example, the spec ought to say
>   something like:
>	Applications normally include their icons, but documents
>	do not.
>   (or maybe all always include their icon)

I don't see how this is really a problem.  Some things will have
icons, and some won't.  In a heterogeneous environment, I don't see a
simple way to avoid that.  You will fairly often have documents
isolated from their associated applications (if any), but you still
want them to have appropriate icons.  Then again, you don't want to
worry about icons for MS-DOS files.

>4. Again, their ought to be file naming guidelines for mapping
>   (recommendations for consistency, not absolute rules) so that 
>   there is similarity between various systems.  For example,
>     A) Case is stripped on systems that don't support case, so
>	Foo.C becomes FOO.C
>     B)	Spaces are mapped to "_" if available, otherwise to
>	another available non-alphanumeric special character.
>     C)	Standard mappings for Macintosh extended characters.
>	For example, all accented letters are stripped to their
>	IUMagString equivalents,
>		`e becomes e
>		c, becomes c
>	etc.  Characters that have no equivalent are removed
>	((R), TM, (C), etc.) since they are likely to be noise
>	characters.

This is getting pretty file system dependent, which as I understand it
was why it was left open.  For most systems, a simple scheme can be
put together based on the naming restrictions.  Since the native
(Macintosh) name is stored in the file, you don't really lose
anything.  Once again, this is more of an issue of application than
formatting.

>5. Presumably if a new file is being created in this directory
>   and it truncates to an existing AppleSingle or AppleDouble
>   file, the program will check for the actual name and, if
>   different, choose a new truncated name to save the file.
>
>   For example,
>
>	'AntiDisestablish Text' and 'AntiDisestablish Text 2'
>   might be stored as
>	ANTIDIST. and ANTIDIS2.
>   on an MS-DOS system.

See above.

>This is obviously an area where a standard is sorely needed,
>and as an A/UX user, I hope A/UX 1.0 implements the final 
>AppleDouble proposal.
>
>	Joel West		
>	Palomar Software, Inc., POB 2635, Vista, CA  92083
>	joel%palomar.uucp@beowulf.ucsd.edu
>	ihnp4!crash!palomar!joel	joel@palomar.cts.com
>	AppleLink: D0619			MCI: Palomar

Remember that this is a draft, and thus doesn't have all of the
examples and recommendations that we are used to in Apple docs (final
ones, anyway :-)).

I've been playing with AppleDouble in my server, and it seems to be
quite usable.  It's sure a lot better than, say, MacBinary...  I
second the motion to put it in the A/UX toolbox (as if they needed
more things to work on :-)).

-=-

Clayton Elwell

Arpa/CSNet:	Elwell@Ohio-State.ARPA
UUCP:		...!cbosgd!osu-eddie!elwell
Voice:		(614) 292-6546

jww@sdcsvax.UCSD.EDU (Joel West) (08/26/87)

In article <4035@osu-eddie.UUCP>, elwell%tut.cis.ohio-state.edu@osu-eddie.UUCP (Clayton Elwell) writes:
. We talked about this issue at the file representation session before
. Macworld Expo.  The problem is that there are text files, and there
. are other files (I, for example, routinely move TeX DVI files back and
. forth betweeen my Mac and our Pyramid 98x).  Unfortunately, it's not
. as simple as putting in a delimiter specification (see below for my
. solution (hack?)).
....
. Coming from the point of view of writing a server, I translate on the
. fly between native text format and Mac text format.  This does assume
. that you're not going to seek around in text files randomly (although
. in my case even this works, since my host system is UNIX).  If you're
. using it for archival purposes, a little utility to extract or replace
. a text file wouldn't be hard, and is usable in practice (it's what
. I've been doing for the last 6 months or so).  The combination of the
. file system identifier and file type should be enough information.

Once the data fork in AppleDouble is fully converted to the native
format, translating on the fly seems the only way to go.  The main
problem seems to be exactly the issue Clayton raises: which files
should be translated and which should not?

There are at least two solutions I can see:
    1.	The remote file system has an (administrator-editable) list 
	of file types that are translatable, presumably including TEXT
	and EPSF but not WDBN or 'DVI '; or
    2.	The server always does TEXT (for upward compatibility)
	and then there's another file bit somewhere added to
	Mac files indicating that this is CR-delimited text,
	thus allowing the server to know which ones are translated.

And I otherwise agree with Clayton, it's reasonable to expect the final
spec will include recommendations and examples that will nudge 
implementors in generally the same direction, just like the HIG, 
Inside Mac and the Tech Notes.

elwell%tut.cis.ohio-state.edu@osu-eddie.UUCP (Clayton Elwell) (08/27/87)

In article <3725@sdcsvax.UCSD.EDU> jww@sdcsvax.UCSD.EDU (Joel West) writes:
>
>Once the data fork in AppleDouble is fully converted to the native
>format, translating on the fly seems the only way to go.  The main
>problem seems to be exactly the issue Clayton raises: which files
>should be translated and which should not?
>
>There are at least two solutions I can see:
>    1.	The remote file system has an (administrator-editable) list 
>	of file types that are translatable, presumably including TEXT
>	and EPSF but not WDBN or 'DVI '; or

Great minds think alike :-).  I have this list hardwired into the code,
since I am doing a little more than just CR-LF mapping (see below).

>    2.	The server always does TEXT (for upward compatibility)
>	and then there's another file bit somewhere added to
>	Mac files indicating that this is CR-delimited text,
>	thus allowing the server to know which ones are translated.

This is getting a little off of the subject, but I'll brind it up anyway.
There are two basic cases:  the host OS is looking at a file created
by a Mac program, and the Mac is looking at a file created by a host
program.  Here's how I do it now:

Case 1: Mac-created file.

If the file is of type TEXT, translate on the fly to host format as it
is being written out (Use AppleDouble).  Otherwise, check the type
against a magic list of types that the host understands.  If it's in
the list, use AppleDouble (but no translation on the data fork).  If
it's not, use AppleSingle.

Case 2: Host-created file.

This is where life gets interesting.  The basic idea is to leave the
file alone as much as possible.  The way I do this is to pretend the
file is in AppleDouble, but create the resource/info information on
the fly WITHOUT STORING IT BACK unless the Mac explicitly changes it.
The type of file is inferred by looking at the magic number and/or the
first K or so of the file.  The creator gets set to 'UNIX', and the file
type gets set as follows:

	ASCII text: 'TEXT'
		This includes PostScript and shell scripts.
		translation is done for these files.
	executable (with execute permission set): 'AOUT'
	executable (without execute permission set): 'UOBJ'
	archive: 'ARCV'
	DVI file: 'DVIF' (matches MacTeX from FTL Systems)

This list could be extended as far as is needed.  The server is also
set up so that there are entries for these types in the Dessktop
database, so they each pop up with an appropriate icon.


	
-=-

Clayton Elwell

Arpa/CSNet:	Elwell@Ohio-State.ARPA
UUCP:		...!cbosgd!osu-eddie!elwell
Voice:		(614) 292-6546