[comp.sys.apple] Transfer of files by E-mail

delaney%wnre.aecl.cdn%ubc.CSNET@RELAY.CS.NET.UUCP (03/03/87)

The following article appears on a number of commercial information systems
and may be of interest to all connected to info-apple.  I have both programs 
mentioned and will put them on the system if enough interest is expressed
I will put them though to Info-APPLE. 

****************************************************************************

                         Binary II
                               ---------
                      Apple II Binary File Format
                             developed by
                            Gary B. Little
Version History
---------------
November 24, 1986  : Initial release.
Background
----------
Transferring Apple II files in binary form to commercial information 
services like CompuServe, Delphi, GEnie, and The Source is, to put it 
mildly, a frustrating exercise. (For convenience, I'll refer to such 
services, and any other non-Apple II systems, as "hosts.") Although 
most hosts are able to receive a file's *data* in binary form (using 
the Xmodem protocol, for example), they don't receive the file's all-
important attribute bytes. All the common Apple II operating systems, 
notably ProDOS, store the attributes inside the disk directory, not 
inside the file itself. 
The ProDOS attributes are the access code, file type code, auxiliary 
type code, storage type code, date of creation and last modification, 
time of creation and last modification, the file size, and the name of 
the file itself. (All these terms are defined in Apple's "ProDOS 
Technical Reference Manual" or in the book "Apple ProDOS: Advanced 
Features for Programmers" by Gary Little.) It is usually not possible 
to use a ProDOS file's data without knowing what the file's attributes 
are (particularly the file type code, auxiliary type code, and size). 
This means ProDOS files uploaded in binary form to a host are useless 
to those who download them. The same is true for DOS 3.3 and Pascal 
files. 
Most Apple II communications programs use special protocols for 
transferring file attributes during a binary file transfer, but none 
of these protocols have been implemented by hosts. These programs are 
only useful for exchanging files with another Apple II running the 
same program. 
At present, the only acceptable way to transfer an Apple II file to a 
host is to convert it into lines of text and send it as a textfile. 
Such a textfile would contain a listing of an Applesoft program, or a 
series of Apple II system monitor "enter" commands (e.g., 0300:A4 32 
etc.). Someone downloading such a file can convert it to binary form 
using the Applesoft EXEC command. 
The main disadvantage of this technique is that the text version of 
the file is over three times the size of the original binary file, 
making it expensive (in terms of time and $$$) to upload and download. 
It is also awkward, and sometimes impossible, to perform the binary-
to-text or text-to-binary conversion. 
The solution to the problem is to upload an encoded binary file which 
contains not just the file's data, but the file's attributes as well. 
Someone downloading such a file, say using Xmodem, can then use a 
conversion program to strip the attributes from the file and create a 
file with the required attributes. 
To make this technique truly useful, however, the Apple II community 
must agree on a format for this encoded binary file. A variety of 
incompatible formats, all achieving the same general result, cannot be 
allowed to appear. 
It is proposed that the Binary II format described in this document be 
adopted. What follows is a description of the Binary II format in 
sufficient detail to allow software developers to implement it in 
Apple II communications programs. 
The Binary II File Format
-------------------------
The Binary II form of a standard file consists of a 128-byte file 
information header followed by the file's data. The data portion of 
the file is padded with nulls ($00 bytes), if necessary, to ensure the 
data length is an even multiple of 128. As a result, the Binary II 
form of a file is never more than 255 bytes longer than the original 
file. 
The file information header contains four ID bytes, the attributes of 
the file (in ProDOS 8 form), and some control information. Here is the 
structure of the header: 
      Offset  Length                  Contents
      ------  ------   ---------------------------------------      
       +0       1      ID byte: always $0A
       +1       1      ID byte: always $47
       +2       1      ID byte: always $4C
       +3       1      access code
       +4       1      file type code
       +5       2      auxiliary type code
       +7       1      storage type code
       +8       2      size of file in 512-byte blocks
       +10      2      date of modification
       +12      2      time of modification
       +14      2      date of creation
       +16      2      time of creation
       +18      1      ID byte: always $02
       +19      1      [reserved]
       +20      3      end-of-file (EOF) position
       +23      1      length of filename/partial pathname
       +24      64     ASCII filename or partial pathname
       +88      23     [reserved, must be zero]
       +111     1      ProDOS 16 access code (high)
       +112     1      ProDOS 16 file type code (high)
       +113     1      ProDOS 16 storage type code (high)
       +114     2      ProDOS 16 size of file in blocks (high)
       +116     1      ProDOS 16 end-of-file position (high)
       +117     4      disk space needed
       +121     1      operating system type
       +122     2      native file type code
       +124     1      phantom file flag
       +125     1      data flags
       +126     1      Binary II version number
       +127     1      number of files to follow
Multi-byte numeric quantities are stored with their low-order bytes 
first, the same order expected by ProDOS. All reserved bytes must be 
set to zero; they may be used in future versions of the protocol.  
To determine the values of the attributes to be put into a file 
information header for a ProDOS file, you can use the ProDOS 
GET_FILE_INFO and GET_EOF MLI commands. 
   Note: Some file attributes returned by ProDOS 16 commands
         are one or two bytes longer than the attributes
         returned by the corresponding ProDOS 8 commands. At
         present, these extra bytes are always zero, and
         probably will remain zero forever. In any event,
         place the extra bytes returned by ProDOS 16 in the
         header at +114 to +119. ProDOS 8 communications
         programs should zero these header locations. 
The "disk space needed" bytes contain the number of 512-byte disk 
blocks the files inside the Binary II file will occupy after they've 
been removed from the Binary II file. (The format of a Binary II file 
containing multiple files is described below.) If the number is zero, 
the person uploading the file did not bother to calculate the space 
needed. The "disk space needed" must be placed in the file information 
header for the first file inside the Binary II file; it can be set to 
zero in subsequent headers. A downloading program can inspect "disk 
space needed" and abort the transfer immediately if there isn't enough 
disk free space. 
The value of the "operating system type" byte indicates the native 
operating system of the file: 
        $00 = ProDOS 8, ProDOS 16, or SOS
        $01 = DOS 3.3
        $02 = Pascal
        $03 = CP/M
        $04 = MS-DOS
Note that even if a file is not a ProDOS file, the attributes in the 
file information header, including the name, must be inserted in 
ProDOS form. Instructions on how to do this for DOS 3.3 files are 
given later in this document. Similar considerations apply for the 
files of other operating systems. 
The "native file type code" has meaning only if the "operating system 
type" is non-zero. It is set to the actual file type code assigned to 
the file by its native operating system. (Some operating systems, such 
as CP/M and MS-DOS, do not use file type codes, however.) Contrast 
this with the file type code at +4, which is the closest equivalent 
ProDOS file type code. The "native file type code" is needed to 
distinguish files which have the same *ProDOS* file type, but which 
may have different file types in their native operating system. Note 
that if the file type code is only byte long (the usual case), the 
high-order byte of "native file type code" is set to zero. 
The "phantom file flag" byte indicates whether a receiver of the 
Binary II file should save the file which follows (flag is zero) or 
ignore it (flag is non-zero). It is anticipated that some 
communications programs will use phantom files to pass non-essential 
explanatory notes or encoded information which would be understood 
only by a receiver using the same communications program. Such 
programs must not rely on receiving a phantom file, however, since 
this would mean they couldn't handle Binary II files created by other 
communications programs. 
The first two bytes in a phantom file *must* contain an ID code unique 
to the communications program. Developers must obtain ID codes from 
Gary Little to ensure uniqueness (see below for his address). Here is 
a current list of approved ID codes for phantom files used by Apple II 
communications programs: 
        $00 $00  =  [generic]
        $00 $01  =  Point-to-Point
        $00 $02  =  Tele-Master Communications System
Developers of communications programs are responsible for defining and 
publishing the structures of their phantom files. 
The ID bytes appear in the first two bytes of the phantom file. 
Phantom files having a generic ID code of zero must contain lines of 
text terminated by a $00 byte. The text must begin at the third byte 
in the file. 
The "data flags" byte is a bit vector indicating whether the data 
portion of the Binary II file has been compressed, encrypted, or 
packed. If bit 7 (the high-order bit) is set to 1, the file is 
compressed. If bit 6 is 1, the file is encrypted. If bit 0 is 1, the 
file is a sparse file that is packed. A Binary II downloading program 
can examine this byte and warn the user, when necessary, that the file 
must be expanded, decrypted, or unpacked. The person uploading a 
Binary II file may use any convenient method for compressing, 
encrypting, or packing the file but is responsible for providing 
instructions on how to restore the file to its original state. 
This initial release of Binary II has a "Binary II version number" of 
$00. 
Handling Multiple Files
-----------------------
An appealing feature of Binary II is that a single Binary II file can 
hold multiple disk files, making it easy to keep a group of related 
files "glued" together when they're sent to a host. 
The structure of a Binary II file containing multiple disk files is 
what you might expect: it is a series of images of individual Binary 
II files. For example, here is the general structure of a Binary II 
file containing three disk files: 
 start                                                           end
 -------------------------------------------------------------------
 | Header #1 | #1 Data | Header #2 | #2 Data | Header #3 | #3 Data |
 -------------------------------------------------------------------
   +127 = 2              +127 = 1              +127 = 0
The data areas following each header end on a 128-byte boundary. 
The "number of files to follow" byte (at offset 127) in the file 
information header for each disk file contains the number of disk 
files that follow it in the Binary II file. It will be zero in the 
header for the last disk file in the group. 
Filenames and Partial Pathnames
-------------------------------
Notice that you can put a standard ProDOS filename or a partial 
pathname in the file information header (but never a complete 
pathname). *Beware!* Don't use a partial pathname unless you've 
included, earlier on in the Binary II file, file information headers 
for each of the directories referred to in the partial pathname. Such 
a header must have its "end of file position" bytes set to zero, and 
no data blocks for the subdirectory file must follow it. 
For example, if you want to send a file whose partial pathname is 
HELP/GS/READ.ME, first send a file information header defining the 
HELP/ subdirectory, then one defining the HELP/GS/ subdirectory. If 
you don't, someone downloading the Binary II file won't be able to 
convert it because the necessary subdirectories will not exist. 
Filename Convention
-------------------
Whenever a file is sent to a host, the host asks the sender to provide 
a name for it. If it's a Binary II file, the name provided should end 
in .BNY so that its special form will be apparent to anyone viewing a 
list of filenames. 
Identifying Binary II Files
---------------------------
You can determine if a file is in Binary II form by examining the ID 
bytes at offsets +0, +1, +2, and +18 from the beginning of the file. 
They must be $0A, $47, $4C, and $02, respectively. 
Once you identify a Binary II file, you can use the data in the file 
information header to create and open a ProDOS file with the correct 
name and attributes (using the CREATE, OPEN and SET_FILE_INFO 
commands), transfer the file data in the Binary II file to the ProDOS 
file, set the ProDOS file size (with SET_EOF), then close the ProDOS 
file. You would repeat this for each file contained inside the Binary 
II file. 
   Note: The number of 128-byte data blocks following the
         file information header must be derived from the
         "end-of-file position" attribute (EOF) not the "size
         of file in blocks" attribute. Calculate the number
         by dividing EOF by 128 and adding one to the result
         if EOF is not 0 or an exact multiple of 128. 
Exception: If the file information header defines a subdirectory (the 
file type code is 15), simply CREATE the subdirectory file. Do not 
OPEN it and do not set its size with SET_EOF. 
Ideally, all this conversion work will be done automatically by a 
communications program during an Xmodem (or other binary protocol) 
download. If not, a separate conversion program will have to be run 
after the Binary II file has been received and saved to disk. Gary 
Little has published a public domain program, called BINARY.DWN, that 
will do this for you. (A related program, BINARY.UP, combines multiple 
ProDOS files into one Binary II file which can then be uploaded to a 
host.) 
DOS 3.3 Considerations
----------------------
With a little extra effort, you can also convert DOS 3.3 files to 
Binary II form. This involves translating the DOS 3.3 file attributes 
to the corresponding ProDOS attributes so that you can build a proper 
file information header. Here is how to do this: 
   (1) Set the name to one that adheres to the stricter ProDOS naming 
       rules. 
   (2) Set the ProDOS file type code, auxiliary type code, and access 
       code to values which correspond to the DOS 3.3 file type: 
          DOS 3.3  |   ProDOS     ProDOS    ProDOS
         file type | file type   aux type   access
        -----------|----------- ---------- --------
         $00 ( T)  | $04 (TXT)    $0000      $E3
         $80 (*T)  | $04 (TXT)    $0000      $21
         $01 ( I)  | $FA (INT)    $0C00      $E3
         $81 (*I)  | $FA (INT)    $0C00      $21
         $02 ( A)  | $FC (BAS)    $0801      $E3
         $82 (*A)  | $FC (BAS)    $0801      $21
         $04 ( B)  | $06 (BIN)     (*)       $E3
         $84 (*B)  | $06 (BIN)     (*)       $21
         $08 ( S)  | $06 (BIN)    $0000      $E3
         $88 (*S)  | $06 (BIN)    $0000      $21
         $10 ( R)  | $FE (REL)    $0000      $E3
         $90 (*R)  | $FE (REL)    $0000      $21
         $20 ( A)  | $06 (BIN)    $0000      $E3
         $A0 (*A)  | $06 (BIN)    $0000      $21
         $40 ( B)  | $06 (BIN)    $0000      $E3
         $C0 (*B)  | $06 (BIN)    $0000      $21
         (*) Set the aux type for a B file to the
             value stored in the first two bytes
             of the file (this is the default load
             address). 
    (3) Set the storage type code to $01.
    (4) Set the size of file in blocks, date of creation, date of 
        modification, time of creation, and time of modification to 
        $0000. 
    (5) Set the end-of-file position to the length of the DOS 3.3 
        file, in bytes. For a B file (code $04 or $84), this number is 
        stored in the third and fourth bytes of the file. For an I 
        file (code $01 or $81) or an A file (code $02 or $82), this 
        number is stored in the first and second bytes of the file. 
    (6) Set the operating system type to $01.
    (7) Set the native file type code to the value of the DOS 3.3 file 
        type code. 
Attribute bytes inside a DOS 3.3 file (if any) must *not* be included 
in the data portion of the Binary II file. This includes the first 
four bytes of a B (Binary) file, and the first two bytes of an A 
(Applesoft) or I (Integer BASIC) file. 
Acknowledgements
----------------
Thanks to Glen Bredon for suggesting that partial pathnames be allowed 
in file information headers. Thanks also to Shawn Quick for suggesting 
the "phantom file" byte, to Scott McMahan for suggesting the 
compression and encryption bits in the "data flags" byte, and to 
William Bond for suggesting the "disk space needed" bytes. Finally, a 
big thank you to Neil Shapiro, Chief Sysop of MAUG, for supporting the 
development of the Binary II format and helping it become a true 
standard. 
Feedback and Support
--------------------
Send any comments or questions concerning the Binary II file format 
to: 
   Gary B. Little
   #210 - 131 Water Street
   Vancouver, British Columbia
   Canada  V6B 4M3
   (604) 681-3371 
   CompuServe : 70135,1007
   Delphi     : GBL
   MCI Mail   : 658L6
Gary developed the Point-to-Point telecommunications program published 
by Pinpoint Publishing. He has also written several books on how to 
program Apple computers: "Inside the Apple IIe," "Inside the Apple 
IIc," "Apple ProDOS: Advanced Features for Programmers," and "Mac 
Assembly Language: A Guide for Programmers." He is currently a 
Contributing Editor for A+ magazine and writes A+'s monthly Rescue 
Squad column. Gary has also published articles in Nibble, Micro, Call 
-A.P.P.L.E, and Softalk.