jdb@mordor.UUCP (John Bruner) (11/19/84)
The S-1 Project at the Lawrence Livermore National Laboratory is porting UNIX to our own machine, the S-1 Mark IIA. One problem that we're currently trying to solve is the implementation of "tar". The crucial facts are: 1) The S-1 memory is organized into 36-bit words (addressable in 9-bit quarterwords). **Sigh.** 2) On the S-1, characters are nine bits and are stored one per quarterword. 3) UNIX does not distinguish file types (e.g. character vs. binary). The problem is this: we want to be able to read/write "tar" tapes containing ASCII text files on both the VAX and the S-1. The "obvious" mapping is for the S-1 to associate each 8-bit byte with the low-order 8 bits of a 9-bit quarterword, discarding or zero-filling the uppermost bit in the quarterword as appropriate. A different mapping is required for binary files (because the ninth bit is significant): the S-1 packs 9-bit quarterwords into 8-bit bytes. (There is hardware support for this conversion operation.) The issue is that, in order for the VAX to read S-1 text files and vice versa, text files must be stored using a different representation than binary files. There is no reliable way to determine whether a file should be "text" or "binary" when the tape is written, and no field in the "tar" header for recording this information even if the writer could reliably figure it out. If all files on the "tar" tape are stored with 9-bit quarterwords packed into 8-bit bytes, text files on the "tar" tape are unusable on the VAX. (Of course, we have programs which will pack/unpack them, but this must be done manually and it is a real hassle.) I don't want to define an incompatible "tar" format for the S-1. I have used UNIX systems for M68000's which write tapes with byte reversal problems so that I could not read them directly on our VAX (it was necessary to pipe the input through "dd conv=swab"), and I feel that the intent of "tar" format is to provide a standard means for information exchange. At this point, though, I can't think of any alternatives to this approach. P.S. Our next machine will have 32-bit words, but it will also have hardware tags. An image copy of a file on tape will include both the 32-bit data and a 4-bit tag (probably stored in a fifth byte). While the 9/8-bit packing problems will go away, the key problem still remains: a "tar" text file should contain only characters (not tags), so binary files and text files must be stored in a different format. I don't see how to do this with the current "tar" definition. -- John Bruner (S-1 Project, Lawrence Livermore National Laboratory) MILNET: jdb@mordor.ARPA [jdb@s1-c] (415) 422-0758 UUCP: ...!ucbvax!dual!mordor!jdb ...!decvax!decwrl!mordor!jdb
sdo@u1100a.UUCP (Scott Orshan) (11/26/84)
The Sperry 1100 has 36 bit words and 9 bit bytes (quarter-words) just like the S1. When we make tar tapes we give the user the option of specifying whether it is a Sperry-to-Sperry tape or a normal tape. The normal tape is a text tape - only the low order 8 bits of each byte are stored. It is transportable to a vax, etc., and can be read back on the 1100 as well. The Sperry-to-Sperry tape uses the 9-to-8 bit packing scheme to put 8 quarter words into 9 tape-bytes. It allows a tape to be made with binary files in it, but it can only be used on another 1100 (or maybe an S1). -- Scott Orshan Bell Communications Research 201-981-3064 {ihnp4,allegra,pyuxww}!u1100a!sdo