dpz@klinzhai.RUTGERS.EDU (David P. Zimmerman) (01/16/87)
Hello all, Here is shar #2 of 2. Enjoy! dpz # This is a shell archive. # Remove everything above and including the cut line. # Then run the rest of the file through sh. #----cut here-----cut here-----cut here-----cut here----# #!/bin/sh # shar: Shell Archiver # Run the following text with /bin/sh to create: # Makefile # PORTING # README # TODO # tar.1 # tar.5 # This archive created: Thu Jan 15 21:01:40 1987 cat << \SHAR_EOF > Makefile # Makefile for public domain tar program. # @(#)Makefile 1.13 86/10/29 # Berserkeley version DEFS = -DBSD42 LIBS = LINTFLAGS = -abchx DEF_AR_FILE = \"/dev/rmt8\" DEFBLOCKING = 20 # USG version #DEFS = -DUSG #LIBS = -lndir #LINTFLAGS = -bx #DEF_AR_FILE = \"/dev/rmt8\" #DEFBLOCKING = 20 # UniSoft's Uniplus SVR2 with NFS #DEFS = -DUSG -DUNIPLUS -DNFS -DSVR2 #LIBS = -lndir #LINTFLAGS = -bx #DEF_AR_FILE = \"/dev/rmt8\" #DEFBLOCKING = 20 # V7 version #DEFS = -DV7 -Dvoid=int #LIBS = -lndir #LINTFLAGS = -abchx #DEF_AR_FILE = \"/dev/rmt8\" #DEFBLOCKING = 20 CFLAGS = $(COPTS) $(DEFS) \ -DDEF_AR_FILE=$(DEF_AR_FILE) \ -DDEFBLOCKING=$(DEFBLOCKING) # next line for Debugging #COPTS = -g # next line for Production COPTS = -O # Add things here like getopt, readdir, etc that aren't in your # standard libraries. SUBSRC= SUBOBJ= SRCS = tar.c create.c extract.c buffer.c getoldopt.c list.c names.c \ port.c $(SUBSRC) OBJS = tar.o create.o extract.o buffer.o getoldopt.o list.o names.o \ port.o $(SUBOBJ) AUX = README PORTING Makefile TODO tar.1 tar.5 tar.h port.h tar: $(OBJS) cc -o tar $(COPTS) $(OBJS) $(LIBS) lint: $(SRCS) lint $(LINTFLAGS) $(CFLAGS) $(SRCS) clean: rm -f errs *.o tar tar.shar: $(SRCS) $(AUX) shar >tar.shar $(AUX) $(SRCS) tar.tar.Z: $(SRCS) $(AUX) /bin/tar cf - $(AUX) $(SRCS) | compress -v >tar.tar.Z $(OBJS): tar.h port.h SHAR_EOF cat << \SHAR_EOF > PORTING Porting hints for public domain tar John Gilmore, ihnp4!hoptoad!gnu @(#)PORTING 1.5 86/10/29 The Makefile should be edited to comment out all the undesired versions, and create the following configuration lines for the system you are compiling it on: DEFS = the proper #define's to conditionally compile for your system. LIBS = the system libraries and/or object modules to link with the program. LINTFLAGS = a good strong way to invoke 'lint' on your system. DEF_AR_FILE = the name of the default archive file on your system. It should be enclosed in quoted quotes, e.g. \"/dev/foo\" . DEFBLOCKING = the default blocking factor on your system. A copy of "getopt", the standard argument parser, is required. It's in libc on Missed'em V systems and 4.3BSD; on most other systems, you'll need a copy of a public domain getopt, available through mod.sources or the AT&T Toolchest if you can't find it elsewhere. A copy of the Berkeley directory access routines is also required. These are in libc and <sys/dir.h> on Berkeley systems. A public domain version is available through mod.sources. There is an #include you have to change in create.c for this, to set the name of the include file you have. Some systems have the include file in <sys/ndir.h>. You'll have to find it on your system, or get the public domain one and place it somewhere. Grep for FIXME to find places that aren't finished or which have portability problems. Also see the file TODO. SHAR_EOF cat << \SHAR_EOF > README This is a public domain tar(1) replacement. It implements the 'c', 'x', and 't' commands of Unix tar, and many of the options. It creates P1003 "Unix Standard" [draft 6] tapes by default, and can read and write both old and new formats. It can decompress tar archives when reading them from disk files (using the 'z' option), but cannot do so when writing, or when reading from a tape drive. Its verbose output looks more like "ls -l" than the Unix tar, and even lines up the columns. It is a little better at reading damaged tapes than Unix tar. It is designed to be a lot more efficient than the standard Unix tar; it does as little bcopy-ing as possible, and does file I/O in large blocks. On the other hand, it has not been timed or performance-tuned; it's just *designed* to be faster. On the Sun, the tar archives it creates under the 'old' option are byte-for-byte the same as those created by /bin/tar, except the trash at the end of each file and at the end of the archive. It was written and initially debugged on a Sun Workstation running 4.2BSD. It has been run on Xenix, Unisoft, Vax 4.2BSD, V7, and USG systems. I'm interested in finding people who will port it to other types of (Unix and non-Unix) systems, use it, and send back the changes; and people who will add the obscure tar options that they happen to use and I don't. In particular, VMS, MSDOS, Mac, Atari and Amiga versions would be handy. It still has a number of loose ends, marked by "FIXME" comments in the source. For example, it does not chown() extracted files. Fixes to these things are also welcome. I am the author of all the code in this program. I hereby place it in the public domain. If you modify it, or port it to another system, please send me back a copy, so I can keep a master source. John Gilmore Nebula Consultants 1504 Golden Gate Ave. San Francisco, California, USA 94115 +1 415 931 4667 voice hoptoad!gnu data jgilmore@lll-crg.arpa data Hoptoad talks to sun, ptsfa, well, lll-crg, ihnp4, cbosgd, ucsfcgl, pyramid. @(#)README 1.5 86/10/29 SHAR_EOF cat << \SHAR_EOF > TODO @(#) TODO 1.6 86/10/29 Install new mkdir from the net for non-Berkeley systems. Look at SUID, SGID; look at -p and -m options. (test them). Handle owner/group on extraction. creation of links and symlinks doesn't follow the -k (f_keep) guidelines; if the file already exists, it is not replaced, even though no -k. Check stderr and stdout for errors after writing, and quit if so. Compression option to automatically pipe thru compress (both input&output). (Need a 3rd process to reblock compress's output for output case, and when reading from tape drives.) Preliminary design of Multifile option to handle EOFs on input and output. Multifile can just write EOF when it hits end of archive, and ask for archive to be changed. Start off 2nd archive medium with odd header block, duplicating original, but with offset to start of data spec'd. Reading such a header causes tar non-'M' to complain while extracting (but to seek there and do it anyway!) Big win -- this works on cartridge tapes, should work on floppies, might work on magtape. It would encourage the *&%#$ systems programmers to fix their drivers, too! Profile it and see where the time, call counts, etc are going. Test reading compressed tapes with odd blocksizes. (real tape drives, that is...) (may need buffer proc no matter what.) Fix directory timestamps after inserting files into them. Wait til next file that's not in the directory. Need a stack of them. Add option to delete N matching(?) chars from the front of a file to be extracted/listed. Great for reading tapes written with names starting from "/"... Option to seek the input file (in skip_file) rather than reading and tossing it? (Could just jump in buffer if stuff is in core.) Could misalign archive reads versus filesys and slow it down, who knows? Add -C option for creating from odd directories a la 4.2BSD? Break out odd bits of code into separate support modules. Add the r, u, X, l, F, C, and digit options of Unix tar. SHAR_EOF cat << \SHAR_EOF > tar.1 .TH TAR 1 "31 October 1986" .SH NAME tar \- tape (or other media) file archiver .SH SYNOPSIS \fBtar\fP \-[\fBBcDhikmopstvxzZ\fP] [\fB\-b\fP \fIN\fP] [\fB\-f\fP \fIF\fP] [\fB\-T\fP \fIF\fP] [ \fIfilename\fP\| .\|.\|. ] .SH DESCRIPTION \fItar\fP provides a way to store many files into a single archive, which can be kept in another Unix file, stored on an I/O device such as tape, floppy, cartridge, or disk, or piped to another program. It is useful for making backup copies, or for packaging up a set of files to move them to another system. .LP \fItar\fP has existed since Version 7 Unix with very little change. It has been proposed as the standard format for interchange of files among systems that conform to the P1003 ``Portable Operating System'' standard. .LP This version of \fItar\fP supports the extensions which were proposed in the P1003 draft standards, including owner and group names, and support for named pipes, fifos, and block and character devices. .LP When reading an archive, this version of \fItar\fP continues after finding an error. Previous versions required the `i' option to ignore checksum errors. .SH OPTIONS \fItar\fP options can be specified in either of two ways. The usual Unix conventions can be used: each option is preceded by `\-'; arguments directly follow each option; multiple options can be combined behind one `\-' as long as they take no arguments. For compatability with the Unix \fItar\fP program, the options may also be specified as ``keyletters,'' wherein all the option letters occur in the first argument to \fItar\fP, with no `\-', and their arguments, if any, occur in the second, third, ... arguments. Examples: .LP Normal: tar -f arcname -cv file1 file2 .LP Old: tar fcv arcname file1 file2 .LP At least one of the \fB\-c\fP, \fB\-t\fP, or \fB\-x\fP options must be included. The rest are optional. .LP Files to be operated upon are specified by a list of file names, which follows the option specifications (or can be read from a file by the \fB\-T\fP option). Specifying a directory name causes that directory and all the files it contains to be (recursively) processed. In general, specifying full path names when creating an archive is a bad idea, since when the files are extracted, they will have to be extracted into exactly where they were dumped from. Instead, \fIcd\fP to the root directory and use relative file names. .IP "\fB\-b\fP \fIN\fP" Specify a blocking factor for the archive. The block size will be \fIN\fP x 512 bytes. Larger blocks typically run faster and let you fit more data on a tape. The default blocking factor is set when \fItar\fP is compiled, and is typically 20. There is no limit to the maximum block size, as long as enough memory can be allocated for it, and as long as the device containing the archive can read or write that block size. .IP \fB\-B\fP When reading an archive, reblock it as we read it. Normally, \fItar\fP reads each block with a single \fIread(2)\fP system call. This does not work when reading from a pipe or network socket under Berkeley Unix. With this option, it will do multiple \fIread(2)\fPs until it gets enough data to fill the specified block size. \fB\-B\fP can also be used to speed up the reading of tapes that were written with small blocking factors, by specifying a large blocking factor with \fB\-b\fP and having \fItar\fP read many small blocks into memory before it tries to process them. .IP \fB\-c\fP Create an archive from a list of files. .IP \fB\-D\fP With each message that \fItar\fP produces, print the record number within the archive where the message occurred. This option is especially useful when reading damaged archives, since it helps to pinpoint the damaged section. .IP "\fB\-f\fP \fIF\fP" Specify the filename of the archive. If the specified filename is ``\-'', the archive is read from the standard input or written to the standard output. If this option is not used, a default archive name (which was picked when tar was compiled) is used. The default is normally set to the ``first'' tape drive or other transportable I/O medium on the system. .IP \fB\-h\fP When creating an archive, if a symbolic link is encountered, dump the file or directory to which it points, rather than dumping it as a symbolic link. .IP \fB\-i\fP When reading an archive, ignore blocks of zeros in the archive. Normally a block of zeros indicates the end of the archive, but in a damaged archive, or one which was created by appending several archives, this option allows \fItar\fP to continue. It is not on by default because there is garbage written after the zeroed blocks by the Unix \fItar\fP program. .IP \fB\-k\fP When extracting files from an archive, keep existing files, rather than overwriting them with the version from the archive. .IP \fB\-m\fP When extracting files from an archive, set each file's modified timestamp to the current time, rather than extracting each file's modified timestamp from the archive. .IP \fB\-o\fP When creating an archive, write an old format archive, which does not include information about directories, pipes, or device files, and specifies file ownership by uid's and gid's rather than by user names and group names. In most cases, a ``new'' format archive can be read by an ``old'' tar program without serious trouble, so this option should seldom be needed. .IP \fB\-p\fP When extracting files from an archive, restore them to the same permissions that they had in the archive. If \fB\-p\fP is not specified, the current umask limits the permissions of the extracted files. See \fIumask(2)\fP. .IP \fB\-t\fP List a table of contents of an existing archive. If file names are specified, just list files matching the specified names. .IP \fB\-s\fP When specifying a list of filenames to be listed or extracted from an archive, the \fB\-s\fP flag specifies that the list is sorted into the same order as the tape. This allows a large list to be used, even on small machines, because the entire list need not be read into memory at once. Such a sorted list can easily be created by running ``tar \-t'' on the archive and editing its output. .IP "\fB\-T\fP \fIF\fP" Rather than specifying the file names to operate on as arguments to the \fItar\fP command, this option specifies that the file names should be read from the file \fIF\fP, one per line. If the file name specified is ``\-'', the list is read from the standard input. This option, in conjunction with the \fB\-s\fP option, allows an arbitrarily large list of files to be processed, and allows the list to be piped to \fItar\fP. .IP \fB\-v\fP Be verbose about the files that are being processed or listed. Normally, archive creation or file extraction are silent, and archive listing just gives file names. The \fB\-v\fP option causes an ``ls \-l''\-like listing to be produced. .IP \fB\-x\fP Extract files from an existing archive. If file names are specified, just extract files matching the specified names, otherwise extract all the files in the archive. .IP "\fB\-z\fP or \fB\-Z\fP" When extracting or listing an archive, these options specify that the archive should be decompressed while it is read, using the \-d option of the \fIcompress(1)\fP program. The archive itself is not modified. .SH "SEE ALSO" shar(1), tar(5), compress(1), ar(1), arc(1), cpio(1), dump(8), restore(8), restor(8) .SH BUGS The \fBr, u, w, X, l, F, C\fP, and \fIdigit\fP options of Unix \fItar\fP are not supported. .LP It should be possible to create a compressed archive with the \fB\-z\fP option. .LP SHAR_EOF cat << \SHAR_EOF > tar.5 .TH TAR 5 "31 October 1986" .SH NAME tar \- tape (or other media) archive file format .SH DESCRIPTION A ``tar tape'' or file contains a series of records. Each record contains TRECORDSIZE bytes (see below). Although this format may be thought of as being on magnetic tape, other media are often used. Each file archived is represented by a header record which describes the file, followed by zero or more records which give the contents of the file. At the end of the archive file there may be a record filled with binary zeros as an end-of-file indicator. A conforming system must write a record of zeros at the end, but must not assume that an end-of-file record exists when reading an archive. The records may be blocked for physical I/O operations. Each block of \fIN\fP records (where \fIN\fP is set by the \fB\-b\fP option to \fItar\fP) is written with a single write() operation. On magnetic tapes, the result of such a write is a single tape record. When writing an archive, the last block of records shall be written at the full size, with records after the zero record containing undefined data. When reading an archive, a confirming system shall properly handle an archive whose last block is shorter than the rest. The header record is defined in the header file <tar.h> as follows: .nf .sp .5v .DT /* * Standard Archive Format - Standard TAR - USTAR */ #define RECORDSIZE 512 #define NAMSIZ 100 #define TUNMLEN 32 #define TGNMLEN 32 union record { char charptr[RECORDSIZE]; struct header { char name[NAMSIZ]; char mode[8]; char uid[8]; char gid[8]; char size[12]; char mtime[12]; char chksum[8]; char linkflag; char linkname[NAMSIZ]; char magic[8]; char uname[TUNMLEN]; char gname[TGNMLEN]; char devmajor[8]; char devminor[8]; } header; }; /* The checksum field is filled with this while the checksum is computed. */ #define CHKBLANKS " " /* 8 blanks, no null */ /* The magic field is filled with this if uname and gname are valid. */ #define TMAGIC "ustar " /* 7 chars and a null */ /* The linkflag defines the type of file */ #define LF_OLDNORMAL '\\0' /* Normal disk file, Unix compatible */ #define LF_NORMAL '0' /* Normal disk file */ #define LF_LINK '1' /* Link to previously dumped file */ #define LF_SYMLINK '2' /* Symbolic link */ #define LF_CHR '3' /* Character special file */ #define LF_BLK '4' /* Block special file */ #define LF_DIR '5' /* Directory */ #define LF_FIFO '6' /* FIFO special file */ #define LF_CONTIG '7' /* Contiguous file */ /* Further link types may be defined later. */ /* Bits used in the mode field - values in octal */ #define TSUID 04000 /* Set UID on execution */ #define TSGID 02000 /* Set GID on execution */ #define TSVTX 01000 /* Save text (sticky bit) */ /* File permissions */ #define TUREAD 00400 /* read by owner */ #define TUWRITE 00200 /* write by owner */ #define TUEXEC 00100 /* execute/search by owner */ #define TGREAD 00040 /* read by group */ #define TGWRITE 00020 /* write by group */ #define TGEXEC 00010 /* execute/search by group */ #define TOREAD 00004 /* read by other */ #define TOWRITE 00002 /* write by other */ #define TOEXEC 00001 /* execute/search by other */ .fi .LP All characters in header records are represented using 8-bit characters in the local variant of ASCII. Each field within the structure is contiguous; that is, there is no padding used within the structure. Each character on the archive medium is stored contiguously. Bytes representing the contents of files (after the header record of each file) are not translated in any way and are not constrained to represent characters or to be in any character set. The \fItar\fP(5) format does not distinguish text files from binary files, and no translation of file contents should be performed. The fields \fIname, linkname, magic, uname\fP, and \fIgname\fP are null-terminated character strings. All other fields are zero-filled octal numbers in ASCII. Each numeric field (of width \fIw\fP) contains \fIw\fP-2 digits, a space, and a null, except \fIsize\fP and \fImtime\fP, which do not contain the trailing null. The \fIname\fP field is the pathname of the file, with directory names (if any) preceding the file name, separated by slashes. The \fImode\fP field provides nine bits specifying file permissions and three bits to specify the Set UID, Set GID and Save Text (TSVTX) modes. Values for these bits are defined above. When special permissions are required to create a file with a given mode, and the user restoring files from the archive does not hold such permissions, the mode bit(s) specifying those special permissions are ignored. Modes which are not supported by the operating system restoring files from the archive will be ignored. Unsupported modes should be faked up when creating an archive; e.g. the group permission could be copied from the `other' permission. The \fIuid\fP and \fIgid\fP fields are the user and group ID of the file owners, respectively. The \fIsize\fP field is the size of the file in bytes; linked files are archived with this field specified as zero. The \fImtime\fP field is the modification time of the file at the time it was archived. It is the ASCII representation of the octal value of the last time the file was modified, represented as in integer number of seconds since January 1, 1970, 00:00 Coordinated Universal Time. The \fIchksum\fP field is the ASCII representaion of the octal value of the simple sum of all bytes in the header record. Each 8-bit byte in the header is treated as an unsigned value. These values are added to an unsigned integer, initialized to zero, the precision of which shall be no less than seventeen bits. When calculating the checksum, the \fIchksum\fP field is treated as if it were all blanks. The \fItypeflag\fP field specifies the type of file archived. If a particular implementation does not recognize or permit the specified type, the file will be extracted as if it were a regular file. As this action occurs, \fItar\fP issues a warning to the standard error. .IP "LF_NORMAL or LF_OLDNORMAL" represents a regular file. For backward compatibility, a \fItypeflag\fP value of LF_OLDNORMAL should be silently recognized as a regular file. New archives should be created using LF_NORMAL. Also, for backward compatability, \fItar\fP treats a regular file whose name ends with a slash as a directory. .IP LF_LINK represents a file linked to another file, of any type, previously archived. Such files are identified (in Unix) by each file having the same device and inode number. The linked-to name is specified in the \fIlinkname\fP field with a trailing null. .IP LF_SYMLINK represents a symbolic link to another file. The linked-to name is specified in the \fIlinkname\fP field with a trailing null. .IP "LF_CHR or LF_BLK" represent character special files and block special files respectively. In this case the \fIdevmajor\fP and \fIdevminor\fP fields will contain the major and minor device numbers respectively. Operating systems may map the device specifications to their own local specification, or may ignore the entry. .IP LF_DIR specifies a directory or sub-directory. The directory name in the \fIname\fP field should end with a slash. On systems where disk allocation is performed on a directory basis the \fIsize\fP field will contain the maximum number of bytes (which may be rounded to the nearest disk block allocation unit) which the directory may hold. A \fIsize\fP field of zero indicates no such limiting. Systems which do not support limiting in this manner should ignore the \fIsize\fP field. .IP LF_FIFO specifies a FIFO special file. Note that the archiving of a FIFO file archives the existence of this file and not its contents. .IP LF_CONTIG specifies a contiguous file, which is the same as a normal file except that, in operating systems which support it, all its space is allocated contiguously on the disk. Operating systems which do not allow contiguous allocation should silently treat this type as a normal file. .IP "`A' \- `Z'" are reserved for custom implementations. None are used by this version of the \fItar\fP program. .IP \fIother\fP values are reserved for specification in future revisions of the P1003 standard, and should not be used by any \fItar\fP program. .LP The \fImagic\fP field indicates that this archive was output in the P1003 archive format. If this field contains TMAGIC, then the \fIuname\fP and \fIgname\fP fields will contain the ASCII representation of the owner and group of the file respectively. If found, the user and group ID represented by these names will be used rather than the values contained within the \fIuid\fP and \fIgid\fP fields. User names longer than TUNMLEN-1 or group names longer than TGNMLEN-1 characters will be truncated. .SH "SEE ALSO" tar(1), ar(5), cpio(5) .SH BUGS Names or link names longer than NAMSIZ-1 characters cannot be archived. This format does not address multi-volume archives. .SH NOTES This manual page was adapted by John Gilmore from Draft 6 of the P1003 specification SHAR_EOF # End of shell archive exit 0 -- David P. Zimmerman "When I'm having fun, the world doesn't exist." Arpa: dpz@rutgers.rutgers.edu Uucp: ...{harvard | seismo | pyramid}!rutgers!dpz