[comp.sources.unix] v10i090: Copytape, a magtape xerox

rs@uunet.UU.NET (Rich Salz) (08/04/87)

Submitted-by: David S. Hayes <sundc!hqda-ai!merlin>
Posting-number: Volume 10, Issue 90
Archive-name: copytape

[  This is a magtape copy program, similar to what can be bought from
   SMI consulting services.  It won't work on SystemV until someone
   changes the ioctl() calls.  --r$  ]

	This program is offered without support.  If you find a
bug, feel free to post the fix, but I don't intend to keep a
centralized patch registry.  I find it to be reasonably bugfree,
but please read the man pages.  Good health,

	David S. Hayes, The Merlin of Avalon
	PhoneNet:	(202) 694-6900
	ARPA:		merlin%hqda-ai.uucp@brl.arpa
	UUCP:		...!seismo!sundc!hqda-ai!merlin

========== SHAR ARCHIVE STARTS HERE ==========
# This is a shell archive.  Remove anything before this line, then
# unpack it by saving it in a file and typing "sh file".  (Files
# unpacked will be owned by you and have default permissions.)
#
# This archive contains:
# Makefile copytape.1 copytape.5 copytape.c
#
#!/bin/sh
echo x - Makefile
cat > "Makefile" << '//E*O*F Makefile//'
MAN1 =	/usr/man/man1
MAN5 =	/usr/man/man5
BIN =	/usr/local/bin

all:	copytape man

install:	copytape
	install -s -m 0511 copytape ${BIN}

man:	man1 man5

man1:	${MAN1}/copytape.1
	cp copytape.1 ${MAN1}

man5:	${MAN5}/copytape.5
	cp copytape.5 ${MAN5}

copytape:	copytape.c
	cc -O -o copytape copytape.c
//E*O*F Makefile//

echo x - copytape.1
cat > "copytape.1" << '//E*O*F copytape.1//'
.TH COPYTAPE 1 "25 June 1986"
.\"@(#)copytape.1 1.0 86/07/08 AICenter; by David S. Hayes
.SH NAME
copytape \- duplicate magtapes
.SH SYNOPSIS
.B copytape
\[\-f\]
\[\-t\]
\[\-s\fInnn\fP\]
\[\-l\fInnn\fP\]
\[\-v\]
.I
\[input \[output\]\]
.SH DESCRIPTION
.LP
.I copytape
duplicates magtapes.  It is intended for duplication of
bootable or other non-file-structured (non-tar-structured)
magtapes on systems with only one tape drive.
.I copytape
is blissfully ignorant of tape formats.  It merely makes
a bit-for-bit copy of its input.
.PP
In normal use,
.I copytape
would be run twice.  First, a boot tape is copied to an
intermediate disk file.  The file is in a special format that
preserves the record boundaries and tape marks.  On the second
run, 
.I copytape
reads this file and generates a new tape.  The second step
may be repeated if multiple copies are required.  The typical
process would look like this:
.sp
.RS +.5i
tutorial% copytape /dev/rmt8 tape.tmp
.br
tutorial% copytape tape.tmp /dev/rmt8
.br
tutorial% rm tape.tmp
.RE
.PP
.I copytape
copies from the standard input to the standard output, unless
input and output arguments are provided.  It will automatically
determine whether its input and output are physical tapes, or
data files.  Data files are encoded in a special (human-readable)
format.
.PP
Since
.I copytape
will automatically determine what sort of thing its input
and output are, a twin-drive system can duplicate a tape in
one pass.  The command would be
.RS +.5i
tutorial% copytape /dev/rmt8 /dev/rmt9
.RE
.SH OPTIONS
.TP 3
.RI \-s nnn
Skip tape marks.  The specified number of tape marks are skipped
on the input tape, before the copy begins.  By default, nothing is
skipped, resulting in a copy of the complete input tape.  Multiple
tar(1) and dump(1) archives on a single tape are normally
separated by a single tape mark.  On ANSI or IBM labelled tapes,
each file has three associated tape marks.  Count carefully.
.TP 3
.RI \-l nnn
Limit.  Only nnn files (data followed by a tape mark), at most,
are copied.  This can be used to terminate a copy early.  If the
skip option is also specified, the files skipped do not count
against the limit.
.TP 3
\-f
>From tape.  The input is treated as though it were a physical
tape, even if it is a data file.  This option can be used
to copy block-structured device files other than magtapes.
.TP 3
\-t
To tape.  The output is treated as though it were a physical
tape, even if it is a data file.  Normally, data files mark
physical tape blocks with a (human\-readable) header describing
the block.  If the \-t option is used when the output is
actually a disk file, these headers will not be written.
This will extract all the information from the tape, but
.I copytape
will not be able to duplicate the original tape based on
the resulting data file.
.TP 3
\-v
Verbose.
.I copytape
does not normally produce any output on the control terminal.
The verbose option will identify the input and output files,
tell whether they are physical tapes or data files, and
announce the size of each block copied.  This can produce
a lot of output on even relatively short tapes.  It is
intended mostly for diagnostic work.
.SH FILES
/dev/rmt*
.SH "SEE ALSO"
ansitape(1), dd(1), tar(1), mtio(4), copytape(5)
.SH AUTHOR
David S. Hayes, Site Manager, US Army Artificial Intelligence Center.
Originally developed September 1984 at Rensselaer Polytechnic Institute,
Troy, New York.
Revised July 1986.  This software is in the public domain.
.SH BUGS
.LP
.I copytape
treats two successive file marks as logical end-of-tape.
.LP
The intermediate data file can consume huge amounts of
disk space.  A full 2400-foot reel can burn 50 megabytes.
This is not strictly speaking a bug, but users should
be aware of the possibility.  Check disk space with
.I df(1)
before starting
.IR copytape .
Caveat Emptor!
//E*O*F copytape.1//

echo x - copytape.5
cat > "copytape.5" << '//E*O*F copytape.5//'
.TH COPYTAPE 5  "8 August 1986"
.SH NAME
copytape \- copytape intermediate data file format
.SH DESCRIPTION
.I copytape
duplicates magtapes on single\-tape systems by making
an intermediate copy of the tape in a disk file.
This disk file has a special format that preserves
the block boundaries and tape marks of the original
physical tape.
.PP
Each block is preceded by a header identifying what
sort of block it is.  In the case of data blocks,
the length of the data is also given.  Each header is
on a separate text line, followed by a newline character.
.sp
.TP 3
CPTP:BLOCK \fInnnnn\fP
.ti -3
\fIdata\fP\\n
.sp
A data block is identified by the keyword
.IR BLOCK .
The length of the block is given in a five\-character
numeric field.  The field is zero\-padded on the left if
less than five characters are needed.  The header is
followed by a newline character.
The original data follows.  The data may have any characters
in it, since
.I copytape
uses an fread(3) to extract it.
The data is followed by a newline, to make the file easy
to view with an editor.
.TP 3
CPTP:TAPE_MARK
A tape mark was encountered in the original tape.
.TP 3
CPTP:END_OF_TAPE
When two consecutive tape marks are encountered,
.I copytape
treats the second as a logical end\-of\-tape.  On
output, both TAPE_MARK and END_OF_TAPE generate
a physical tape mark.
.I copytape
stops processing after copying an END_OF_TAPE.
.SH "SEE ALSO"
mtio(4)
.SH BUGS
Some weird tapes may not use two consecutive tape marks
as logical end\-of\-tape.
//E*O*F copytape.5//

echo x - copytape.c
cat > "copytape.c" << '//E*O*F copytape.c//'

/*
 * COPYTAPE.C
 *
 * This program duplicates magnetic tapes, preserving the
 * blocking structure and placement of tape marks.
 *
 * This program was updated at
 *
 *	U.S. Army Artificial Intelligence Center
 *	HQDA (Attn: DACS-DMA)
 *	Pentagon
 *	Washington, DC  20310-0200
 *
 *	Phone: (202) 694-6900
 *
 **************************************************
 *
 *	THIS PROGRAM IS IN THE PUBLIC DOMAIN
 *
 **************************************************
 *
 * July 1986		David S. Hayes
 *		Made data file format human-readable.
 *
 * April 1985		David S. Hayes
 *		Original Version.
 */


#include <stdio.h>
#include <sys/types.h>
#include <sys/ioctl.h>
#include <sys/mtio.h>
#include <sys/file.h>

extern int      errno;

#define BUFLEN		32768	/* max tape block size */
#define TAPE_MARK	-100	/* return record length if we read a
				 * tape mark */
#define END_OF_TAPE	-101	/* 2 consecutive tape marks */
#define FORMAT_ERROR	-102	/* data file munged */

int             totape = 0,	/* treat destination as a tape drive */
                fromtape = 0;	/* treat source as a tape drive */

int             verbose = 0;	/* tell what we're up to */

char           *source = "stdin",
               *dest = "stdout";

char            inbuf[BUFLEN],
                outbuf[BUFLEN],
                tapebuf[BUFLEN];

main(argc, argv)
    int             argc;
    char           *argv[];
{
    FILE           *from = stdin,
                   *to = stdout;/* our input and output */
    int             len;	/* number of bytes in record */
    int             skip = 0;	/* number of files to skip before
				 * copying */
    unsigned int    limit = 0xffffffff;
    int             i;
    struct mtget    status;

    for (i = 1; i < argc && argv[i][0] == '-'; i++) {
	switch (argv[i][1]) {
	  case 's':		/* skip option */
	    skip = atoi(&argv[i][2]);
	    break;

	  case 'l':
	    limit = atoi(&argv[i][2]);
	    break;

	  case 'f':		/* from tape option */
	    fromtape = 1;
	    break;

	  case 't':		/* to tape option */
	    totape = 1;
	    break;

	  case 'v':		/* be wordy */
	    verbose = 1;
	    break;

	  default:
	    fprintf(stderr, "usage: copytape [-f] [-t] [-lnn] [-snn] [-v] from to\n");
	    exit(-1);
	}
    }

    if (i < argc) {
	from = freopen(argv[i], "r", stdin);
	source = argv[i];
	if (from == NULL) {
	    perror("copytape: input open failed");
	    exit(-1);
	}
	i++;;
    }
    if (i < argc) {
	to = freopen(argv[i], "w", stdout);
	dest = argv[i];
	if (to == NULL) {
	    perror("copytape: output open failed");
	    exit(-1);
	}
	i++;
    }
    if (i < argc)
	perror("copytape: extra arguments ignored");

    /*
     * Determine if source and/or destination is a tape device. Try to
     * issue a magtape ioctl to it.  If it doesn't error, then it was a
     * magtape. 
     */

    errno = 0;
    ioctl(fileno(from), MTIOCGET, &status);
    fromtape |= errno == 0;
    errno = 0;
    ioctl(fileno(to), MTIOCGET, &status);
    totape |= errno == 0;
    errno = 0;

    if (verbose) {
	fprintf(stderr, "copytape: from %s (%s)\n",
		source, fromtape ? "tape" : "data");
	fprintf(stderr, "          to %s (%s)\n",
		dest, totape ? "tape" : "data");
    }
    /* Establish buffering for our IO. */

    if (fromtape)
	setbuffer(from, inbuf, BUFLEN);
    if (totape)
	setbuffer(to, outbuf, BUFLEN);


    /*
     * Skip number of files, specified by -snnn, given on the command
     * line. This is used to copy second and subsequent files on the
     * tape. 
     */

    if (verbose && skip) {
	fprintf(stderr, "copytape: skipping %d input files\n", skip);
    }
    for (i = 0; i < skip; i++) {
	do {
	    len = input(from);
	} while (len != TAPE_MARK && len != END_OF_TAPE);
	if (len == END_OF_TAPE) {
	    fprintf(stderr, "copytape: only %d files in input\n", i);
	    exit(-1);
	}
    }

    /*
     * Do the copy. 
     */

    while (limit && len != END_OF_TAPE) {
	do {
	    do {
		len = input(from);
		if (len == FORMAT_ERROR)
		    perror("copytape: data format error - block ignored");
	    } while (len == FORMAT_ERROR);

	    output(to, len);

	    if (verbose) {
		switch (len) {
		  case TAPE_MARK:
		    fprintf(stderr, "  copied TAPE_MARK\n");
		    break;

		  case END_OF_TAPE:
		    fprintf(stderr, "  copied END_OF_TAPE\n");
		    break;

		  default:
		    fprintf(stderr, "  copied %d bytes\n", len);
		}
	    }
	} while (len > 0);
	limit--;
    }
    exit(0);
}


/*
 * Input up to 32K from a file or tape. If input file is a tape, then
 * do markcount stuff.  Input record length will be supplied by the
 * operating system. 
 *
 * If input file is not a tape, then read an integer from the input.  This
 * integer will specify the number of bytes that comprise the following
 * record, not counting the length of the integer itself. 
 *
 * An input length of zero corresponds to a tape mark. We return the
 * number of bytes read.  If we hit logical EOT (two tape marks in a
 * row), then return -1. 
 */

input(fp)
    FILE           *fp;
{
    static          markcount = 0;	/* number of consecutive tape
					 * marks */
    int             len,
                    l2;
    char            header[80],
                    flag[20];

    if (fromtape) {
	len = read(fileno(fp), tapebuf, BUFLEN);
	if (len == 0) {
	    if (++markcount == 2)
		return END_OF_TAPE;
	    else
		return TAPE_MARK;
	}
	markcount = 0;		/* reset tape mark count */
	return len;
    }
    l2 = fscanf(fp, "%4s:%s", flag, header);
    if (l2 < 2 || strcmp(flag, "CPTP") != 0)
	return FORMAT_ERROR;

    if (strcmp(header, "BLOCK") == 0) {
	fscanf(fp, "%d\n", &len);	/* how long is the block? */
	l2 = fread(tapebuf, sizeof(char), len, fp);
	if (l2 < len)		/* did we get it all? */
	    return FORMAT_ERROR;
    } else if (strcmp(header, "TAPE_MARK") == 0)
	return TAPE_MARK;
    else if (strcmp(header, "END_OF_TAPE") == 0)
	return END_OF_TAPE;
    else
	return FORMAT_ERROR;

    return len;
}


/*
 * Copy a buffer out to a file or tape. 
 *
 * If output is a tape, write the record.  A length of zero indicates that
 * a tapemark should be written. 
 *
 * If not a tape, write len to the output file, then the buffer.  
 */

output(fp, len)
    FILE           *fp;
    int             len;
{
    struct mtop     op;

    if (totape && (len == TAPE_MARK || len == END_OF_TAPE)) {
	op.mt_op = MTWEOF;
	op.mt_count = 1;
	ioctl(fileno(fp), MTIOCTOP, &op);
	return;
    }
    if (!totape) {
	switch (len) {
	  case TAPE_MARK:
	    fprintf(fp, "CPTP:TAPE_MARK\n");
	    break;

	  case END_OF_TAPE:
	    fprintf(fp, "CPTP:END_OF_TAPE\n");
	    break;

	  case FORMAT_ERROR:
	    break;

	  default:
	    fprintf(fp, "CPTP:BLOCK %05d\n", len);
	    fwrite(tapebuf, sizeof(char), len, fp);
	    putc('\n', fp);
	}
    } else
	write(fileno(fp), tapebuf, len);
}
//E*O*F copytape.c//

echo Possible errors detected by \'wc\' [hopefully none]:
temp=/tmp/shar$$
trap "rm -f $temp; exit" 0 1 2 3 15
cat > $temp <<\!!!
      19      40     303 Makefile
     120     643    3860 copytape.1
      50     259    1515 copytape.5
     297     989    6795 copytape.c
     486    1931   12473 total
!!!
wc  Makefile copytape.1 copytape.5 copytape.c | sed 's=[^ ]*/==' | diff -b $temp -
exit 0
-- 
	David S. Hayes, The Merlin of Avalon
	PhoneNet:	(202) 694-6900
	ARPA:		merlin%hqda-ai.uucp@brl.arpa
	UUCP:		...!seismo!sundc!hqda-ai!merlin

-- 

Rich $alz			"Anger is an energy"
Cronus Project, BBN Labs	rsalz@bbn.com
Moderator, comp.sources.unix	sources@uunet);
=='HD