jc@cdx39.UUCP (John Chambers) (12/24/86)
Well, so many people said "No, I don't know how, but I want the code when you get it working" that I did it and I'm posting it. This is a first try, and has only been tested on some SYS/V machines, so it probably won't quite work everywhere. The problem is: Copy a whole heirarchy of files from one machine to another, using limited-capacity file-transfer utilities like UUCP. Simple variants could handle other file-transfer utilities like kermit or xmodem. The hard part is arranging for all the multiply-linked files to get there correctly, multiply linked in the same way. UUCP doesn't like to do this. What I've done is written something that creates a lot of file lists (called "list_*"), such that multiply-linked files are all in the same list, and the total size of the files in each list is less than the environment variable UUCPMAX. The lists are used to generate cpio archives ("cpio_*), which are then uucp'd to the destination. Try it out, and tell me what's wrong with it. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - : This is a shar archive. Extract with sh, not csh echo file: Makefile cat > Makefile << '\!Funky\!Stuff\!' D=/usr/lib/uucp L=ln all: filefacts filelists install:$D/filefacts $D/filelists $D/uucptree $D/uucptree.sed $D/filefacts: filefacts; $L filefacts $D $D/filelists: filelists; $L filelists $D $D/uucptree: uucptree; $L uucptree $D $D/uucptree.sed:uucptree.sed; $L uucptree.sed $D filefacts: filefacts.c; cc -o filefacts filefacts.c filelists: filelists.c; cc -o filelists filelists.c S=Makefile README RUN_ME filefacts.c filelists.c uucptree uucptree.sed uucptree.1 uucptree.shar: $S; shar $S >uucptree.shar clean: ; rm filefacts filelists \!Funky\!Stuff\! echo file: README cat > README << '\!Funky\!Stuff\!' This directory contains the 'uucptree' script and associated programs. The purpose of this script is to generate a set of reasonably small cpio archives, and uucp them somewhere. They may then be unpacked at the receiving end, to reconstruct a set of file trees. The uucptree script is called as: uucptree directory... destination where one or more directories (or files) may be listed, and the destination is a uucp path to a directory. The result will be a set of cpio archives at the destination which, when unpacked, will reconstruct the original directories and all their contents. There are two C programs used: filefacts and filelist. The first takes a list of files (generated by 'find' and produces the same list with information about their device, inode, and size. This list is then sorted, resulting in multiply-linked files ending up together. The sorted list is fed to filelist, which produces a series of lists: list_1, list_2, .... Each one is just big enough to total $UUCPMAX, an environment variable that defaults to '1M', or 1 Megabyte. The resulting lists are fed to cpio, to produce the files: cpio_1, cpio_2, .... Each of these is then uucp'd to the specified destination. Note that this script leaves behind a set of files in the current directory named "list_*" and "cpio_*". You might wish to delete them when you have verified that the uucps have completed, since they will occupy a fair amount of space. The Makefile is set up so that you can just type: make install and everything will be compiled and installed in a default directory (/usr/lib/uucp). You might want to examine the Makefile and the *.c files first, to see if there's anything you want to change for your system. To test it, try typing a command like: setenv UUCPMAX 50000 sh -x uucptree p sh csh i somewhere!~someone > & audit & When this terminates, you should have a lot of "cpio_*" files that are mostly around 50K bytes, and a lot of uucp copies in the hopper for somewhere!~someone; log into somewhere and see if they all get there OK, then unpack them with cpio. Go back to the first system and type: rm list_* cpio_* This code was developed and tested on a reasonably generic Unix SYS/V system. If you have problems with non-portable code, you might send patches to the author: John M Chambers Phone: 617/364-2000x7304 Email: ...{adelie,bu-cs,harvax,inmet,mcsbos,mit-eddie,mot[bos]}!cdx39!{jc,news,root,usenet,uucp} Smail: Codex Corporation; Mailstop C1-30; 20 Cabot Blvd; Mansfield MA 02048-1193 Clever-Saying: If we can't fix it, it ain't broke. \!Funky\!Stuff\! echo file: RUN_ME cat > RUN_ME << '\!Funky\!Stuff\!' UL=/usr/lib/uucp make all make install \!Funky\!Stuff\! echo file: filefacts.c cat > filefacts.c << '\!Funky\!Stuff\!' /* filefacts <filelist [-option]... ** ** The standard input should be a list of file names. ** For each file, a line of output is produced in the form: ** HHHH DDDD IIII size filename ** ** This data is intended to be used with filelists(1), to ** chop a single list of files into a lot of little lists, ** each of which totals less than N bytes. ** ** BUGS: files which have disappeared are ignored, and ** their names are not written to the output file. ** ** Directories are not treated specially; perhaps they should be. */ #include <stdio.h> #include <sys/types.h> #include <sys/stat.h> #define D1 if(debug>=1)pmsg #define D2 if(debug>=2)pmsg #define D3 if(debug>=3)pmsg #define D4 if(debug>=4)pmsg #define D5 if(debug>=5)pmsg #define D6 if(debug>=6)pmsg #define D7 if(debug>=7)pmsg #define D8 if(debug>=8)pmsg #define D9 if(debug>=9)pmsg #define E pmsg #define NMAX 1000 /* Longest filename we can handle */ int debug = 1; extern errno; char nbuf[NMAX+1]; /* Place to build file names */ char *na = nbuf; /* Start of name buffer */ char *np = nbuf; /* Next char in name buffer */ char *nz = nbuf+NMAX; /* End of name buffer */ int outf = -1; /* File number of output file */ char *progname = "?"; /* This program's name */ long total = 0; /* Number of blocks so far */ main(ac,av) int ac; char **av; { int a, args, c, i; char *cp; progname = av[0]; args = 0; for (a=1; a<ac; a++) { switch (c = av[a][0]) { case '-': /* -option */ D4("main:option \"%s\"",av[a]); switch (av[a][1]) { case 'v': case 'V': case 'd': case 'D': i = sscanf(av[a]+2,"%d",&debug); if (i < 1) debug = 2; break; default: E("Unknown option \"%s\" ignored.",av[a]); } break; default: E("Extra arg \"%s\" ignored.",av[a]); } } np = na; /* Start of name buffer */ c = ' '; while (c != EOF) { c = getchar(); switch(c) { /* What sort of char is it? */ default: /* Most are just part of filename */ if (np < nz) { *np++ = c; *np = 0; D9("main: name=\"%s\"",na); } else { fprintf(stderr,"Name too long: \"%s%c",na,c); while ((c = getchar()) != EOF && c != '\n') putc(c,stderr); putc('\n',stderr); fflush(stderr); continue; } break; case EOF: /* List of possible filename terminators */ case ' ': case '\t': case '\n': case '\r': case '\0': if (np <= na) { D3("Null name ignored."); break; } *np = 0; D6("before onefile(\"%s\")",na); onefile(na); D6("after onefile(\"%s\")",na); np = na; } } exit(0); } help() { fprintf(stderr,"Usage: %s <filelist\n",progname); } /* Given one file name, figure out whether it will fit onto the current ** dump tape. If not, go on to the output file for the next tape. */ onefile(name) char *name; { struct stat status; long size; int i; int dev, ino; D5("onefile(\"%s\")",name); if (stat(name,&status) < 0) { E(" Can't access \"%s\" [errno=%d]",name,errno); return 0; } size = status.st_size; dev = status.st_dev ; ino = status.st_ino ; printf("%5d %5d %8ld %s\n",dev,ino,size,name); } pmsg(fmt,x0,x1,x2,x3,x4,x5,x6,x7,x8,x9) char *fmt; { fprintf(stdout,"%s:",progname); fprintf(stdout,fmt,x0,x1,x2,x3,x4,x5,x6,x7,x8,x9); fprintf(stdout,"\n"); fflush( stdout); } \!Funky\!Stuff\! echo file: filelists.c cat > filelists.c << '\!Funky\!Stuff\!' /* filelists <filelist [prefix] [-limit] [-option]... ** ** The standard input should be a list of file names. ** The list is divided up into sublists, each of less ** than 25Mbytes blocks total, and written to tape1, ..., ** to produce lists for a set of 1-tape dumps. The ** return value is the number of tape* files written. ** ** To ensure dump tapes without initial '/' in the names, ** this program strips off any initial '/' it sees. ** ** BUGS: files which have disappeared are ignored, and ** their names are not written to the output file. */ #include <stdio.h> #include <sys/types.h> #include <sys/stat.h> #define FUDGEA 512 /* Extra space needed per archive by cpio */ #define FUDGEF 128 /* Extra space needed per file by cpio */ #define NMAX 128 /* Longest filename cpio can handle */ #define D1 if(debug>=1)pmsg #define D2 if(debug>=2)pmsg #define D3 if(debug>=3)pmsg #define D4 if(debug>=4)pmsg #define D5 if(debug>=5)pmsg #define D6 if(debug>=6)pmsg #define D7 if(debug>=7)pmsg #define D8 if(debug>=8)pmsg #define D9 if(debug>=9)pmsg #define E pmsg long limit = 1000000; /* Max chars defaults to 1Mbytes */ int debug = 1; int dev = -1; /* Device number of current file */ extern errno; /* Unix error status */ char fbuf[NMAX+1]; /* Place to build output file name */ int files = 0; /* Count of files in the current list */ int filnum = 0; /* Current output file number */ extern char*getenv(); /* For extracting UUCPMAX from environment */ int ino = -1; /* Inode number of current file */ char nbuf[NMAX+1]; /* Place to build file names */ char *na = nbuf; /* Start of name buffer */ char *np = nbuf; /* Next char in name buffer */ char *nz = nbuf+NMAX; /* End of name buffer */ int olddev = -1; /* Device number of previous file */ int oldino = -1; /* Inode number of previous file */ int outf = -1; /* File number of output file */ char *prefix = "list_"; /* Output file names start with this */ char *progname = "?"; /* This program's name */ long total = 0; /* Number of blocks so far */ main(ac,av) int ac; char **av; { int a, args, c, i; char *cp; long siz; progname = av[0]; if (cp = getenv("UUCPMAX")) { getlimit(cp); D1("UUCPMAX=%ld",limit); } args = 0; for (a=1; a<ac; a++) { switch (c = av[a][0]) { case '-': /* -option */ D4("main:option \"%s\"",av[a]); switch (av[a][1]) { case 'v': case 'V': case 'd': case 'D': i = sscanf(av[a]+2,"%d",&debug); if (i < 1) debug = 2; break; default: E("Unknown option \"%s\" ignored.",av[a]); break; } break; case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': cp = av[a]; getlimit(av[a]); break; default: /* Arg without '-' or digit is prefix */ switch(args++) { /* We only want one of them */ case 0: prefix = av[a]; D3("main:prefix=\"%s\"",prefix); break; default: E("Extra arg \"%s\" ignored.",av[a]); } } } D2("limit=%ld prefix=\"%s\"",limit,prefix); D6("main:before newoutfile()"); newoutfile(); D6("main: after newoutfile()"); np = na; c = ' '; olddev = -1; oldino = -1; while ((i = scanf("%d %d %ld %s",&dev,&ino,&siz,nbuf)) > 0) { if (i == 4) { D5("dev=%d ino=%5d size=%6ld '%s'",dev,ino,siz,nbuf); D6("before onefile(%ld,\"%s\") total=%ld",siz,na,total); onefile(siz,na); D6("after onefile(%ld,\"%s\") total=%ld",siz,na,total); np = na; } else { E("Invalid line in input, only %d fields.",i); } } D1("onefile: Total=%ld > limit=%ld; finishing list %d.",total,limit,filnum); exit(filnum); /* Return the number of tapes required */ } help() { fprintf(stderr,"Usage: %s <filelist [blocklimit [prefix]]\n",progname); } /* Close the current output file and start a new one. */ newoutfile() { int i; D5("newoutfile()"); ++filnum; files = 0; sprintf(fbuf,"%s%d\0",prefix,filnum); D2("New output file %d = \"%s\"",filnum,fbuf); D6("newoutfile:before close(%d)",outf); i = close(outf); D6("newoutfile: after close(%d)=%d\t[errno=%d]",outf,i,errno); D6("newoutfile:before creat(\"%s\",0%o)",fbuf,0666); outf = creat(fbuf,0666); D6("newoutfile: after creat(\"%s\",0%o)=%d\t[errno=%d]",fbuf,0666,outf,errno); total = FUDGEA; return outf; } /* Given one file name, figure out whether it will fit onto the current ** dump tape. If not, go on to the output file for the next tape. Note ** the 'files' variable, to get around a logical problem: if a file is ** listed which is bigger than limit, we would produce an infinite number ** of empty lists. If such a file occurs, it is allowed as the first ** name in a list. The eventual result will be a 1-file cpio archive. */ onefile(siz,name) long siz; char *name; { struct stat status; long size; int i; char newfl; D5("onefile(\"%s\")",name); if (stat(name,&status) < 0) { /* Paranoia: validate the size */ E(" Can't access \"%s\" [errno=%d]",name,errno); return 0; } size = status.st_size; if (siz != size) E("Size changed from %ld to %ld for '%s'",siz,size,name); newfl = 1; if (dev == olddev && ino == oldino) { newfl = 0; D2("Link: dev=%4d ino=%5d '%s'",dev,ino,name); /* ** There are versions of cpio that don't fully understand ** multipy-linked files. These versions will include many ** copied of the linked file in the archive, although only ** one copy is necessary. If your cpio behaves this way ** (which may be determined by making a toy archive from ** two linked files and examining a dump of the result), ** you should comment out the following line. If your ** cpio produces only the link names, use this command. */ /* size = FUDGEF; /* Treat links as special */ } size += strlen(name) + FUDGEF; total += size; D2("size=%5ld total=%8ld limit=%8ld name='%s'",size,total,limit,name); if (newfl && total > limit && files > 0) { D1("onefile: Total=%ld > limit=%ld; finishing list %d.",total,limit,filnum); D6("onefile:before newoutfile()"); i = newoutfile(); D6("onefile: after newoutfile()=%d"); D2("size=%5ld total=%8ld limit=%8ld name=%s",size,total,limit,name); } write(outf,name,strlen(name)); write(outf,"\n",1); ++files; /* File counter to prevent infinite loops */ olddev = dev; oldino = ino; } pmsg(fmt,x0,x1,x2,x3,x4,x5,x6,x7,x8,x9) char *fmt; { fprintf(stdout,"%s:",progname); fprintf(stdout,fmt,x0,x1,x2,x3,x4,x5,x6,x7,x8,x9); fprintf(stdout,"\n"); fflush( stdout); } getlimit(cp) char *cp; { int c; limit = 0L; while (c = *cp++) { switch(c) { case '0': case '1': case '2': case '3': case '4': case '5': case '6': case '7': case '8': case '9': limit = (limit * 10) + (c - '0'); break; case 'k': case 'K': limit *= 1000; break; case 'm': case 'M': limit *= 1000000; break; default: E("Invalid char '%c' in limit",c); help(); break; case ',': case '_': break; } } } \!Funky\!Stuff\! echo file: uucptree cat > uucptree << '\!Funky\!Stuff\!' : # uucp [dir]... dest # # This script generates a list of all the files in the # named directories, runs them through some filtering # programs to divide them up into lists, each of whose # total size is less than a threshold, creates a set # of cpio archives, and uucps them to the destination. # # When this script ends, the current directory will contain # files called "list_*" and "cpio_*", which are the file # lists and cpio archives; uucp commands will have been # submitted to copy the archives to 'dest'. # # The 'uucptree.sed' script is used to edit the list; # a default is kept in /usr/lib/uucp, but one in the # current directory or $HOME will be used first. # # The destination must be a valid uucp path to a directory; # all the cpio archives will be put into that directory, # which must be writable by uucp. # CD=`pwd` UL=/usr/lib/uucp T=/tmp if [ $UUCPMAX'.' = '.' ] ; then UUCPMAX=1M ; fi echo "|-------UUCPMAX:" $UUCPMAX # # Generate set of files list_* that contain the names # of all the files under the named directories, divided # up so that the total sizes of the files in each list # is less than $UUCPMAX [default = 1Mbyte]. The files # will be sorted by device and inode number. # # The uucptree.sed script may be used to delete unwanted # files, such as *.bak, core, etc. if [ -f $CD/uucptree.sed ] ; then S=$CD/uucptree.sed elif [ -f $HOME/uucptree.sed ] ; then S=$HOME/uucptree.sed elif [ -f /usr/local/uucptree.sed ] ;then S=/usr/local/uucptree.sed elif [ -f $UL/uucptree.sed ] ; then S=$UL/uucptree.sed else S=/dev/null # No sed script. fi echo "|-------------S:" $S if [ $# -lt 2 ] then echo "Usage:" $0 "directory... destination" fi echo "|----------Args:" $* echo "|-----------Env:" ;env echo Creating the list... create $T/$$_A while [ $# -gt 1 ] do find $1 -print | filefacts >> $T/$$_A shift done echo Editing the list... sed < $T/$$_A -f $S \ | sort \ | uniq > $T/$$_D rm $T/$$_A # # Finally, we can chop the list up into bite-sized portions. echo "Dividing into "$UUCPMAX"-byte lists..." filelists <$T/$$_D -d2 $UUCPMAX N=$? echo "Note: " $N "file list(s)." rm $T/$$_D # # Given a set of file lists "list_*", this # script packages each with 'cpio'. # echo Building cpio archives... I=0 while [ $I -lt $N ] do I=`expr $I + 1` cpio -oa <list_$I >cpio_$I done # # Send our cpio files to specified user on another system. echo Sending cpio archives... I=0 while [ $I -lt $N ] do I=`expr $I + 1` uucp cpio_$I $1 done echo $0 done. exit 0 \!Funky\!Stuff\! echo file: uucptree.sed cat > uucptree.sed << '\!Funky\!Stuff\!' \|/core$|d \|\.bak$|d \|\.ckp$|d \|-$|d \|/dev/|d \|\#|d \|^/*|s/// \!Funky\!Stuff\! echo file: uucptree.1 cat > uucptree.1 << '\!Funky\!Stuff\!' .TH UUCPTREE 1 .VE 0 .SH NAME uucptree \- copy directories via uucp .SH SYNOPSIS .B uucptree .I dir... uucppath .SH DESCRIPTION .PP .I Uucptree copies the named directories with all their files and subdirectories to the .I uucppath , which should be a directory. .PP The file list is sorted and broken into sublists such that multiply-linked files are in the same sublist. The sublists are then used to generate a set of cpio archives, each of which is sent to the destination with a separate uucp command. .FI This script creates files list_1, list2, ..., which contain the file names for each uucp. Then it creates cpio_1, cpio_2, ..., which are the uucp archives. When the uucps have finished, you should remove these files. .PP The upper bound on the size of each cpio archive is limited by the environment variable UUCPMAX, which defaults to 1 Mbyte. .SH SEE ALSO tar(1), cpio(1), cptree(1) .BU Multiply-linked files may result in a cpio archive that is bigger than the UUCPMAX limit. Furthermore, some versions of cpio put multiple copies of linked files into the archives. .PP File ownership on the recieving end is a difficult question. \!Funky\!Stuff\! -- John M Chambers Phone: 617/364-2000x7304 Email: ...{adelie,bu-cs,harvax,inmet,mcsbos,mit-eddie,mot[bos]}!cdx39!{jc,news,root,usenet,uucp} Smail: Codex Corporation; Mailstop C1-30; 20 Cabot Blvd; Mansfield MA 02048-1193 Clever-Saying: If we can't fix it, it ain't broke.
ksh@scampi.UUCP (Kent S. Harris) (01/07/87)
In article <532@cdx39.UUCP>, jc@cdx39.UUCP (John Chambers) writes: > Well, so many people said "No, I don't know how, but I want the code when > you get it working" that I did it and I'm posting it. This is a first > try, and has only been tested on some SYS/V machines, so it probably won't > quite work everywhere. what about (for example): tar cf ~uucp/foo ./* uucp ~uucp/foo target!~uucp If you need to uuencode the file, fine. At the other end you use tar again for the extract.