greim@sbsvax.UUCP (Michael Greim) (05/17/88)
Hello netland, Some months ago a program called 'slice' was posted comp.sources.unix. We here tried it, but found a bug almost instantly. Our system administrator wanted to use slice to extract tar mail pieces from a mail box so I made 2 extensions of slice for him. I sent the following to Rich Salz, suggesting a reposting, but did not hear anything from him. So I assume it's ok, if I present my changes. Here are 1.) the BUG 2.) 2 extensions to slice, description 3.) context diff of slice.c (a cure for the ailment) 4.) context diff of slice.1 5.) the tarmail extracting script 1.) the BUG 1.1) Symptoms I tried "slice -f file -n100 A#n" and exspected slice to produce some file Ann. But to my suprise it said : "can not use -n option together with pattern" or some such. 1.2) Diagnosis The first command line option not starting with '-' was considered a pattern regardless of other options specified. 1.3) Therapy Apply the context diff in 3. 2.) 2 extensions to slice, description The extensions are in the substitution ability. #0nn : with this format you can specify up to 99 parameters instead of only 9. We needed this! #-nn : take the nn'th parameter from the last. nn=0 means the last parameter. This is equal to #$ when you have less than 99 parameters. NOTE: To make this work properly set MAXPARM in opts.h to 99. (no context diff include because of {what-you-like} :-) Apply the context diff in 3. 3.) context diff of slice.c (a cure for the ailment) *** slice.c.old Wed Mar 23 18:41:41 1988 --- slice.c Wed Mar 23 18:40:42 1988 *************** *** 43,48 **** --- 44,52 ---- bool exclude = FALSE; /* exclude matched line from o/p files */ bool split_after = FALSE; /* split after matched line */ bool m_flag = FALSE; /* was -m option used */ + bool s_flag = FALSE; /* was -s option used */ + bool n_flag = FALSE; /* was -n option used */ + bool e_flag = FALSE; /* was -e option used */ FILE *output = (FILE *) NULL; /* fd of current output file */ FILE *rejectfd = (FILE *) NULL; /* fd of reject file */ *************** *** 105,110 **** --- 109,115 ---- usage(1); } pattern = *argv; + e_flag = TRUE; break; } case 'm': { /* mailbox pattern */ *************** *** 113,119 **** break; } case 's': { /* shell pattern */ ! pattern = "^#! *\/bin\/sh"; break; } case 'n': { /* -n n_lines -- split every n lines */ --- 118,125 ---- break; } case 's': { /* shell pattern */ ! pattern = "^#! *\\/bin\\/sh"; ! s_flag = TRUE; break; } case 'n': { /* -n n_lines -- split every n lines */ *************** *** 123,128 **** --- 129,135 ---- error("-n: number must be at least 1\n"); exit(EXIT_SYNTAX); } + n_flag = TRUE; break; } case 'f': { *************** *** 163,179 **** } } /* end switch */ } else { ! if (!pattern) pattern = *argv; /* first non-flag is pattern */ else break; /* break while loop */ } /* end if */ } /* end while */ if (!argc) { ! if (m_flag) { format = mboxformat; ! } else { format = defaultfmt; - } n_format = 1; } else { format = argv; --- 170,195 ---- } } /* end switch */ } else { ! /* ! * mg, 22.mar.88 ! * the first non-flag is pattern, if not one of -s -n or -m ! * was specified or -e pattern ! */ ! if (!pattern && !m_flag && !s_flag && !n_flag) ! pattern = *argv; /* first non-flag is pattern */ else break; /* break while loop */ } /* end if */ } /* end while */ + if (e_flag && (m_flag || s_flag || n_flag)) { + error("don't use -e together with -m, -s or -n flags\n"); + usage(EXIT_SEMANT); + } if (!argc) { ! if (m_flag) format = mboxformat; ! else format = defaultfmt; n_format = 1; } else { format = argv; *************** *** 486,491 **** --- 506,539 ---- q += strlen(tempbuf); break; } + /* + * mg, 18.mar.88 + * - use #0nn to specify parameter numbers greater than 9 + * - use #-nn to select the nn'th parameter from the last + * #-00 is equivalent to #$ + */ + case '-': + case '0': + if (!isdigit(*(p+1)) || !isdigit(*(p+2))) { + error("Invalid use of #%cnn format in '%s'\n", *p, *format); + exit(EXIT_RUNERR); + } + i = (*(p+1) - '0') * 10 + *(p+2) - '0'; + if (i > MAXPARM) { + error("Number of parameter (%1d) exceeds max (%1d)\n", i, MAXPARM); + exit(EXIT_RUNERR); + } + if (*p == '-') { + j = lastparm (); + if (j < i) { + error ("Not enough parameters to take difference.\n"); + exit (EXIT_RUNERR); + } + i = j - i; + } else + i--; + p += 2; + goto do_form; case '1': case '2': case '3': *************** *** 501,506 **** --- 549,555 ---- } else { i = (*p) - '1'; } + do_form: if (*(p+1) == '%') { p++; fmtcode = getfmt(fmt,p); 4.) context diff of slice.1 *** slice.1.old Wed Mar 23 18:42:28 1988 --- slice.1 Wed Mar 23 18:40:42 1988 *************** *** 38,45 **** into one or more output files. The output files are named according to the \fIformat\fR strings provided. The input file is split whenever a pattern is matched or every \fIn\fR lines, depending on the ! options selected. Because some of the options are mutually exclusive, there are three forms of the command. .LP Whenever a pattern match is used to slice the file, lines occurring before the first match are sent to the \fIreject\fR file (which is --- 38,47 ---- into one or more output files. The output files are named according to the \fIformat\fR strings provided. The input file is split whenever a pattern is matched or every \fIn\fR lines, depending on the ! options selected. ! Because some of the options are mutually exclusive, there are three forms of the command. + It is an error to specify a pattern together with options -m, -s or -n. .LP Whenever a pattern match is used to slice the file, lines occurring before the first match are sent to the \fIreject\fR file (which is *************** *** 111,119 **** output file produced by the current output format. When an output format produces the same name twice, a new format is selected and numbering begins again with the initial value. ! .IP "#\&1, #\&2 ..." ! Parameters of the form #\&1, #\&2, ... #\&9 are replaced by corresponding tokens drawn from the source line which matched the slice pattern. For example, if each procedure in a C program began with a comment line of the following form: .sp --- 113,129 ---- output file produced by the current output format. When an output format produces the same name twice, a new format is selected and numbering begins again with the initial value. ! .IP "#\&1, #\&2 ..., #\&0nn, #\&-nn" ! Parameters of the form #\&1, #\&2, ... #\&9 or #\&0nn, where 'nn' is ! a 2 digit number are replaced by corresponding tokens drawn from the source line which matched the slice pattern. + If you specify #\&-nn, you can select a parameter relative from + the last token on the line. #\&-00 is the last token on the line, + #\&-01 the last but one, ... + .br + Note that it is an error to not specify two digits when using #\&0nn + or #\&-nn. + .br For example, if each procedure in a C program began with a comment line of the following form: .sp *************** *** 131,136 **** --- 141,149 ---- \ \ \ \ \From garyp@cognos Tue Sep 15 15:08:23 EDT 1987 .sp then "#$" would select "1987", the last token on the line. + .br + Currently there are 99 addressable tokens on an input line. If a line + is split in more tokens, #$ will hold the last one. .SH FORMAT SPEC's .LP Substitution parameters can be followed by an optional *************** *** 240,245 **** --- 253,264 ---- generate the correct filenames, either slice has to lookahead to find the next match line or it has to direct lines for the current slice into a temporary file until it finds the line matching the pattern. + .IP c) 4 + When you use slice on machines with a filesystem which allowes you + only a (usually small) amount of characters for filenames (i.e. 14), + slice might not detect that it is overwriting a file and/or + its diagnostic output is false. Especially filenames generated by the -m + option are too long. Just specify a format when slicing a mailbox. .SH DIAGNOSTICS ``Internal Error'' indicates a bug in \fIslice\fR, and should be reported. Exit status 1 indicates an error parsing options \- for example, if an unknown *************** *** 249,254 **** --- 268,279 ---- be opened. .LP If a reject file is not provided, a count of rejected lines is reported. + .SH "AUTHOR" + Originally written by Russell Quinn as "mailsplit". + .sp + Revised and extended by Gary Puckering <cognos!garyp>. + .sp + Extended some more by Michael Greim. .SH "SEE ALSO" .I cat (1), .I ed (1), 5.) the tarmail extracting script The author, Bernard Sieloff (bs@sbsvax.UUCP), says it could be improved, but it is already 30% faster than the version using csplit. #! /bin/sh # @(#)untarpack 2.1 (UniSB[bs]) 88/03/20 PATH=/usr/ucb:/bin:/usr/bin:/usr/local if [ $# -lt 1 -o $# -gt 2 ]; then echo "Usage: untarpack \"subject-string\"[ your-tarmailbox]" exit 1 fi trap 'echo "untarpack: cancelled"; exit 9' 1 2 3 15 TS=$1; if [ $# -eq 2 ]; then MB=$2 else MB=/usr/spool/mail/$USER fi if [ ! -s $MB ]; then echo "untarpack: no such file: $MB" exit 1 fi rm -f utm.boxfile.???-of-??? echo "starting unpacking now---please wait..." sed -n -e "/^Subject: $TS - part/,/^---end beef/p" $MB | slice "^Subject: $TS - part" 'utm.boxfile.#-02%03d-of-#$%03d' if [ $? -ne 0 ]; then echo "untarpack: slice error" exit 2 fi if [ ! -s utm.boxfile.001-of-??? ]; then echo "untarpack: can't find subjects \"$TS\" in file \"$MB\"" exit 3 fi FOUND=`ls utm.boxfile.???-of-??? | wc -l` PACKS=`expr substr utm.boxfile.001-of-??? 20 3` if [ $FOUND -lt $PACKS ]; then FOUND=`expr $FOUND + 0` PACKS=`expr $PACKS + 0` echo "untarpack: lack of tarmail packets ($FOUND instead of $PACKS)" exit 4 elif [ $FOUND -gt $PACKS ]; then FOUND=`expr $FOUND + 0` PACKS=`expr $PACKS + 0` echo "untarpack: packet overrun?!? ($FOUND instead of $PACKS)" exit 5 fi echo '---end beef' > utm.boxfile.000-of-$PACKS echo -n "Done---do you want to UNTARMAIL the tarmail? [y/n]:" read answer junk answer=${answer}x if expr $answer : '[yY].*x'>/dev/null; then echo "OK---UNTARMAILing your tarmail..." exec untarmail utm.boxfile.???-of-??? else echo 'Use "untarmail utm.boxfile.???-of-???" to reconstruct the TARMAIL' fi exit 0 Absorb, apply and enjoy, Michael -- snail-mail : Michael Greim, Universitaet des Saarlandes, FB 10 - Informatik (Dept. of CS), Bau 36, Im Stadtwald 15, D-6600 Saarbruecken 11, West Germany E-mail : greim@sbsvax.UUCP