greim@sbsvax.UUCP (Michael Greim) (05/17/88)
Hello netland,
Some months ago a program called 'slice' was posted comp.sources.unix.
We here tried it, but found a bug almost instantly. Our system
administrator wanted to use slice to extract tar mail pieces from
a mail box so I made 2 extensions of slice for him.
I sent the following to Rich Salz, suggesting a reposting, but did
not hear anything from him. So I assume it's ok, if I present my changes.
Here are
1.) the BUG
2.) 2 extensions to slice, description
3.) context diff of slice.c (a cure for the ailment)
4.) context diff of slice.1
5.) the tarmail extracting script
1.) the BUG
1.1) Symptoms
I tried "slice -f file -n100 A#n" and exspected slice to produce
some file Ann. But to my suprise it said : "can not use -n option
together with pattern" or some such.
1.2) Diagnosis
The first command line option not starting with '-' was considered
a pattern regardless of other options specified.
1.3) Therapy
Apply the context diff in 3.
2.) 2 extensions to slice, description
The extensions are in the substitution ability.
#0nn : with this format you can specify up to 99 parameters instead
of only 9. We needed this!
#-nn : take the nn'th parameter from the last. nn=0 means the last
parameter. This is equal to #$ when you have less than 99 parameters.
NOTE:
To make this work properly set MAXPARM in opts.h to 99.
(no context diff include because of {what-you-like} :-)
Apply the context diff in 3.
3.) context diff of slice.c (a cure for the ailment)
*** slice.c.old Wed Mar 23 18:41:41 1988
--- slice.c Wed Mar 23 18:40:42 1988
***************
*** 43,48 ****
--- 44,52 ----
bool exclude = FALSE; /* exclude matched line from o/p files */
bool split_after = FALSE; /* split after matched line */
bool m_flag = FALSE; /* was -m option used */
+ bool s_flag = FALSE; /* was -s option used */
+ bool n_flag = FALSE; /* was -n option used */
+ bool e_flag = FALSE; /* was -e option used */
FILE *output = (FILE *) NULL; /* fd of current output file */
FILE *rejectfd = (FILE *) NULL; /* fd of reject file */
***************
*** 105,110 ****
--- 109,115 ----
usage(1);
}
pattern = *argv;
+ e_flag = TRUE;
break;
}
case 'm': { /* mailbox pattern */
***************
*** 113,119 ****
break;
}
case 's': { /* shell pattern */
! pattern = "^#! *\/bin\/sh";
break;
}
case 'n': { /* -n n_lines -- split every n lines */
--- 118,125 ----
break;
}
case 's': { /* shell pattern */
! pattern = "^#! *\\/bin\\/sh";
! s_flag = TRUE;
break;
}
case 'n': { /* -n n_lines -- split every n lines */
***************
*** 123,128 ****
--- 129,135 ----
error("-n: number must be at least 1\n");
exit(EXIT_SYNTAX);
}
+ n_flag = TRUE;
break;
}
case 'f': {
***************
*** 163,179 ****
}
} /* end switch */
} else {
! if (!pattern) pattern = *argv; /* first non-flag is pattern */
else break; /* break while loop */
} /* end if */
} /* end while */
if (!argc) {
! if (m_flag) {
format = mboxformat;
! } else {
format = defaultfmt;
- }
n_format = 1;
} else {
format = argv;
--- 170,195 ----
}
} /* end switch */
} else {
! /*
! * mg, 22.mar.88
! * the first non-flag is pattern, if not one of -s -n or -m
! * was specified or -e pattern
! */
! if (!pattern && !m_flag && !s_flag && !n_flag)
! pattern = *argv; /* first non-flag is pattern */
else break; /* break while loop */
} /* end if */
} /* end while */
+ if (e_flag && (m_flag || s_flag || n_flag)) {
+ error("don't use -e together with -m, -s or -n flags\n");
+ usage(EXIT_SEMANT);
+ }
if (!argc) {
! if (m_flag)
format = mboxformat;
! else
format = defaultfmt;
n_format = 1;
} else {
format = argv;
***************
*** 486,491 ****
--- 506,539 ----
q += strlen(tempbuf);
break;
}
+ /*
+ * mg, 18.mar.88
+ * - use #0nn to specify parameter numbers greater than 9
+ * - use #-nn to select the nn'th parameter from the last
+ * #-00 is equivalent to #$
+ */
+ case '-':
+ case '0':
+ if (!isdigit(*(p+1)) || !isdigit(*(p+2))) {
+ error("Invalid use of #%cnn format in '%s'\n", *p, *format);
+ exit(EXIT_RUNERR);
+ }
+ i = (*(p+1) - '0') * 10 + *(p+2) - '0';
+ if (i > MAXPARM) {
+ error("Number of parameter (%1d) exceeds max (%1d)\n", i, MAXPARM);
+ exit(EXIT_RUNERR);
+ }
+ if (*p == '-') {
+ j = lastparm ();
+ if (j < i) {
+ error ("Not enough parameters to take difference.\n");
+ exit (EXIT_RUNERR);
+ }
+ i = j - i;
+ } else
+ i--;
+ p += 2;
+ goto do_form;
case '1':
case '2':
case '3':
***************
*** 501,506 ****
--- 549,555 ----
} else {
i = (*p) - '1';
}
+ do_form:
if (*(p+1) == '%') {
p++;
fmtcode = getfmt(fmt,p);
4.) context diff of slice.1
*** slice.1.old Wed Mar 23 18:42:28 1988
--- slice.1 Wed Mar 23 18:40:42 1988
***************
*** 38,45 ****
into one or more output files. The output files are named according
to the \fIformat\fR strings provided. The input file is split
whenever a pattern is matched or every \fIn\fR lines, depending on the
! options selected. Because some of the options are mutually exclusive,
there are three forms of the command.
.LP
Whenever a pattern match is used to slice the file, lines occurring
before the first match are sent to the \fIreject\fR file (which is
--- 38,47 ----
into one or more output files. The output files are named according
to the \fIformat\fR strings provided. The input file is split
whenever a pattern is matched or every \fIn\fR lines, depending on the
! options selected.
! Because some of the options are mutually exclusive,
there are three forms of the command.
+ It is an error to specify a pattern together with options -m, -s or -n.
.LP
Whenever a pattern match is used to slice the file, lines occurring
before the first match are sent to the \fIreject\fR file (which is
***************
*** 111,119 ****
output file produced by the current output format. When an output
format produces the same name twice, a new format is selected and
numbering begins again with the initial value.
! .IP "#\&1, #\&2 ..."
! Parameters of the form #\&1, #\&2, ... #\&9 are replaced by corresponding
tokens drawn from the source line which matched the slice pattern.
For example, if each procedure in a C program began with a comment
line of the following form:
.sp
--- 113,129 ----
output file produced by the current output format. When an output
format produces the same name twice, a new format is selected and
numbering begins again with the initial value.
! .IP "#\&1, #\&2 ..., #\&0nn, #\&-nn"
! Parameters of the form #\&1, #\&2, ... #\&9 or #\&0nn, where 'nn' is
! a 2 digit number are replaced by corresponding
tokens drawn from the source line which matched the slice pattern.
+ If you specify #\&-nn, you can select a parameter relative from
+ the last token on the line. #\&-00 is the last token on the line,
+ #\&-01 the last but one, ...
+ .br
+ Note that it is an error to not specify two digits when using #\&0nn
+ or #\&-nn.
+ .br
For example, if each procedure in a C program began with a comment
line of the following form:
.sp
***************
*** 131,136 ****
--- 141,149 ----
\ \ \ \ \From garyp@cognos Tue Sep 15 15:08:23 EDT 1987
.sp
then "#$" would select "1987", the last token on the line.
+ .br
+ Currently there are 99 addressable tokens on an input line. If a line
+ is split in more tokens, #$ will hold the last one.
.SH FORMAT SPEC's
.LP
Substitution parameters can be followed by an optional
***************
*** 240,245 ****
--- 253,264 ----
generate the correct filenames, either slice has to lookahead to find
the next match line or it has to direct lines for the current slice
into a temporary file until it finds the line matching the pattern.
+ .IP c) 4
+ When you use slice on machines with a filesystem which allowes you
+ only a (usually small) amount of characters for filenames (i.e. 14),
+ slice might not detect that it is overwriting a file and/or
+ its diagnostic output is false. Especially filenames generated by the -m
+ option are too long. Just specify a format when slicing a mailbox.
.SH DIAGNOSTICS
``Internal Error'' indicates a bug in \fIslice\fR, and should be reported.
Exit status 1 indicates an error parsing options \- for example, if an unknown
***************
*** 249,254 ****
--- 268,279 ----
be opened.
.LP
If a reject file is not provided, a count of rejected lines is reported.
+ .SH "AUTHOR"
+ Originally written by Russell Quinn as "mailsplit".
+ .sp
+ Revised and extended by Gary Puckering <cognos!garyp>.
+ .sp
+ Extended some more by Michael Greim.
.SH "SEE ALSO"
.I cat (1),
.I ed (1),
5.) the tarmail extracting script
The author, Bernard Sieloff (bs@sbsvax.UUCP), says it could be improved,
but it is already 30% faster than the version using csplit.
#! /bin/sh
# @(#)untarpack 2.1 (UniSB[bs]) 88/03/20
PATH=/usr/ucb:/bin:/usr/bin:/usr/local
if [ $# -lt 1 -o $# -gt 2 ]; then
echo "Usage: untarpack \"subject-string\"[ your-tarmailbox]"
exit 1
fi
trap 'echo "untarpack: cancelled"; exit 9' 1 2 3 15
TS=$1;
if [ $# -eq 2 ]; then
MB=$2
else
MB=/usr/spool/mail/$USER
fi
if [ ! -s $MB ]; then
echo "untarpack: no such file: $MB"
exit 1
fi
rm -f utm.boxfile.???-of-???
echo "starting unpacking now---please wait..."
sed -n -e "/^Subject: $TS - part/,/^---end beef/p" $MB |
slice "^Subject: $TS - part" 'utm.boxfile.#-02%03d-of-#$%03d'
if [ $? -ne 0 ]; then
echo "untarpack: slice error"
exit 2
fi
if [ ! -s utm.boxfile.001-of-??? ]; then
echo "untarpack: can't find subjects \"$TS\" in file \"$MB\""
exit 3
fi
FOUND=`ls utm.boxfile.???-of-??? | wc -l`
PACKS=`expr substr utm.boxfile.001-of-??? 20 3`
if [ $FOUND -lt $PACKS ]; then
FOUND=`expr $FOUND + 0`
PACKS=`expr $PACKS + 0`
echo "untarpack: lack of tarmail packets ($FOUND instead of $PACKS)"
exit 4
elif [ $FOUND -gt $PACKS ]; then
FOUND=`expr $FOUND + 0`
PACKS=`expr $PACKS + 0`
echo "untarpack: packet overrun?!? ($FOUND instead of $PACKS)"
exit 5
fi
echo '---end beef' > utm.boxfile.000-of-$PACKS
echo -n "Done---do you want to UNTARMAIL the tarmail? [y/n]:"
read answer junk
answer=${answer}x
if expr $answer : '[yY].*x'>/dev/null; then
echo "OK---UNTARMAILing your tarmail..."
exec untarmail utm.boxfile.???-of-???
else
echo 'Use "untarmail utm.boxfile.???-of-???" to reconstruct the TARMAIL'
fi
exit 0
Absorb, apply and enjoy,
Michael
--
snail-mail : Michael Greim,
Universitaet des Saarlandes, FB 10 - Informatik (Dept. of CS),
Bau 36, Im Stadtwald 15, D-6600 Saarbruecken 11, West Germany
E-mail : greim@sbsvax.UUCP