[comp.sources.d] File headers on shar files

chip@ateng.UUCP (Chip Salzenberg) (12/04/87)

In article <393@ddsw1.UUCP> karl@ddsw1.UUCP (Karl Denninger) writes:
>
>First, the map files had no headers on them beyond the normal Usenet
>headers.  [...]
>
>Now, the significance of this is that I can easily feed this article, sans
>headers, to 'sh' for immediate unpacking.

It's easy to unshar any sharchive article -- even one with leading text --
with the following script:

-----------8<----cut here-----8<----------
: unshar
# Extract modules from a shar archive

SEDCMD="1,/^[#:]/d"

case $# in
0)
	sed -e $SEDCMD | sh
	;;

*)
	for f
	do
	    sed -e $SEDCMD $f | sh
	done
	;;
esac
-----------8<----cut here-----8<----------

In rn, just type `|unshar' and you're in business. (Unless the poster had
the bad manners to put a line starting with # or : before the sharchive.)

-- 
Chip Salzenberg         "chip@ateng.UUCP"  or  "{codas,uunet}!ateng!chip"
A T Engineering         My employer's opinions are not mine, but these are.
   "Gentlemen, your work today has been outstanding.  I intend to recommend
   you all for promotion -- in whatever fleet we end up serving."   - JTK

rsalz@bbn.com (Rich Salz) (12/04/87)

In turgid prose, someone writes:
#I therefore propose that in the future, when sources (or any other 'shar'
#file) is posted, that the powers that be place any needed 'introduction'
#that preceeds the shar into the file in such a way that it will not
#interfere with the attempts of myself (and I bet others as well) to automate
#or simplify this unpacking.  This can be assured by simply insuring that the
#first character of each line is a "#", making it a shell comment line.

#Comments?

Don't look for it in comp.sources.unix, good luck in alt.sources, and I doubt
any of the other moderators will follow.  Here's why:

First, it's a pain to do.  It's not feasible to put the # in while writing
the text (things like "!}fmt -75" won't work) and it's a real pain to
have to put it in before posting -- one more thing to remember...

Second, from the mail I've got, people really want to see introductory info
to see if the source is worth saving, and they want to know *quickly*:-)
The "#" is hard to read -- I quoted the article above, just to make the
point.

Third, there have been a few "unshar" programs posted to the net, including
uuhosts and others for the automatically unpacking all the UUCP maps
(safely, via chroot and a feed in your news/sys file), and a couple of
heuristic programs for the general case (including a revision I wrote
for part of a general shar/unshar package).

Finally, there are compatibility problems, when source is shipped
world-wide:  not all shells understand "#" comments (V7 sh), and I've
gotten mail from BITNET where all # lines were lost, include #define
and #include.

I suppose it's time to repost my shar/makekit/unshar utilities... look
for two sets of them in early January (please don't send me mail asking
for it).
	/r$
-- 
For comp.sources.unix stuff, mail to sources@uunet.uu.net.

billw@killer.UUCP (12/05/87)

If you have problems unpacking shar archives, don't like saving them, editing,
whatever, do your very, very best to get a copy of Rich $alz's "unshar"
program. It's packaged with his "shar" programs (arguably the best shar package
around) and automatically does the job of getting rid of headers and
extraneous lines. I can't live without it.
-- 
Bill Wisner, HASA "A" Division		..{codas,ihnp4}!killer!billw
Immanentizing the Eschaton at a site near you.

lwv@n8emr.UUCP (Larry W. Virden) (12/05/87)

I would like to suggest one better!  If you have commentary / descriptions /
etc place it in the shar in such a way that it goes into a file!  I find
that often there is useful info in that part that I don't get to see if
I blinkdly unshar the program.

As for the referenced article's proposal, if one is fortunate to be on
a Unix-alike system, it is easy to write up a sed | sh pipe to skip 
everything up to the first line with a # and feed the rest to sh.  Thus,
the newly proposed format isnt need for us.  As for non-unix sites, I would
assume that they already have a C program to unshar the files, since sed
is a product of someone or other attached to ATT/Bell Labs/etc.  

Perhaps the easiest thing to do is for someone to post a copy of a C unshar
that all could use, making the whole discussion a moot point.  If you have
such a creature, would you at least consider putting in an option to save
off the header and other info in to a SHAR.INFO file or something?  That way
I keep the address of the moderator/poster, any pre-shar comments, etc.

Thanks!
-- 
Larry W. Virden	 75046,606 (CIS)
674 Falls Place, Reynoldsburg, OH 43068 (614) 864-8817
cbosgd!n8emr!lwv (UUCP) 	cbosgd!n8emr!lwv@ucbvax.Berkeley.EDU (BITNET)
We haven't inherited the world from our parents, but borrowed it from our children.

brad@cayman.COM (Brad Parker) (12/05/87)

rsalz@bbn.com (Rich Salz):
> In turgid prose, someone writes:
> #I therefore propose that in the future, when sources (or any other 'shar'
> #file) is posted, that the powers that be place any needed 'introduction'
...
> #first character of each line is a "#", making it a shell comment line.
> #Comments?
> Don't look for it in comp.sources.unix, good luck in alt.sources, and I doubt
> any of the other moderators will follow.  Here's why:


I changed a copy of generic "unshar" to automagically handle concatenated
shar files. I can not believe I am the first to want to do this. Are there
any other "already available" shars which already do this? (or should I
post mine?) ps: I'm about to run out and grab a copy of uuhosts. I was
wondering what it did ;-)

pps: Am I the only one who thinks inews has gotten a bit fascist? It rejected
my posting claiming there where more included lines than new lines. yowza.
-- 

Brad Parker
Cayman Systems			"Mama's little baby likes violent sex..."
harvard!cayman!brad			   - from a song I heard on the radio.

caf@omen.UUCP (Chuck Forsberg WA7KGX) (12/07/87)

I've found the following shell script able to "crack" most incoming
shar bearing news articles.  It's called "unr" and from rn it is:
"w|unr dir" or "69-77w|unr dir".  Everytime a moderator comes up with
a new permutation on the shar format I've tweeked unr to keep up with
it, and I haven't had any articles that have failed to unpack lately.
Of course, the usual caveats about unsharing news articles which might
contain nasties still applies.

#!/bin/sh
# to extract, remove the header and type "sh filename"
if `test ! -s ./unr`
then
echo "Writing ./unr"
cat > ./unr << '\Rogue\Monster\'
pattern="^[:#]"
readme="READ..ME"
mkdir $1
cd $1
sed -e '1,/^[:#][ !]/d' -e 's/^chdir /cd /' >/tmp/unrn.$$
ksh /tmp/unrn.$$
sed "/$pattern/q" </tmp/unrn.$$ >>$readme
pwd;ls -lt|head -15;du
case $0 in
	unr) rm /tmp/unrn.$$
esac
exit

Notes: Create a temp file minus the initial chatter which tends to
bollix shells and unshar programs.  "unr" removes the temp file,
"unrn" keeps it; unrn is a link to this file called unr.
\Rogue\Monster\
else
  echo "will not over write ./unr"
fi
if [ `wc -c ./unr | awk '{printf $1}'` -ne 422 ]
then
echo `wc -c ./unr | awk '{print "Got " $1 ", Expected " 422}'`
fi
echo "Finished archive 1 of 1"
exit

Chuck Forsberg WA7KGX Author of YMODEM, ZMODEM, Professional-YAM, ZCOMM, and DSZ
...!tektronix!reed!omen!caf  Omen Technology Inc "The High Reliability Software"
17505-V Northwest Sauvie Island Road Portland OR 97231  VOICE:503-621-3406:VOICE
    TeleGodzilla BBS: 621-3746 19200/2400/1200  CIS:70007,2304  Genie:CAF
  omen Any ACU 2400 1-503-621-3746 se:--se: link ord: Giznoid in:--in: uucp
  omen!/usr/spool/uucppublic/FILES lists all uucp-able files, updated hourly

bd@hpsemc.UUCP (bob desinger) (12/08/87)

> I changed a copy of generic "unshar" to automagically handle concatenated
> shar files. I can not believe I am the first to want to do this. Are there
> any other "already available" shars which already do this?

Do you mean concatenated shars (created by cat'ing several shar
bundles together) or recursively-bundled shars (created by shar'ing
a shar)?

Concatenated shars can be unpacked by editing out the "exit 0" line
emitted by better shars.  Yes, it's a stupid step that has to be done
by hand, and the creator of the shar bundle should have done that
for you.  Complain fiercely if you get a lot of these.

Recursive shars indeed need special handling.  But I'm confused by
your reference to `unshar' in the first line and `shar' in the last
line.  Did you modify the unpacking program `unshar', or the packing
program `shar'?  The Jack Applin shar and the Connoisseur's shar
already do the right thing when you shar a shar file.  If Rich has
adopted my mods, his shar is a superset of the Connoisseur's shar and
Jack Applin's shar.

bob desinger
shar hacker and historian

wohler@milk1.istc.sri.com..istc.sri.com (Bill Wohler) (12/08/87)

Karl Denninger writes:
>I therefore propose that in the future, when sources (or any other 'shar'
>file) is posted, that the powers that be place any needed 'introduction'
>that preceeds the shar into the file in such a way that it will not
>interfere with the attempts of myself (and I bet others as well) to automate
>or simplify this unpacking.  This can be assured by simply insuring that the
>first character of each line is a "#", making it a shell comment line.

  also, if "tarmail" of the compress family had been made the de-facto
  standard of shipping sources, then you wouldn't have to have even
  written your script--atob would ignore all the leading garbage.  

  for those not in the know, tarmail is a filter which essentially does:

	tar | compress | btoa | mail

  compress makes the distributions smaller, which would make uucp sites
  much happier (often cuts source by almost one half). [ab]to[ba] does a 
  better job than uu{en,de}code, and is not setuid to uucp which makes it 
  easier to debinary things.

  if there aren't any machines that compress doesn't run on, perhaps
  shar could be phased out...

  if interested, compress.shar, which includes atob, btoa, tarmail, 
  and compressdir, can be ftped from spam.istc.sri.com. 
  i'd be happy to mail it too.  i encourage other folks who maintain
  compress.shar in their pub directories to update their version (like 
  j.cc.purdue.edu).

						--bw
						wohler@istc.sri.com

allbery@ncoast.UUCP (Brandon Allbery) (12/08/87)

As quoted from <393@ddsw1.UUCP> by karl@ddsw1.UUCP (Karl Denninger):
+---------------
| I therefore propose that in the future, when sources (or any other 'shar'
| file) is posted, that the powers that be place any needed 'introduction'
| that preceeds the shar into the file in such a way that it will not
| interfere with the attempts of myself (and I bet others as well) to automate
| or simplify this unpacking.  This can be assured by simply insuring that the
| first character of each line is a "#", making it a shell comment line.
| 
| Thanks for listening! (comments?)
+---------------

Comment 1:  Go ahead, make the Net's day... force me to edit EVERY submission
I get.

Comment (of sorts) 2:

: uns -- unpack shell archives (named or on stdin)
if test -z "$*"; then set x -; shift 1; fi
for arg in "$@"; do
	echo "shar: unpacking ${arg}"
	sed -e '1,/^[:#]/d' "$arg" | sh
done

It never met a "shar" it didn't like.  ;-)
-- 
Brandon S. Allbery		      necntc!ncoast!allbery@harvard.harvard.edu
 {hoptoad,harvard!necntc,cbosgd,sun!mandrill!hal,uunet!hnsurg3}!ncoast!allbery
			Moderator of comp.sources.misc

stephen@alberta.UUCP (Stephen Samuel) (12/09/87)

In article <252@papaya.bbn.com>, rsalz@bbn.com (Rich Salz) writes:
> In turgid prose, someone writes:
 ....
> #that preceeds the shar into the file in such a way that it will not
> #interfere with the attempts of myself (and I bet others as well) to automate
> #or simplify this unpacking.  This can be assured by simply insuring that the
> #first character of each line is a "#", making it a shell comment line.
> 
 The following is a quick file to do un-sharing of messages and articles.
It simply eats everything up to (but not including) the first comment 
line (I have yet to see a shar file that doesn't have at least one). 
the rest gets stuffed thru your favorite shell.
 Although the default program to do the screening is sed I have included
an awk version because it seems that the Convergent Mightyframe has some sort
of bug with sed that caused it to EAT CPU TIME (90% system cpu) FOR LUNCH!!!
(hint, hint, CT...).
# -----------------  CUT HERE ----------------- 
# 
# file bin/toshar 
sed -e 's/^\>//' -e '/<$/s/\<$//' << "++ENDFILE++" > bin/toshar
>PROG=sh
>for i in $@
>do
>echo start $i
># awk  -e '/^[#:]/,0==1'  $i | $PROG
>sed  -n -e '/^[#:]/,$p'  $i | $PROG
>echo done $i
>done
++ENDFILE++

dag@chinet.UUCP (Daniel A. Glasser) (12/10/87)

In article <10908@sri-spam.istc.sri.com> wohler@milk1.istc.sri.com.UUCP (Bill Wohler) writes:
+[some stuff deleted]
+  also, if "tarmail" of the compress family had been made the de-facto
+  standard of shipping sources, then you wouldn't have to have even
+  written your script--atob would ignore all the leading garbage.  
+
+  for those not in the know, tarmail is a filter which essentially does:
+
+	tar | compress | btoa | mail
+
+  compress makes the distributions smaller, which would make uucp sites
+  much happier (often cuts source by almost one half). [ab]to[ba] does a 
+  better job than uu{en,de}code, and is not setuid to uucp which makes it 
+  easier to debinary things.
+
+  if there aren't any machines that compress doesn't run on, perhaps
+  shar could be phased out...
+[more stuff deleted]
+						--bw
+						wohler@istc.sri.com

The main problem with this suggestion is that compress does not work
the same on all systems.  I have a UNIX system that only allows up to
12 bits for compression.  Many systems support more bits.  The default
is the max, thus if I get a compressed file from, say, a 3B2, I cannot
decompress it.  If somebody could post a version of compress that would
give the same number of bits for all systems, at least for decompression,
this problem would be lessened.

(My UNIX box is a Z8001 based system, so doing huge model is not an
option, also, PDP-11's have the same trouble.)


-- 
					Daniel A. Glasser
					...!ihnp4!chinet!dag
					...!ihnp4!mwc!dag
					...!ihnp4!mwc!gorgon!dag
	One of those things that goes "BUMP!!! (ouch!)" in the night.

ewiles@netxcom.UUCP (Edwin Wiles) (12/10/87)

In article <10908@sri-spam.istc.sri.com> (Bill Wohler) writes:
>  also, if "tarmail" of the compress family had been made the de-facto
>  standard of shipping sources, then you wouldn't have to have even
>  written your script--atob would ignore all the leading garbage.  
>
>  for those not in the know, tarmail is a filter which essentially does:
>
>	tar | compress | btoa | mail
>
	I'm sorrry, but I pray to god that this does NOT become the standard.
	I prefer to be able to look the stuff over BEFORE I decide to use
	up my disk space on it.  Compress, or anything else which mucks the
	source into an unreadable form, is completely unacceptable.

>  compress makes the distributions smaller, which would make uucp sites
>  much happier (often cuts source by almost one half).

	Sorry to inform you, but compressed text, which is then recompressed
	usually ends up MUCH LARGER.  (I've checked.)  Thus you're better off
	not compressing it since news, etc. is usually traded between
	systems in a compressed format anyway.  (It is on our system.)
	And something that will *really* tick people off is if you start
	running up their communications costs.

>						wohler@istc.sri.com
-- 
...!hadron\   "Who?... Me?... WHAT opinions?!?" | Edwin Wiles
  ...!sundc\   Schedule: (n.) An ever changing	| NetExpress Comm., Inc.
   ...!pyrdc\			  nightmare.	| 1953 Gallows Rd. Suite 300
    ...!uunet!netxcom!ewiles			| Vienna, VA 22180

allbery@ncoast.UUCP (Phil Smith) (12/10/87)

As quoted from <505@cayman.COM> by brad@cayman.COM (Brad Parker):
+---------------
| pps: Am I the only one who thinks inews has gotten a bit fascist? It rejected
| my posting claiming there where more included lines than new lines. yowza.
+---------------

Sites running ncoast's variant of 2.11 only get the fascist stuff if they
put a line of the form

g/\/\* #define ART_FASCIST/s/.../

in localize.sh.  Need I say any more?
-- 
Brandon S. Allbery		      necntc!ncoast!allbery@harvard.harvard.edu
 {hoptoad,harvard!necntc,cbosgd,sun!mandrill!hal,uunet!hnsurg3}!ncoast!allbery
			Moderator of comp.sources.misc

rick@pcrat.UUCP (Rick Richardson) (12/10/87)

In article <966@pembina.UUCP> obed!stephen@alberta.UUCP (Stephen Samuel) writes:
>It simply eats everything up to (but not including) the first comment 
>line (I have yet to see a shar file that doesn't have at least one). 

They are rare, but yes, the occasional shar has no '#' to key on.

I'd suggest that the posting software insure that there's at least a line
starting with '#' before allowing the posting to a sources group.
Then, these simple 'sed' scripts (which I've been using for <too long> myself)
will suffice for unsharing from <favorite news reader>.

But, what I'd really like, is to say YES, OH, YES.  And YES I said,
YES, please put that whole thing over on floppy someday, but not
today, 'cause I'm sick and tired of 'unshar; format -n /dev/fx0; tar ...',
but only if I've gotten all the pieces. P.S. Mr. Santa, decide what
size floppy while you're at it.

I want a 'later' script.  That snags the N files (when they finally
arrive) and reminds me to just stick a lousy floppy into the
drive... it'll do the rest.  One program, one disk.
-- 
	Rick Richardson, President, PC Research, Inc.
(201) 542-3734 (voice, nights)   OR   (201) 834-1378 (voice, days)
		seismo!uunet!pcrat!rick

daveb@geac.UUCP (David Collier-Brown) (12/10/87)

In turgid prose, someone writes:
#I therefore propose that in the future, when sources (or any other 'shar'
#file) is posted, that the powers that be place any needed 'introduction'
#that preceeds the shar into the file in such a way that it will not
#interfere with the attempts of myself (and I bet others as well) to automate
#or simplify this unpacking.  This can be assured by simply insuring that the
#first character of each line is a "#", making it a shell comment line.

#Comments?

	Write a "domail" program, which discards lines until it hits
/^#/ or /^#! *\//

 --dave (it all fits on one screen) c-b
-- 
 David Collier-Brown.                 {mnetor|yetti|utgpu}!geac!daveb
 Geac Computers International Inc.,   |  Computer Science loses its
 350 Steelcase Road,Markham, Ontario, |  memory (if not its mind)
 CANADA, L3R 1B3 (416) 475-0525 x3279 |  every 6 months.

rmtodd@uokmax.UUCP (Richard Michael Todd) (12/12/87)

In article <1955@chinet.UUCP> dag@chinet.UUCP (Daniel A. Glasser) writes:
>The main problem with this suggestion is that compress does not work
>the same on all systems.  I have a UNIX system that only allows up to
>12 bits for compression.  Many systems support more bits.  The default
>is the max, thus if I get a compressed file from, say, a 3B2, I cannot
>decompress it.  If somebody could post a version of compress that would
>give the same number of bits for all systems, at least for decompression,
>this problem would be lessened.
The max number of bits that a version of compress can handle is limited by
how much memory you have available.  However, a version capable of 16-bit
can read compressed files created by versions capable of 12-bit only.  The
problem is how to keep the versions of compress capable of better than
12-bit performance from using it and confusing the restricted versions. 
Fortunately compress has an option -b<n> which allows you to specify 
the maximum bit code compress will use.  Thus compressing a file with the
-b12 option should allow the compressed file to be read on any machine. 
(I know of no machine that is restricted to 11-bits, so 12-bit compression
should be universally acceptable.)
>(My UNIX box is a Z8001 based system, so doing huge model is not an
>option, also, PDP-11's have the same trouble.)
Huge model can't be used by my PC under MINIX, either.  But as long as I
remember to use a suitable -b option when compressing files on uokmax, I
have no problems transporting compressed files in either direction.  
--------------------------------------------------------------------------
Richard Todd
USSnail:820 Annie Court,Norman OK 73069
UUCP: {allegra!cbosgd|ihnp4}!occrsh!uokmax!rmtodd

dudek@ubglue.ksr.com (Glen Dudek) (12/13/87)

I, too, have found 'shar's which do not start with a '#' comment,
but have caught all of those so far by also checking for 'echo'
at the beginning of a line.  As for concatenated shars, I usually end
up with these in a mailbox, so I check for 'exit' or a good old Unix-style
'From ' line to terminate the current shar.  Works great.

	Glen Dudek
	ksr!dudek@harvard.harvard.edu

rustcat@russell.STANFORD.EDU (Vallury Prabhakar) (12/22/87)

I know of a certain program called "unshar" which tries to extract files
from shar archives intelligently.  I'm reasonably sure that it's public-
domain, and am willing to post it if people would like me to.


								-- V

--
> (get life)				E-Mail :  rustcat@russell.stanford.edu
>>Error: LIFE has no global value		  

:C    Try evaluating LIFE again
->