[net.followup] !FUNKY!STUFF!

jim@uw-beaver.UUCP (06/24/83)

I'm sure you are tired of hearing about this by now, but since I
originated the use of the phrase in shell archives, I thought I'd try
to shed some light on its origin.

It comes from the "Kool and the Gang" song by the same name ("Can't Get
Enough of that Funky Stuff").  After exhaustive research I determined
that this phrase was the least likely of all English language phrases
to appear in Unix source code.  This has proven to be a safe choice, as
there have been no reported cases of archives failing because of the
appearance of this phrase in the code.

Hopes this helps to clarify this knotty issue.

smh@mit-eddi.UUCP (Steven M. Haflich) (06/28/83)

However, the extensive use of
!FUNKY!STUFF!
in all the mail recently floating around the net assures that it is
no longer the least likely English (sic ?) phrase to appear in Unix
source code, or actually, Unix archives.  It's the age-old computer
problem of creating meta-tokens to handle meta-tokens ...

dann@wxlvax.UUCP (06/28/83)

What happens to all the netnews archives now that there
are all these articles with !FUNKY!STUFF! in them?

dee@cca.UUCP (06/28/83)

Re meta-tokens to handle meta-tokens, you would at least be better off
if your software contructed your own private meta-token convolving
some obscureness plus your user id and site name.  At least that way
no practical finite message could screw everyone up ...

leichter@yale-com.UUCP (06/29/83)

This is getting VERY deep..."private meta-token [by] convolving some obscur-
ness plus your user id and site name."  If you really want a workable meta-
token, make it depend on the input!  For example, it's very unlikely that
every possible sequence of three upper-case letters occurs in the text.
That's 17576 combinations.  Set up a table with one bit per combo - only
2197 bytes long - and read the text, ignoring all consecutive triples with
at least one non-upper-case letter, and turning on the right bit for any
found triples of upper-case letters.  When you are done, any zero bit gives
you a useable three-letter meta-token.  (In the very unlikely case that all
the bits are on, you can always go to 3-digit strings, or three lower-case
letters, or, if all else fails, move on to 4-digit strings, etc.  In fact,
for speed you probably should start with 2-digit strings, which will cover
99.9999% of the cases in about the time it takes just to read through the
text once.)
						-- Jerry
				decvax!yale-comix!leichter leichter@yale

dee@cca.UUCP (06/29/83)

I can't see how merely processing the input helps at all.  Do you mean
that if you come across a triple of capital letters that has not occured
earlier in the message, you interpret that as the meta-token (perhaps
also requiring it be the first in some ordering of thus far unused
triples)?  That obviously does not work for the message

	"The addres of the Boston office of the AAA (American Automobile
Association) is ..."AAB

Where AAB was meant to be an input dependent meta-token terminating
the message.

Of course the problem is trivially solvable with more complex schemes
such as two strings, one of which is an end-of-message and the other
of which quotes the immediately following character regardless of what
that character is.
						dee@cca
						decvax!cca!dee

leichter@yale-com.UUCP (06/29/83)

Re:  cca!dee's comments:

The whole point of this exercise was to find a text string that did not occur
anywhere in a text string to be delimited.  I was suggesting an extra pass over
the text to be delimited in order to choose a delimiter good FOR THAT WHOLE TEXT
STRING.  I was NOT suggesting choosing, or changing, the delimiter on the fly.
								-- Jerry
					decvax!yale-comix!leichter leichter@yale

jim@uw-beaver.UUCP (06/30/83)

I have risen to the challenge, and created a version of shar which I
think solves the problem of the magic word appearing in the text of
archived files, for example when archiving archives.  The magic word is
generated by combining the node name with the date and time of day,
producing an identifier that is unique for all time.  Portablility
suffers somewhat, because not all Unixes have the "uuname -l" command
(which just returns your own system name).

Here it is:

MAGIC=`(uuname -l; date) | tr ' :\12' '_--'`
AR=$1

shift
echo "# The rest of this file is a shell script which will extract:" >>$AR
echo "# $*" >>$AR
for i do
	echo a - $i
	echo "echo x - $i" >>$AR
	echo "cat >$i <<'$MAGIC'" >>$AR
	cat $i >>$AR
	echo "$MAGIC" >>$AR
done

chris@umcp-cs.UUCP (06/30/83)

Well, uw-beaver!jim's latest is pretty good.  I still prefer using
sed over cat; it gets those '.'s past those mailers.  Why, just a
few months ago I tried to send someone a file with '.' in it, and
they kept getting only the first couple of lines.  I finally ran
the thing through ``makescript'' [don't you get net.sources?  Don't
you wonder why I didn't use my own program in the first place?] and
it got there with no problems.

One other thing: the script you make should say "Run with <foo>".
The hereis-document terminators \\\work differently/// in the two
shells.  For example:

	/bin/cat << 'EOF'
	This message will appear
	when this script is run.
	'EOF'
	ls -l /dev/tty

If you run that with 'sh', you'll get on your screen:

	This message will appear
	when this script is run.
	'EOF'
	ls -l /dev/tty

If you run it through 'csh', you'll get:

	This message will appear
	when this script is run.
	crw-rw-rw- 1 root      2,  0 Jun 30 02:12 /dev/tty

I *wish* the authors of the shells had not made quoted hereis documents
act differently.  But there you have it, another typical Unix*
black-magic-confuse-the-non-wizards trick....

				- Chris

*Unix is a trademark of Bell, who didn't want it until half the country
was running it. :-)
-- 
UUCP:	{seismo,allegra,brl-bmd}!umcp-cs!chris
CSNet:	chris@umcp-cs
ARPA:	chris.umcp-cs@UDel-Relay