jim@uw-beaver.UUCP (06/24/83)
I'm sure you are tired of hearing about this by now, but since I originated the use of the phrase in shell archives, I thought I'd try to shed some light on its origin. It comes from the "Kool and the Gang" song by the same name ("Can't Get Enough of that Funky Stuff"). After exhaustive research I determined that this phrase was the least likely of all English language phrases to appear in Unix source code. This has proven to be a safe choice, as there have been no reported cases of archives failing because of the appearance of this phrase in the code. Hopes this helps to clarify this knotty issue.
smh@mit-eddi.UUCP (Steven M. Haflich) (06/28/83)
However, the extensive use of !FUNKY!STUFF! in all the mail recently floating around the net assures that it is no longer the least likely English (sic ?) phrase to appear in Unix source code, or actually, Unix archives. It's the age-old computer problem of creating meta-tokens to handle meta-tokens ...
dann@wxlvax.UUCP (06/28/83)
What happens to all the netnews archives now that there are all these articles with !FUNKY!STUFF! in them?
dee@cca.UUCP (06/28/83)
Re meta-tokens to handle meta-tokens, you would at least be better off if your software contructed your own private meta-token convolving some obscureness plus your user id and site name. At least that way no practical finite message could screw everyone up ...
leichter@yale-com.UUCP (06/29/83)
This is getting VERY deep..."private meta-token [by] convolving some obscur- ness plus your user id and site name." If you really want a workable meta- token, make it depend on the input! For example, it's very unlikely that every possible sequence of three upper-case letters occurs in the text. That's 17576 combinations. Set up a table with one bit per combo - only 2197 bytes long - and read the text, ignoring all consecutive triples with at least one non-upper-case letter, and turning on the right bit for any found triples of upper-case letters. When you are done, any zero bit gives you a useable three-letter meta-token. (In the very unlikely case that all the bits are on, you can always go to 3-digit strings, or three lower-case letters, or, if all else fails, move on to 4-digit strings, etc. In fact, for speed you probably should start with 2-digit strings, which will cover 99.9999% of the cases in about the time it takes just to read through the text once.) -- Jerry decvax!yale-comix!leichter leichter@yale
dee@cca.UUCP (06/29/83)
I can't see how merely processing the input helps at all. Do you mean that if you come across a triple of capital letters that has not occured earlier in the message, you interpret that as the meta-token (perhaps also requiring it be the first in some ordering of thus far unused triples)? That obviously does not work for the message "The addres of the Boston office of the AAA (American Automobile Association) is ..."AAB Where AAB was meant to be an input dependent meta-token terminating the message. Of course the problem is trivially solvable with more complex schemes such as two strings, one of which is an end-of-message and the other of which quotes the immediately following character regardless of what that character is. dee@cca decvax!cca!dee
leichter@yale-com.UUCP (06/29/83)
Re: cca!dee's comments: The whole point of this exercise was to find a text string that did not occur anywhere in a text string to be delimited. I was suggesting an extra pass over the text to be delimited in order to choose a delimiter good FOR THAT WHOLE TEXT STRING. I was NOT suggesting choosing, or changing, the delimiter on the fly. -- Jerry decvax!yale-comix!leichter leichter@yale
jim@uw-beaver.UUCP (06/30/83)
I have risen to the challenge, and created a version of shar which I think solves the problem of the magic word appearing in the text of archived files, for example when archiving archives. The magic word is generated by combining the node name with the date and time of day, producing an identifier that is unique for all time. Portablility suffers somewhat, because not all Unixes have the "uuname -l" command (which just returns your own system name). Here it is: MAGIC=`(uuname -l; date) | tr ' :\12' '_--'` AR=$1 shift echo "# The rest of this file is a shell script which will extract:" >>$AR echo "# $*" >>$AR for i do echo a - $i echo "echo x - $i" >>$AR echo "cat >$i <<'$MAGIC'" >>$AR cat $i >>$AR echo "$MAGIC" >>$AR done
chris@umcp-cs.UUCP (06/30/83)
Well, uw-beaver!jim's latest is pretty good. I still prefer using sed over cat; it gets those '.'s past those mailers. Why, just a few months ago I tried to send someone a file with '.' in it, and they kept getting only the first couple of lines. I finally ran the thing through ``makescript'' [don't you get net.sources? Don't you wonder why I didn't use my own program in the first place?] and it got there with no problems. One other thing: the script you make should say "Run with <foo>". The hereis-document terminators \\\work differently/// in the two shells. For example: /bin/cat << 'EOF' This message will appear when this script is run. 'EOF' ls -l /dev/tty If you run that with 'sh', you'll get on your screen: This message will appear when this script is run. 'EOF' ls -l /dev/tty If you run it through 'csh', you'll get: This message will appear when this script is run. crw-rw-rw- 1 root 2, 0 Jun 30 02:12 /dev/tty I *wish* the authors of the shells had not made quoted hereis documents act differently. But there you have it, another typical Unix* black-magic-confuse-the-non-wizards trick.... - Chris *Unix is a trademark of Bell, who didn't want it until half the country was running it. :-) -- UUCP: {seismo,allegra,brl-bmd}!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris.umcp-cs@UDel-Relay