ajc (02/14/83)
Every once in a while some turkey program comes along and leaves a file with some unprintable name. You know it's there because when you do ls you see a file with a name that prints (for instance) ABC?. Yet when you do rm ABC? or even stranger rm ABC*, rm responds "file nonexistent". If the damn file is nonexistent, why does ls list it?? And if ls thinks it's there, why can't rm remove it?? Or as Judy Garland put it so eloquently so many years ago, Birds fly over the rainbow... Why then, oh why can't I. ~e ~e ~e ~e ~e ~e (oh, well, I guess ~e doesn't work either. Guess it's just one of those days!)
guy (02/15/83)
The problem is that the unprintable character has its uppermost bit turned on. "ls" has no trouble because it reads the directory directly and immediately stuffs the pathname through "stat" (and, with the Berkeley "ls - but NOT other "ls"es - turns the unprintable to "?"), but the "rm ABC*" passes through the shell's star-convention expansion mechanism. If I remember correctly, the Bourne shell (and probably the C shell, and most other UNIX shells) use the uppermost bit to indicate that the character has been quoted (with "", '', or \), and thus have to strip that bit off when they hand the filename to the command. Therefore, "rm" is trying to remove "ABC\001" instead of "ABC\201"; the former does not exist, so it complains. Guy Harris RLG Corporation (...!decvax!duke!mcnc!rlgvax!guy) P.S. The only way I know of getting rid of that file is to write your own little program which executes unlink("ABC\201");.
mcm (02/15/83)
Since many people seem to be interested in the problem of removing files with unprintable characters, I'm sending this through news rather than mail. To remove a file with an unprintable character, we use a program called 'dired'. This program went down net.sources a while ago. It was written by Stuart Cracraft (mclure@sri-unix) with modifications by J. Lepreau (University of Utah). It is a visual editor of sorts
alb (02/16/83)
To get rid of files with unpleasant characters in them, try: rm -ri . This is just goes through and ask you if you want to remove each file, one by one, from . on down (takes a while if you are in a top directory)
ken (02/16/83)
The file you have a hard time removing (ABC?) has some kind of unprintable character in it; ls senses that it isn't printable and puts a '?' in place of the unprintable character. On earlier UNIX systems, there used to be a program called "dsw" (a program with questionable etymology), which allowed you to interactively delete files in the current directory. "rm -i" is an attempt to consolidate all the different types of removal programs, but it doesn't fully take the place of dsw, because it doesn't change the unprintable characters to a '?'. Therefore, when you say "rm -i *", you never see "ABC?". One suggestion is to do an "od -c .", and look for the file with the unprintable character; then you can use one of various quoting mechanisms to specifically delete the offending file. Another is to find some sort of pattern that can uniquely describe the file, such as "*BC*". Try it out with "ls" first, then with "rm".
trt (02/16/83)
Strangely named files have resulted in many wasted hours, much confusion, and many uninteresting net articles (such as this one). So please consider the following proposal: UNIX should accept only filenames with printable characters. More specifically, the kernel's namei should verify that all characters in the filename are 'isprint', as defined by <ctype.h>. In my experience, filenames with weird characters are *always* mistakes. (Hmm, sometimes they have been intentional ... and malicious.) I am going to change some local namei()s as an experiment. But, really, what problems could there be compared to the huge nuisance we have currently? Tom Truscott
mat (02/16/83)
Yes, dsw(i) way to do it under generic UN*X (ie v6). Dsw at one time read the console switches to determine whether or not you wanted a file destroyed! Rather unportable, I'd say. As an aside, dsw comes from a telegrapher's abbreviation fo the Russian word for ``so long'', which transliterates ``dossv'donye'' or ``DoSsW'donye'' This according to a person who got it from a person who overheard it from someone who was reported to be a close friend of ... from someone who claimed to actually have heard it from Ken.
guy (02/17/83)
I agree with you that non-printable characters in filenames are generally useless. I believe 4.2BSD prohibits them; some group of Interlisp hackers or somesuch got upset because they used non-printable characters to simulate the TENEX/TOPS-20/VMS file version number conventions - 4.2BSD with its long pathnames permits you to stick ";nnn" at the end for a version number, but their code had to change. As far as I am concerned, space through tilde is completely adequate. If people get upset about not being able to have a file name ^G^G^G\277xxx, they should consider all the people on systems where filenames can only consist of 9 or 12 upper-case alphanumerics, with a period before the last three characters... Guy Harris RLG Corporation (decvax!duke!mcnc!rlgvax!guy)
grunwald (02/18/83)
#R:rlgvax:-105100:uiucdcs:13700024:000:285 uiucdcs!grunwald Feb 16 11:41:00 1983 Instead of writing a program to do "unlink("ABC\201)", you could to a "rm -i *" in the directory and answer no to all the files except the ABC? file. Of course if you have a lot of files in your directory, this could be kind of slow. dirk grunwald university of illinois
dave (02/18/83)
Why not use dsw? That will let you remove any file in the directory. What's that you say? You don't have version 6 UNIX any more? And you thought v7 was progress?? Not afraid to continue running v6 (on our aging 11/45) Dave Sherman, Toronto utcsrgv!dave
guy (02/19/83)
Close, but no cigar, as we say. The UNIX shell will strip off the 8th bit of the file name when it expands the "*", and as such rm will get called by the equivalent of: execl("/bin/rm", "rm", "-i", "ABC\001", (char *)NULL); and will reply "ABC\001 non-existent". Unless the C shell or whatever other shell you are running doesn't strip off the \200, you can't GET a string containing a character with its 8th bit on to a program through the shell. Guy Harris RLG Corporation ...!decvax!mcnc!rlgvax!guy
goldfarb (02/20/83)
I guess the net runs in a pretty tight cycle, since this subject was discussed last summer. In any case, I adopted the following method based on that previous discussion: 1) Do ls -i to find the inode number of the wierd file. 2) Do a find to get rid of it, e.g.: find . -inum nnnn -exec rm -f {} \; Ben Goldfarb Univ. of Cent. Fla. ...!duke!ucf-cs!goldfarb
jcz (02/20/83)
References: rti.1021 My favorite wierd filename was 'file1\tfile2' (where file1 and file2 were forgotten but reasonablr filenames.) One of my users kept complaning "I can't get rid of file1." (sigh) A vote for trt's fix to namei. --jcz (Why dosen't my news have an editor escape?)
sdo (02/24/83)
Along the lines of files with strange characters in them, a funny thing happened to a member of my group today (at least we thought it was funny). He had a file name with an ampersand in it. He tried to remove it by saying rm *&* If you can't figure out what happens, try it in an unimportant directory. Scott Orshan Bell Labs Piscataway 201-981-3064 houxm!u1100a!sdo
clark.wbst@PARC-MAXC.ARPA (03/23/83)
I don't like the idea... it seems 'anti-UNIX'. I always wrote a one line C program to unlink it. If a global solution is needed, how about a program sort of like dsw... You say "rmi inode-number", and it looks in the current directory for inode-number, gets the string, and deletes it. The user gets the inode number from an ls -i. If you call it with no argument, it could look through the directory for non-typeable file names and ask the user if that one was it, printing it in hex and as a string. This does not kludge up UNIX, and solves the problem. --Ray
cak@Purdue (03/23/83)
From: Christopher A Kent <cak@Purdue> The Unix systems at Purdue EE have had just such a "turkey mode", that prevents you from creating files with unprintables, for years. It's implemented as a stty option! chris
gamma@EDN-UNIX (03/23/83)
From: W. J. Showalter <gamma@EDN-UNIX> We are still running V6 Unix at EDN so this may not work for other releases. However, I wrote a small program some time ago to take care of removing unprintable characters. I called it "ctrl" and use it as a filter. ls produces prog1.c prog2.c prog3.c ls | ctrl will produce p r o g 1 . c p o r ^H ^H r o g 2 . c p r o g 3 . c The second file name was one of those which had a couple of backspaces imbeded in the name. This enables the user to reproduce exactly, the name of the file ( assuming of course that the imbeded control/non-printable character will not cause adverse reaction by the system ). An example of such a reaction is that of Control P on the DEC 11/70 console which throws the terminal into the soft console mode. Other possibilities are imbeded blanks. This can be taken care of by simply enclosing the culprit in quotes ( e.g. rm "file one" "file two" ) Some characters may have to be escaped ( e.g. rm "file\@" ) to remove "file@", assuming that '\' is the escape character. Here is the ctrl filter. main () { char C; int i; putchar (' '); while ((C = getchar ()) != 0) { if (C < 040 ) { putchar ('^'); putchar (C + 0100); if ( C == 015 || C == 012 ) putchar(C); } else putchar (C); putchar (' '); }; exit(); } Comments are welcome. Jim Showalter GAMMA@EDN-UNIX
ras (04/08/83)
Forgive me if I am mis-interpreting the problem, but if it is how to remove a file with a weird (binary,high-bit,control,or other) character in its name, I have been able to deal with things like that with a standard "rm" command. The procedure is to use "rm -ir" on the parent directory of the strangely-named file, answering "no" for each non-offending file. When it comes around to the file under question, then answer "y" and the file will be removed. Granted, this could potentially remove files erroneously (a slip of the finger), or could involve answeringg a large number of "rm" questions, but for the number of times that it comes up,.... Ralph Shaw decvax!brunix!rayssd!ras
kar (04/08/83)
Rather than write a new program to remove garbage files, we modified "rm" to print control characters in \ddd format. Then, to get rid of a junk name, you need only type "rm -ri .", and answer "n" until the bogus name is reached. This solves the problem without adding yarm (yet another rm), and has the additional advantage of taking a long time if you have a broad directory organization, encouraging users split files among many directories like they should. Ken Reek, Rochester Institute of Technology ucbvax!allegra!rochester!ritcv!kar
iy47ab (04/11/83)
An addition to the program posted a few days back that could be used with 'ls' to print out escaped characters: adding a #include <stdio.h> and changing the getchar() != 0 to getchar() != EOF will make the program portable to any system. Try it. Lady A