[net.unix-wizards] rm abc*

ajc (02/14/83)

Every once in a while some turkey program comes along and leaves
a file with some unprintable name.

You know it's there because when you do ls you see a file with a
name that prints (for instance) ABC?.

Yet when you do rm ABC? or even stranger rm ABC*, rm responds
"file nonexistent".

If the damn file is nonexistent, why does ls list it??  And if ls
thinks it's there, why can't rm remove it??  Or as Judy Garland
put it so eloquently so many years ago,
   Birds fly over the rainbow...
   Why then, oh why can't I.
~e
~e
~e
                     
~e
~e
~e
(oh, well, I guess ~e doesn't work either.  Guess it's just one
of those days!)

guy (02/15/83)

The problem is that the unprintable character has its uppermost bit turned
on.  "ls" has no trouble because it reads the directory directly and immediately
stuffs the pathname through "stat" (and, with the Berkeley "ls - but NOT other
"ls"es - turns the unprintable to "?"), but the "rm ABC*" passes through the
shell's star-convention expansion mechanism.  If I remember correctly, the
Bourne shell (and probably the C shell, and most other UNIX shells) use the
uppermost bit to indicate that the character has been quoted (with "", '', or
\), and thus have to strip that bit off when they hand the filename to the
command.  Therefore, "rm" is trying to remove "ABC\001" instead of "ABC\201";
the former does not exist, so it complains.
					Guy Harris
					RLG Corporation
				(...!decvax!duke!mcnc!rlgvax!guy)
P.S. The only way I know of getting rid of that file is to write your own
little program which executes unlink("ABC\201");.

mcm (02/15/83)

Since many people seem to be interested in the problem of removing
files with unprintable characters, I'm sending this through news
rather than mail.

To remove a file with an unprintable character, we use a program called
'dired'.   This program went down net.sources a while ago.  It was
written by Stuart Cracraft (mclure@sri-unix) with modifications by
J. Lepreau (University of Utah).  It is a visual editor of sorts

alb (02/16/83)

To get rid of files with unpleasant characters in them, try:

rm -ri .

This is just goes through and ask you if you want to remove each file,
one by one, from . on down (takes a while if you are in a top directory)

ken (02/16/83)

The file you have a hard time removing (ABC?) has some kind of unprintable
character in it; ls senses that it isn't printable and puts a '?' in place of
the unprintable character.  On earlier UNIX systems, there used to be a program
called "dsw" (a program with questionable etymology), which allowed you to
interactively delete files in the current directory.  "rm -i" is an attempt
to consolidate all the different types of removal programs, but it doesn't
fully take the place of dsw, because it doesn't change the unprintable
characters to a '?'.  Therefore, when you say "rm -i *", you never see "ABC?".

One suggestion is to do an "od -c .", and look for the file with the
unprintable character; then you can use one of various quoting mechanisms
to specifically delete the offending file.

Another is to find some sort of pattern that can uniquely describe the file,
such as "*BC*".  Try it out with "ls" first, then with "rm".

trt (02/16/83)

Strangely named files have resulted in many wasted hours, much confusion,
and many uninteresting net articles (such as this one).
So please consider the following proposal:

	UNIX should accept only filenames with printable characters.

More specifically, the kernel's namei should verify that all characters
in the filename are 'isprint', as defined by <ctype.h>.
In my experience, filenames with weird characters are *always* mistakes.
(Hmm, sometimes they have been intentional ... and malicious.)

I am going to change some local namei()s as an experiment.
But, really, what problems could there be compared
to the huge nuisance we have currently?
	Tom Truscott

mat (02/16/83)

Yes, dsw(i) way to do it under generic UN*X (ie v6).  Dsw at one time read the
console switches to determine whether or not you wanted a file destroyed!
Rather unportable, I'd say.
As an aside, dsw comes from a telegrapher's abbreviation fo the Russian word
for ``so long'', which transliterates ``dossv'donye'' or ``DoSsW'donye''
This according to a person who got it from a person who overheard it from
someone who was reported to be a close friend of ... from someone who
claimed to actually have heard it from Ken.

guy (02/17/83)

I agree with you that non-printable characters in filenames are generally
useless.  I believe 4.2BSD prohibits them; some group of Interlisp hackers
or somesuch got upset because they used non-printable characters to simulate
the TENEX/TOPS-20/VMS file version number conventions - 4.2BSD with its
long pathnames permits you to stick ";nnn" at the end for a version number,
but their code had to change.

As far as I am concerned, space through tilde is completely adequate.  If
people get upset about not being able to have a file name ^G^G^G\277xxx,
they should consider all the people on systems where filenames can only
consist of 9 or 12 upper-case alphanumerics, with a period before the last
three characters...

					Guy Harris
					RLG Corporation
				(decvax!duke!mcnc!rlgvax!guy)

grunwald (02/18/83)

#R:rlgvax:-105100:uiucdcs:13700024:000:285
uiucdcs!grunwald    Feb 16 11:41:00 1983

   Instead of writing a program to do "unlink("ABC\201)", you could to a
"rm -i *" in the directory and answer no to all the files except the
ABC? file. Of course if you have a lot of files in your directory, this
could be kind of slow.

					dirk grunwald
					university of illinois

dave (02/18/83)

Why not use dsw? That will let you remove any file in the directory.
What's that you say? You don't have version 6 UNIX any more?
And you thought v7 was progress??

			Not afraid to continue running v6
				(on our aging 11/45)
					Dave Sherman, Toronto
					utcsrgv!dave

guy (02/19/83)

Close, but no cigar, as we say.  The UNIX shell will strip off the 8th bit
of the file name when it expands the "*", and as such rm will get called
by the equivalent of:

	execl("/bin/rm", "rm", "-i", "ABC\001", (char *)NULL);

and will reply "ABC\001 non-existent".  Unless the C shell or whatever other
shell you are running doesn't strip off the \200, you can't GET a string
containing a character with its 8th bit on to a program through the shell.

					Guy Harris
					RLG Corporation
					...!decvax!mcnc!rlgvax!guy

goldfarb (02/20/83)

I guess the net runs in a pretty tight cycle, since this subject
was discussed last summer.  In any case, I adopted the following
method based on that previous discussion:

	1)  Do ls -i to find the inode number of the wierd file.
	2)  Do a find to get rid of it, e.g.:
	       find . -inum nnnn -exec rm -f {} \;

				Ben Goldfarb
				Univ. of Cent. Fla.
				...!duke!ucf-cs!goldfarb

jcz (02/20/83)

References: rti.1021


My favorite wierd filename was 'file1\tfile2' (where file1 and file2 were
forgotten but reasonablr filenames.)    One of my users kept complaning
"I can't get rid of file1."    (sigh)   A vote for trt's fix to namei.

--jcz
(Why dosen't my news have an editor escape?)

sdo (02/24/83)

Along the lines of files with strange characters in them, a funny thing
happened to a member of my group today (at least we thought it was
funny).  He had a file name with an ampersand in it.  He tried to remove
it by saying    rm *&*

If you can't figure out what happens, try it in an unimportant
directory.

			Scott Orshan
			Bell Labs Piscataway
			201-981-3064
			houxm!u1100a!sdo

clark.wbst@PARC-MAXC.ARPA (03/23/83)

I don't like the idea... it seems 'anti-UNIX'. 

I always wrote a one line C program to unlink it.

If a global solution is needed, how about a program sort of like dsw...
You say "rmi inode-number", and it looks in the current directory for
inode-number, gets the string, and deletes it.  The user gets the inode
number from an ls -i.  If you call it with no argument, it could look
through the directory for non-typeable file names and ask the user if
that one was it, printing it in hex and as a string.  This does not
kludge up UNIX, and solves the problem.

					--Ray

cak@Purdue (03/23/83)

From:  Christopher A Kent <cak@Purdue>

The Unix systems at Purdue EE have had just such a "turkey mode", that
prevents you from creating files with unprintables, for years. It's 
implemented as a stty option!

chris

gamma@EDN-UNIX (03/23/83)

From:  W. J. Showalter <gamma@EDN-UNIX>

We are still running V6 Unix at EDN so this may not work for other releases.
However, I wrote a small program some time ago to take care of removing
unprintable characters.  I called it "ctrl" and use it as a filter.

ls

produces

prog1.c
prog2.c
prog3.c


ls | ctrl

will produce

p r o g 1 . c
p o r ^H ^H r o g 2 . c
p r o g 3 . c

The second file name was one of those which had a couple of backspaces
imbeded in the name.  This enables the user to reproduce exactly, the name of
the file ( assuming of course that the imbeded control/non-printable
character will not cause adverse reaction by the system ).  An example of
such a reaction is that of Control P on the DEC 11/70 console which
throws the terminal into the soft console mode.

Other possibilities are imbeded blanks.  This can be taken care of by
simply enclosing the culprit in quotes ( e.g. rm "file one" "file two" )

Some characters may have to be escaped ( e.g. rm "file\@" ) to remove "file@",
assuming that '\' is the escape character.


Here is the ctrl filter.

main ()
{
    char    C;
    int     i;
    putchar (' ');
    while ((C = getchar ()) != 0)
    {
	if (C < 040 )
	{
	    putchar ('^');
	    putchar (C + 0100);
	    if ( C == 015 || C == 012 )
		putchar(C);
	}
	else
	    putchar (C);
	putchar (' ');
    };
    exit();
}


Comments are welcome.

Jim Showalter   GAMMA@EDN-UNIX

ras (04/08/83)

Forgive me if I am mis-interpreting the problem, but if it
is how to remove a file with a weird (binary,high-bit,control,or other)
character in its name, I have been able to deal with things like
that with a standard "rm" command.
	The procedure is to use "rm -ir" on the parent directory
of the strangely-named file, answering "no" for each non-offending
file.  When it comes around to the file under question, then answer "y"
and the file will be removed.
	Granted, this could potentially remove files erroneously (a
slip of the finger), or could involve answeringg a large number of
"rm" questions, but for the number of times that it comes up,....

Ralph Shaw			decvax!brunix!rayssd!ras

kar (04/08/83)

	Rather than write a new program to remove garbage files, we modified
"rm" to print control characters in \ddd format.  Then, to get rid of a junk
name, you need only type "rm -ri .", and answer "n" until the bogus name is
reached.

	This solves the problem without adding yarm (yet another rm), and has
the additional advantage of taking a long time if you have a broad directory
organization, encouraging users split files among many directories like they
should.

	Ken Reek, Rochester Institute of Technology
	ucbvax!allegra!rochester!ritcv!kar

iy47ab (04/11/83)

An addition to the program posted a few days back that could be used
with 'ls' to print out escaped characters:

adding a 
#include <stdio.h>

and changing the
getchar() != 0
to 
getchar() != EOF

will make the program portable to any system.

Try it.

Lady A