stockett@larry.UUCP (Jeff Stockett) (04/30/91)
Greetings! I'm looking for a version of rm (or a script) that will move deleted files to a temporary location like .wastebasket, so that novice users who accidentally delete files, can redeem themselves. I've considered writing a script to do this, but I thought one might already exist. Thanks in advance, Jeffrey M. Stockett Tensleep Design, Inc. UUCP: ..cs.utexas.edu!ut-emx!jeanluc!larry!stockett Internet: stockett@titan.tsd.arlut.utexas.edu
mcf@statware.UUCP ( Mathieu Federspiel) (05/03/91)
In article <144@larry.UUCP> stockett@larry.UUCP (Jeff Stockett) writes: > >I'm looking for a version of rm (or a script) that will move deleted files to a >temporary location like .wastebasket, so that novice users who accidentally >delete files, can redeem themselves. I've considered writing a script to >do this, but I thought one might already exist. > Following are Bourne shell scripts I implemented on our systems. I install the scripts in /usr/local/bin, and then give everyone an alias of "rm" to this script. What happens is, say, you "rm testfile". The script moves "testfile" to ".#testfile". You then have a period of time to "unrm testfile" to get the file back. The period of time is determined by the system administrator, who sets up a job to run periodically to remove all files with names starting with ".#". For this removing process, the administrator must, of course, warn users not to name files as ".#". Since this is a hidden file, there should be no problem. Note that this preserves the directory structure of files, which makes life easier than moving everything to ".wastebasket". Also note that directories will be moved, and special handling of directories in your removing job may be required. Enjoy! -- Mathieu Federspiel mcf%statware.uucp@cs.orst.edu Statware orstcs!statware!mcf 260 SW Madison Avenue, Suite 109 503-753-5382 Corvallis OR 97333 USA 503-758-4666 FAX #---------------------------------- cut here ---------------------------------- # This is a shell archive. Remove anything before this line, # then unpack it by saving it in a file and typing "sh file". # # Wrapped by Mathieu Federspiel <mcf@statware> on Tue Jun 12 17:17:26 1990 # # This archive contains: # rm#.1 rmrm#.1 rm lsrm rmrm unrm # # Modification/access file times will be preserved. # Error checking via wc(1) will be performed. unset LANG echo x - rm\#.1 cat >rm\#.1 <<'@EOF' .TH RM 1 "LOCAL" .tr ^ .SH NAME rm, lsrm, unrm, rmrm \- temporary file removal system .SH SYNOPSIS .B rm [rm options] files .B lsrm [ls options] .B unrm files .B rmrm directory .SH DESCRIPTION The temporary file removal system will use .B mv(1) to move the specified \fBfiles\fR to the same name with the prefix \fB.#\fR. Files with this prefix are deleted from the system after they are one day old. .B Unrm may be used to restore temporarily removed files, if they have not been deleted. .B Lsrm is used to list files in the current directory which are temporarily removed. .B Rmrm is used to remove all files which begin with \fB.#\fR from the directory which is specified, and all directories thereunder. .B Find(1) is used to identify and remove those files. .SH CAVEATS The use of options with rm will cause different results. The rm(1) option -i is recognized and the script will prompt for a "y" or "Y" response before moving the file specified. Any other response will not move the file. Options other than -i are not recognized by rm. The use of any option other than -i will result in the option and the list of files being passed to rm(1) unchanged. Only the first item in the command list is checked for "-i". While \fBlsrm\fR will correctly use \fBls(1)\fR options, it will only list temporarily removed files in the current directory. .SH "SEE ALSO" rm(1), ls(1), mv(1) .SH AUTHOR Mathieu Federspiel, Statware. @EOF set `wc -lwc <rm\#.1` if test $1$2$3 != 572511434 then echo ERROR: wc results of rm\#.1 are $* should be 57 251 1434 fi touch -m 1219172089 rm\#.1 touch -a 0605040190 rm\#.1 chmod 664 rm\#.1 echo x - rmrm\#.1 cat >rmrm\#.1 <<'@EOF' .TH RM 1 "LOCAL" .tr ^ .SH NAME rm, lsrm, unrm, rmrm \- temporary file removal system .SH SYNOPSIS .B rm [rm options] files .B lsrm [ls options] .B unrm files .B rmrm directory .SH DESCRIPTION The temporary file removal system will use .B mv(1) to move the specified \fBfiles\fR to the same name with the prefix \fB.#\fR. Files with this prefix are deleted from the system after they are one day old. .B Unrm may be used to restore temporarily removed files, if they have not been deleted. .B Lsrm is used to list files in the current directory which are temporarily removed. .B Rmrm is used to remove all files which begin with \fB.#\fR from the directory which is specified, and all directories thereunder. .B Find(1) is used to identify and remove those files. .SH CAVEATS The use of options with rm will cause different results. The rm(1) option -i is recognized and the script will prompt for a "y" or "Y" response before moving the file specified. Any other response will not move the file. Options other than -i are not recognized by rm. The use of any option other than -i will result in the option and the list of files being passed to rm(1) unchanged. Only the first item in the command list is checked for "-i". While \fBlsrm\fR will correctly use \fBls(1)\fR options, it will only list temporarily removed files in the current directory. .SH "SEE ALSO" rm(1), ls(1), mv(1) .SH AUTHOR Mathieu Federspiel, Statware. @EOF set `wc -lwc <rmrm\#.1` if test $1$2$3 != 572511434 then echo ERROR: wc results of rmrm\#.1 are $* should be 57 251 1434 fi touch -m 1219172089 rmrm\#.1 touch -a 0612171790 rmrm\#.1 chmod 664 rmrm\#.1 echo x - rm cat >rm <<'@EOF' #!/bin/sh # rm temporary # script to do an mv rather than rm to .# file # By Mathieu Federspiel, 1987 # # recognizes -i option # # Modified to print if file deleted/not with -i. --- MCF, Jan 1989 # Modified to test for write permission. --- MCF, Aug 1989 # Modified to touch saved file. This helps with backups. --- MCF, Aug 1989 case "$1" in -i) shift for arg in $* do if [ \( -f $arg -o -d $arg \) -a -w $arg ] then echo "$arg [yn](n) ? \c" read yesno if [ "$yesno" = "y" -o "$yesno" = "Y" ] then base=`basename $arg` dir=`dirname $arg` mv $arg ${dir}/.#$base && touch ${dir}/.#$base echo "$arg removed." else echo "$arg not removed." fi else echo "$arg: write permission denied" fi done ;; -*) /bin/rm "$@" ;; *) for arg do if [ -w $arg ] then base=`basename $arg` dir=`dirname $arg` mv $arg ${dir}/.#$base && touch ${dir}/.#$base else echo "$arg: write permission denied" fi done ;; esac @EOF set `wc -lwc <rm` if test $1$2$3 != 45169984 then echo ERROR: wc results of rm are $* should be 45 169 984 fi touch -m 0824171389 rm touch -a 0612163190 rm chmod 775 rm echo x - lsrm cat >lsrm <<'@EOF' ls -a $* .#* @EOF set `wc -lwc <lsrm` if test $1$2$3 != 1413 then echo ERROR: wc results of lsrm are $* should be 1 4 13 fi touch -m 0825120387 lsrm touch -a 0611201390 lsrm chmod 775 lsrm echo x - rmrm cat >rmrm <<'@EOF' # # rmrm#: to rm files moved with rm# USAGE="Usage: $0 <directory>" case $# in 1 ) ;; * ) echo $USAGE >&2 ; exit 1 ;; esac find $1 -name '.#*' -exec /bin/rm -f {} \; @EOF set `wc -lwc <rmrm` if test $1$2$3 != 1137172 then echo ERROR: wc results of rmrm are $* should be 11 37 172 fi touch -m 0415135888 rmrm touch -a 0604074090 rmrm chmod 775 rmrm echo x - unrm cat >unrm <<'@EOF' for arg do base=`basename $arg` dir=`dirname $arg` mv ${dir}/.#$base $arg done @EOF set `wc -lwc <unrm` if test $1$2$3 != 61196 then echo ERROR: wc results of unrm are $* should be 6 11 96 fi touch -m 0825120587 unrm touch -a 0517152390 unrm chmod 775 unrm exit 0 -- Mathieu Federspiel mcf%statware.uucp@cs.orst.edu Statware orstcs!statware!mcf 260 SW Madison Avenue, Suite 109 503-753-5382 Corvallis OR 97333 USA 503-758-4666 FAX
navarra@casbah.acns.nwu.edu (John 'tms' Navarra) (05/04/91)
In article <11283@statware.UUCP> mcf@statware.UUCP ( Mathieu Federspiel) writes: > > Following are Bourne shell scripts I implemented on our systems. >I install the scripts in /usr/local/bin, and then give everyone an >alias of "rm" to this script. > What happens is, say, you "rm testfile". The script moves >"testfile" to ".#testfile". You then have a period of time to >"unrm testfile" to get the file back. The period of time is >determined by the system administrator, who sets up a job to run >periodically to remove all files with names starting with ".#". > For this removing process, the administrator must, of course, >warn users not to name files as ".#". Since this is a hidden file, >there should be no problem. Note that this preserves the directory >structure of files, which makes life easier than moving everything >to ".wastebasket". Also note that directories will be moved, and >special handling of directories in your removing job may be >required. > Enjoy! I am not to sure about this one. Why would you want to make a script which does not allow users to name a file .# something when you can just make an script to put ALL removed files into a directory /var/preserve/username and remove all files in that directory older than two days? Then you can tell users that they can get into that directory and get a copy of the file they just removed, -- no matter what the name of it is. Also, whatever script you write that searches thru EVERYONE's dir looking for files beginning with a .# would be MUCH slower than doing a find -mtime on a previously specified dir like /var/preserve and then removing those files older than 2 days. Also, when you remove a file from say your home directory, is there a file .#file made in your home dir? and if you are in your bin directory there is a .#file made there? That means of course that whatever script you write to remove these files has to traverse EVERY damn directory on the planet lookin for .# files! Also, when you say hidden, you mean from ls and not ls -las. Well I do a ls -las all the time and I wouldn't want a whole bunch of .# files looking me in the face when I ls my directories. This is what I do: I have a program called rm that moves all files I remove into $HOME/tmp. Then I have a program called night-clean which is run from crontab that looks SPECIFICALLY in $HOME/tmp and removes files older than 2 days. Night-clean reports what files it removes to $HOME/adm/rmlog so I can look periodically at what files crontab has removed in case I forget or something. Of coure, rmlog grows to a considerable size after a while so I have another program called skim which I run to make sure it is not too big :-) Note though, that this is MUCH more efficient than looking a GOD knows how many directories looking for .# files. > >-- >Mathieu Federspiel mcf%statware.uucp@cs.orst.edu >Statware orstcs!statware!mcf >260 SW Madison Avenue, Suite 109 503-753-5382 >Corvallis OR 97333 USA 503-758-4666 FAX > -- From the Lab of the MaD ScIenTiST: navarra@casbah.acns.nwu.edu
jik@athena.mit.edu (Jonathan I. Kamens) (05/06/91)
John Navarra suggests a non-destructive version of 'rm' that either moves the deleted file into a directory such as /var/preserve/username, which is periodically reaped by the system, and from which the user can retrieve accidentally deleted files, or uses a directory $HOME/tmp and does a similar thing. He points out two drawbacks with the approach of putting the deleted file in the same directory as before it was deleted. First of all, this requires that the entire directory tree be searched in order to reap deleted files, and this is slower than just having to search one directory. Second, the files show up when the "-a" or "A" flag to ls is used to list the files in a directory. A design similar to his was considered when we set about designing the non-destructive rm currently in use (as "delete") at Project Athena and available in the comp.sources.misc archives. There were several reasons why we chose the approach of leaving files in the same directory, rather than Navarra's approach. They include: 1. In a distributed computing environment, it is not practical to assume that a world-writeable directory such as /var/preserve will exist on all workstations, and be accessible identically from all workstations (i.e. if I delete a file on one workstation, I must be able to undelete it on any other workstation; one of the tenet's of Project Athena's services is that, as much as possible, they must not differ when a user moves from one workstation to another). Furthermore, the "delete" program cannot run setuid in order to have access to the directory, both because setuid programs are a bad idea in general, and because setuid has problems in remote filesystem environments (such as Athena's). Using $HOME/tmp alleviates this problem, but there are others.... 2. (This is a big one.) We wanted to insure that the interface for delete would be as close as possible to that of rm, including recursive deletion and other stuff like that. Furthermore, we wanted to insure that undelete's interface would be close to delete's and as functional. If I do "delete -r" on a directory tree, then "undelete -r" on that same filename should restore it, as it was, in its original location. Navarra's scheme cannot do that -- his script stores no information about where files lived originally, so users must undelete files by hand. If he were to attempt to modify it to store such information, he would have to either (a) copy entire directory trees to other locations in order to store their directory tree state, or (b) munge the filenames in the deleted file directory in order to indicate their original locationa, and search for appropriate patterns in filenames when undeleting, or (c) keep a record file in the deleted file directory of where all the files came from. Each of these approaches has problems. (a) is slow, and can be unreliable. (b) might break in the case of funny filenames that confuse the parser in undelete, and undelete is slow because it has to do pattern matching on every filename when doing recursive undeletes, rather than just opening and reading directories. (c) introduces all kinds of locking problems -- what if two processes try to delete files at the same time. 3. If all of the deleted files are kept in one directory, the directory gets very large. This makes searching it slower, and wastes space (since the directory will not shrink when the files are reaped from it or undeleted). 4. My home directory is mounted automatically under /mit/jik. but someone else may choose to mount it on /mnt, or I may choose to do so. The undeletion process must be independent of mount point, and therefore storing original paths of filenames when deleting them will fail if a different mount point is later used. Using the filesystem hierarchy itself is the only way to insure mount-point independent operation of the system. 5. It is not expensive to scan the entire tree for deleted files to reap, since most systems already run such scans every night, looking for core files *~ files, etc. In fact, many Unix systems come bundled with a crontab that searches for # and .# files every night by default. 6. If I delete a file in our source tree, why should the deleted version take up space in my home directory, rather than in the source tree? Furthermore, if the source tree is on a different filesystem, the file can't simply be rename()d to put it into my deleted file directory, it has to be copied. That's slow. Again, using the filesystem hierarchy avoids these problems, since rename() within a directory always works (although I believe renaming a non-empty directory might fail on some systems, they deserve to have their vendors shot :-). 7. Similarly, if I delete a file in a project source tree that many people work on, then other people should be able to undelete the file if necessary. If it's been put into my home directory, in a temporary location which presumably is not world-readable, they can't. They probably don't even know who delete it. Jonathan Kamens USnail: MIT Project Athena 11 Ashford Terrace jik@Athena.MIT.EDU Allston, MA 02134 Office: 617-253-8085 Home: 617-782-0710
navarra@casbah.acns.nwu.edu (John 'tms' Navarra) (05/06/91)
In article <JIK.91May6001507@pit-manager.mit.edu> jik@athena.mit.edu (Jonathan I. Kamens) writes: > > John Navarra suggests a non-destructive version of 'rm' that either >moves the deleted file into a directory such as >/var/preserve/username, which is periodically reaped by the system, >and from which the user can retrieve accidentally deleted files, or >uses a directory $HOME/tmp and does a similar thing. > > He points out two drawbacks with the approach of putting the deleted >file in the same directory as before it was deleted. First of all, >this requires that the entire directory tree be searched in order to >reap deleted files, and this is slower than just having to search one >directory. Second, the files show up when the "-a" or "A" flag to ls >is used to list the files in a directory. > > A design similar to his was considered when we set about designing >the non-destructive rm currently in use (as "delete") at Project >Athena and available in the comp.sources.misc archives. There were >several reasons why we chose the approach of leaving files in the same >directory, rather than Navarra's approach. They include: > >1. In a distributed computing environment, it is not practical to > assume that a world-writeable directory such as /var/preserve will > exist on all workstations, and be accessible identically from all > workstations (i.e. if I delete a file on one workstation, I must be > able to undelete it on any other workstation; one of the tenet's of > Project Athena's services is that, as much as possible, they must > not differ when a user moves from one workstation to another). > Furthermore, the "delete" program cannot run setuid in order to > have access to the directory, both because setuid programs are a > bad idea in general, and because setuid has problems in remote > filesystem environments (such as Athena's). Using $HOME/tmp > alleviates this problem, but there are others.... The fact that among Athena's 'tenets' is that of similarity from workstation to workstation is both good and bad in my opinion. True, it is reasonable to expect that Unix will behave the same on similar workstations but one of the fundamental benifits of Unix is that the user gets to create his own environment. Thus, we can argue the advantages and disadvantages of using an undelete utililty but you seem to be of the opinion that non- standard changes are not beneficial and I argue that most users don't use a large number of different workstations and that we shouldn't reject a better method just because it isn't standard. I don't understand your setuid argument. All you do is have a directory called /var/preserve/navarra and have each persons directory unaccessible to others (or possibily have the sticky bit set on too) so that only a the owner of the file can undelete it. > >2. (This is a big one.) We wanted to insure that the interface for > delete would be as close as possible to that of rm, including > recursive deletion and other stuff like that. Furthermore, we > wanted to insure that undelete's interface would be close to > delete's and as functional. If I do "delete -r" on a directory > tree, then "undelete -r" on that same filename should restore it, > as it was, in its original location. > > Navarra's scheme cannot do that -- his script stores no information > about where files lived originally, so users must undelete files by > hand. If he were to attempt to modify it to store such > information, he would have to either (a) copy entire directory > trees to other locations in order to store their directory tree > state, or (b) munge the filenames in the deleted file directory in > order to indicate their original locationa, and search for > appropriate patterns in filenames when undeleting, or (c) keep a > record file in the deleted file directory of where all the files > came from. Ahh, we can improve that. I can write a program called undelete that will look at the filename argument and by default undelete it to $HOME but can also include a second argument -- a directory -- to move the undeleted material. I am pretty sure I could (or some better programmer than I) could get it to move more than one file at a time or even be able to do something like: undelete *.c $HOME/src and move all files in /var/preserve/username with .c extensions to your src dir. And if you don't have an src dir -- it will make one for you. Now this if done right, shouldn't take much longer than removing a directory structure. So rm *.c on a dir should be only a tiny bit faster than undelete *.c $HOME/src. I think the wait is worth it though -- esp if you consider the consequnces of looking thru a tape backup or gee a total loss of your files! As far as rm -r and undelete -r go, perhaps the best way to handle this is when the -r option is called, the whole dir in which you are removing files is just moved to /preserve. And then an undelete -r dir dir2 where dir2 is a destination dir, would restore all those files. HOwever, you would run into problems if /preserve is not mounted on the same tree as the dir you wanted to remove. This can be resolved by allowing undelete to run suid but I agree that is not wise. You wouldn't want users being able to mount and unmount filesystems they had remove privledges on -- perhaps there is another solution that I am overlooking but there are limits to any program. Just because there might not be any information about where the files orginally were is not good enough reason to axe its use. > > Each of these approaches has problems. (a) is slow, and can be > unreliable. (b) might break in the case of funny filenames that > confuse the parser in undelete, and undelete is slow because it has > to do pattern matching on every filename when doing recursive > undeletes, rather than just opening and reading directories. (c) > introduces all kinds of locking problems -- what if two processes > try to delete files at the same time. Assuming I can write a program which could look thru this preserve dir and grab a file(s) that matches the argument undelete would be slow if there were a vast number of files in there. However, assuming you don't remove HUGE numbers of files over a two day period (the period the files would be deleted.) I bet that would be faster than undeleting a file in a number of directories that have a .# extension because many directories would be bigger than the /preserve dir in which case you would have to be digging thru a bigger list of files. Here are some more problems. Like rm, undelete would operate by looking thru /preserve. But if rm did not store files in that dir but instead stored them as .# in the current directory, then undelete would likewise have to start looking in the current dir and work its way thru the directory structure looking for .# files that matched a filename argument UNLESS you gave it a starting directory as an argument in which case it would start there. That seems like alot of hassle to me. As far as funny filenames and such -- that I am not sure about but it seems like it could be worked out. > >3. If all of the deleted files are kept in one directory, the > directory gets very large. This makes searching it slower, and > wastes space (since the directory will not shrink when the files > are reaped from it or undeleted). You get a two day grace period -- then they are GONE! This is still faster than searchin thru the current directory (in many cases) looking for .# files to undelete. > >4. My home directory is mounted automatically under /mit/jik. but > someone else may choose to mount it on /mnt, or I may choose to do > so. The undeletion process must be independent of mount point, and > therefore storing original paths of filenames when deleting them > will fail if a different mount point is later used. Using the > filesystem hierarchy itself is the only way to insure mount-point > independent operation of the system. > >5. It is not expensive to scan the entire tree for deleted files to > reap, since most systems already run such scans every night, > looking for core files *~ files, etc. In fact, many Unix systems > come bundled with a crontab that searches for # and .# files every > night by default. if that is the case -- fine -- you got me there. Do it from crontab and remove them every few days. I just think it is a waste to infest many directories with *~ and # and .# files when 99% of the time when someone does rm filename -- THEY WANT IT REMOVED AND NEVER WANT TO SEE IT AGAIN! SO now when I do an ls -las -- guess what! There they are again! Well you tell me "John, don't do an ls -las" -- well how bout having to wait longer on various ls's because my directory size is bigger now. Say I did delete a whole mess of files, now I have all those files in my current dir, now I want to see all my .files as well. So I do an ls -las and when I come back from lunch I might see them . -- ever try to ls -las /dev!? > >6. If I delete a file in our source tree, why should the deleted > version take up space in my home directory, rather than in the > source tree? Furthermore, if the source tree is on a different > filesystem, the file can't simply be rename()d to put it into my > deleted file directory, it has to be copied. That's slow. Again, > using the filesystem hierarchy avoids these problems, since > rename() within a directory always works (although I believe > renaming a non-empty directory might fail on some systems, they > deserve to have their vendors shot :-). > >7. Similarly, if I delete a file in a project source tree that many > people work on, then other people should be able to undelete the > file if necessary. If it's been put into my home directory, in a > temporary location which presumably is not world-readable, they > can't. They probably don't even know who delete it. I admit you have pointed out some flaws. Some of which can be corrected, others you just have to live with. I have made a few suggestions to improve the program. In the end though, I think the one /preserve directory is much better. But here is another suggestion which you might like: make a shell variable RMPATH and you can set it to whatever PATH you want. The default will be /var/preserve but you can set it to $HOME/tmp or maybe perhaps it could work like the PS1 variable and have a $PWD options in which case it is set to your current directory. Then when you rm something or undelete something, the RMPATH will be checked. > >Jonathan Kamens USnail: >MIT Project Athena 11 Ashford Terrace >jik@Athena.MIT.EDU Allston, MA 02134 >Office: 617-253-8085 Home: 617-782-0710 > > -- From the Lab of the MaD ScIenTiST: navarra@casbah.acns.nwu.edu
asg@sage.cc.purdue.edu (The Grand Master) (05/06/91)
In article <1991May6.072447.21943@casbah.acns.nwu.edu> navarra@casbah.acns.nwu.edu (John 'tms' Navarra) writes: }In article <JIK.91May6001507@pit-manager.mit.edu> jik@athena.mit.edu (Jonathan I. Kamens) writes: }> [First a brief history] }> John Navarra suggests a non-destructive version of 'rm' that either }>moves the deleted file into a directory such as }>/var/preserve/username, which is periodically reaped by the system, or }>uses a directory $HOME/tmp and does a similar thing. }> }> He points out two drawbacks with the approach of putting the deleted }>file in the same directory as before it was deleted. First of all, }>this requires that the entire directory tree be searched in order to }>reap deleted files, and this is slower than just having to search one }>directory. Second, the files show up when the "-a" or "A" flag to ls }>is used to list the files in a directory. }> }> A design similar to his was considered when we set about designing }>the non-destructive rm currently in use (as "delete") at Project Athena }>1. In a distributed computing environment, it is not practical to }> assume that a world-writeable directory such as /var/preserve will }> exist on all workstations, and be accessible identically from all }> workstations (i.e. if I delete a file on one workstation, I must be }> able to undelete it on any other workstation; one of the tenet's of }> Project Athena's services is that, as much as possible, they must }> not differ when a user moves from one workstation to another). Explain something to me Jon - first you say that /var/preserve will not exist on all workstations, then you say you want a non-differing environment on all workstations. If so, /var/preserve SHOULD exist on all workstations if it exists on any. Maybe you should make sure it does. }> Furthermore, the "delete" program cannot run setuid in order to }> have access to the directory, both because setuid programs are a }> bad idea in general, and because setuid has problems in remote }> filesystem environments (such as Athena's). Using $HOME/tmp }> alleviates this problem, but there are others.... Doesn't need to run suid. Try this: $ ls -ld /var/preserve rwxrwxrwt preserve preserve /var/preserve $ ls -l /var/preserve rwx------ navarra navarra /var/preserve/navara rwx------ jik jik /var/preserve/jik hmm, doesn't look like you need anything suid for that! } } The fact that among Athena's 'tenets' is that of similarity from } workstation to workstation is both good and bad in my opinion. True, it } is reasonable to expect that Unix will behave the same on similar workstations } but one of the fundamental benifits of Unix is that the user gets to create } his own environment. Thus, we can argue the advantages and disadvantages of } using an undelete utililty but you seem to be of the opinion that non- } standard changes are not beneficial and I argue that most users don't use } a large number of different workstations and that we shouldn't reject a } better method just because it isn't standard. It is bad in no way at all. It is reasonable for me to expect that my personaly environment, and the shared system environment will be the same on different workstations. And many users at a university sight use several different workstations (I do). I like to know that i can do things the same way no matter where I am when I log in. }>2. (This is a big one.) We wanted to insure that the interface for }> delete would be as close as possible to that of rm, including }> recursive deletion and other stuff like that. Furthermore, we }> wanted to insure that undelete's interface would be close to }> delete's and as functional. If I do "delete -r" on a directory }> tree, then "undelete -r" on that same filename should restore it, }> as it was, in its original location. Therre is not a large problem with this either. Info could be added to the file, or a small record book could be kept. And /userf/jik could be converted to $HOME in the process to avoid problems with diferent mount points. }> }> Navarra's scheme cannot do that -- his script stores no information }> about where files lived originally, so users must undelete files by }> hand. If he were to attempt to modify it to store such }> information, he would have to either (a) copy entire directory }> trees to other locations in order to store their directory tree What about $HOME/tmp???? - Then you would have only to mv it. }> state, or (b) munge the filenames in the deleted file directory in }> order to indicate their original locationa, and search for }> appropriate patterns in filenames when undeleting, or (c) keep a }> record file in the deleted file directory of where all the files }> came from. Again - these last two are no problem at all. } } Ahh, we can improve that. I can write a program called undelete that } will look at the filename argument and by default undelete it to $HOME } but can also include a second argument -- a directory -- to move the } undeleted material. I am pretty sure I could (or some better programmer } than I) could get it to move more than one file at a time or even be } able to do something like: undelete *.c $HOME/src and move all files } in /var/preserve/username with .c extensions to your src dir. } And if you don't have an src dir -- it will make one for you. Now this } if done right, shouldn't take much longer than removing a directory } structure. So rm *.c on a dir should be only a tiny bit faster than } undelete *.c $HOME/src. I think the wait is worth it though -- esp } if you consider the consequnces of looking thru a tape backup or gee } a total loss of your files! This is not what Jon wants though. He does not want the user to have to remember where in the directory tree the file was undeleted from. However, what Jon fails to point out is that one must remember where they deleted a file from with his method too. Say for example I do the following. $ cd $HOME/src/zsh2.00/man $ delete zsh.1 Now later, when I want to retrieve zsh.1 - I MUST CHANGE DIRECTORIES to $HOME/src/zsh2.00/man. I STILL HAVE TO REMEMBER WHAT DIRECTORY I DELETED THE FILE FROM!!!! So you gain NOTHING by keeping the file in the directory it was deleted from. Or does your undelete program also search the entire damn directory structure of the system? } As far as rm -r and undelete -r go, perhaps the best way to handle } this is when the -r option is called, the whole dir in which you are } removing files is just moved to /preserve. And then an undelete -r dir dir2 where dir2 is a destination dir, would restore all those files. HOwever, you would run into } problems if /preserve is not mounted on the same tree as the dir you wanted Again, that is why you should use $HOME/tmp. }>3. If all of the deleted files are kept in one directory, the }> directory gets very large. This makes searching it slower, and }> wastes space (since the directory will not shrink when the files }> are reaped from it or undeleted). This is much better than letteng EVERY DAMN DIRECTORY ON THE SYSTEM GET LARGER THAN IT NEEDS TO BE!! Say I do this $ ls -las 14055 -rw------- 1 wines 14334432 May 6 11:31 file12.dat 21433 -rw------- 1 wines 21860172 May 6 09:09 file14.dat $ rm file*.dat $ cp ~/new_data/file*.dat . [ note at this point, my directory will probably grow to a bigger size since therre is now a fill 70 Meg in one directory as opposed to the 35 meg that should be there using John Navarra's method] [work deleted] $ rm file*.dat (hmm, I want that older file12 back - BUT I CANNOT GET IT!) } } You get a two day grace period -- then they are GONE! This is still faster } than searchin thru the current directory (in many cases) looking for .# files } to undelete. You are correct sir. }> }>4. My home directory is mounted automatically under /mit/jik. but }> someone else may choose to mount it on /mnt, or I may choose to do }> so. The undeletion process must be independent of mount point, and }> therefore storing original paths of filenames when deleting them }> will fail if a different mount point is later used. Using the }> filesystem hierarchy itself is the only way to insure mount-point }> independent operation of the system. Well most of us try not to go mounting filesystems all over the place. Who would be mounting your home dir on /mnt?? AND WHY??? }> } if that is the case -- fine -- you got me there. Do it from crontab } and remove them every few days. I just think it is a waste to infest many directories with *~ and # and .# files when 99% of the time when someone } does rm filename -- THEY WANT IT REMOVED AND NEVER WANT TO SEE IT AGAIN! } SO now when I do an ls -las -- guess what! There they are again! Well John, how about trying (you use bash right?) ;-) bash$ ls() { > command ls $@ | grep -v \.\# > } } you tell me "John, don't do an ls -las" -- well how bout having } to wait longer on various ls's because my directory size is bigger now. This point is still valid however, because there will be overhead associated with piping billions of files starting with .# through grep -v (as well as the billions of files NOT starting with .# that must be piped through) }> }>6. If I delete a file in our source tree, why should the deleted }> version take up space in my home directory, rather than in the }> source tree? Furthermore, if the source tree is on a different }> filesystem, the file can't simply be rename()d to put it into my }> deleted file directory, it has to be copied. That's slow. Again, }> using the filesystem hierarchy avoids these problems, since }> rename() within a directory always works (although I believe }> renaming a non-empty directory might fail on some systems, they }> deserve to have their vendors shot :-). Is this system source code? If so, I really don't think you should be deleting it with your own account. But if that is what you wish, how about a test for if you are in your own directory. If yes, it moves the deleted file to $HOME/tmp, if not, it moves it to ./tmp (or ./delete, or ./wastebasket or whatever) }> }>7. Similarly, if I delete a file in a project source tree that many }> people work on, then other people should be able to undelete the }> file if necessary. If it's been put into my home directory, in a }> temporary location which presumably is not world-readable, they }> can't. They probably don't even know who delete it. Shouldn't need to be world readable (that is assuming that to have permission to delete source you have to be in a special group - or can just anyone on your system delete source?) } } I admit you have pointed out some flaws. Some of which can be corrected, } others you just have to live with. I have made a few suggestions to improve } the program. In the end though, I think the one /preserve directory is } much better. But here is another suggestion which you might like: } }>Jonathan Kamens USnail: Well Jon, I have a better solution for you - ready? rm: # Safe rm script at -o 2.0.0 rm $* That seems to be what you want. Look - there is no perfect method for doing this. But the best way seems to me to be the following 1) move files in the $HOME tree to $HOME/tmp 2) Totally delete files in /tmp 3) copy personally owned files from anywhere other than $HOME or /tmp to $HOME/tmp (with a -r if necessary). Do this in the background. Then remove them of course (cp -r $dir $HOME/tmp ; rm -r $dir) & 4) If a non-personally owned file is deleted, place it in ./delete, and place a notification in file as to who deleted it when. Then spawn an at job to delete the file in 2 days, and the notification in whatever number of days you wish. an example of 4: jik> ls -las drwxrwxr-x source source 1024 . -rwxrwxr-x source source 5935 fun.c jik> rm fun.c jik> ls -las drwxrwxr-x source source 1024 . drwxrwxr-x source source 1024 .delete -rwxrwxr-x source source 69 fun.c jik> cat fun.c File: fun.c Deleted at: Mon May 6 12:41:31 EDT 1991 Deleted by: jik Another possibility for 4: I assume that the source tree is all one filesystem no? If so then have filse removed in the source tree moved to /src/.delete. Have a notification then placed in fun.c and spawn an at job to delete it, or place the notification in fun.c_delete and have the src tree searched for *_delete files (or whatever you wanna call them). }From the Lab of the MaD ScIenTiST: } }navarra@casbah.acns.nwu.edu Have fun. Oh and by the way - I think doing this with a shell script is a complete waste of resources. You could easily make mods to th eacual code to rm to do this, or use the PUCC entombing library and not even have to change the code to rm (just have to link to the aforementioned PUCC entombing library when compiling rm). culater Bruce Varney --------- ### ## Courtesy of Bruce Varney ### # aka -> The Grand Master # asg@sage.cc.purdue.edu ### ##### # PUCC ### # ;-) # # ;'> # ##
jik@athena.mit.edu (Jonathan I. Kamens) (05/07/91)
In article <1991May6.072447.21943@casbah.acns.nwu.edu>, navarra@casbah.acns.nwu.edu (John 'tms' Navarra) writes: |> The fact that among Athena's 'tenets' is that of similarity from |> workstation to workstation is both good and bad in my opinion. True, it |> is reasonable to expect that Unix will behave the same on similar workstations |> but one of the fundamental benifits of Unix is that the user gets to create |> his own environment. Our approach in no way prevents the user from creating his own environment. |> Thus, we can argue the advantages and disadvantages of |> using an undelete utililty but you seem to be of the opinion that non- |> standard changes are not beneficial No. What I am arguing is that users should have *access* to a similar environment on all workstations. They can do with that environment whatever the hell they want with it when they log in. They can use X, or not use X. They can use mwm, or twm, or uwm, or gwm, or whatever-the-hell-wm they want. they can use /bin/csh, or /bin/sh, or (more recently zsh), or a shell installed in a contributed software locker or in their home directory. They can configure their accounts as much as anyone at any Unix site, if not more. |> and I argue that most users don't use |> a large number of different workstations There are over 1000 workstations at Project Athena. Most users will log into a different workstation every time they log in. The biggest cluster has almost 100 workstations in it. Please remember that your environment is not everyone's environment. I am trying to explain why the design chosen by Project Athena was appropriate for Project Athena's environment; your solution may be appropriate for your environment (although I still believe that it does have problems). Furthermore, I still believe that Project Athena's approach is more generalized than yours, for the simple reason that our approach will work in your environment, but your approach will not work in our environment. |> and that we shouldn't reject a |> better method just because it isn't standard. The term "standard" has no meaning here, since we're talking about implementing something that doesn't come "standard" with Unix. |> I don't understand your setuid argument. All you do is have a directory |> called /var/preserve/navarra and have each persons directory unaccessible to |> others (or possibily have the sticky bit set on too) so that only a the owner |> of the file can undelete it. In order to be accessible from multiple workstations, the /var/preserve filesystem has to be a remote filesystem (e.g. NFS or AFS) mounted on each workstation. Mounting one filesystem, from one fileserver, on over 1000 workstations is not practical. Furthermore, it does not scale (e.g. what if there are 10000 workstations rather than 1000?), and another of Project Athena's main design goals was scalability. Finally, since all of the remote file access at Athena is authenticated using Kerberos (because both NFS and AFS are insecure when public workstations can be rebooted by users without something like Kerberos), all users would have to authenticate themselves to /var/preserve's fileserver in order to access it (to delete or undelete files). Storing authentication for every user currently logged in is quite difficult for one fileserver to deal with. We have over 10000 users at Project Athena. This means that either (a) there will have to be over 10000 subdirectories of /var/preserve, or (b) the directories will have to be created as they are needed, which means either a world-writeable /var/preserve or a setuid program that can create directories in a non-world-writeable directory. And setuid programs don't work with authenticated remote filesystems, which was my original point. Yes, many of these concerns are specific to Project Athena. But, as I said, what I'm trying to explain is not why all of the problems with your scheme I mentioned are problems everywhere (although some of them are), but rather why all of them are problems at Project Athena. |> Ahh, we can improve that. I can write a program called undelete that |> will look at the filename argument and by default undelete it to $HOME |> but can also include a second argument -- a directory -- to move the |> undeleted material. I am pretty sure I could (or some better programmer |> than I) could get it to move more than one file at a time or even be |> able to do something like: undelete *.c $HOME/src and move all files |> in /var/preserve/username with .c extensions to your src dir. |> And if you don't have an src dir -- it will make one for you. I'm sorry, but this does nothing to address my concerns. Leaving the files in the directory in which they were deleted preserves the state indicating where they were originally, so that they can be restored to exactly that location without the user having to specify it. Your way of accomplishing the same thing is a kludge at best and does *not* accomplish the same thing, but rather a crude imitation of it. |> As far as rm -r and undelete -r go, perhaps the best way to handle |> this is when the -r option is called, the whole dir in which you are |> removing files is just moved to /preserve. And then an undelete -r dir |> dir2 where dir2 is a destination dir, would restore all those files. What if I do "delete -r foo" and then realize that I want to restore the file "foo/bar/baz/frelt" without restoring anything else. My "delete" deletes a directory recursively by renaming the directory and all of its contents with ".#" prefixes, recursively. Undeleting a specific file several levels deep is therefore trivial, and my delete does it using only rename() calls, which are quite fast. Once again your system runs into the problem of /preserve being on a different filesystem (if it can't be, then you have restricted all of your files to reside on one filesystem), in which case copying directory structures is slow as hell and can be unreliable. Since my system does no inter-filesystem copying, it is fast (which was another requirement of the design -- delete cannot be significantly faster than /bin/rm). Let's see what your system has to do to undelete "foo/bar/baz/frelt". First, it has to create the undeleted directory "foo". It has to give it the same permissions as the deleted "foo", but it can't just rename() the "foo" in /preserve, since that might be across filesystems and since it doesn't want all of the *other* deleted files in /preserve/foo to show up undeleted. Then, it has to do the same thing with "foo/bar" and "foo/bar/baz". Then, it has to put "foo/bar/baz/frelt" back, copying it (slowly). It seems to me that your system can reap deleted files quickly, but can delete or undelete files rather slowly. My system reaps files slowly (using a nightly "find" that many Unix sites already run), but runs very quickly from the user's point of view. Tell me, whose time is more important at your site, the user's or the computer's (late at night)? |> Here are some more problems. Like rm, undelete would operate by looking |> thru /preserve. But if rm did not store files in that dir but instead stored |> them as .# in the current directory, then undelete would likewise have to |> start looking in the current dir and work its way thru the directory structure |> looking for .# files that matched a filename argument UNLESS you gave it |> a starting directory as an argument in which case it would start there. That |> seems like alot of hassle to me. Um, "undelete" takes exactly the same syntax as "delete". If you give it an absolute pathname, it looks in that pathname. If you don't, it looks relative to the current path. If it can't find a file in the current directory, then the file cannot be undeleted. This functionality is identical to the functionality of virtually every other Unix file utility. The system is not expected to be able to find a file in an entire filesystem, given just its name. The user is expected to know where the file is. That's how Unix works. Furthermore, the state is in the filesystem, so that if the user forgets where something is, he can use "find" or something to find it. It seems to me that Athena's design conforms more to the Unix paradigm than yours. |> You get a two day grace period -- then they are GONE! This is still faster |> than searchin thru the current directory (in many cases) looking for .# files |> to undelete. The speed of searching is negligible. The speed of copying the file, possibly very large, from another filesystem, is not. My program will *always* run in negligible speed, yours will not. |> SO now when I do an ls -las -- guess what! You are one of the few people who has ever told me that he regularly uses the "-a" flag to ls. Most people don't -- that's why ls doesn't display dotfiles by default. Renaming files with a ".#" prefix to indicate that they can be removed and to hide them is older than Athena's delete program; that's why many Unix sites already search for ".#" files. If you use "ls -a" so often that it is a problem for you, *and* if you delete so many files that you will often see deleted files when you do "ls -a", then don't do delete. You can't please all of the people all of the time. But I would venture to say that new users, inexperienced users, the users that "delete" is (for the most part) intended to protect, are not going to have your problems. |> make a shell variable RMPATH and you can set it to whatever PATH |> you want. The default will be /var/preserve but you can set it to $HOME/tmp |> or maybe perhaps it could work like the PS1 variable and have a $PWD |> options in which case it is set to your current directory. Then when you |> rm something or undelete something, the RMPATH will be checked. This solves pretty much none of the problems I mentioned, and introduces others. What if you delete something in one of your accounts that has a weird RMPATH, and then want to undelete it later and can't remember who you were logged in as when you deleted it? You've then got deleted files scattered all over your filespace, and in fact they can be in places totally unrelated to where they were originally. It makes much more sense to leave them where they were when they were deleted -- if you know what the file is about, you probably know in general where to look for it. -- Jonathan Kamens USnail: MIT Project Athena 11 Ashford Terrace jik@Athena.MIT.EDU Allston, MA 02134 Office: 617-253-8085 Home: 617-782-0710
jik@athena.mit.edu (Jonathan I. Kamens) (05/07/91)
(I have addressed some of Bruce's points in my last posting, so I will not repeat here any point I have made there.) In article <11941@mentor.cc.purdue.edu>, asg@sage.cc.purdue.edu (The Grand Master) writes: |> Explain something to me Jon - first you say that /var/preserve will not |> exist on all workstations, then you say you want a non-differing |> environment on all workstations. If so, /var/preserve SHOULD |> exist on all workstations if it exists on any. Maybe you should make |> sure it does. The idea of mounting one filesystem from one fileserver (which is what /var/preserve would have to be, if it were to look the same from any workstation so that any file could be recovered from any workstation) on all workstations in a distributed environment does not scale well to even 100 workstations, let alone the over 1000 workstations that we have, and our environment was designed to scale well to as many as 10000 workstations or more. If it doesn't scale, then it doesn't work in our environment. So we can't "make sure" that /var/preserve appears on all workstations. |> However, what Jon fails to point out is that one must remember |> where they deleted a file from with his method too. Say for example I do |> the following. |> $ cd $HOME/src/zsh2.00/man |> $ delete zsh.1 |> Now later, when I want to retrieve zsh.1 - I MUST CHANGE DIRECTORIES |> to $HOME/src/zsh2.00/man. I STILL HAVE TO REMEMBER WHAT DIRECTORY I |> DELETED THE FILE FROM!!!! So you gain NOTHING by keeping the file in |> the directory it was deleted from. Or does your undelete program also |> search the entire damn directory structure of the system? Um, the whole idea of Unix is that the user knows what's in the file hierarchy. *All* Unix file utilities expect the user to remember where files are. This is not something new, nor (in my opinion) is it bad. I will not debate that issue here; if you wish to discuss it, start another thread. I will only say that our "delete" was designed in conformance with the Unix paradigm, so if you wish to criticize this particular design decision, you must be prepared to criticize and defend your criticism of every other Unix utility which accepts the same design criterion. |> This is much better than letteng EVERY DAMN DIRECTORY ON THE SYSTEM |> GET LARGER THAN IT NEEDS TO BE!! How many deleted files do you normally have in a directory in any three-day period, or seven-day period, or whatever? |> Say I do this |> $ ls -las |> 14055 -rw------- 1 wines 14334432 May 6 11:31 file12.dat |> 21433 -rw------- 1 wines 21860172 May 6 09:09 file14.dat |> $ rm file*.dat |> $ cp ~/new_data/file*.dat . |> [ note at this point, my directory will probably grow to a bigger |> size since therre is now a fill 70 Meg in one directory as opposed |> to the 35 meg that should be there using John Navarra's method] First of all, the size of a directory has nothing to do with the size of the files in it. Only with the number of files in it. Two extra file entries in a directory increase its size negligibly, if at all (since directories are sized in block increments). Second, using John Navarra's method, assuming a separate partition for deleted files, I could do this: 1. Copy 300meg of GIF files into /tmp. 2. "rm" them all. 3. Every day or so, "undelete" them into /tmp, touch them to update the modification time, and then delete them. Now I'm getting away with using the preservation area as my own personal file space, quite possibly preventing other people from deleting files. Using $HOME/tmp avoids this problem, but (as I pointed out in my first message in this thread), you can't always use $HOME/tmp, so there is probably going to be a way for a user to spoof the program into putting the files somewhere nifty. You could put quotas on the preserve directory. But the user's home directory already has a quota on it (if you're using quotas), so why not just leave the file in whatever filesystem it was in originally? Better yet, in the degenerative case, just leave it in the same directory it was in originally, with the same owner, thus guaranteeing it will be counted under the correct quota until it is permanently removed! That's a design consideration I neglected to mention in my previous messages.... |> [work deleted] |> $ rm file*.dat |> (hmm, I want that older file12 back - BUT I CANNOT GET IT!) You can't get it back in the other system suggested either. I have been considering adding "version control" to my package for a while now. I haven't gotten around to it. It would not be difficult. But the issue of version control is equivalent in both suggested solutions, and is therefore not an issue. |> Well most of us try not to go mounting filesystems all over the place. |> Who would be mounting your home dir on /mnt?? AND WHY??? In a distributed environment of over 1000 workstations, where the vast majority of file space is on remote filesystems, virtually all file access happens on mounted filesystems. A generalized solution to this problem must therefore be able to cope with filesystems mounted in arbitrary locations. For example, let's say I have a NFS home directory that usually mounts on /mit/jik. But then I log into one of my development machines in which I have a local directory in /mit/jik, with my NFS home directory mounted on /mit/jik/nfs. This *happens* in our environment. A solution that does not deal with this situation is not acceptable in our environment (and will probably run into problems in other environments as well). |> Is this system source code? If so, I really don't think you should be |> deleting it with your own account. First of all, it is not your prerogative to question the source-code access policies at this site. For your information, however, everyone who has write access to the "system source code" must authenticate that access using a separate Kerberos principal with a separate password. I hope that meets with your approval. Second, this is irrelevant. |> But if that is what you wish, how about |> a test for if you are in your own directory. If yes, it moves the |> deleted file to $HOME/tmp, if not, it moves it to ./tmp (or ./delete, or |> ./wastebasket or whatever) How do you propose a 100% foolproof test of this sort? What if I have a source filesystem mounted under my home directory? For all intents and purposes, it will appear to be in my home directory. What if I have a source tree in my home directory, and I delete a file in it, then tar up the source directory and move it into another project directory, and then realize a couple of days later that I need to undelete the file, but it's not there anymore because it was deleted in my home directory and not in the project directory? How do you propose to move state about deleted files when hierarchies are moved in that manner? Your suggested alternate solutions to this problem, which I have omitted, all save state in a way that degenerates into saving the state in each directory by leaving the files there. Furthermore, something that has not yet been mentioned, the implementation of a set of utilities which leaves the files in place is far less complex than any other implementation. And the less complex an implementation is, the easier it is to get it right (and optomize it, and fix any bugs that do pop up, etc.). -- Jonathan Kamens USnail: MIT Project Athena 11 Ashford Terrace jik@Athena.MIT.EDU Allston, MA 02134 Office: 617-253-8085 Home: 617-782-0710
drd@siia.mv.com (David Dick) (05/07/91)
In <1991May3.212619.21119@casbah.acns.nwu.edu> navarra@casbah.acns.nwu.edu (John 'tms' Navarra) writes: >In article <11283@statware.UUCP> mcf@statware.UUCP ( Mathieu Federspiel) writes: [description of a renaming scheme for tentative file removal elided] [desc. of tentative-removal directory scheme elided] These two schemes seem to have their own advantages and disadvantages. Renaming (prefixing filenames with something special-- ".#" was suggested) has the advantage of leaving files in place in their filesystem hierarchy, but reserves a class of names in the namespace, and makes scavenging for too-old files slow (because the whole filesystem must be searched). Moving (copying the files with their original names into a fixed directory) has the advantage of preserving original names and not cluttering up the namespace, but full-path information is lost and collisions of filenames can still occur. How about a .deleted sub-directory in any directory where one of these commands has been used? Then a tentatively-deleted file can be moved there (very efficient, since only linking and unlinking is necessary), the original name can be used, and full-path information is preserved. The scavenger still needs to do a full filesystem search, but I don't think it should be continuously running, anyway. One additional thing that these schemes need, however hard it may be to provide, is emergency deletion. That is, just as Macintosh wastebasket contents get reclaimed if more blocks are needed, it would be really nice if the same thing could happen, automatically, if a filesystem ran out of space. On most bigger machines, this is of little concern. But, for individual systems, barely scraping by, this could be a real life-saver. David Dick Software Innovations, Inc. [the Software Moving Company (sm)]
asg@sage.cc.purdue.edu (The Grand Master) (05/08/91)
In article <1991May7.095912.17509@athena.mit.edu> jik@athena.mit.edu (Jonathan I. Kamens) writes: } } (I have addressed some of Bruce's points in my last posting, so I will not }repeat here any point I have made there.) } }In article <11941@mentor.cc.purdue.edu>, asg@sage.cc.purdue.edu (The Grand Master) writes: }|> environment on all workstations. If so, /var/preserve SHOULD }|> exist on all workstations if it exists on any. Maybe you should make } The idea of mounting one filesystem from one fileserver (which is what }/var/preserve would have to be, if it were to look the same from any Are you telling me that when you log in, you have to wait for your home directory to be mounted on the workstation you log in on? - This is absolutely Horid!! However, My suggestion of PUCC's entomb (with a coupla mods) is very useful. Here it goes First, you have a user named charon (yeah, the boatkeeper) which will be in control of deleted files. Next, at the top level of each filesystem you but a directory named tomb - in other words, instead of the jik directory being the only directory on your partition, there are two - jik and tomb. Next, you use PUCC's entomb library. you wilol need to make a few mods, but there should be little problem with that. The entomb library is full of functions named (unlink, link etc) which will be calledfrom rm instead of th "real" unlink, etc and which will if necesarry call the real unlink etc. What these functtions will actually do is call on a process entombd (which runs suid to root - ohmigod) to move your files to the tomb directory. The current library does not retain directory structure, but that is little problem to fix. The important thing is that things are moved to the tomb directory that is on the same file system as the directory from which they are deleted. tomb is owned by charon, and is 700. The companion program unrm can restore the files to their original location (note, in this case you do not neccesarily have to be in the same directory from which you deleted them - though the files WILL be returned to the directory from which you deleted them). Unrm will only let you restore a file if you have read permission on the file, and write permission on the directory to which it will be restored. Just as important, since the ownership, permissions, and directory structure of the files will be kept, you still will not be able to look at files you are not authorized to look at. You no longer have to worry about moving files to a new filesystem. You know longer have to worry about looking at stupid .# files. And since preend(1) also takes care of cleaning out the tomb directories, you no longer need to search for them. Another nice thing is that preend is capable of specifying different times for differnet files. A few quotes from the PUCC man page on entomb: You can control whether or not your files get entombed with the ENTOMB environment variable: ____________________________________________________________________________ |variable setting action | ____________________________________________________________________________ |"no" no files are entombed | |"yes" (the default) all files are entombed | |"yes:pattern" only files matching pattern are entombed | "no:pattern" all files except those matching pattern are entombed +__________________________________________________________________________+ ....... If the file to be entombed is NFS mounted from a remote host, the entomb program would be unable to move it to the tomb because of the mapping of root (UID 0) to nobody (UID -2). Instead, it uses the RPC mechanism to call the entombd server on the remote host, which does the work of entombing. .......... Files destroyed by the library calls in the entombing library, libtomb.a, are placed in subdirectories on each filesystem. The preening daemon, preend, removes old files from these tombs. If the filesystem in question is less than 90% full, files are left in the tomb for 24 hours, minus one second for each two bytes of the file. If the filesystem is between 90 and 95% full, files last 6 hours, again adjusted for file size. If the filesystem is between 95 and 100% full, files last 15 minutes. If the filesystem is more than 100% full, all files are removed at about 5 minute intervals. An exception is made for files named "a.out" or "core" and filenames beginning with a "#" or end- ing in ".o" or "~", which are left in the tomb for at most 15 minutes. ........ The entombing library, libtomb.a, contains routines named creat, open, rename, truncate, and unlink that are call- compatible with the system calls of the same names, but which as a side effect may execute /usr/local/lib/entomb to arrange for the file in question to be entombed. The user can control whether or not his files get entombed with the ENTOMB environment variable. If there is no ENTOMB environment variable or if it is set to "yes", all files destroyed by rm, cp, and mv are saved. If the ENTOMB environment variable is set to "no", no files are ever entombed. In addition, a colon-separated list of glob patterns can be given in the ENTOMB environment variable after the initial "yes" or "no". A glob pattern uses the special characters `*', `?', and `[' to generate lists of files. See the manual page for sh(1) under the heading "Filename Genera- tion" for an explanation of glob patterns. center box; l l. variable setting action _ "no" no files are entombed "yes" (the default) all files are entombed "yes:pattern" only files matching pattern are entombed "no:pattern" all files except those matching pattern are entombed If the ENTOMB environment variable indicates that the file should not be entombed, or if there is no tomb directory on the filesytem that contains the given file, the routines in this library simply invoke the corresponding system call. --------------------------------- If this is not a full enough explaination, please contact me via email and I will try to be more thorough. } }|> However, what Jon fails to point out is that one must remember }|> where they deleted a file from with his method too. Say for example I do }|> the following. }|> $ cd $HOME/src/zsh2.00/man }|> $ delete zsh.1 }|> Now later, when I want to retrieve zsh.1 - I MUST CHANGE DIRECTORIES }|> to $HOME/src/zsh2.00/man. I STILL HAVE TO REMEMBER WHAT DIRECTORY I }|> DELETED THE FILE FROM!!!! So you gain NOTHING by keeping the file in }|> the directory it was deleted from. Or does your undelete program also }|> search the entire damn directory structure of the system? } } Um, the whole idea of Unix is that the user knows what's in the file }hierarchy. *All* Unix file utilities expect the user to remember where files Not exactly true. Note this is the reason for the PATH variable, so that you do not have to remember where every God-blessed command resides. } } How many deleted files do you normally have in a directory in any three-day }period, or seven-day period, or whatever? Often many - it depends on the day } }|> Say I do this }|> $ ls -las }|> 14055 -rw------- 1 wines 14334432 May 6 11:31 file12.dat }|> 21433 -rw------- 1 wines 21860172 May 6 09:09 file14.dat }|> $ rm file*.dat }|> $ cp ~/new_data/file*.dat . }|> [ note at this point, my directory will probably grow to a bigger }|> size since therre is now a fill 70 Meg in one directory as opposed }|> to the 35 meg that should be there using John Navarra's method] } } First of all, the size of a directory has nothing to do with the size of the }files in it. Only with the number of files in it. Two extra file entries in Ok, you are right - I wan\sn't thinking here } }1. Copy 300meg of GIF files into /tmp. } }2. "rm" them all. } }3. Every day or so, "undelete" them into /tmp, touch them to update the } modification time, and then delete them. } }Now I'm getting away with using the preservation area as my own personal file }space, quite possibly preventing other people from deleting files. Well, I could copy 300meg of GIFs to /tmp and keep touching them every few hours or so (say with a daemon I run from my crontab) and the effect would be the same. } } Using $HOME/tmp avoids this problem, but (as I pointed out in my first Yes it does, as does using filesystemroot:/tomb } } You could put quotas on the preserve directory. But the user's home }directory already has a quota on it (if you're using quotas), so why not just }leave the file in whatever filesystem it was in originally? Better yet, in Thatt is what entomb does! } You can't get it back in the other system suggested either. Some kind of revision control (though I am not sure how it works) is also present with entomb. }|> Well most of us try not to go mounting filesystems all over the place. }|> Who would be mounting your home dir on /mnt?? AND WHY??? } } In a distributed environment of over 1000 workstations, where the vast }majority of file space is on remote filesystems, virtually all file access }happens on mounted filesystems. A generalized solution to this problem must }therefore be able to cope with filesystems mounted in arbitrary locations. Well, then this is an absolute kludge. How ridiculous to have to mount and unmount everyones directory when they log in/out. ABSURD!. You would be better off to have a few powerful centralized systems with Xwindow terminals instead of separate workstations. In fact, what you have appearantly makes it impossible for me to access any other users files that he might have purposefully left accessable unless he is logged into the same workstation. Even if he puts some files in /tmp for me, I HAVE TO LOG INTO THE SAME WORKSTATION HE WAS ON TO GET THEM!! And if I am working on a workstation and 10 people happen to rlogin to it at the same time, boy are my processes gonna keep smokin. No the idea of an Xterminal with a small processor to handle the Xwindows, and a large system to handle the rest is MUCH MUCH more reasonable and functional. } } For example, let's say I have a NFS home directory that usually mounts on }/mit/jik. But then I log into one of my development machines in which I have }a local directory in /mit/jik, with my NFS home directory mounted on }/mit/jik/nfs. This *happens* in our environment. A solution that does not }deal with this situation is not acceptable in our environment (and will }probably run into problems in other environments as well). Well, in most environments (as far as I know) the average user is not allowed to mount file systems. } }|> Is this system source code? If so, I really don't think you should be }|> deleting it with your own account. } First of all, it is not your prerogative to question the source-code access }policies at this site. For your information, however, everyone who has write }access to the "system source code" must authenticate that access using a }separate Kerberos principal with a separate password. I hope that meets with }your approval. It is my perogative to announce my opinion on whatever the hell I choose, and it is not yours to tell me I cannot. Again this seems like a worthless stupid kludge. What is next - a password so that you can execute ls? } }-- }Jonathan Kamens USnail: }MIT Project Athena 11 Ashford Terrace }jik@Athena.MIT.EDU Allston, MA 02134 }Office: 617-253-8085 Home: 617-782-0710 While I understand the merits of your system, I still argue that it is NOT a particularly good one. I remove things so that I do not have to look at them anymore. And despite your ravings at John, ls -a is not at all uncommon. In fact I believe it is the default if you are root is it not? Most people I know DO use -a most of the time, in fact most have alias ls 'ls -laF' or something of the like. And I do not like being restricted from ever naming a file .#jikisdumb or whatever I wanna name it. As Always, The Grand Master --------- ### ## Courtesy of Bruce Varney ### # aka -> The Grand Master # asg@sage.cc.purdue.edu ### ##### # PUCC ### # ;-) # # ;'> # ##
jik@athena.mit.edu (Jonathan I. Kamens) (05/08/91)
In article <12021@mentor.cc.purdue.edu>, asg@sage.cc.purdue.edu (The Grand Master) writes: |> Are you telling me that when you log in, you have to wait for your home |> directory to be mounted on the workstation you log in on? Yes. |> - This is |> absolutely Horid!! I would suggest, Bruce, that you refrain from commenting about things about which you know very little. Your entire posting is filled with jibes about the way Project Athena does things, when you appear to know very little about *how* we do things or about the history of Project Athena. I doubt that DEC and IBM would have given Athena millions of dollars over more than seven years if they thought it was a "kludge". I doubt that Universities and companies all over the world would be adopting portions of the Project Athena environment if they thought it was a "kludge". I doubt DEC would be selling a bundled "Project Athena workstation" product if they thought it was a "kludge". I doubt the OSF would have accepted major portions of the Project Athena environment in their DCE if they thought it was a "kludge". You have the right to express your opinion about Project Athena. However, when you opinion is based on almost zero actual knowledge, you just end up making yourself look like a fool. Before allowing that to happen any more, I suggest you try to find out more about Athena. There have been several articles about it published over the years, in journals such as the CACM. You also seem to be quite in the dark about the future of distributed computing. The computer industry has recognized for years that personal workstations in distributed environments are becoming more popular. I have more computing power under my desk right now than an entire machine room could hold ten years ago. With the entire computing industry moving towards distributed environments, you assert that Project Athena, the first successful large-scale distributed DCE in the world, would be better off "to have a few powerful centralized systems with Xwindow terminals instead of separate workstations." Whatever you say, Bruce; perhaps you should try to convince DEC, IBM, Sun, HP, etc. to stop selling workstations, since the people buying them would obviously be better off with a few powerful centralized systems. |> Next, at the top level of each filesystem you but a directory named |> tomb - in other words, instead of the jik directory being the only |> directory on your partition, there are two - jik and tomb. "Filesystems" are arbitrary in our environment. I can mount any AFS directory as a "filesystem" (although AFS mounts are achieved using symbolic links, the filesystem abstraction is how we keep NFS and AFS filesystems parallel to each other). Furthermore, I can mount any *subdirectory* of any NFS filesystem as a filesystem on a workstation, and the workstation has no way of knowing whether that directory really is the top of a filesystem on the remote host, or of getting to the "tomb" directory you propose. As I think I've already pointed out now twice, we considered what you're proposing when we designed Athena's "delete". But we also realized that in a generalized environment that allows arbitrary mounting of filesystems, top-level "tomb" or ".delete" or whatever directories just don't work, and they degenerate into storing deleted files in each directory. If your site uses "a few powerful centralized systems" and does not allow mounting as we do, then your site can use the entomb stuff. But it just doesn't cut it in a large-scale distributed environment, which is the point I've tried to make in my previous two postings (and in this one). In any case, mounting user home directories on login takes almost no time at all; I just mounted a random user directory via NFS and it took 4.2 seconds. That 4.2 seconds is well worth it, considering that they can access their home directory on any of over 1000 workstations, any of which is probably as powerful as one of your "powerful centralized systems." |> What these functtions will actually do is |> call on a process entombd (which runs suid to root - ohmigod) to move |> your files to the tomb directory. One more time -- setuid does not work with authenticated filesystems, even when moving files on the same filesystem. Your solution will not work in our environment. I do not know how many times I am going to have to repeat it before you understand it. |> ____________________________________________________________________________ |> |variable setting action | |> ____________________________________________________________________________ |> |"no" no files are entombed | |> |"yes" (the default) all files are entombed | |> |"yes:pattern" only files matching pattern are entombed | |> "no:pattern" all files except those matching pattern are entombed |> +__________________________________________________________________________+ Very nice. I could implement this in delete if I wanted to; this does not seem specific to the issues we are discussing (although it's a neat feature, and I'll have to consider it when I have time to spend on developing delete). |> If the file to be entombed is NFS mounted from a remote |> host, the entomb program would be unable to move it to the |> tomb because of the mapping of root (UID 0) to nobody (UID |> -2). Instead, it uses the RPC mechanism to call the entombd |> server on the remote host, which does the work of entombing. We considered this too, and it was rejected because of the complexity argument I mentioned in my last posting. Your daemon has to be able to figure out what filesystem to call via RPC, using gross stuff to figure out mount points. Even if you get it to work for NFS, you've got to be able to do the same thing for AFS, or for RVD, which is the other file protocol we use. And when you add new file protocols, your daemon has to be able to understand them to know who to do the remote RPC too. Not generalized. Not scalable. Furthermore, you have to come up with a protocol for the RPC requests. Not difficult, but not easy either. Furthermore, the local entombd has to have some way of authenticating to the remote entombd. In an environment where root is secure and entombd can just use a reserved port to transmit the requests, this isn't a problem. But in an environment such as Athena's where anyone can hook up a PC or Mac or workstation to the network and pretend to be root, or even log in as root on one of our workstations (or public workstation root password is "mroot"; enjoy it), that kind of authentication is useless. No, I'm not going to debate with you why people have root access on our workstations. I've done that flame once, in alt.security shortly after it was created. I'd be glad to send via E-mail to anyone who asks, every posting I made during that discussion. But I will not debate it again here; in any case, it is tangential to the subject currently being discussed. By the way, the more I read about your entomb system, the more I think that it is a clever solution to the problem it was designed to solve. It has lots of nice features, too. But it is not appropriate for our environment. |> } Um, the whole idea of Unix is that the user knows what's in the file |> }hierarchy. *All* Unix file utilities expect the user to remember where files |> |> Not exactly true. Note this is the reason for the PATH variable, so that |> you do not have to remember where every God-blessed command resides. Running commands is different from manipulating files. There are very few programs which manipulate files that allow the user to specify a filename and know where to find it automatically. And those programs that do have this functionality do so by either (a) always looking in the same place, or (b) looking in a limited path of places (TEXINPUTS comes to mind). I don't know of any Unix program which, by default, takes the filename specified by the user and searches the entire filesystem looking for it. And no, find doesn't count, since that's the one utility that was specifically designed to do this, since nothing else does (although even find requires that you give it a directory to start in). |> Well, I could copy 300meg of GIFs to /tmp and keep touching them |> every few hours or so (say with a daemon I run from my crontab) and |> the effect would be the same. You could, but I might not keep 300meg of space in my /tmp partition, whereas I would probably want to keep as much space as possible free in my entomb partitions, so that deleted files would not be lost prematurely. |> Well, then this is an absolute kludge. How ridiculous to have to mount and |> unmount everyones directory when they log in/out. ABSURD!. See above. What you are calling "ABSURD" is pretty much accepted as the wave of the future by almost every major workstation manufacturer and OS developer in the world. Even the Mac supports remote filesystem access at this point. How else do you propose a network of 1000 workstations deal with all the users' home directories? Oh, I forgot, you don't think anyone should need to have a network of 1000 workstations. Right, Bruce. |> In fact, what you have appearantly makes it impossible for me to access |> any other users files that he might have purposefully left accessable |> unless he is logged into the same workstation. No, we have not. As I said above, you don't know what you're talking about, and making accusations at Project Athena when you haven't even bothered to try to find out if there is any truth behind the accusations is unwise at best, and foolish at worst. Project Athena provides "attach", an interface to mount(2) which allows users to mount any filesystem they want, anywhere they want (at least, anywhere that is not disallowed by the configuration file for "attach"). All someone else has to do to get to my home directory is type "attach jik". Do not assume that Project Athena is like Purdue and then assume what we don on that basis. Project Athena is unlike almost any other environment in the world (although there are a few that parallel it, such as CMU's Andrew system). |> And if I am working on a workstation and 10 people happen to rlogin |> to it at the same time, boy are my processes gonna keep smokin. Workstations on Project Athena are private. One person, one machine (there are exceptions, but they are just that, exceptions). |> No the idea of an Xterminal with a small processor to handle the |> Xwindows, and a large system to handle the rest is MUCH MUCH more reasonable |> and functional. You don't know what you're talking about. Project Athena *used to be* composed of several large systems connected to many terminals. Users could only log in on the cluster nearest the machine they had an account on, and near the end of the term, every machine on campus was unuseable because the loads were so high. Now, we can end up with 1000 people logged in at a time on workstations all over campus, and the performance is still significantly better than it was before we switched to workstations. |> It is my perogative to announce my opinion on whatever the hell I choose, |> and it is not yours to tell me I cannot. Again this seems like a worthless |> stupid kludge. What is next - a password so that you can execute ls? You asserted that we should not be writing to system source code with our own account. I responded by pointing out that, in effect, we are not. We simply require separate Kerberos authentication, rather than a completely separate login, to get source write access. Now you respond by saying that that authentication is wrong, when it is in fact what you implied we should be doing in the first place. -- Jonathan Kamens USnail: MIT Project Athena 11 Ashford Terrace jik@Athena.MIT.EDU Allston, MA 02134 Office: 617-253-8085 Home: 617-782-0710
rjc@oghma.ocunix.on.ca (Robert J Carter) (05/08/91)
In article <12021@mentor.cc.purdue.edu> asg@sage.cc.purdue.edu (The Grand Master) writes: >} First of all, it is not your prerogative to question the source-code access >}policies at this site. For your information, however, everyone who has write >}access to the "system source code" must authenticate that access using a >}separate Kerberos principal with a separate password. I hope that meets with >}your approval. > >It is my perogative to announce my opinion on whatever the hell I choose, >and it is not yours to tell me I cannot. Again this seems like a worthless >stupid kludge. What is next - a password so that you can execute ls? >} This WAS a real interesting thread, but it's going downhill - is there any chance you two can keep the personalities out of it, and get on with the discussion? -- |=================================================================| ttfn! | Robert J Carter Oghma Systems Ottawa, Ontario | | Phone: (613) 565-2840 | @ @ | Fax: (613) 565-2840 (Phone First) rjc@oghma.ocunix.on.ca | * * |=================================================================| \_____/