dhinds@portia.Stanford.EDU (David Hinds) (12/02/90)
I'm not sure if this is strictly an Irix problem, but I certainly seem to have found a file system bug. After having to power down our machine a few days ago to restart it, the resulting fsck activity produced a directory entry in /lost+found that does not point to anything. A file called '000258' shows up if I do 'ls', but 'ls -l' complains that the file does not exist. I have done everything I could think of to get rid of this dangling entry - 'rm', 'unlink', etc. all fail. I can't create another file of the same name on top of it. I am completely at a loss - I suspect that since 'fsck' made it, it won't be able to undo the damage. This is a real pain because any recursive search of the directory tree returns errors - 'find', 'bru', etc all return failure statuses - which is screwing up all sorts of shell scripts. How do I get rid of this dang thing? -David Hinds dhinds@cb-iris.stanford.edu
srp@babar.mmwb.ucsf.edu (Scott R. Presnell) (12/02/90)
dhinds@portia.Stanford.EDU (David Hinds) writes: >few days ago to restart it, the resulting fsck activity produced a directory >entry in /lost+found that does not point to anything. A file called '000258' >shows up if I do 'ls', but 'ls -l' complains that the file does not exist. >I have done everything I could think of to get rid of this dangling entry - >'rm', 'unlink', etc. all fail. I can't create another file of the same name [...] >How do I get rid of this dang thing? I had the exact same thing happen to me. While not an optimal fix, another reboot (and implicit fsck) cleared away the reference. - Scott -- Scott Presnell +1 (415) 476-9890 Pharm. Chem., S-926 Internet: srp@cgl.ucsf.edu University of California UUCP: ...ucbvax!ucsfcgl!srp San Francisco, CA. 94143-0446 Bitnet: srp@ucsfcgl.bitnet
bh@sgi.com (Bent Hagemark) (12/05/90)
In article <srp.660086900@babar.mmwb.ucsf.edu> srp@babar.mmwb.ucsf.edu (Scott R. Presnell) writes: >dhinds@portia.Stanford.EDU (David Hinds) writes: > >>few days ago to restart it, the resulting fsck activity produced a directory >>entry in /lost+found that does not point to anything. A file called '000258' >>shows up if I do 'ls', but 'ls -l' complains that the file does not exist. >>I have done everything I could think of to get rid of this dangling entry - >>'rm', 'unlink', etc. all fail. I can't create another file of the same name > >[...] > >>How do I get rid of this dang thing? > >I had the exact same thing happen to me. While not an optimal fix, another >reboot (and implicit fsck) cleared away the reference. > > - Scott >-- >Scott Presnell +1 (415) 476-9890 >Pharm. Chem., S-926 Internet: srp@cgl.ucsf.edu >University of California UUCP: ...ucbvax!ucsfcgl!srp >San Francisco, CA. 94143-0446 Bitnet: srp@ucsfcgl.bitnet Yes, fsck properly clears the reference, and yes nothing else can. The problem is that the directory entry refers to an inode which is deallocated. The kernel EFS code can't even namei() this name much less unlink(2), open(2)... The bug which creates such errant directory entries has been fixed and is available in The Next Release. Bent
sgf@cs.brown.edu (Sam Fulcomer) (12/05/90)
In article <1990Dec5.032329.5531@odin.corp.sgi.com> bh@sgi.com (Bent Hagemark) writes: > >Yes, fsck properly clears the reference, and yes nothing else can. Well, let's be fair now... It wouldn't be hard to edit the directory (by some means other than fsck) directly through the disk driver. It's just that, since most OS implementations somewhat prudishly disallow write() to dir, no other approach can work. (except perhaps a good, swift kick) _/**/Sam
vjs@rhyolite.wpd.sgi.com (Vernon Schryver) (12/07/90)
In article <1990Dec5.032329.5531@odin.corp.sgi.com>, bh@sgi.com (Bent Hagemark) writes: > > Yes, fsck properly clears the reference, and yes nothing else can. > The problem is that the directory entry refers to an inode which > is deallocated. The kernel EFS code can't even namei() this name much > less unlink(2), open(2)... > > The bug which creates such errant directory entries has been fixed and > is available in The Next Release. In 3.3.1 and 3.3.2 you can sometimes get bogus files in lost+found that you cannot get rid of, and that fsck refuses to destroy. Just this morning, a couple of previously vital inodes in / were turned into such zombies on my personnal workstation by a probably hardware failure in a VME network board. Similar problems have happened to sgi.sgi.com. My solution is to use explosives. Unlink the node (with unlink not rm), clri the inode (having correctly determined the i-number and special device name), and then reboot. This morning, that did not work because the mini-root kernel would hang trying to update the completely bogus inode, so on the zillionth reboot, I clri'ed them and then pushed the reset button. Please note that this sort of deletion is effective and NOT recommended. A typo can leave you cursing while you look for backup tapes. Fsck for The Next Release continues to be improved, so it might be able to kill more such zombies. Vernon Schryver, vjs@sgi.com