[comp.unix.wizards] find: bad status-- /u/tu

bob@rush.howp.com (Bob Ames) (03/29/89)

Anybody seen this:?

3.51# find /u -print
[...]
/u/tutor
/u/tutor/Filecabinet
[...]
/u/tutor/Filecabinet/Profiles/9600bps:A2
find: bad status-- /u/tutor/Filecabinet/practice/sample6.clf
find: bad status-- /u/tutor/Filecabinet/practice/sample5.clf
[...]
find: bad status-- /u/tutor/Filecabinet/practice/sample1.clf
/u/tutor/Wastebasket
/u/tutor/Clipboard
/u/tutor/.history
find: bad status-- /u/hello
find: bad status-- /u/dummy
find: bad status-- /u/someone
[...]
3.51#

The bad status messages continue for the rest of the /u directory.
The only command which seems to be bothered at all with the /u
directory is find.  I can cd, ls, od ., anything.  All the files
are there and intact.  The only reason this error was discovered
is that the Tape Backup software gave the same errors during the
"Checking the file systems" phase of the "Complete Backup".

I did an od -cx /u and the output looked somewhat reasonable for a
directory.  There's plenty of space left on the disk.  I tried
rebooting.  It seems like permissions, but I'm root.

This all started after we brought the machine back from ATT for a
new power supply |-)

Bob

Bob Ames  The National Organization for the Reform of Marijuana Laws, NORML 
"Pot is the world's best source of complete protein, alcohol fuel, and paper,
is the best fire de-erosion seed, and is america's largest cash crop," USDA
bob@rush.cts.com or ncr-sd!rush!bob@nosc.mil or rutgers!ucsd!ncr-sd!rush!bob
619-743-2546 "We each pay a fabulous price for our visions of paradise," Rush

wescott@ncrcae.Columbia.NCR.COM (Mike Wescott) (03/31/89)

In article <948@rush.howp.com> bob@rush.howp.com (Bob Ames) writes:
> /u/tutor/Filecabinet/Profiles/9600bps:A2
> find: bad status-- /u/tutor/Filecabinet/practice/sample6.clf
> /u/tutor/.history
> find: bad status-- /u/hello

> The bad status messages continue for the rest of the /u directory.

One of the directories that find goes through right before the "bad status"
messages start probably has a ".." that is not the parent that find used
to get to the directory.

Try this in /u and some other directories:

	ls -id . */..

Each time you should get the same inode number for each directory.  Any
mismatch causes find to take the wrong path in trying to get back to where
it started from.

You can use /etc/link and /etc/unlink to fix.

kremer@cs.odu.edu (Lloyd Kremer) (04/01/89)

In article <948@rush.howp.com> bob@rush.howp.com (Bob Ames) writes:

>3.51#
> /u/tutor/Filecabinet/Profiles/9600bps:A2
> find: bad status-- /u/tutor/Filecabinet/practice/sample6.clf
> /u/tutor/.history
> find: bad status-- /u/hello
> The bad status messages continue for the rest of the /u directory.
>This all started after we brought the machine back from ATT for a
>new power supply |-)

In article <1287@jhunix.HCF.JHU.EDU> ins_anmy@jhunix.HCF.JHU.EDU (Norman Yarvin) writes:

>I have exactly the same problem with my filesystem in the neighborhood of
> /usr (in places where I have made changes.)  "find" seems to be the only
>program affected, but I use "find".  I'm also using version 3.51 of the
>operating system.
>Yes, I got a new power supply too. :-?

In article <4341@ncrcae.Columbia.NCR.COM> wescott@ncrcae.Columbia.NCR.COM (Mike Wescott) writes:

>Try this in /u and some other directories:
>	ls -id . */..
>Each time you should get the same inode number for each directory.  Any
>mismatch causes find to take the wrong path in trying to get back to where
>it started from.
>You can use /etc/link and /etc/unlink to fix.

Yes, and after unlinking and linking the relevant directories, and everything
appears to be correct in the 'ls -id' tests, it would be very wise to unmount
and 'fsck -y -D' the affected filesystem.  Your repairs may be incomplete,
or the filesystem may have other problems of which you are not (yet) aware.
It may avoid some nasty surprises in the future.

Also, is there some correlation between power supply replacement and
filesystem corruption?  I sync there might be.  :-)

					Lloyd Kremer
					{uunet,sun,...}!xanth!kremer

cks@ziebmef.uucp (Chris Siebenmann) (04/04/89)

In article <8310@xanth.cs.odu.edu> kremer@cs.odu.edu (Lloyd Kremer) writes:
...
| Yes, and after unlinking and linking the relevant directories, and everything
| appears to be correct in the 'ls -id' tests, it would be very wise to unmount
| and 'fsck -y -D' the affected filesystem.  Your repairs may be incomplete,
| or the filesystem may have other problems of which you are not (yet) aware.
| It may avoid some nasty surprises in the future.

 Please *don't* ever use 'fsck -y' in any shape or form; bad things
can happen (people wanting details of how to take it out of /etc/rc
can send me email). Note that you should pay careful attention to
fsck's error messages, and just because your 3B1 boots doesn't mean
you don't have a scrambled directory; there is at least one
circumstance where fsck can detect a problem and give a message, but
not fix it and not exit with any error status. It happened to me, and
here is the story:

[This happened around the beginning of March, just after Jim Joyce had
given a talk here about data recovery from crashed disks. The Ziebmef
is a 3B1.]

 Early this morning (around 4am) the Ziebmef's disk got corrupted,
followed shortly afterwards by the system crashing. When I discovered
this around 8am, I decided to boot of my floppy boot disk and fsck the
HD manually (just as a precaution, after hearing Jim Joyce's talk about
data recovery).

 Imagine my shock and horror when a stream of 'DUP/BAD INODE' messages
started streaming across the screen, accompanied by:

  DUP/BAD INODE=xxxxx OWNER=xxxx MODE=10644
  FILE=<something important>
  ...  
  CLEAR?
 
 By answering no instead of yes, I was actually able to salvage most
of the files, and at least see what the other missing ones were (such
things as compress ... bad news for the news unbatcher).

 There were also a lot of lost files; in fact, too many lost files to
all fit into lost+found at the same time. Of course, if fsck runs out
of space in lost+found, its default action is to delete the file;
completely the wrong thing to do in most circumstances (including this
one, as many of the lost files turned out to be expired news articles
that could be safely deleted after being looked at). 

 I managed to recover and clean up most everything by successive cycles
of	fsck
	mount the HD and poke around inspecting & cleaning up stuff
	unmount drive
	fsck again

 This didn't manage to get everything, though; there were a couple of
directories too scrambled for fsck to deal with that I had to zap with
ncheck and clri. Of course, fsck reported 'success' when these
directories were still scrambled.

 If I had simply hit the hardware reboot switch and let the default
3B1 /etc/rc take over (it does a 'fsck -y' when problems are detected)
I would have
	a. lost some important unrecoverable files claimed to be scrambled,
	b. lost some important executables without knowing about it,
	c. had several important lost files deleted because lost+found
	   was full up with expired news articles,
	d. and wound up with a disk with potentially deadly directory
	   problems that /etc/rc thought was fine.

 Instead I was able to recover with remarkably few things gone for
good; most of what I couldn't save I managed to restore off various
forms of backup and master disks.
 
 Before this, I thought there wasn't much a non-guru could do except
'fsck -y'; now I know exactly how wrong I was. Needless to say, the
Ziebmef's /etc/rc no longer has an 'fsck -y' in it; even if I can't do
anything more than the equivalent of an 'fsck -y', I'll at least find
out what my losses are. 



-- 
"He recognized her; said that he remembered her from when he'd been a
 child; expressed surprise she was still alive; suggested novel ways to 
 remedy that fact..."
Chris Siebenmann		uunet!{utgpu!moore,attcan!telly}!ziebmef!cks
cks@ziebmef.UUCP	     or	.....!utgpu!{,ontmoh!,ncrcan!brambo!}cks

michi@anvil.oz (Michael Henning) (04/07/89)

In article <4341@ncrcae.Columbia.NCR.COM>, wescott@ncrcae.Columbia.NCR.COM (Mike Wescott) writes:
> In article <948@rush.howp.com> bob@rush.howp.com (Bob Ames) writes:
> > /u/tutor/Filecabinet/Profiles/9600bps:A2
> > find: bad status-- /u/tutor/Filecabinet/practice/sample6.clf
> > /u/tutor/.history
> > find: bad status-- /u/hello
> 
> > The bad status messages continue for the rest of the /u directory.
> 
> One of the directories that find goes through right before the "bad status"
> messages start probably has a ".." that is not the parent that find used
> to get to the directory.
> 

This is the case if there is a symbolic link to a directory. Try the
following:

	1) Make a symbolic link to a directory

	2) cd to the directory the symbolic link is pointing to

	3) cd ..

If you now do a pwd, it will tell you that you back to where you came
from, but an ls will show that you are now in the directory which is the
parent of the directory the symbolic link is pointing to. I believe that
this problem is caused by the shell which keeps the current directory
in a string and manipulates it by modifying that string (i.e. when you
use cd .., the shell removes the last component of the cwd string) instead
of doing a getcwd() every time.

My question is, has that problem been solved on BSD systems, and does the find
command work correctly with symbolic links under BSD ?  We are also plagued
by the bad status messages here, and I would like to know whether I should
get stuck into the vendor for a bug fix (we are running AIX 2.2.1).

					Michi.

-- 
               | The opinions expressed are my own, not those of my employer. |
               |                                                              |
               | Michael (Michi) Henning                                      |
               | - We have three Michaels here, that's why they call me Michi |

guy@auspex.auspex.com (Guy Harris) (04/09/89)

>My question is, has that problem been solved on BSD systems, and does the find
>command work correctly with symbolic links under BSD ?

BSD's "find" will not follow symbolic links, so yes, it works correctly
with symbolic links, in that sense.  S5R4's will probably have an option
indicating whether it should follow symbolic links or not, and will
probably be set up to Do The Right Thing if it follows them.