jch@devvax.tn.cornell.edu (Jeffrey C Honig) (10/08/90)
I'm having strange behaviour on my SS1+ with external SCSI drive. The system is configured with two internal Sun drives, 3 external HP 660M (formatted), a Sun 150M QIC tape and an Exabyte with a bus terminator. All the external cables are 2' long keeping us under the maximum length. I am currently not using the internal drives; this SS1+ is playing server until our real SS1+ server arrives. What happens is that some file gets modified. This usually shows up as an executable that starts dumping core. Comparing it to an identical copy from backups or another system shows definite differences. But the strange problem is that eventually the problem goes away, although it may require a reboot, and the file is back to normal. No messages are printed. I would suspect a parity error, but that should be caught by error checking, shouldn't it? That leaves me to suspect the imbedded controller on one of the HP disks, but the problem is not limited to one disk. I guess it could also be software, does SunOS do much caching of disk data? Has anyone seen this or a similar problem? Can anyone suggest any solutions? Thanks. Jeff
hedrick@athos.rutgers.edu (Charles Hedrick) (11/02/90)
>What happens is that some file gets modified. This usually shows up as an >executable that starts dumping core. Comparing it to an identical copy >from backups or another system shows definite differences. But the >strange problem is that eventually the problem goes away, although it may >require a reboot, and the file is back to normal. Sounds to me like the dreaded "confused file problem". One block of the file, typically the first one, turns into the corresponding block of a different file. Typically it happens to files accessed via NFS, but we've seen it in local files too. There is supposedly a fix available from Sun.