[comp.sys.sun] wierd uucico bug

lai%vedge.UUCP@larry.mcrcim.mcgill.edu (David Lai) (11/11/88)

We are running uucp on a sun3 OS 3.5.  It runs fine for about a year now,
no major problems... until suddenly around 6PM Nov 8, uucico refused to
run, until about 2PM Nov 9, then everything mysteriously went back to
normal.

As far as I can tell
	- nothing 'strange' happened between 6PM and 2PM next day
		(ie. no changes to L.sys, password files, cron, uucp
		 directory permissions, kernal rebuilds, etc...)
	- all other programs ran fine (including tip to the modems)
	- I even rebooted just to make sure something wasnt hanging in
		the kernel.
	- I just happen to keep a backup copy of uucico under a different
		name, a 'cmp' showed that they were the same!

The symptoms were:

	1) If I execute 'uucico -r1 ...' manually, it returns right
	   away with my shell prompt, but the terminal is set to
	   a 'strange' state that wont accept any characters typed.
	   (Had to hang up).  I suspect that cron running uucp scripts
	   same thing happened.  Even -x7 debug mode prints nothing.

	2) Outside systems calling in to the uucico login shell just
	   lost carrier after login.

	3) Nothing is added to LOGFILE nor SYSLOG between 6PM and 2PM
	   next day.  Even when running uucico manually.

	4) Running 'uucp' and 'mail' queues up jobs like normal.

At 2PM Nov 9, it started working again.  The only thing I may have done is
I did a minor cosmetic change to the /etc/passwd file just before 2PM.
(Afterwards I changed it back and uucico still worked!).

Did anyone else ever have such a problem?  Is it Sun specific?  Is it
specific to just that date and time?

slevy@uf.msc.umn.edu (Stuart Levy) (12/10/88)

We occasionally see something that might cause those symptoms.  Once in a
while, it seems as though the cached text for some binary will be
corrupted.  The *file* will appear fine, but everybody on the system will
find that that binary won't run, dumps core or whatever.

It can be "fixed" by renaming the real binary & copying the renamed binary
to its usual name.  Since the new binary is a different inode, those
cached pages don't apply to it.  After a while (when all the executing
copies exit?) the cached pages go away and the old binary will work too,
so you can rename it back.

	Stuart Levy