lai%vedge.UUCP@larry.mcrcim.mcgill.edu (David Lai) (11/11/88)
We are running uucp on a sun3 OS 3.5. It runs fine for about a year now, no major problems... until suddenly around 6PM Nov 8, uucico refused to run, until about 2PM Nov 9, then everything mysteriously went back to normal. As far as I can tell - nothing 'strange' happened between 6PM and 2PM next day (ie. no changes to L.sys, password files, cron, uucp directory permissions, kernal rebuilds, etc...) - all other programs ran fine (including tip to the modems) - I even rebooted just to make sure something wasnt hanging in the kernel. - I just happen to keep a backup copy of uucico under a different name, a 'cmp' showed that they were the same! The symptoms were: 1) If I execute 'uucico -r1 ...' manually, it returns right away with my shell prompt, but the terminal is set to a 'strange' state that wont accept any characters typed. (Had to hang up). I suspect that cron running uucp scripts same thing happened. Even -x7 debug mode prints nothing. 2) Outside systems calling in to the uucico login shell just lost carrier after login. 3) Nothing is added to LOGFILE nor SYSLOG between 6PM and 2PM next day. Even when running uucico manually. 4) Running 'uucp' and 'mail' queues up jobs like normal. At 2PM Nov 9, it started working again. The only thing I may have done is I did a minor cosmetic change to the /etc/passwd file just before 2PM. (Afterwards I changed it back and uucico still worked!). Did anyone else ever have such a problem? Is it Sun specific? Is it specific to just that date and time?
slevy@uf.msc.umn.edu (Stuart Levy) (12/10/88)
We occasionally see something that might cause those symptoms. Once in a while, it seems as though the cached text for some binary will be corrupted. The *file* will appear fine, but everybody on the system will find that that binary won't run, dumps core or whatever. It can be "fixed" by renaming the real binary & copying the renamed binary to its usual name. Since the new binary is a different inode, those cached pages don't apply to it. After a while (when all the executing copies exit?) the cached pages go away and the old binary will work too, so you can rename it back. Stuart Levy