car@trux.UUCP (Chris Rende) (02/22/90)
(Nixdorf Targon M35/50 TOS 3.2 --> Pyramid 9810 OSx 4.0) On rare occaisions ATTCRON is running things twice apparently because it does things early. (System V Release 2 CRON) Here is a section from my cron's log: > CMD: /etc/dmesg - >>/usr/adm/messages > root 15896 c Sat Feb 10 06:49:59 1990 < root 15896 c Sat Feb 10 06:50:00 1990 > CMD: /etc/dmesg - >>/usr/adm/messages > root 15898 c Sat Feb 10 06:50:00 1990 < root 15898 c Sat Feb 10 06:50:00 1990 Here is the associated crontab entry: 00,10,20,30,40,50 * * * * /etc/dmesg - >>/usr/adm/messages What seems to be happening is that CRON is not calculating a long enough delay time before running an entry. Maybe it's an oddball rounding error that causes the calculation to come up short... ? If CRON would add another .5 seconds to each delay time then there probably wouldn't be a problem. Has anyone else observed this behaviour? Is it a known bug? Is there a fix? car. -- Christopher A. Rende Central Cartage (Nixdorf/Pyramid/SysVR2/BSD4.3) uunet!edsews!rphroy!trux!car Multics,DTSS,Unix,Shortwave,Scanners,StarTrek trux!car@uunet.uu.net Minix 1.2,PC/XT,Mac+,TRS-80 Model I,1802 ELF "I don't ever remember forgetting anything." - Chris Rende
hedrick@athos.rutgers.edu (Charles Hedrick) (02/26/90)
Are you using a program that does adjtime, i.e. nntp or timed? This is a Berkeley call to adjust the time slowly. It changes the clock speed. System V cron gets confused when this is done. We had to fix in on the Suns. Presumably the same fix will work. If this is your problem, send me mail and I'll try to find out who has the diffs. If you don't have source to cron, there's not much I can do for you.
keith@bain3.oz (Keith Brinck) (03/01/90)
in article <362@trux.UUCP>, car@trux.UUCP (Chris Rende) says: > Xref: bain3 comp.sys.pyramid:582 comp.bugs.sys5:855 > > (Nixdorf Targon M35/50 TOS 3.2 --> Pyramid 9810 OSx 4.0) > > On rare occaisions ATTCRON is running things twice apparently because it > does things early. (System V Release 2 CRON) > > [Stuff deleted ....] > > Has anyone else observed this behaviour? We sure have - on the same configuration with the same version of OSx. Its not so rare either about once every 1-2 months in our case. > Is it a known bug? I have spoken to Pyramid Australia about it and they say its a known bug in the AT&T system. They don't know when it will be fixed (although its rumoured to be fixed in version 5). > Is there a fix? We use a lock file system which catches the bug most of the time. Fairly recently we had an occurence which was not caught by the lock file and as a result screwed up a significant Sybase database. It took us some time to rebuild said base. What has always amazed me about this bug is the fact that no-one else appeared to be particularly worried about it, including the people at Pyramid in Sydney. Given the rate of the incidence of the bug at our site I would guess that there is a cron job firing of twice somewhere in the world every hour of the day (or more) and that someone would have been hurt by it !! Pyramid's lack of enthusiasm in pursuing a fix for this bug shows up one of the disadvantages of using unix - one is not dealing directly with the originator of the os and its difficult to get things done as a result. --------------- I've posted this for someone else - please direct any email replies to barry@bain3.bain.oz (Barry Allebone)
ejp@bohra.cpg.oz (Esmond Pitt) (03/01/90)
In article <362@trux.UUCP>, car@trux.UUCP (Chris Rende) says: > Xref: bain3 comp.sys.pyramid:582 comp.bugs.sys5:855 > > (Nixdorf Targon M35/50 TOS 3.2 --> Pyramid 9810 OSx 4.0) > > On rare occaisions ATTCRON is running things twice apparently because it > does things early. (System V Release 2 CRON) > > [Stuff deleted ....] > > Has anyone else observed this behaviour? Yes. Does running ucb cron solve the problem? -- Esmond Pitt, Computer Power Group ejp@bohra.cpg.oz
car@trux.UUCP (Chris Rende) (03/08/90)
Thanks to all those who either posted or Emailed responses regarding the problem with CRON running things twice. The bottom line is that there is a bug in the AT&T System V CRON. It may have been fixed in more recent releases. The bug manifests itself by running something 1 second early and then AGAIN at the proper time. The following are NOT the cause of this particular problem: - Change of date/time either with date(1) or with BSD's adjtime(2). (nada.kth.se!paf) - Two CRON's running at the same time. (mcorrigan@ucsd.edu) - NNTP or TIMED (hedrick@athos.rutgers.edu) A few other notes which people sent to me: - Even the System V Release 3 CRON is reported to get messed up by date/time changes while CRON is running. (motcsd!brian) - This same bug also exists under SunOS 4.0.3 (bugids 1022379 and 1027075). (ata!eggert) - It's a known bug in the AT&T system. (keith@bain3.oz) - Observered frequency is once every 1-2 months. (keith@bain3.oz) - It is rumored to be fixed in OSx5. (keith@bain3.oz) - It is estimated that twice per hour CRON goofs up some where in the world. (keith@bain3.oz) Suggested solutions: - Run the UCB CRON instead of the ATT CRON. (ejp@bohra.cpg.oz) - Use lock files in your jobs. (ejp@bohra.cpg.oz) Here is a good summary and a fix from vogon.cetia.fr!philip: Most SV Rel. 2 systems share your problem. It seems to be that the (twisted) logic of cron takes the time several times during execution, and it is very lax in which one of the values obtained it actually believes. Rather than try to corect the logic, I have used a fix, which cures the problem, but has a side effect that *some* commands may be run one second late. I find this acceptable, since one second is within the normal scheduling tolerances of UNIX. I hope you have access to the sources, because here is a context diff showing my modificaton: *** cron.c Thu Jan 4 12:26:40 1990 --- cron.c.orig Tue Mar 6 10:33:59 1990 *************** *** 239,245 #endif seconds = (ne_time < (long) 0) ? (long) 0 : ne_time; if(ne_time > (long) 0) ! idle(seconds == 1L ? 2L : seconds); if(notexpired) { notexpired = 0; last_time = INFINITY; --- 239,245 ----- #endif seconds = (ne_time < (long) 0) ? (long) 0 : ne_time; if(ne_time > (long) 0) ! idle(seconds); if(notexpired) { notexpired = 0; last_time = INFINITY; I suppose that on a really slow system, you may need to change the 2L into 3L - but that would be a *slow* machine. --------------- car. -- Christopher A. Rende Central Cartage (Nixdorf/Pyramid/SysVR2/BSD4.3) uunet!edsews!rphroy!trux!car Multics,DTSS,Unix,Shortwave,Scanners,StarTrek trux!car@uunet.uu.net Minix 1.2,PC/XT,Mac+,TRS-80 Model I,1802 ELF "I don't ever remember forgetting anything." - Chris Rende