system@alchemy.chem.utoronto.ca (System Admin (Mike Peterson)) (01/26/91)
There is a severe bug in SR10.3 on DN2500's (only DN2500's according to the Hotline) - the system clock goes crazy, gaining 1-10 minutes for every minute of real time, with the gains being worse the heavier the load is on the system. The system is semi-useable with DM/pads (except that vi/page/more/rlogin/telnet only work once), but once X/xterm/otherXclients are used, you can't even get a character to echo in the xterm for 2-5 minutes. Moving the mouse causes the load average to shoot over 15 (which is probably wrong since the clock is screwed), and all manner of screen events happen - windows cycle automatically, menus pop up / pull down automatically, areas of text are marked/yanked/pasted at random. The Hotline says the patch will be available in March. FANTASTIC! I guess I'll just watch the xclock hands jump around the dial for a month or so. I found this problem within 30 seconds of booting SR10.3 using X Windows - doesn't anybody at HP/Apollo test anything, not to mention the beta testers !?!?! Sorry if you feel attacked/insulted, but this is ridiculous. -- Mike Peterson, System Administrator, U/Toronto Department of Chemistry E-mail: system@alchemy.chem.utoronto.ca Tel: (416) 978-7094 Fax: (416) 978-8775
kts@quintro.uucp (Kenneth T. Smelcer) (01/27/91)
In article <1991Jan25.160628.10897@alchemy.chem.utoronto.ca> system@alchemy.chem.utoronto.ca (System Admin (Mike Peterson)) writes: >There is a severe bug in SR10.3 on DN2500's (only DN2500's according to the >Hotline) - the system clock goes crazy, gaining 1-10 minutes for every >minute of real time, with the gains being worse the heavier the load >is on the system. The system is semi-useable with DM/pads (except that >vi/page/more/rlogin/telnet only work once), but once >X/xterm/otherXclients are used, you can't even get a character to echo >in the xterm for 2-5 minutes. Moving the mouse causes the load average >to shoot over 15 (which is probably wrong since the clock is screwed), >and all manner of screen events happen - windows cycle automatically, >menus pop up / pull down automatically, areas of text are >marked/yanked/pasted at random. >-- >Mike Peterson, System Administrator, U/Toronto Department of Chemistry >E-mail: system@alchemy.chem.utoronto.ca >Tel: (416) 978-7094 Fax: (416) 978-8775 Well, I don't know about anyone else, but I've been running SR10.3 on my DN2500 (16MB, 200M disk) for about three weeks with very few problems. The system clock is fairly stable (it loses about 8 seconds a day), and X windows works fine. My standard window system is the MIT X11R4 server and xdm (no DM running), but I've also used the HP/Apollo supplied X11R3 server in shared mode. I don't have any problems with strange window events happening, although I have had a couple instances where a window would disappear without any apparent reason. Did the support line give you a reason for the problems on DN2500 machines? Is it something that's configuration based or are they talking about kernel problems? -- --=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-- Ken Smelcer Glenayre Corp. quintro!kts@lll-winken Quincy, IL tiamat!quintro!kts@uunet
system@alchemy.chem.utoronto.ca (System Admin (Mike Peterson)) (01/27/91)
In article <1991Jan26.181425.21685@quintro.uucp> kts@quintro.uucp (Kenneth T. Smelcer) writes: >In article <1991Jan25.160628.10897@alchemy.chem.utoronto.ca> system@alchemy.chem.utoronto.ca (System Admin (Mike Peterson)) writes: >>There is a severe bug in SR10.3 on DN2500's (only DN2500's according to the >>Hotline) - the system clock goes crazy, gaining 1-10 minutes for every >>minute of real time, etc. > > <DN2500 works at SR10.3 text deleted> > >Did the support line give you a reason for the problems on DN2500 machines? >Is it something that's configuration based or are they talking about kernel >problems? It is a kernel problem in the timer interrupt routine - a register is not being saved properly. -- Mike Peterson, System Administrator, U/Toronto Department of Chemistry E-mail: system@alchemy.chem.utoronto.ca Tel: (416) 978-7094 Fax: (416) 978-8775
hanche@imf.unit.no (Harald Hanche-Olsen) (01/28/91)
In article <1991Jan26.181425.21685@quintro.uucp> kts@quintro.uucp (Kenneth T. Smelcer) writes: In article <1991Jan25.160628.10897@alchemy.chem.utoronto.ca> system@alchemy.chem.utoronto.ca (System Admin (Mike Peterson)) writes: >There is a severe bug in SR10.3 on DN2500's (only DN2500's according to the >Hotline) - the system clock goes crazy, gaining 1-10 minutes for every >minute of real time, with the gains being worse the heavier the load >is on the system. [...] >Moving the mouse causes the load average >to shoot over 15 (which is probably wrong since the clock is screwed), [...] Well, I don't know about anyone else, but I've been running SR10.3 on my DN2500 (16MB, 200M disk) for about three weeks with very few problems. The system clock is fairly stable (it loses about 8 seconds a day), and X windows works fine. We have seen problems similar to those described by Mike. The most reproducible way to provoke this behaviour is as follows: Someone logged in on a node other than the 2500 starts compiling a big file in a directory which is on the 2500. Once every minute or so, the load jumps sky high, and the clock jumps forward by a couple minutes. Meanwhile, the poor guy who is trying to use the 2500 screen is stuck, unable to do a thing. We "solved" the problem by moving everybody's home directory away from the node, after which the problem is still present but not nearly so noticable. (The guy we hired to take care of our computers was supposed to follow up on this, but he quit before Christmas and apparently never got around to it, which means I will have to do it (sigh)). Anyway, some clock racing was still present, but after we installed xntpd on all our machines the 2500's xntpd has managed to keep the clock in line, more or less. The advice to not install sr10.3 on a 2500 before the patch is out is probably a good one, although Kens experience implies that not all machines will be bitten by this bug. That could explain why neither HP nor the beta testers ever saw it. - Harald Hanche-Olsen <hanche@imf.unit.no> Division of Mathematical Sciences The Norwegian Institute of Technology N-7034 Trondheim, NORWAY
krowitz@RICHTER.MIT.EDU (David Krowitz) (01/28/91)
Hmmm ... your message indicates that you see the error on DN2500's that are providing file service. The DN2500's we have are all diskless machines, which may be why we never saw this particular problem during the beta-test. This strengthens my conviction on the need for a pre-beta-release testing lab ... -- David Krowitz krowitz@richter.mit.edu (18.83.0.109) krowitz%richter.mit.edu@eddie.mit.edu krowitz%richter.mit.edu@mitvma.bitnet (in order of decreasing preference)
kts@quintro.uucp (Kenneth T. Smelcer) (01/29/91)
In article <HANCHE.91Jan27181040@hufsa.imf.unit.no> hanche@imf.unit.no (Harald Hanche-Olsen) writes: >In article <1991Jan26.181425.21685@quintro.uucp> kts@quintro.uucp (Kenneth T. Smelcer) writes: > > In article <1991Jan25.160628.10897@alchemy.chem.utoronto.ca> system@alchemy.chem.utoronto.ca (System Admin (Mike Peterson)) writes: >>>There is a severe bug in SR10.3 on DN2500's (only DN2500's according to the >>>Hotline) - the system clock goes crazy, gaining 1-10 minutes for every >>>minute of real time, with the gains being worse the heavier the load >>>is on the system. >> >> Well, I don't know about anyone else, but I've been running SR10.3 >> on my DN2500 (16MB, 200M disk) for about three weeks with very few >> problems. The system clock is fairly stable (it loses about 8 seconds >> a day), and X windows works fine. > >We have seen problems similar to those described by Mike. The most >reproducible way to provoke this behaviour is as follows: Someone >logged in on a node other than the 2500 starts compiling a big file in >a directory which is on the 2500. [...] I think this is why I haven't had any problems with my DN2500. This node has had some sporatic disk time-out errors, so we haven't loaded anything on it except the OS. It sounds like the problems are obvious only when there's heavy usage of the 2500's local disk. BTW, thanks to Mike for letting us know about this MAJOR problem. I wish HP/Apollo would let people know about problems like these (at least people who are register DN2500 owners.) -- --=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-- Ken Smelcer Glenayre Corp. quintro!kts@lll-winken Quincy, IL tiamat!quintro!kts@uunet
system@alchemy.chem.utoronto.ca (System Admin (Mike Peterson)) (01/29/91)
In article <1991Jan28.172841.9547@quintro.uucp> kts@quintro.uucp (Kenneth T. Smelcer) writes: >BTW, thanks to Mike for letting us know about this MAJOR problem. I wish >HP/Apollo would let people know about problems like these (at least people >who are register DN2500 owners.) Your welcome, and I agree 100% -- this sort of information should be forwarded IMMEDIATELY to all local offices (who should contact all their DN2500 customers) and IMMEDIATELY to all known customers, and posted on comp.sys.apollo by someone from HP. A notice should also be inserted in all outgoing SR10.3 shipments as of the day the problem was verified. -- Mike Peterson, System Administrator, U/Toronto Department of Chemistry E-mail: system@alchemy.chem.utoronto.ca Tel: (416) 978-7094 Fax: (416) 978-8775