buck@drax.gsfc.nasa.gov (Loren (Buck) Buchanan) (10/17/90)
Hi, We have been having a problem with NeWS getting confused to the point where we have to kill the news_server process (thus killing all proceeses associated with windows). We had assumed it was something in the code we are currently developing that caused this. Last week, we moved our machine for a demo, and while at the demo site we disabled all of the networking code (inetd, named, nfs, etc.) and in 30 runs of our program we never had to kill the news_server once. The initial indication that the news_server is confused is when grcond logs the following (somewhat edited for brevity) to /usr/adm/SYSLOG: grcond[980]: CIO: ***ERROR*** grcond[980]: CIO: Process: 0x100223A4 Error: undefined grcond[980]: CIO: Stack: grcond[980]: CIO: Executing: input grcond[980]: CIO: At: Reading file('%stdio',W,R) grcond[980]: CIO: ***** grcond[980]: CIO: ***ERROR*** grcond[980]: CIO: Process: 0x100223A4 Error: undefined grcond[980]: CIO: Stack: grcond[980]: CIO: Executing: queue grcond[980]: CIO: At: Reading file('%stdio',W,R) grcond[980]: CIO: ***** grcond[980]: CIO: ***ERROR*** grcond[980]: CIO: Process: 0x100223A4 Error: undefined grcond[980]: CIO: Stack: grcond[980]: CIO: Executing: lock grcond[980]: CIO: At: Reading file('%stdio',W,R) grcond[980]: CIO: ***** grcond[980]: CIO: ***ERROR*** grcond[980]: CIO: Process: 0x100223A4 Error: undefined grcond[980]: CIO: Stack: grcond[980]: CIO: Executing: broken! grcond[980]: CIO: At: Reading file('%stdio',W,R) grcond[980]: CIO: ***** The thing that strikes me odd about this, is that the error input queue lock broken is supposed to be harmless, but somehow it is getting fed to the news_server as a PostScript program to be run. It really gets bad when the error messages grcond is logging are also sent to news_server to be run. This sounds like the start of an infinite loop, but fortunately it does stop being logged to SYSLOG after a while. At that point the only recovery is to kill the news_server. Killing grcond at this point is useless, because doing so will require a press of the "reset" button to recover. We are using a 4D/20 running IRIX 3.2 Thankx for any help in solving this problem. B Cing U Buck Loren Buchanan | buck@drax.gsfc.nasa.gov | #include <std_disclaimer.h> CSC, 1100 West St. | ...!ames!dftsrv!drax!buck | typedef int by Laurel, MD 20707 | (301) 497-2531 | void where_prohibited(by law){} CD International lists over 40,000 pop music CDs, collect the whole set.