cosc4fp@jetson.uh.edu (08/22/90)
On our master server where we have the registry and the glbd and llbd, we are seeing that rgyd start running at 95% out of the blue. This causes mail, and printing services to be distrupted since communication seems to be broken. Has anyone seen this wild rgyd? We are running 10.2 ........
krowitz@RICHTER.MIT.EDU (David Krowitz) (08/22/90)
Yup, I've seen this before. The most usual cause was a broken, wedged, or outright dead /etc/tcpd. Trying using /etc/ping to check if TCP/IP to the node is working. If it is, then try using "telnet" or "rlogin" to make certain that you're getting to the *correct* machine (we had, at one point, a machine with a screwed up host table that kept stealing packets meant for the machine with the rgyd). Both registry and printing services use NCS for communications, which in turn uses TCP services. -- David Krowitz krowitz@richter.mit.edu (18.83.0.109) krowitz%richter.mit.edu@eddie.mit.edu krowitz%richter.mit.edu@mitvma.bitnet (in order of decreasing preference)
thompson@PAN.SSEC.HONEYWELL.COM (John Thompson) (08/22/90)
> On our master server where we have the registry and the glbd and llbd, we are > seeing that rgyd start running at 95% out of the blue. This causes mail, and > printing services to be distrupted since communication seems to be broken. > > Has anyone seen this wild rgyd? We are running 10.2 ........ Yes. I've also seen it with the llbd and glbd. My guess is that the NCS services aren't as robust as they should (could?) be, and that they start thrashing if/when communications are interrupted/corrupted. I've noticed the llbd running amok fairly often after the tcpd dies or gets aborted. When this occurs, I've just sigp'ed the NCS services, and re-started them. Lately, we haven't seen the registry daemon running amok, so I don't know what the scoop is. John Thompson (jt) Honeywell, SSEC Plymouth, MN 55441 thompson@pan.ssec.honeywell.com As ever, my opinions do not necessarily agree with Honeywell's or reality's. (Honeywell's do not necessarily agree with mine or reality's, either)
wjw@eba.eb.ele.tue.nl (Willem Jan Withagen) (08/23/90)
In article <9008221628.AA10801@pan.ssec.honeywell.com> thompson@PAN.SSEC.HONEYWELL.COM (John Thompson) writes: > > >> On our master server where we have the registry and the glbd and llbd, we are >> seeing that rgyd start running at 95% out of the blue. This causes mail, and >> printing services to be distrupted since communication seems to be broken. >> >> Has anyone seen this wild rgyd? We are running 10.2 ........ rumour had it once that the registry services did not work well with certain number of entries in the accounts or groups. I keep comp.sys.apollo for quite a while, and below one of the previous discussions. And as far as I could see is this one not yet fixed. Regards, Willem Jan Withagen. ----------------------------- Well, I found out what caused the problems with my corrupted and missing /etc/{org,group,passwd} files -- many thanks to Betsy Minahan @ apollo for diagnosing it. It seems that if there are exactly 16 entries in the equivalents of /etc/group or /etc/passwd, some of the 3 are unreadable and others are corrupted. This is a problem that Apollo knows about and is working on. The workaround is to use edrgy and add another account, or person, or something, to bring the number over 16. It seems I also had to delete \`node_data/systmp/{.org,.group,.passwd} as well, so that they could get recreated properly (thanks to Michael Zeleznik for that last tip). ------------------------------ Eindhoven University of Technology DomainName: wjw@eb.ele.tue.nl Digital Systems Group, Room EH 10.10 BITNET: ELEBWJ@HEITUE5.BITNET P.O. 513 Tel: +31-40-473401 5600 MB Eindhoven The Netherlands