[comp.sys.apollo] Help! My registry won't talk to me!

dj@dorsai.cognet.ucla.edu (David J. Wells) (11/19/88)

Environment:

	DN4000 (dorsai.cognet.ucla.edu)
	SR10.0 Domain/OS BSD 4.3
	running	tcpd, syslogd, inetd, llbd, glbd, rgyd, cron, lpd,
		netman, sendmail.
	shell: /bin/ksh

Description:

	If I have 13 su's (root), then applications cannot communicate
	with the rgyd.  "Why do I want 13 su's?"  Well, that was just a
	way that I could duplicate the problem under DM -- I first saw
	the problem under X11R3 Windows.  Under X, I am allowed 3, maybe
	4 su's before the same thing happens.  Apparently some X programs
	use the same resource that is required to communicate with the
	rgyd.  (xterm is *not* one of these affected programs)

	When this happens, many things break.  Most noticeable are su,
	login, rgy_admin, and sendmail (it doesn't know about any local
	users).
	
	~ $ su
	Su failed
	~ $ rgy_admin
	        Cannot locate a rgy replica which is in service
	        Default object: rgy  default host: unspecified
	rgy_admin: q
	~ $ rsh dorsai		(from dorsai)
	account is invalid or has expired (RGYC/Login)
	Login incorrect
	Connection closed.
	~ $

	Eventually, after I kill an su, everything starts working again.
	Oddly, there can be a time where I can su in one window, but not
	in another; yet I can run rgy_admin in any window.  If I wait 5
	minutes or so, then it all works.

	~ $ rgy_admin
	        Default object: rgy  default host: dds://fs3
	        State: in service  master  replica list is readonly
	rgy_admin: lr
	(master)  dds://fs3
	          dds://dorsai
	rgy_admin: q
	~ $

Attempted Solutions:

	It was suggested that I might be running out of processes.  Since
	I can start more processes, this is not the case.

	I thought that maybe I need more server processes, so I tried
	netsvc -s 3, but this didn't help.

	My next idea was that I might be running out of a socket-related
	resource.  But I can start rlogin sessions to other machines, so
	that isn't the problem.

	Any other ideas?

	Thanks in advance.
								David J Wells
							       dj@cs.ucla.edu
								w213/206-3960

pato@apollo.COM (Joe Pato) (12/06/88)

In article <18085@shemp.CS.UCLA.EDU> dj@cs.ucla.edu (David J. Wells) writes:
>
>Environment:
>
>	DN4000 (dorsai.cognet.ucla.edu)
>	SR10.0 Domain/OS BSD 4.3
>	running	tcpd, syslogd, inetd, llbd, glbd, rgyd, cron, lpd,
>		netman, sendmail.
>	shell: /bin/ksh
>
>Description:
>
>	If I have 13 su's (root), then applications cannot communicate
>	with the rgyd.  "Why do I want 13 su's?"  Well, that was just a
>	way that I could duplicate the problem under DM -- I first saw
>	the problem under X11R3 Windows.  Under X, I am allowed 3, maybe
>	4 su's before the same thing happens.  Apparently some X programs
>	use the same resource that is required to communicate with the
>	rgyd.  (xterm is *not* one of these affected programs)
>
>	When this happens, many things break.  Most noticeable are su,
>	login, rgy_admin, and sendmail (it doesn't know about any local
>	users).
>	
. . .
>
>Attempted Solutions:
>
>	It was suggested that I might be running out of processes.  Since
>	I can start more processes, this is not the case.
>
>	I thought that maybe I need more server processes, so I tried
>	netsvc -s 3, but this didn't help.
>
>	My next idea was that I might be running out of a socket-related
>	resource.  But I can start rlogin sessions to other machines, so
>	that isn't the problem.
>
>	Any other ideas?
>
>	Thanks in advance.
>								David J Wells
>							       dj@cs.ucla.edu
>								w213/206-3960

There is, in fact, a socket related resource that is not properly released in
sr10.0.  IP family sockets are not affected, which is why you were able to
continue to create new rlogin sessions.  This problem has been fixed in sr10.1

  Joe Pato                UUCP: ...{attunix,uw-beaver,brunix}!apollo!pato
  Apollo Computer Inc.  NSFNET: pato@apollo.com