[comp.sys.apollo] Registry/GLBD problems

obrennan@CC3.CC.UMR.EDU (obrennan) (12/27/90)

      >>> Sounds like the glbd is not 1) getting started before rgyd,
      >>> or 2) accepting server requests
      >>
      >>  Any suggestions of how to change "/etc/rc" to handle "1)"? Or
      >>how to determine if the reason is "2)" and if so, any suggestions
      >>on how to find the problem (which logs to look etc?).
      >>
      >>> I would try rebuilding the GLBD with -create -first options ...
      >>
      >>  Hmm.. I have tried to do this few times. When I do it under root,
      >>like
      >>	/etc/server /etc/ncs/glbd -create -first
      >>  or
      >>	/etc/ncs/glbd -create -first
      >>
      >>  Nothing seems to happen, the man page leaves me with the impression
      >>that even with "-create", glbd would be left running with these
      >>commands, but it doesn't. Either way I try, I don't get any feedback
      >>from the commands and no running glbd results.
      >>
      >>  In general the apollo behaves now rather randomly. When logging
      >>in from the net with telnet, I sometimes get "bind: address in use",
      >>trying to FTP outwards just hangs on Username prompt etc.. and when
      >>trying again, all goes well. Running lb_admin, I get similar kinds
      >>of inconsistensies, sometimes it complains about comm failures, and
      >>yet next command may succeed. First I would like to get this into
      >>some simple known initial state.
      >>
      >>   Continuing...
      >>
      >>   I did try to start with "/etc/server -p ...", but it doesn't seem
      >>to help.
      >>
      >>replicas...  Like, I just run into section about ns_helper? When do I
      >>need to run that?
      >>
      >>   Apologies for troubling you again, but it seems that there is
      >>something very *basic* wrong in the system. Most of the management
      >>tools (lb_admin, drm_admin, edns ..) complain mostly about
      >>communications problems.
      >>
      >>   From glb_log I can see that "glbd -create -first" fails because glb
      >>object already exists. How do I delete? Should I? Have already
      >>tried with drm_admin and delrep, but that attempt just terminated
      >>with error "communication failure .. NCS/RPC runtime)", as are
      >>terminated most other attempts with this and other tools. (llbd,
      >>glbd and rgyd are all running).
      >> 

It sounds like all of your servers think they have replicas on other Apollo nodes.
Here is what I would do to essentially cleanup and restart:
         
        1) edit the rc.user file and comment out the ns_helper startup. NS_HELPER
           is only useful when you have a network of Apollos. It is used to keep
           the node cataloging (//name and nodeid correspondence) in a database
           so that nodes can resolve conflicts with their local caches (contains
           nodenames and nodeids).

        2) delete these GLBD files:

            /sys/node_data/glb.p
            /sys/node_data/glb.e

        3) Run: /etc/server /etc/ncs/llbd &

        4) Run: /etc/server /etc/ncs/glbd -create -first &

        5) Run: /etc/server -p /etc/rgyd &

        6) Use CRF or TOUCH to create the following Zero-length files
           so that at bootup the RC files will start your servers:

                  /sys/node_data/etc/daemons/llbd
                  /sys/node_data/etc/daemons/glbd 
                  /sys/node_data/etc/daemons/rgyd
                         

If the above does not work out, I would check the /sys/node_data/system_logs
for problem determination. Hopefully, the above will clear your problem.


Gerry O'Brennan
Programmer/Analyst II
Computing Services
University of Missouri - Rolla
------------------------------
obrennan@apollo.cc.umr.edu
c0022@umrvmb.umr.edu          
------------------------------