[comp.sys.apollo] Apollo prmgr troubles

rlee@island.seas.ucla.edu (Robert Lee) (03/07/90)

I'm having quite a bit of trouble getting a serial printer to work with
a 4500 running SR10.1;  the print manager doesn't run.  Prmgr returns
a "can't find print manager" message after invocation, goes to sleep for
30 seconds or so, and retries.  The hardware is fine; the workstation
and printer talks to each other under the sys5 print model, i.e. lpsched/lp
instead of prf/prmgr/prsvr.

What am I doing wrong?  The only documentation I have handy are those provided
online, where I have found no explanation of prmgr error messages.

Robert Lee                (Above opinions are my own and etc, etc, etc...)
InterNet: rlee@island.seas.ucla.edu
UUCP:  ...!(uunet,ucbvax,rutgers)!island.seas.ucla.edu!rlee

krowitz%richter@UMIX.CC.UMICH.EDU (David Krowitz) (03/08/90)

Are you certain that the "prmgr_name" in the configuration file for
the print manager is the same as the "prmgr_site" in the print server
configuration file? (yes, the server configuration file has a different
name for the same parameter!). This is how the manager and the server
identify each other.

Are the NCS brokers running? You must have /etc/ncs/llbd running on the
node which is running the print manager, the node which is running the
print server (which may or may not be the same machine), and the node
which is executing the prf command. "prf", "prsvr", and "prmgr" use NCS
to talk to each other and require "llbd" to find which node each server
is located on. "llbd", in turn, requires that "tcpd" be running on each
node and that /etc/ncs/glbd (the global location broker) be running on
at least one node in the network (this may be any node in the network
which is also running tcpd and llbd, it does not have to be one of the
print server nodes).


 -- David Krowitz

krowitz@richter.mit.edu   (18.83.0.109)
krowitz%richter.mit.edu@eddie.mit.edu
krowitz%richter.mit.edu@mitvma.bitnet
(in order of decreasing preference)

krowitz%richter@UMIX.CC.UMICH.EDU (David Krowitz) (03/08/90)

In regards to the problem of the print manager (/sys/hardcopy/prmgr)
and the print server (/sys/hardcopy/prsvr) not talking to each other,
I have come up with yet another point which system managers will need
to check ...

I recently started running multiple global location brokers on my network.
Up to this time I had no problem running the print server on one node (the
node to which the printer was attached, of course), the print manager on
another node, the pre SR10 print queuing program (/sys/hardcopy/pre10q) on
yet a third node, and using the prf command from anywhere I liked.

After starting up two addition copies of /etc/ncs/glbd (for a total of 3
global location brokers) in order to make certain that the whole net would
be able to continue operating if my master node went down, I found that the
various components of the SR10 printing system could not talk to each other
unless they were running on the same node. The two additional replicas of
the global location broker had been started according to the example given
in the help file, ie:

      /etc/ncs/glbd -create -from dds://orginal_glbd_node

When I ran /etc/ncs/drm_admin to check the state of the glbd databases, I
found that not all of the replicas were known to each other, so I used the
"addrep" command to add the replicas to the list, and then the "merge_all"
command to merge the replicated databases. As soon as this was completed,
the print server and the print manager (which were running on two seperate
nodes) saw each other, registered the printer with the manager, and began
operating. Here's the sequence of commands I used (for those who are not
familiar with drm_admin):

$ drm_admin
drm_admin: set -o glb -h dds://original_glbd_node
drm_admin: addrep dds://second_glbd_node
drm_admin: addrep dds://third_glbd_node
drm_admin: merge_all


One interesting note ... the "merge_all" command only worked when I used
the "set" command to operating on the original glbd node. When I tried to
use one of the other nodes as the hub for the merge, drm_admin refused to
do the merge because the system clocks on the three nodes were out of sync
by a minute or two ("skewed" is the term drm_admin uses). When I set
drm_admin to use the orginal glbd node as the default host to use as the
hub of the merge operation, I still get the messages about the skewed clocks,
but the merge operation is completed despite the messages. Any NCS experts
out there who can shed some light on what is going on?


 -- David Krowitz

krowitz@richter.mit.edu   (18.83.0.109)
krowitz%richter.mit.edu@eddie.mit.edu
krowitz%richter.mit.edu@mitvma.bitnet
(in order of decreasing preference)

rlee@ISLAND.SEAS.UCLA.EDU (Robert Lee) (03/10/90)

You're right about the location brokers.  After I ran drm_admin, and merge_all'd
everything, prmgr hasn't complained since.

This is my first time with the sys admin side of things, and there's nothing
turnkey about any of this.  I now have much more respect for our real
systems operations people; thank God I don't do this for a living.

Thanks again.

mk@apollo.HP.COM (Mike Kong) (03/15/90)

In article <9003081502.AA02337@richter.mit.edu>
krowitz%richter@UMIX.CC.UMICH.EDU (David Krowitz) writes:
>...
>I recently started running multiple global location brokers on my network.
>...
>After starting up two addition copies of /etc/ncs/glbd (for a total of 3
>global location brokers) in order to make certain that the whole net would
>be able to continue operating if my master node went down, I found that the
>various components of the SR10 printing system could not talk to each other
>unless they were running on the same node.
>...
>When I ran /etc/ncs/drm_admin to check the state of the glbd databases, I
>found that not all of the replicas were known to each other, so I used the
>"addrep" command to add the replicas to the list, and then the "merge_all"
>command to merge the replicated databases. 
>...
>One interesting note ... the "merge_all" command only worked when I used
>the "set" command to operating on the original glbd node. When I tried to
>use one of the other nodes as the hub for the merge, drm_admin refused to
>do the merge because the system clocks on the three nodes were out of sync

If the clocks at the existing glb site and the new glb site are in synch
(differing by 10 minutes or less), you should not have to add replicas or
merge databases with drm_admin.  If the clocks are out of synch (differing
by more than 10 minutes), the replica creation should fail cleanly.

However, it's possible to create a set of replicas in which sites A and B
are in synch and sites B and C are in synch, but sites A and C are out of
synch.  In this case, database inconsistencies can arise, and a global merge
can work with one site as the hub but not with another site as the hub.  To
avoid these problems, keep the clocks at all sites within 10 minutes of each
other.

At SR10.x, if you have one of the UNIX environments installed, setting
clocks is fairly painless; use the /bin/date utility.  At SR10.2, the
time server daemon /etc/timed is available, but I haven't tried it.

Mike Kong
Apollo Computer
mk@apollo.hp.com

gjalt@ele.tue.nl (& de Jong) (03/15/90)

In article <493156dd.20b6d@apollo.HP.COM> mk@apollo.HP.COM (Mike Kong) writes:

   At SR10.x, if you have one of the UNIX environments installed, setting
   clocks is fairly painless; use the /bin/date utility.  At SR10.2, the
   time server daemon /etc/timed is available, but I haven't tried it.

Well we run since a couple of days, and it works fine!
--
__
Gjalt G. de Jong,                 | Phone: +(31)40-473345
Eindhoven University of Technology, Dept. of Electr. Eng. (ES/EH 7.26)
P.O. Box 513, 5600 MB Eindhoven, The Netherlands
Email: gjalt@ele.tue.nl