[comp.sys.apollo] create remote process & NCS sockets

root@ICAEN.UIOWA.EDU (Super user) (08/11/90)

In article <1174@fang.dsto.oz> agq@dstos3.dsto.oz (Ashleigh Quick) writes:
>
>
[ description of problems with crp under sr10.1 deleted ]

>Msg> Questions:
>Msg>
>Msg> Is there a limit in DOMAIN/OS on the number of print servers that can
>Msg> be run on a node? (And if so, WHY????)

Yes, there is a limit that is set by the number of NCS sockets that you
can have open. This limit is OS revision dependent. (See next question)

>Msg> Is there a limit on the number of 'sockets' available for NCS type
>Msg> services? (Again - if so why?) If there is a limit - can it be
>Msg> configured in any way??????

Yes there is a OS revision dependent limit on the number of sockets available
for NCS. Back in the "good old days" (pre-NCS) there was no way that a user
could directly consume DDS sockets, they could only run a selected set of
Apollo supplied tools (such as mbx_helper, netman, spm, etc) that used them
for the user. With the advent of NCS, users could write server programs that
directly consumed DDS sockets. Thus there arose more contention for this scarce
resource & things became tight. At sr10.2 the number of sockets was increased,
as noted in the sr10.2 release notes:

                             Software Release 10.2

     1.5.10  Enhancement to Domain/OS Sockets

     The number of user Domain/OS sockets available has been increased at
     SR10.2 from 23 to 64 (for m68k systems).  We now provide sufficient
     user Domain/OS sockets for each user process. This is important as NCS
     applications are becoming more prevalent.

So the answer to your problem is to up-rev your OS ASAP. At sr10.2 it is possible
to run 3 print servers on one node with out problems. 

FYI: At sr10.2 there have been several improvements that make the prsvr printing
system work better. I started writing a print server driver under sr10.1 but gave it up
as I was losing too much hair. Now under sr10.2 I have a reasonable port of my sr9.7
HP LaserJet printer driver working.

Dave Funk

agq@fang.dsto.oz (Ashleigh Quick) (08/14/90)

Thanks to all those who replied via news and e-mail.

Quick summary of the problem:

> We have a node here running SR10.1. When we run three print servers,
> strange things begin to happen... like "prf -list_pr" will fail with a
> message "unable to locate printers for site xxxxx - unable to bind
> socket". When it is sick the /etc/ncs/lb_admin utility will not talk
> to the local location broker (llbd), you cannot CRP off or on the
> node, etc.

A summary of the answers:

Ulf Ekberg
Hewlett-Packard Sverige AB, Response Center (Stockholm, Sweden)
Box 19
164 93 Kista
Sweden
Phone: +46 8 7502417
FAX:   +46 8 7504942
(ULEK%HPUSTOA.HP.COM)   writes:

UE> It is possible that you are simply running out of Domain (DDS) sockets
UE> (not the same thing as BSD sockets). There are 23 such sockets
UE> available on a node at SR10.1, 64 at SR10.2 . The socket limit is
UE> documented on the limits(7) man page (limits(5) in SysV).
UE>
UE> Using /etc/ncs/lb_admin to examine the socket usage by NCS servers on
UE> our system shows that the glbd, rgyd, prmgr and prsvr consume one DDS
UE> socket each. In addition, each llbd should use one socket, and of
UE> course any NCS clients communicating over DDS (such as lb_admin) uses
UE> at least one DDS socket. If you are using Omniback, this is a real
UE> socket hog and can consume up to eight (8) sockets on a single node
UE> (see section 4.3 of the Omniback release notes).
UE>
UE> In your case, what is probably happening is that your NCS servers and
UE> clients are consuming all of the 23 available sockets. When you do
UE> "prf -list_printers", prf contacts the prmgr:s, and to do this, prf
UE> must allocate a socket for communication with (first) the glbd (to get
UE> a list of prmgr:s) and the prmgr:s. However, since there are no free
UE> sockets, the prf commands fails with "unable to bind socket".
UE>
UE> If you are indeed running out of sockets, you can do one of the
UE> following things to correct the problem:
UE>
UE> (1) Move some of the NCS servers to other machines.
UE>
UE> (2) Upgrade to SR10.2 (this raises the limit to 64 sockets).
UE>
UE> (3) Install patch 49 (for SR10.1). The release notes for this patch
UE>     (available on patch tapes for August through December 1989) says,
UE>     among other things:
UE>
UE>                    o The number of user sockets was not large enough
UE>                      (APR dd0fb). The number of user sockets was
UE>                      increased to 96. This fix is for sau7 only.  DO
UE>                      NOT use this include file for building other
UE>                      saus; it will cause global B overflow.
UE>
UE>     I should add that I don't know anybody who has actually verified
UE>     that this patch _will_ raise the DDS socket limit.
UE>


Dave Funk,
Iowa Computer Aided Engineering Network, University of Iowa
(root@ICAEN.UIOWA.EDU (Super user) )
writes:

DF> Yes, there is a limit that is set by the number of NCS sockets that you
DF> can have open. This limit is OS revision dependent. (See next question)
DF>       [deleted]
DF>
DF> Yes there is a OS revision dependent limit on the number of sockets
DF> available for NCS. Back in the "good old days" (pre-NCS) there was no
DF> way that a user could directly consume DDS sockets, they could only
DF> run a selected set of Apollo supplied tools (such as mbx_helper,
DF> netman, spm, etc) that used them for the user. With the advent of NCS,
DF> users could write server programs that directly consumed DDS sockets.
DF> Thus there arose more contention for this scarce resource & things
DF> became tight. At sr10.2 the number of sockets was increased, as noted
DF> in the sr10.2 release notes:
DF>
DF>                              Software Release 10.2
DF>
DF>      1.5.10  Enhancement to Domain/OS Sockets
DF>
DF>      The number of user Domain/OS sockets available has been increased
DF>      at SR10.2 from 23 to 64 (for m68k systems).  We now provide
DF>      sufficient user Domain/OS sockets for each user process. This is
DF>      important as NCS applications are becoming more prevalent.
DF>
DF> So the answer to your problem is to up-rev your OS ASAP. At sr10.2 it
DF> is possible to run 3 print servers on one node with out problems.
DF>
DF> FYI: At sr10.2 there have been several improvements that make the
DF> prsvr printing system work better. I started writing a print server
DF> driver under sr10.1 but gave it up as I was losing too much hair. Now
DF> under sr10.2 I have a reasonable port of my sr9.7 HP LaserJet printer
DF> driver working.
DF>


I also commented that we are held back from going to SR10.2 by running
Mentor products, and a number of people pointed out that they have run
Mentor quite successfully under 10.2, except for Boardstation (which
we are not running on that node anyway).

So there it is!


Thanks to all who replied.

Ashleigh Quick
AGQ@dstos3.dsto.oz.au