[comp.unix.wizards] Insufficient Resource Error on msgsnd Call

noren@dinl.uucp (Charles Noren) (05/24/89)

We have been developing a application that uses System V message queues
(perhaps thats the first mistake :-)) for interprocess communication.
Everything has worked fine until we really wanted to stress test the
application by sending it hundereds of messages at once.  The application
chugs away nicely until it hangs.

Long boring discription of the problem follows...

First a model of the application.  It consists of three processes (call them
A, B, and C), and two message queues (call them 1 and 2).  The processes
and queues are orgainized as:

   +--------+                +--------+                +--------+
   |        |      +-----+   |        |      +-----+   |        |
   | Proc A |---->>| Q 1 |-->| Proc B |---->>| Q 2 |-->| Proc C |
   |        |      +-----+   |        |      +-----+   |        |
   +--------+                +--------+                +--------+

Process A generates 200 messages in bursts of about 50 as fast as it
can go (CPU bound) and puts it into Queue 1.  Process B reads the
messages from Queue 1, processes them while looking things up in
an Ingres database (we are using Ingres 5.0).  Process B sends even more
messages to Queue 2 which is read by Process C.

After Process A sends 150 messages (and Process B deleivers more messages),
Process B tries to write to Queue 2 and hangs (using IPC_WAIT on the msgsnd
call).  Queue 2 looks empty because Process C is blocking on it (using
msgsnd with IPC_WAIT).  Queue 1 appears full because when I try to write
to it with a no-wait (using diagnostic software), 
it returns with an errno of 11 (the Sun 3 manual
indicates this is caused by a fork with process limit exceeded or insufficent
resources).  Trying to write to Queue 2 produces the same error.
Using diagnostic software to read Queue 1 pulls messages off it of it, and
reading Queue 2 several times breaks the log jam and the system runs
for a while (trying to read Queue 2 with a no-wait fails returning an errno
of 22 -- Invalid argument, and the arguments have been checked).

Writing a simplified application with the processes and queues without the
database application flows nicely and NEVER hangs, even with waits in process B
(and process C).

We are running on a Sun 3/260 with 24 MB ram and with SunOS4.01.

Any suggestions of what could be happening?  Is Ingres using resources
common to Message Queues?  Have I shown a misunderstanding of how message
queues are to be used?

Thanks. 


-- 
Chuck Noren
NET:     ncar!dinl!noren
US-MAIL: Martin Marietta I&CS, MS XL8058, P.O. Box 1260,
         Denver, CO 80201-1260
Phone:   (303) 971-7930

noren@dinl.uucp (Charles Noren) (05/25/89)

Investigating my problem some more, I found some interesting things and
I want to bounce some ideas off the net.  In examining the <sys/msg.h>
file I found some interesting comments and definitions.  These seem
to imply:

   1.  Each message queue is limited to a default size set at
       system configuration time.  On our system, this is
       currently set to 2048 bytes (MSGMNB in msg.h).

   2.  While each message queue has a limit, all the messages
       queues are limited to a certain amount of memory.
       On our system, this is currently set to 8k bytes
       (MSGPOOL in msg.h).

   3.  There is also a fixed limit to the number of message
       packets in the message queue system.  This is defined
       in our system as 50 packets (MSGMNI -- number of message
       queue identifiers, MSGTQL -- number of system message
       headers.

In modifying my debug utilities that access message queue
statisics,  I found that when I was "stuck", none of the
messages in the message queues exceeded the 2048 byte
limit, and the total did not exceed 4k, well within the
8k limit.  However, I found that I had 50 messages queued
to all the queues -- the limit in msg.h for the count of message
queue identifiers.

I know how to turn on message queues on the Sun (thanks to answers
from a previous posting), but how do I tune those parameters?
Do I edit msg.h and reconfigure?

Another question:  There is a parameter in msg.h, right below the comment,
"The following parameters are assumed not to require tuning", named
MSGMAP that is the number of entries in the map.  It is set to 100.
If I change the number of packets in the system to 650, will I need to
set this as well?

Finally, am I wrong in my guesses?  Will a knowledgable comment on
my guesses?  Also, are there any references to the inside implementation
details on message queues, semaphores, shared memory, sockets, kernal
stuff that you would recommend?

Thanks to those who have replied and started pointing me in the right direction.


-- 
Chuck Noren
NET:     ncar!dinl!noren
US-MAIL: Martin Marietta I&CS, MS XL8058, P.O. Box 1260,
         Denver, CO 80201-1260
Phone:   (303) 971-7930