[comp.unix.questions] Insufficient Resource Error on msgsnd Call

noren@dinl.uucp (Charles Noren) (05/24/89)

We have been developing a application that uses System V message queues
(perhaps thats the first mistake :-)) for interprocess communication.
Everything has worked fine until we really wanted to stress test the
application by sending it hundereds of messages at once.  The application
chugs away nicely until it hangs.

Long boring discription of the problem follows...

First a model of the application.  It consists of three processes (call them
A, B, and C), and two message queues (call them 1 and 2).  The processes
and queues are orgainized as:

   +--------+                +--------+                +--------+
   |        |      +-----+   |        |      +-----+   |        |
   | Proc A |---->>| Q 1 |-->| Proc B |---->>| Q 2 |-->| Proc C |
   |        |      +-----+   |        |      +-----+   |        |
   +--------+                +--------+                +--------+

Process A generates 200 messages in bursts of about 50 as fast as it
can go (CPU bound) and puts it into Queue 1.  Process B reads the
messages from Queue 1, processes them while looking things up in
an Ingres database (we are using Ingres 5.0).  Process B sends even more
messages to Queue 2 which is read by Process C.

After Process A sends 150 messages (and Process B deleivers more messages),
Process B tries to write to Queue 2 and hangs (using IPC_WAIT on the msgsnd
call).  Queue 2 looks empty because Process C is blocking on it (using
msgsnd with IPC_WAIT).  Queue 1 appears full because when I try to write
to it with a no-wait (using diagnostic software), 
it returns with an errno of 11 (the Sun 3 manual
indicates this is caused by a fork with process limit exceeded or insufficent
resources).  Trying to write to Queue 2 produces the same error.
Using diagnostic software to read Queue 1 pulls messages off it of it, and
reading Queue 2 several times breaks the log jam and the system runs
for a while (trying to read Queue 2 with a no-wait fails returning an errno
of 22 -- Invalid argument, and the arguments have been checked).

Writing a simplified application with the processes and queues without the
database application flows nicely and NEVER hangs, even with waits in process B
(and process C).

We are running on a Sun 3/260 with 24 MB ram and with SunOS4.01.

Any suggestions of what could be happening?  Is Ingres using resources
common to Message Queues?  Have I shown a misunderstanding of how message
queues are to be used?

Thanks. 


-- 
Chuck Noren
NET:     ncar!dinl!noren
US-MAIL: Martin Marietta I&CS, MS XL8058, P.O. Box 1260,
         Denver, CO 80201-1260
Phone:   (303) 971-7930

trebor@biar.UUCP (Robert J Woodhead) (05/24/89)

In article <1023@dinl.mmc.UUCP> noren@dinl.UUCP (Chuck Noren) writes:
>Process A generates 200 messages in bursts of about 50 as fast as it
>can go (CPU bound) and puts it into Queue 1.  Process B reads the
>messages from Queue 1, processes them while looking things up in
>an Ingres database (we are using Ingres 5.0).  Process B sends even more
>messages to Queue 2 which is read by Process C.
>
>Any suggestions of what could be happening?  Is Ingres using resources
>common to Message Queues?  Have I shown a misunderstanding of how message
>queues are to be used?

What is happening is that there is a certain amount of memory space set
aside for messages in queues (this is a kernel parameter) and you are
filling it up.

Consider:  Your process A is pumping tons of messages into Queue 1, filling
up queue memory space.  Now process B attempts to write to Queue 2, but
there is not enough space to do this, so it halts waiting for space to
become available.  There are only two ways this can happen; if process C
or process B reads messages out of the queues.  B can't do this; it is
halted trying to write to queue 2.  Now, if C, after reading all the
messages in queue 2, still hasn't freed up enough space for B's write
to complete, your pipeline will freeze.  It is even worse, because as
C reads things, process A will probably snarf this space by sending
messages.  Ooops!

The error message you are getting when you specify IPC_NOWAIT in the msgsnd
indicates this is what is happening.  It's occuring because your pipeline
is poorly designed; the majority of the processing is going on in process B,
and tons of messages are piling up in queue 1.

Note that when you use your debug process to read some messages out of the
queues, the blocked write eventually gets done, and the pipeline unblocks.

Solutions:

1) Increase kernel space allocated for messages so that worst case there
   will always be enough room.  Advantage : quick and easy.  Disadvantage :
   Unless you can determine the maximum size needed, it isn't reliable.

2) Add a queue 3 that connects process C to process A.  Process C sends
   a blank message down queue 3 every N messages it gets from queue 2.
   Process 1 has a counter it decrements by 1 each message it sends into
   queue 1, starting at, say 2*N (or M in the general case).  When
   this counter goes to 0, process A reads a message from queue 3 (and
   blocks until it gets one), and then adds N to the counter and
   continues.  This guarantees that there are at most M-N transactions
   in transit, and allows you to ensure that you don't overflow your
   message space.

3) Restructure your system as a client/server model, where B serves A
   and C serves B.  Since either side can put a message in the queue,
   A sends a message to B, then does a msgrcv.  When B reads the message,
   it sends a dummy message back to A, saying ``I got it, you can now put
   another request in''.  B and C do the same process.  End result is that
   there is never more than one message in each queue at any time.

   By using two queues between each process, you can generalize this to
   never more than N message in each queue at any time.

-- 
Robert J Woodhead, Biar Games, Inc.  !uunet!biar!trebor | trebor@biar.UUCP
"The lamb will lie down with the lion, but the lamb won't get much sleep."
     -- Woody Allen.

jay@mips.COM (Jay McCauley) (05/25/89)

In article <1023@dinl.mmc.UUCP> noren@dinl.UUCP (Chuck Noren) writes:
>We have been developing a application that uses System V message queues
>(perhaps thats the first mistake :-)) for interprocess communication.
>Everything has worked fine until we really wanted to stress test the
>application by sending it hundereds of messages at once.  The application
>chugs away nicely until it hangs.

What is probably happening is you have hit the under documented "feature" that
System V message queues all share a very small temporary buffer space 
for messages in transit.  In some systems I've used, the pool was 8 Kb,
apparently the default size in System V Release 2.  It is very easy to
create deadlocks with this implementation.  I'm not sure how the pieces
of Ingress communicate with each other, but if they too use the msg facilities,
bingo, a deadlock.

There is not a real good way out of this.  One way is to use non-blocking
writes, and pause a bit if you get denied.  The implementation where I
had first encountered this has a tricky backoff protocol attempt to
prevent the deadlock.  It worked well in practice, but is not theoretically
perfect.

Increasing the buffer pool is, typically, just a matter of a binary
reconfig of the kernel.  This can reduce the frequency of the problem,
but cannot eliminate it.

We also felt, but could never prove, that there were flaws in the
standard System V implementation of the message facilities.  Every so often
we would see mysterious confusion in the ipcs reports that would only
clear by rebooting.  This occured on several different vendor's systems,
so we suspected that there was some sort of subtle synchronization
glitch in the kernel msgop support.

Jay McCauley
MIPS

noren@dinl.uucp (Charles Noren) (05/25/89)

Investigating my problem some more, I found some interesting things and
I want to bounce some ideas off the net.  In examining the <sys/msg.h>
file I found some interesting comments and definitions.  These seem
to imply:

   1.  Each message queue is limited to a default size set at
       system configuration time.  On our system, this is
       currently set to 2048 bytes (MSGMNB in msg.h).

   2.  While each message queue has a limit, all the messages
       queues are limited to a certain amount of memory.
       On our system, this is currently set to 8k bytes
       (MSGPOOL in msg.h).

   3.  There is also a fixed limit to the number of message
       packets in the message queue system.  This is defined
       in our system as 50 packets (MSGMNI -- number of message
       queue identifiers, MSGTQL -- number of system message
       headers.

In modifying my debug utilities that access message queue
statisics,  I found that when I was "stuck", none of the
messages in the message queues exceeded the 2048 byte
limit, and the total did not exceed 4k, well within the
8k limit.  However, I found that I had 50 messages queued
to all the queues -- the limit in msg.h for the count of message
queue identifiers.

I know how to turn on message queues on the Sun (thanks to answers
from a previous posting), but how do I tune those parameters?
Do I edit msg.h and reconfigure?

Another question:  There is a parameter in msg.h, right below the comment,
"The following parameters are assumed not to require tuning", named
MSGMAP that is the number of entries in the map.  It is set to 100.
If I change the number of packets in the system to 650, will I need to
set this as well?

Finally, am I wrong in my guesses?  Will a knowledgable comment on
my guesses?  Also, are there any references to the inside implementation
details on message queues, semaphores, shared memory, sockets, kernal
stuff that you would recommend?

Thanks to those who have replied and started pointing me in the right direction.


-- 
Chuck Noren
NET:     ncar!dinl!noren
US-MAIL: Martin Marietta I&CS, MS XL8058, P.O. Box 1260,
         Denver, CO 80201-1260
Phone:   (303) 971-7930

ka@june.cs.washington.edu (Kenneth Almquist) (05/25/89)

>    +--------+                +--------+                +--------+
>    |        |      +-----+   |        |      +-----+   |        |
>    | Proc A |---->>| Q 1 |-->| Proc B |---->>| Q 2 |-->| Proc C |
>    |        |      +-----+   |        |      +-----+   |        |
>    +--------+                +--------+                +--------+

This setup can deadlock if the value of msg_qbytes for queue 1 is too
large.  The sequence of events is:

    0)	Queue 2 is empty.
    1)	Process B reads a message from queue 1.
    2)	Process A writes lots of messages to queue 1, exhausting the global
	message buffer space.
    3)	Process B tries to write a message to queue 2, but can't because
	the global message buffer space is exhausted.

If this is your problem, then you can fix it by limiting the quantity of
messages outstanding on queue 1 to prevent the global message buffer space
from being exhausted.  (This is done by using the IPC_SET command of the
msgctl system call to set msg_qbytes to a smaller value.)

On the other hand, you write:

> Queue 2 looks empty because Process C is blocking on it (using
> msgsnd with IPC_WAIT) ...
> reading Queue 2 several times breaks the log jam and the system runs
> for a while ...

If you can read messages off of queue 2 while process C is blocked on a
read, that indicates a kernel bug.  Probably the msgsnd system call is
failing to call wakeup for some reason.
				Kenneth Almquist

stevens@hsi.UUCP (Richard Stevens) (05/25/89)

In article <20284@winchester.mips.COM>, jay@mips.COM (Jay McCauley) writes:
> 
> We also felt, but could never prove, that there were flaws in the
> standard System V implementation of the message facilities.  Every so often
> we would see mysterious confusion in the ipcs reports that would only
> clear by rebooting.  This occured on several different vendor's systems,
> so we suspected that there was some sort of subtle synchronization
> glitch in the kernel msgop support.
> 

There is indeed a problem with the SVr2 implementation of message queues.
As I understand it, the problem occurs when data is copied between the
user space and the kernel *and* a page fault occurs.  Supposedly it was
fixed in SVr2.1 or SVr3.0, but I don't have the source for either of
these to check.  I heard they added a set of locks for each queue
so that common data structures aren't walked over if there is a page
fault during an iomove().

	Richard Stevens
	Health Systems International, New Haven, CT
	   stevens@hsi.com
           ... { uunet | yale } ! hsi ! stevens

dave@ucms.UUCP (Dave Settle) (05/25/89)

In article <1023@dinl.mmc.UUCP> noren@dinl.UUCP (Chuck Noren) writes:
>We have been developing a application that uses System V message queues
>(perhaps thats the first mistake :-)) for interprocess communication.
>Everything has worked fine until we really wanted to stress test the
>application by sending it hundereds of messages at once.  The application
>chugs away nicely until it hangs.
>
>
>First a model of the application.  It consists of three processes (call them
>A, B, and C), and two message queues (call them 1 and 2).  The processes
>and queues are orgainized as:
>
>   +--------+                +--------+                +--------+
>   |        |      +-----+   |        |      +-----+   |        |
>   | Proc A |---->>| Q 1 |-->| Proc B |---->>| Q 2 |-->| Proc C |
>   |        |      +-----+   |        |      +-----+   |        |
>   +--------+                +--------+                +--------+
>
>Process A generates 200 messages in bursts of about 50 as fast as it
>can go (CPU bound) and puts it into Queue 1.  Process B reads the
>messages from Queue 1, processes them while looking things up in
>an Ingres database (we are using Ingres 5.0).  Process B sends even more
>messages to Queue 2 which is read by Process C.
>
>After Process A sends 150 messages (and Process B deleivers more messages),
>Process B tries to write to Queue 2 and hangs (using IPC_WAIT on the msgsnd
>call).  Queue 2 looks empty because Process C is blocking on it (using
>msgsnd with IPC_WAIT).  Queue 1 appears full because when I try to write
>to it with a no-wait (using diagnostic software), 
>it returns with an errno of 11 (the Sun 3 manual
>indicates this is caused by a fork with process limit exceeded or insufficent
>resources).  Trying to write to Queue 2 produces the same error.

The error EAGAIN, to which you refer here, is used in a specific manner by
the 'msgsnd' call to mean 'No more space available to store your message'

>Any suggestions of what could be happening?  Is Ingres using resources
>common to Message Queues?  Have I shown a misunderstanding of how message
>queues are to be used?
>

From what you have written, I suspect that the problem is that you have
exceeded the GLOBAL message space buffer size with writes to Q1. P2 can
therefore not write to Q2 (no more message space), so nobody can proceed.

You have confused me by stating both that you cannot write to Q2, and that
P3 is sitting trying to read from it, and also that you can cure the problem
for a bit by READING from it - isn't P3 trying to do just that?

But generally, the system has a serious flaw: if Q1 causes the system message
space to fill up, then P2 is stuck - it can't write messages to Q2 'cos
the system is full, but on the other hand, it can't free some space until
it has disposed of the message and read Q1.

I think that you can configure the system (look in /etc/master) so that
you can force the individual queue to be full, before the whole system
is full. I've never done this, though, so I might be wrong.

You might also consider using 'crash' to examine the state of the kernel
message queue structures: it's very useful for things like this.

Cheers,
	Dave
-- 

Dave Settle, Universal (CMS) Ltd, Thames Tower, Burleys Way, Leicester, UK.

dave@ucms.co.uk	 (someday)		...!mcvax!ukc!nott-cs!ucms!dave
dave@ucms.uucp 	  (today)

		<--- This way to point of view --->

uucibg@sw1e.UUCP (3929]) (05/25/89)

In article <1026@dinl.mmc.UUCP> noren@dinl.UUCP (Charles Noren) writes:
>Investigating my problem some more, I found ...
>   1.  Each message queue is limited to a default size set at
>       system configuration time.  On our system, this is
>       currently set to 2048 bytes (MSGMNB in msg.h).
>   2.  While each message queue has a limit, all the messages
>       queues are limited to a certain amount of memory.
>       On our system, this is currently set to 8k bytes
>       (MSGPOOL in msg.h).
>   3.  There is also a fixed limit to the number of message
>       packets in the message queue system.  This is defined
>       in our system as 50 packets (MSGMNI -- number of message
>       queue identifiers, MSGTQL -- number of system message
>       headers.
>
>I know how to turn on message queues on the Sun (thanks to answers
>from a previous posting), but how do I tune those parameters?
>Do I edit msg.h and reconfigure?

I'm not sure of the name of the file (I've only watched the sysadmin for the
box change the file), but you definitely edit a file and reconfigure the
kernal.  To be honest, I don't think it's msg.h (though mileage may vary since
you're not working with SysV).

>Another question:  There is a parameter in msg.h, right below the comment,
>"The following parameters are assumed not to require tuning", named
>MSGMAP that is the number of entries in the map.  It is set to 100.
>If I change the number of packets in the system to 650, will I need to
>set this as well?

In the system administrator's guide for the 3B2/600 (don't ask me why we use
them, I don't know), there are descriptions of the dependencies.  There are
some interesting relationships between the different pieces, but unfortunately
I don't know if they apply to Suns and I also don't know what differences there
are between SysV2 and SysV3.

My guess would be to look into the sections tuning or reconfigureing the kernel
(probably in the system administrator's reference rather than the guide) to
find out exactly how Suns handle these things.


Brian R. Gilstrap                          Southwestern Bell Telephone
One Bell Center Rm 17-G-4                  ...!ames!killer!texbell!sw1e!uucibg
St. Louis, MO 63101                        ...!bellcore!texbell!sw1e!uucibg
(314) 235-3929                             ...!uunet!swbatl!sw1e!uucibg
#include <std_disclaimers.h>

noren@dinl.uucp (Charles Noren) (05/25/89)

In article <8340@june.cs.washington.edu> ka@june.cs.washington.edu (Kenneth Almquist) writes:
]>    +--------+                +--------+                +--------+
]>    |        |      +-----+   |        |      +-----+   |        |
]>    | Proc A |---->>| Q 1 |-->| Proc B |---->>| Q 2 |-->| Proc C |
]>    |        |      +-----+   |        |      +-----+   |        |
]>    +--------+                +--------+                +--------+
]
]On the other hand, you write:
]
]> Queue 2 looks empty because Process C is blocking on it (using
]> msgsnd with IPC_WAIT) ...
]> reading Queue 2 several times breaks the log jam and the system runs
           ^^^^^^^
]> for a while ...
]
]If you can read messages off of queue 2 while process C is blocked on a
]read, that indicates a kernel bug.  Probably the msgsnd system call is
]failing to call wakeup for some reason.
]				Kenneth Almquist

Oops, I meant to say Queue 1.  Several have caught me on this one.
I'm glad to see all of the good responses in spite of mis-stating
some of the information.

-- 
Chuck Noren
NET:     ncar!dinl!noren
US-MAIL: Martin Marietta I&CS, MS XL8058, P.O. Box 1260,
         Denver, CO 80201-1260
Phone:   (303) 971-7930

noren@dinl.uucp (Charles Noren) (05/26/89)

The problem is solved and corrected.  Thanks to all who have contributed so
many good answers that opened up my understanding of message queues more.
Particular thanks goes to Doug Looms and  Fred DePalm for pointing out the
correct Sun reference manual that tells how to resize the options, and to
Keith Gregory who gave a brief tutorial on the important "#define" sizing
constants for message queues.  The quality of all the responses was great,
which included some tips on improving my data flow and general information
based on experience on what and what not to do on message queues.

A summary of what I found out is:

First an important Sun manual:  SYSTEM V ENHANCEMNETS OVERVIEW
(Sun part No. 800-1541-03).  For Sun 3 systems, the configuration
file needs to be modified (in directory /sys/sun3/conf) to include
the following lines:

   options   MSGPOOL=xxx    ...where xxx is the size in k bytes
                            of the system V message queue memory
                            pool.  It must be < 255.

   options   MSGMNI=xxx     ...where xxx is the number of possible message
                            queues allowed by the system.

   options   MSGTQL=xxx     ...where xxx is the limit of the number of
                            message packets in the system.

   options   MSGMNB=xxx     ...where xxx is the limit of the number of bytes
                            that can be queued at a message queue.

There are some other parameters described in the SYSTEM V ENHANCEMENTS
OVERVIEW that pertain to the sizing of the message itself.
Since we have the luxury of delivering a stand-alone system, we
configured the message queues based on the mimimum size of messages
so that the contraining factor will be MSGMNB (the maximum number of
bytes queued at message queue).  Since all the processes are actors,
any process that blocks at a queue (the queue being full -- forced by
the definition of the parameters) will have another process at the
other side of the queue that is processing the messages, thus freeing
up space in the queue.

Thanks again for all the help.  I was also particularly pleased with
how quickly Sun (Fred DePalm) responded and pursued the problem until it
was resolved.
Thanks again!! 

-- 
Chuck Noren
NET:     ncar!dinl!noren
US-MAIL: Martin Marietta I&CS, MS XL8058, P.O. Box 1260,
         Denver, CO 80201-1260
Phone:   (303) 971-7930