noren@dinl.uucp (Charles Noren) (05/24/89)
We have been developing a application that uses System V message queues (perhaps thats the first mistake :-)) for interprocess communication. Everything has worked fine until we really wanted to stress test the application by sending it hundereds of messages at once. The application chugs away nicely until it hangs. Long boring discription of the problem follows... First a model of the application. It consists of three processes (call them A, B, and C), and two message queues (call them 1 and 2). The processes and queues are orgainized as: +--------+ +--------+ +--------+ | | +-----+ | | +-----+ | | | Proc A |---->>| Q 1 |-->| Proc B |---->>| Q 2 |-->| Proc C | | | +-----+ | | +-----+ | | +--------+ +--------+ +--------+ Process A generates 200 messages in bursts of about 50 as fast as it can go (CPU bound) and puts it into Queue 1. Process B reads the messages from Queue 1, processes them while looking things up in an Ingres database (we are using Ingres 5.0). Process B sends even more messages to Queue 2 which is read by Process C. After Process A sends 150 messages (and Process B deleivers more messages), Process B tries to write to Queue 2 and hangs (using IPC_WAIT on the msgsnd call). Queue 2 looks empty because Process C is blocking on it (using msgsnd with IPC_WAIT). Queue 1 appears full because when I try to write to it with a no-wait (using diagnostic software), it returns with an errno of 11 (the Sun 3 manual indicates this is caused by a fork with process limit exceeded or insufficent resources). Trying to write to Queue 2 produces the same error. Using diagnostic software to read Queue 1 pulls messages off it of it, and reading Queue 2 several times breaks the log jam and the system runs for a while (trying to read Queue 2 with a no-wait fails returning an errno of 22 -- Invalid argument, and the arguments have been checked). Writing a simplified application with the processes and queues without the database application flows nicely and NEVER hangs, even with waits in process B (and process C). We are running on a Sun 3/260 with 24 MB ram and with SunOS4.01. Any suggestions of what could be happening? Is Ingres using resources common to Message Queues? Have I shown a misunderstanding of how message queues are to be used? Thanks. -- Chuck Noren NET: ncar!dinl!noren US-MAIL: Martin Marietta I&CS, MS XL8058, P.O. Box 1260, Denver, CO 80201-1260 Phone: (303) 971-7930
trebor@biar.UUCP (Robert J Woodhead) (05/24/89)
In article <1023@dinl.mmc.UUCP> noren@dinl.UUCP (Chuck Noren) writes: >Process A generates 200 messages in bursts of about 50 as fast as it >can go (CPU bound) and puts it into Queue 1. Process B reads the >messages from Queue 1, processes them while looking things up in >an Ingres database (we are using Ingres 5.0). Process B sends even more >messages to Queue 2 which is read by Process C. > >Any suggestions of what could be happening? Is Ingres using resources >common to Message Queues? Have I shown a misunderstanding of how message >queues are to be used? What is happening is that there is a certain amount of memory space set aside for messages in queues (this is a kernel parameter) and you are filling it up. Consider: Your process A is pumping tons of messages into Queue 1, filling up queue memory space. Now process B attempts to write to Queue 2, but there is not enough space to do this, so it halts waiting for space to become available. There are only two ways this can happen; if process C or process B reads messages out of the queues. B can't do this; it is halted trying to write to queue 2. Now, if C, after reading all the messages in queue 2, still hasn't freed up enough space for B's write to complete, your pipeline will freeze. It is even worse, because as C reads things, process A will probably snarf this space by sending messages. Ooops! The error message you are getting when you specify IPC_NOWAIT in the msgsnd indicates this is what is happening. It's occuring because your pipeline is poorly designed; the majority of the processing is going on in process B, and tons of messages are piling up in queue 1. Note that when you use your debug process to read some messages out of the queues, the blocked write eventually gets done, and the pipeline unblocks. Solutions: 1) Increase kernel space allocated for messages so that worst case there will always be enough room. Advantage : quick and easy. Disadvantage : Unless you can determine the maximum size needed, it isn't reliable. 2) Add a queue 3 that connects process C to process A. Process C sends a blank message down queue 3 every N messages it gets from queue 2. Process 1 has a counter it decrements by 1 each message it sends into queue 1, starting at, say 2*N (or M in the general case). When this counter goes to 0, process A reads a message from queue 3 (and blocks until it gets one), and then adds N to the counter and continues. This guarantees that there are at most M-N transactions in transit, and allows you to ensure that you don't overflow your message space. 3) Restructure your system as a client/server model, where B serves A and C serves B. Since either side can put a message in the queue, A sends a message to B, then does a msgrcv. When B reads the message, it sends a dummy message back to A, saying ``I got it, you can now put another request in''. B and C do the same process. End result is that there is never more than one message in each queue at any time. By using two queues between each process, you can generalize this to never more than N message in each queue at any time. -- Robert J Woodhead, Biar Games, Inc. !uunet!biar!trebor | trebor@biar.UUCP "The lamb will lie down with the lion, but the lamb won't get much sleep." -- Woody Allen.
jay@mips.COM (Jay McCauley) (05/25/89)
In article <1023@dinl.mmc.UUCP> noren@dinl.UUCP (Chuck Noren) writes: >We have been developing a application that uses System V message queues >(perhaps thats the first mistake :-)) for interprocess communication. >Everything has worked fine until we really wanted to stress test the >application by sending it hundereds of messages at once. The application >chugs away nicely until it hangs. What is probably happening is you have hit the under documented "feature" that System V message queues all share a very small temporary buffer space for messages in transit. In some systems I've used, the pool was 8 Kb, apparently the default size in System V Release 2. It is very easy to create deadlocks with this implementation. I'm not sure how the pieces of Ingress communicate with each other, but if they too use the msg facilities, bingo, a deadlock. There is not a real good way out of this. One way is to use non-blocking writes, and pause a bit if you get denied. The implementation where I had first encountered this has a tricky backoff protocol attempt to prevent the deadlock. It worked well in practice, but is not theoretically perfect. Increasing the buffer pool is, typically, just a matter of a binary reconfig of the kernel. This can reduce the frequency of the problem, but cannot eliminate it. We also felt, but could never prove, that there were flaws in the standard System V implementation of the message facilities. Every so often we would see mysterious confusion in the ipcs reports that would only clear by rebooting. This occured on several different vendor's systems, so we suspected that there was some sort of subtle synchronization glitch in the kernel msgop support. Jay McCauley MIPS
noren@dinl.uucp (Charles Noren) (05/25/89)
Investigating my problem some more, I found some interesting things and I want to bounce some ideas off the net. In examining the <sys/msg.h> file I found some interesting comments and definitions. These seem to imply: 1. Each message queue is limited to a default size set at system configuration time. On our system, this is currently set to 2048 bytes (MSGMNB in msg.h). 2. While each message queue has a limit, all the messages queues are limited to a certain amount of memory. On our system, this is currently set to 8k bytes (MSGPOOL in msg.h). 3. There is also a fixed limit to the number of message packets in the message queue system. This is defined in our system as 50 packets (MSGMNI -- number of message queue identifiers, MSGTQL -- number of system message headers. In modifying my debug utilities that access message queue statisics, I found that when I was "stuck", none of the messages in the message queues exceeded the 2048 byte limit, and the total did not exceed 4k, well within the 8k limit. However, I found that I had 50 messages queued to all the queues -- the limit in msg.h for the count of message queue identifiers. I know how to turn on message queues on the Sun (thanks to answers from a previous posting), but how do I tune those parameters? Do I edit msg.h and reconfigure? Another question: There is a parameter in msg.h, right below the comment, "The following parameters are assumed not to require tuning", named MSGMAP that is the number of entries in the map. It is set to 100. If I change the number of packets in the system to 650, will I need to set this as well? Finally, am I wrong in my guesses? Will a knowledgable comment on my guesses? Also, are there any references to the inside implementation details on message queues, semaphores, shared memory, sockets, kernal stuff that you would recommend? Thanks to those who have replied and started pointing me in the right direction. -- Chuck Noren NET: ncar!dinl!noren US-MAIL: Martin Marietta I&CS, MS XL8058, P.O. Box 1260, Denver, CO 80201-1260 Phone: (303) 971-7930
ka@june.cs.washington.edu (Kenneth Almquist) (05/25/89)
> +--------+ +--------+ +--------+ > | | +-----+ | | +-----+ | | > | Proc A |---->>| Q 1 |-->| Proc B |---->>| Q 2 |-->| Proc C | > | | +-----+ | | +-----+ | | > +--------+ +--------+ +--------+ This setup can deadlock if the value of msg_qbytes for queue 1 is too large. The sequence of events is: 0) Queue 2 is empty. 1) Process B reads a message from queue 1. 2) Process A writes lots of messages to queue 1, exhausting the global message buffer space. 3) Process B tries to write a message to queue 2, but can't because the global message buffer space is exhausted. If this is your problem, then you can fix it by limiting the quantity of messages outstanding on queue 1 to prevent the global message buffer space from being exhausted. (This is done by using the IPC_SET command of the msgctl system call to set msg_qbytes to a smaller value.) On the other hand, you write: > Queue 2 looks empty because Process C is blocking on it (using > msgsnd with IPC_WAIT) ... > reading Queue 2 several times breaks the log jam and the system runs > for a while ... If you can read messages off of queue 2 while process C is blocked on a read, that indicates a kernel bug. Probably the msgsnd system call is failing to call wakeup for some reason. Kenneth Almquist
stevens@hsi.UUCP (Richard Stevens) (05/25/89)
In article <20284@winchester.mips.COM>, jay@mips.COM (Jay McCauley) writes: > > We also felt, but could never prove, that there were flaws in the > standard System V implementation of the message facilities. Every so often > we would see mysterious confusion in the ipcs reports that would only > clear by rebooting. This occured on several different vendor's systems, > so we suspected that there was some sort of subtle synchronization > glitch in the kernel msgop support. > There is indeed a problem with the SVr2 implementation of message queues. As I understand it, the problem occurs when data is copied between the user space and the kernel *and* a page fault occurs. Supposedly it was fixed in SVr2.1 or SVr3.0, but I don't have the source for either of these to check. I heard they added a set of locks for each queue so that common data structures aren't walked over if there is a page fault during an iomove(). Richard Stevens Health Systems International, New Haven, CT stevens@hsi.com ... { uunet | yale } ! hsi ! stevens
dave@ucms.UUCP (Dave Settle) (05/25/89)
In article <1023@dinl.mmc.UUCP> noren@dinl.UUCP (Chuck Noren) writes: >We have been developing a application that uses System V message queues >(perhaps thats the first mistake :-)) for interprocess communication. >Everything has worked fine until we really wanted to stress test the >application by sending it hundereds of messages at once. The application >chugs away nicely until it hangs. > > >First a model of the application. It consists of three processes (call them >A, B, and C), and two message queues (call them 1 and 2). The processes >and queues are orgainized as: > > +--------+ +--------+ +--------+ > | | +-----+ | | +-----+ | | > | Proc A |---->>| Q 1 |-->| Proc B |---->>| Q 2 |-->| Proc C | > | | +-----+ | | +-----+ | | > +--------+ +--------+ +--------+ > >Process A generates 200 messages in bursts of about 50 as fast as it >can go (CPU bound) and puts it into Queue 1. Process B reads the >messages from Queue 1, processes them while looking things up in >an Ingres database (we are using Ingres 5.0). Process B sends even more >messages to Queue 2 which is read by Process C. > >After Process A sends 150 messages (and Process B deleivers more messages), >Process B tries to write to Queue 2 and hangs (using IPC_WAIT on the msgsnd >call). Queue 2 looks empty because Process C is blocking on it (using >msgsnd with IPC_WAIT). Queue 1 appears full because when I try to write >to it with a no-wait (using diagnostic software), >it returns with an errno of 11 (the Sun 3 manual >indicates this is caused by a fork with process limit exceeded or insufficent >resources). Trying to write to Queue 2 produces the same error. The error EAGAIN, to which you refer here, is used in a specific manner by the 'msgsnd' call to mean 'No more space available to store your message' >Any suggestions of what could be happening? Is Ingres using resources >common to Message Queues? Have I shown a misunderstanding of how message >queues are to be used? > From what you have written, I suspect that the problem is that you have exceeded the GLOBAL message space buffer size with writes to Q1. P2 can therefore not write to Q2 (no more message space), so nobody can proceed. You have confused me by stating both that you cannot write to Q2, and that P3 is sitting trying to read from it, and also that you can cure the problem for a bit by READING from it - isn't P3 trying to do just that? But generally, the system has a serious flaw: if Q1 causes the system message space to fill up, then P2 is stuck - it can't write messages to Q2 'cos the system is full, but on the other hand, it can't free some space until it has disposed of the message and read Q1. I think that you can configure the system (look in /etc/master) so that you can force the individual queue to be full, before the whole system is full. I've never done this, though, so I might be wrong. You might also consider using 'crash' to examine the state of the kernel message queue structures: it's very useful for things like this. Cheers, Dave -- Dave Settle, Universal (CMS) Ltd, Thames Tower, Burleys Way, Leicester, UK. dave@ucms.co.uk (someday) ...!mcvax!ukc!nott-cs!ucms!dave dave@ucms.uucp (today) <--- This way to point of view --->
uucibg@sw1e.UUCP (3929]) (05/25/89)
In article <1026@dinl.mmc.UUCP> noren@dinl.UUCP (Charles Noren) writes: >Investigating my problem some more, I found ... > 1. Each message queue is limited to a default size set at > system configuration time. On our system, this is > currently set to 2048 bytes (MSGMNB in msg.h). > 2. While each message queue has a limit, all the messages > queues are limited to a certain amount of memory. > On our system, this is currently set to 8k bytes > (MSGPOOL in msg.h). > 3. There is also a fixed limit to the number of message > packets in the message queue system. This is defined > in our system as 50 packets (MSGMNI -- number of message > queue identifiers, MSGTQL -- number of system message > headers. > >I know how to turn on message queues on the Sun (thanks to answers >from a previous posting), but how do I tune those parameters? >Do I edit msg.h and reconfigure? I'm not sure of the name of the file (I've only watched the sysadmin for the box change the file), but you definitely edit a file and reconfigure the kernal. To be honest, I don't think it's msg.h (though mileage may vary since you're not working with SysV). >Another question: There is a parameter in msg.h, right below the comment, >"The following parameters are assumed not to require tuning", named >MSGMAP that is the number of entries in the map. It is set to 100. >If I change the number of packets in the system to 650, will I need to >set this as well? In the system administrator's guide for the 3B2/600 (don't ask me why we use them, I don't know), there are descriptions of the dependencies. There are some interesting relationships between the different pieces, but unfortunately I don't know if they apply to Suns and I also don't know what differences there are between SysV2 and SysV3. My guess would be to look into the sections tuning or reconfigureing the kernel (probably in the system administrator's reference rather than the guide) to find out exactly how Suns handle these things. Brian R. Gilstrap Southwestern Bell Telephone One Bell Center Rm 17-G-4 ...!ames!killer!texbell!sw1e!uucibg St. Louis, MO 63101 ...!bellcore!texbell!sw1e!uucibg (314) 235-3929 ...!uunet!swbatl!sw1e!uucibg #include <std_disclaimers.h>
noren@dinl.uucp (Charles Noren) (05/25/89)
In article <8340@june.cs.washington.edu> ka@june.cs.washington.edu (Kenneth Almquist) writes:
]> +--------+ +--------+ +--------+
]> | | +-----+ | | +-----+ | |
]> | Proc A |---->>| Q 1 |-->| Proc B |---->>| Q 2 |-->| Proc C |
]> | | +-----+ | | +-----+ | |
]> +--------+ +--------+ +--------+
]
]On the other hand, you write:
]
]> Queue 2 looks empty because Process C is blocking on it (using
]> msgsnd with IPC_WAIT) ...
]> reading Queue 2 several times breaks the log jam and the system runs
^^^^^^^
]> for a while ...
]
]If you can read messages off of queue 2 while process C is blocked on a
]read, that indicates a kernel bug. Probably the msgsnd system call is
]failing to call wakeup for some reason.
] Kenneth Almquist
Oops, I meant to say Queue 1. Several have caught me on this one.
I'm glad to see all of the good responses in spite of mis-stating
some of the information.
--
Chuck Noren
NET: ncar!dinl!noren
US-MAIL: Martin Marietta I&CS, MS XL8058, P.O. Box 1260,
Denver, CO 80201-1260
Phone: (303) 971-7930
noren@dinl.uucp (Charles Noren) (05/26/89)
The problem is solved and corrected. Thanks to all who have contributed so many good answers that opened up my understanding of message queues more. Particular thanks goes to Doug Looms and Fred DePalm for pointing out the correct Sun reference manual that tells how to resize the options, and to Keith Gregory who gave a brief tutorial on the important "#define" sizing constants for message queues. The quality of all the responses was great, which included some tips on improving my data flow and general information based on experience on what and what not to do on message queues. A summary of what I found out is: First an important Sun manual: SYSTEM V ENHANCEMNETS OVERVIEW (Sun part No. 800-1541-03). For Sun 3 systems, the configuration file needs to be modified (in directory /sys/sun3/conf) to include the following lines: options MSGPOOL=xxx ...where xxx is the size in k bytes of the system V message queue memory pool. It must be < 255. options MSGMNI=xxx ...where xxx is the number of possible message queues allowed by the system. options MSGTQL=xxx ...where xxx is the limit of the number of message packets in the system. options MSGMNB=xxx ...where xxx is the limit of the number of bytes that can be queued at a message queue. There are some other parameters described in the SYSTEM V ENHANCEMENTS OVERVIEW that pertain to the sizing of the message itself. Since we have the luxury of delivering a stand-alone system, we configured the message queues based on the mimimum size of messages so that the contraining factor will be MSGMNB (the maximum number of bytes queued at message queue). Since all the processes are actors, any process that blocks at a queue (the queue being full -- forced by the definition of the parameters) will have another process at the other side of the queue that is processing the messages, thus freeing up space in the queue. Thanks again for all the help. I was also particularly pleased with how quickly Sun (Fred DePalm) responded and pursued the problem until it was resolved. Thanks again!! -- Chuck Noren NET: ncar!dinl!noren US-MAIL: Martin Marietta I&CS, MS XL8058, P.O. Box 1260, Denver, CO 80201-1260 Phone: (303) 971-7930