noren@dinl.uucp (Charles Noren) (05/24/89)
We have been developing a application that uses System V message queues (perhaps thats the first mistake :-)) for interprocess communication. Everything has worked fine until we really wanted to stress test the application by sending it hundereds of messages at once. The application chugs away nicely until it hangs. Long boring discription of the problem follows... First a model of the application. It consists of three processes (call them A, B, and C), and two message queues (call them 1 and 2). The processes and queues are orgainized as: +--------+ +--------+ +--------+ | | +-----+ | | +-----+ | | | Proc A |---->>| Q 1 |-->| Proc B |---->>| Q 2 |-->| Proc C | | | +-----+ | | +-----+ | | +--------+ +--------+ +--------+ Process A generates 200 messages in bursts of about 50 as fast as it can go (CPU bound) and puts it into Queue 1. Process B reads the messages from Queue 1, processes them while looking things up in an Ingres database (we are using Ingres 5.0). Process B sends even more messages to Queue 2 which is read by Process C. After Process A sends 150 messages (and Process B deleivers more messages), Process B tries to write to Queue 2 and hangs (using IPC_WAIT on the msgsnd call). Queue 2 looks empty because Process C is blocking on it (using msgsnd with IPC_WAIT). Queue 1 appears full because when I try to write to it with a no-wait (using diagnostic software), it returns with an errno of 11 (the Sun 3 manual indicates this is caused by a fork with process limit exceeded or insufficent resources). Trying to write to Queue 2 produces the same error. Using diagnostic software to read Queue 1 pulls messages off it of it, and reading Queue 2 several times breaks the log jam and the system runs for a while (trying to read Queue 2 with a no-wait fails returning an errno of 22 -- Invalid argument, and the arguments have been checked). Writing a simplified application with the processes and queues without the database application flows nicely and NEVER hangs, even with waits in process B (and process C). We are running on a Sun 3/260 with 24 MB ram and with SunOS4.01. Any suggestions of what could be happening? Is Ingres using resources common to Message Queues? Have I shown a misunderstanding of how message queues are to be used? Thanks. -- Chuck Noren NET: ncar!dinl!noren US-MAIL: Martin Marietta I&CS, MS XL8058, P.O. Box 1260, Denver, CO 80201-1260 Phone: (303) 971-7930