marc@aplpy.jhuapl.edu (Marcus Gates) (07/19/90)
I have a sunview program which sends/receives messages from/to other programs (some sunview, some not). The message passing is accomplished using shared-memory and SIGUSR1. The messages are placed into shared memory (semaphores are used to lock shared memory so that only one program can access it at a time) and then a SIGUSR1 signal is sent to the process which must retrieve the message from its message queue. For some reason we occasionally lose a message. The sunview code is something like: main() { notify_set_signal_func(sim_mod, get_input, SIGUSR1, NOTIFY_SYNC); window_main_loop() } get_input() { mask = sigsetmask(0xffffffff); /* done in while loop since other SIGUSR1 signals might come in while we're here - so get messages until there are none. */ while (recv_msg(&msg) != ERROR) { printf("msg %d received\n", msg); } sigsetmask(mask); } recv_msg(msg) { lock_semaphore(); get_message from queue, adjust queue printf("message %d taken from queue\n", msg); unlock_semaphore() return() } The printfs were added to trace execution through these routines, what we see when a message gets lost is: message 23 taken from queue message 12 taken from queue /* 2 in a row from recv_msg */ msg 12 received msg 12 received /* 2 in a row from get_input */ Any ideas how code in recv_msg() could be executed twice (without returning first?). Any help would be greatly appreciated. marc gates UUCP: ...bpa!cp1!aplcen!aplvax!marc Internet: marc@aplpy.jhuapl.edu