[net.unix-wizards] A SYSV MSG question

bzs%bostonu.csnet@CSNET-RELAY.ARPA (Barry Shein) (10/30/85)

Ok, this one has me beat, before I read the sources...

I create 3 msgid's with msgget() for three different
processes, all seems fine ('ipcs -qo' reveals the
three as I would expect them.) Two of these queues
are written to (read later, but not yet, by other
processes.)

One of them has been IPC_SET to be small, but it
works fine so I doubt that is the problem (alone.)

Now the problem: if the small one blocks on q-full
(which I want it to) the other writing process blocks,
even if its queue is empty. If I kill the blocked process
and ipcrm its queue, the strangely blocked process
proceeds as normal (completes its msgsnd() and continues
happily, they are all mumbling at my terminal what they
are doing.)

Note: these three processes do not share these queues, in
my mind there should be no interaction.

Any ideas? Thanks in advance. (Oh yeah, if I add an IPC_NOWAIT
to the strangely blocked process it comes back with <0 and
EAGAIN...swell, thanks guys, note that no matter why you
returned with the IPC_NOWAIT flag it appears you will get
errno == EAGAIN, I think I would have preferred the original
errno, or something to distinguish why, sigh.)

	-Barry Shein, Boston University

root%bostonu.csnet@CSNET-RELAY.ARPA (BostonU SysMgr) (10/30/85)

Sorry, I realized just after that flew out of here that I hadn't
indicated that running out of system resources I think is eliminated.
(tho nothing in my message explained that, sorry.)

Briefly:

	process 1 opens a msgq and sets it down to 40 bytes
	and loops messages to it till it blocks (it is a 'unique
	magic cookie server' to dish out unique id's to other
	processes that need them for my application, I write a
	bunch till I block just to take advantage of buffering
	so several quick requests can be serviced w/o having to
	wait for the server process to wake up, it may have been
	dormant for hours and thus swapped out or something, just
	seemed to make sense.)

	process 2 opens a different msgq w/o diddling anything, and
	writes bufs to it, typically a few hundred bytes long each. ipcs
	indicates the max qbytes to be 16K (the default.)

The problem again: as soon as process 1 blocks (with 40 bytes in the q)
process 2 blocks, even if it's the first write (it printfs to me
"Putting buf on q.." and then never proceeds (it would say "Just put
junk on q" if it finished.) IPC_NOWAIT added makes it return EAGAIN,
but I kinda knew that so that is no help. 

If I kill process 1 and ipcrm its q process 2 suddenly wakes up and
proceeds fine, putting lots of messages on the q, if I re-start process 1,
as soon as it blocks process 2 blocks again.

IPCS reveals only 3 queues active on the system, MSGMNI is 8, but
that should cause a failure from MSGGET anyhow, not blocking.

I agree, it sounds like someone ran out of resources, or I misunderstand
something entirely which is more what I am worried about. Everything
I have read (and reason) indicate that the q's should be independant
things up to system resources at which point I can understand a block.

I have to go over all the code with a fine tooth comb, I was just
hoping someone out there might recognize the problem, if it's not something
you have seen before, don't lose any sleep over it, I'll figure it out
if I have to trace the kernel. Ya know, some weird unspoken like the kernel
blocks all jobs in a pgrp if one is blocked on a msgop (tho I highly doubt
it.)

Thanks in advance.

	-Barry Shein, Boston University