[net.unix-wizards] Pids

rbj@icst-cmr.arpa (root) (04/25/86)

I just finished reading McKusick's paper on `Measuring & Improving
the Performance of 4.2'. One of the things he dicusses is searching
the PROC tables when assigning PIDS for uniqueness.

I offer the following observations:

1) Pids seem to wrap at 32K. 
2) I have only observed this once, as our machine usually finds
   an excuse to crash, or we reboot it for somne reason. Even so, it
   took more than a week.
3) Until the wrap around, *each pid generated (++pid) is guaranteed
   to be unique*!
4) My solution is to have a flag (pidwrap) which is set at the obvious time.
   If the flag is clear, *why search at all*? If not, do it the hard way.
5) Any reason not to use unsigned pids, this doubling their range?
6) Has my brain been swapped out? Am I missing something?

	(Root Boy) Jim Cottrell		<rbj@cmr>
	"One man gathers what another man spills"

chris@umcp-cs.UUCP (Chris Torek) (04/27/86)

In article <266@brl-smoke.ARPA> rbj@icst-cmr.arpa (root) writes:
>I just finished reading McKusick's paper on `Measuring & Improving
>the Performance of 4.2'. [...]
>
>1) Pids seem to wrap at 32K. 

Actually, at 30000.

>2) I have only observed this once, as our machine usually finds
>   an excuse to crash, or we reboot it for somne reason. Even so, it
>   took more than a week.

You just are not doing enough with your machine :-).  Ours tend to
wrap around every few hours or days.

>3) Until the wrap around, *each pid generated (++pid) is guaranteed
>   to be unique*!
>4) My solution is to have a flag (pidwrap) which is set at the obvious time.
>   If the flag is clear, *why search at all*? If not, do it the hard way.

This is not necessary.  4.3 has a `pidchecked' variable that it
sets to the top of a free range of pids starting at `mpid'.  This
means that one search at boot time sets this to 30000, after which
there are no scans until the first wrap.  It then starts mpid at
100 (not 1, since there are usually all sorts of servers cluttering
up the lowest part of the PID space) and does another scan.  This
usually nets a few hundred free PIDs at the least, and continues
working through any number of wraps.  It depends on only two things:
Process ID space is sparsely populated, and long-lived processes
are rare.

>5) Any reason not to use unsigned pids, this doubling their range?

Fear of breaking old code, perhaps.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1415)
UUCP:	seismo!umcp-cs!chris
CSNet:	chris@umcp-cs		ARPA:	chris@mimsy.umd.edu

tanner@ki4pv.UUCP (Tanner Andrews) (05/02/86)

] 5) Any reason not to use unsigned pids?...

Unsigned pid would really give fits to the old but well-loved test
	if  ((pid = fork()) < 0)
		die(horrible_death);

In fact, with unsigned being used, you'd have a very hard time
distinguishing between
	(failing fork() == -1)
and
	(working fork == 0xFFFF)	/* may be 0xFFFFFFFF on vaxen */

-- 
<std dsclm, copies upon request>	   Tanner Andrews

henry@utzoo.UUCP (Henry Spencer) (05/08/86)

Several solutions to the inefficient pid-duplication-search problem exist.
A few of them were described in the Usenix newsletter a while ago.  The
simplest approach is the one Chris Torek has mentioned:  it is trivial to
change the search loop so that instead of looking for an exact match, it
looks for the lowest pid >= than the one you're trying.  Then if you don't
get an exact hit (the = part of >=), you've got a range of pids which are
known to be unused, and the search needs to be repeated only when that
range is exhausted.  This idea has been independently re-invented several
times in different places.
-- 
Join STRAW: the Society To	Henry Spencer @ U of Toronto Zoology
Revile Ada Wholeheartedly	{allegra,ihnp4,decvax,pyramid}!utzoo!henry