[net.bugs.4bsd] lprm hangs printer

hubert@entropy.UUCP (Steve Hubert) (03/06/86)

I presume that this is a well-known problem, but I have not
seen a fix or an explanation for it myself.  When the 4bsd (4.[23])
line printer spooler is printing a job and you lprm the active job
it will hang the printer queue as often as not.  Has anyone looked
into what is happening and perhaps come up with a fix or am I doing
something wrong that is causing the problem?

Also, when the system crashes while a job is actively printing the
queue will often be hung after reboot.

Steve Hubert
 Dept. of Stat., U. of Wash, Seattle
 {decvax,ihnp4,ucbvax!lbl-csam}!uw-beaver!entropy!hubert
 hubert%entropy@uw-beaver.arpa

rh@cs.paisley.ac.uk (Robert Hamilton) (03/20/86)

In article <256@entropy.UUCP> hubert@entropy.UUCP writes:
>I presume that this is a well-known problem, but I have not
>seen a fix or an explanation for it myself.  When the 4bsd (4.[23])
>line printer spooler is printing a job and you lprm the active job
>it will hang the printer queue as often as not.  Has anyone looked
>into what is happening and perhaps come up with a fix or am I doing
>something wrong that is causing the problem?
>
>Also, when the system crashes while a job is actively printing the
>queue will often be hung after reboot.
>
I don't know if its the same problem but we had a something similar:
After lprm on an active job lpd was restarted with printer {hostname}
In our case paisley.
I added the alias paisley in /etc/printcap and it seemed to work ok.
Thus:

lp|paisley|local line printer:\
	:lp=/dev/lp:sd=/usr/spool/lpd:lf=/usr/adm/lpd-errs:\
        :rf=/usr/ucb/fpr:
Hope this is some help!

-- 
UUCP:	...!seismo!mcvax!ukc!paisley!rh
DARPA:	rh%cs.paisley.ac.uk		| Post: Paisley College
JANET:	rh@uk.ac.paisley.cs		|	Department of Computing,
Phone:	+44 41 887 1241 Ext. 219	|	High St. Paisley.
					|	Scotland.
					|	PA1 2BE

ables@milano.UUCP (03/25/86)

> After lprm on an active job lpd was restarted with printer {hostname}
> In our case paisley.
> I added the alias paisley in /etc/printcap and it seemed to work ok.
> Thus:
> 
> lp|paisley|local line printer:\
> 	:lp=/dev/lp:sd=/usr/spool/lpd:lf=/usr/adm/lpd-errs:\
>         :rf=/usr/ucb/fpr:

Well, this will work, but it doesn't really fix the problem, it just
patches the symptom.  The problem is in the code for lprm, he tries
to restart the printer and uses the variable containing the hostname
INSTEAD of the variable containing the printer name.  It's pretty
obvious when you look through the code.  The fix above DOES fix the
problem, though, and if you're not into patching code, and if it
doesn't cause any other problems, it'll do.

The REAL fix, is to /usr/src/usr.lib/lpr/rmjob.c as follows:
92c92
<       if (assasinated && !startdaemon(host))
---
>       if (assasinated && !startdaemon(printer))

-King
ARPA: ables@mcc.arpa
UUCP: {ihnp4,seismo,ctvax}!ut-sally!im4u!milano!mcc-pp!ables

thorinn@diku.UUCP (Lars Henrik Mathiesen) (03/26/86)

In article <64@paisley.ac.uk> rh@cs.paisley.ac.uk (Robert Hamilton) writes:
>After lprm on an active job lpd was restarted with printer {hostname}
>In our case paisley.

  The obvious fix (if you have source) is to change the call of startdaemon
in line 92 in rmjob.c from
	...startdaemon(host)...
to
	...startdaemon(printer)...
I don't suppose the developer at Berkeley ever tried removing an active print
job when other jobs were in the queue.
--
Lars Mathiesen, DIKU, U. of Copenhagen, Denmark		..mcvax!diku!thorinn

hosking@convexs.UUCP (03/31/86)

> /* Written  4:40 pm  Mar 24, 1986 by ables@milano.UUCP in net.bugs.4bsd */
> The REAL fix, is to /usr/src/usr.lib/lpr/rmjob.c as follows:
> 92c92
> <       if (assasinated && !startdaemon(host))
> ---
> >       if (assasinated && !startdaemon(printer))
> 
> /* End of text from convexs:net.bugs.4bsd */

One of the people here added an additional fix to this code.  There are
apparently races possible between starting/killing daemons.  Our version
of rmjob.c looks like this:

	/*
	 * Restart the printer daemon if it was killed
	 *  but first wait until the daemon is really dead
	 */
	if (assasinated) {   /* added by ACS, 11/12/85 fixes lprm bug */
		int lfd;    /* lock file descriptor */
		lfd = open(LO, O_WRONLY|O_CREAT, 0644);
		if (lfd < 0) {
			printf("cannot create %s", LO);
			exit(1);
		}

		/* when lock succeeds it's ok to restart the daemon */

		if (flock(lfd, LOCK_EX) < 0) { 

			printf("rmjob: cannot lock %s", LO);
			exit(1);
		}
		(void)close(lfd); /* implicit unlock */
	}
	if (assasinated && !startdaemon(printer))
		fatal("cannot restart printer daemon\n");
	exit(0);
}

I've never really looked at the problem in any detail, but since this
change went in, I don't recall ever seeing daemons lost due to lprm...
and it happened quite a bit before the change went in.

				Doug Hosking
				Convex Computer Corp.
				Richardson, TX
				{allegra, ihnp4, uiucdcs}!convex!hosking