dick@ahds.UUCP (Dick Heijne CCS/TS) (04/12/91)
Before coming to the real point, some environmental information first: In our firm we mostly work with NCR computers (i.e. Towers, about 330 of them, all SysV versions), but there are also a Sun (old BSD), a few T.I.'s (XenixV) and since about one year some (about 200) Nixdorf Targons (all SysV boxes). We have NCR's since about 1984 and most of the software is written on those and ported to the other types. Up until the moment we started with none-NCR machines we always worked with NCR's lpr(1)/print(1) spooler (a Berkeley thing, as I was told), whith which we NEVER have had ANY problem. In those days we chose that spooler just for it's better functionallity over lp(1), i.e. forms support, adjustable priority and things like that. BTW, all systems are used in administrative environments, and during the day MUCH printwork is done (most systems have AT LEAST two printers connected, also fast (1000lpm) line printers). I mention all this just to illustrate that lp-problems mean SEVERE problems. When we bought none-NCR systems we were forced to switch to lp(1) and since the introduction of Sys V.3, NCR decided not to support the lpr/print(1) spooler anymore. So, on all systems we have lp(1) running now. Here the flame starts burning: Things we discovered were: 1. The serious problem: Print requests get lost, specially when a printer gets out of paper or is switched off-line for changing ribbon or even when it is disabled! (not to be mixed up with cancelled) etc. The problem has been investigated and the causes are obvious. NO, they have NOTHING to do with a bad interface program! Basically they point to bad lpsched design: When lp starts, it causes lpsched to fork a copy in order to do the job. lpsched than opens the port (if it can). Now, when ANYTHING wierd occurs (it gets a SIGHUP or SIGTERM (caused by cancel or disable (YES!!), timeout-on-write or timeout-on-open (what lunatic invented these ?), setting the printer off-line for changing ribbon/paper/toner, paperjam or whatever) it first THROWS THE PRINTREQUEST AWAY and next informs the interface, mostly by passing a SIGTERM to it. Now, from a user's point of view, the very last thing a print scheduler may do is throwing requests away, unless the user orders to do so (ONLY by a *cancel* request) OR when the print has been done successfully (that is, lpsched receives an exit code of 0 (zero) from the interface program). As anyone with experience in administrative environments can tell, this is a very nasty habbit, since sometimes entire production-runs have te be re-runned: many programs update files while printing and endless paper has not been invented yet. OK, on paperjam they sometimes have to do that anyway, but that is obvious to the users. I need to tell that lp is always called with the -c flag, causing the file to be spooled up before printing, since generally files need to be available for update immediately after the printrequest has been performed. All suppliers of the various systems have been informed about the subject, all can reproduce it and all agree that this is very bad design, but since it is an AT&T standard utility, none of them is willing/allowed to dig into it. Merely, a single update will not be sufficient, since on the next OS update or when receiving a new type of system we are stuck again. This problem has now raised a grade of severeness that here and there on management-level the continuity of Unix in these environments has become discutable. 2. The security hole: lp works with a scheduler (lpr didn't), which is suid'd/sgid'd to itself (i.e. lp/lp or lp/bin, varies per manufacturer), thus arranging that private files CANNOT be printed, and here comes the security hole: you HAVE to make directories/files searchable/ readable by OTHERS to be able to print your things. Here and there one really gets the impression that Unix IS a product that is raised by students and hobbyists, when things like that still live after god-knows-how-long-lp-exists-now. UNBELIEVABLE!!! The problem is, that commercial companies seldom have direct access to the original designers (e.g. AT&T) so maybe they never hear about this ? Problem 2. is as bad as problem 1. but less serious (working around with chmod's, how very unprofessional it is of course) at least keeps production procedures going... Questions (mostly related to problem 1.): 1. Who can tell me a way to get the sources of lpsched in order to get rid of at least problem 1. 2. Where to obtain a spooler that prints rather than purge printrequests (this is NOT my favorite option, unless it is supplied including source files, since it should run on many different systems with different OS(versions)) 3. How to inform/discuss with the RIGHT people at AT&T (or Unix Foundation or so it is called now, I think) to get rid of these problems in the very near future 4. Just get rid of the problems (please no nonsense about interfaces, that's not where the problems reside) Many thanks to all you readers, especially thos who can contribute something. Dick.
lml@cbnews.cb.att.com (l.mark.larsen) (04/16/91)
I discovered the bug in lpsched a few years ago and even posted the information and a source code fix to this newsgroup. As you have observed, lpsched has a rather nasty bug that causes files to be dequeued upon termination of the interface script - regardless of success or failure. For those without source code, it is fairly trivial to code the interface script to take into account this "feature" - which is what I did for the machines I administer. If anyone wants a copy of how I did it, I would be happy to send one. For those with source, here is the diff of the original vs. the fixed versions of lpsched.c: 608c608 < resetstatus(1, 1); --- > fclose(rfile); 609a610 > resetstatus(0, 1); 616,617c617 < fclose(rfile); < unlink(rname); --- > resetstatus(1, 1); 705c705 < * if dflag != 0 then delete outputq entry and remove associated data --- > * if oflag != 0 then delete outputq entry and remove associated data Note that the entire LP package was rewritten and expanded in SVR3.2. The lpsched bug was fixed as a side-effect. Three new features were added: access to forms, easier administration of filters and a menu interface for administration. Lpsched is now setuid root but does setuid() before calling the interface script. Lp is no longer setuid/setgid, so your second problem will also disappear. In the meantime, as someone else suggested, you can put a wrapper around the lp command to make sure protected files are sent to the real lp program via stdin. I did something similar but for a very different reason. regards, L. Mark Larsen lml@atlas.att.com
peter@ficc.ferranti.com (Peter da Silva) (04/16/91)
In article <1950@ahds.UUCP> dick@ahds.UUCP (Dick Heijne CCS/TS) writes: > lp works with a scheduler (lpr didn't), which is suid'd/sgid'd > to itself (i.e. lp/lp or lp/bin, varies per manufacturer), thus > arranging that private files CANNOT be printed, Problem 1 is a major boner, but this can be handled just by doing: cat file | lp > 1. Who can tell me a way to get the sources of lpsched in order to > get rid of at least problem 1. There are a couple of PD, freeware, or GNUware spoolers out there in the various comp.sources.* archives. > 3. How to inform/discuss with the RIGHT people at AT&T (or Unix > Foundation or so it is called now, I think) to get rid of these > problems in the very near future Ha. ha. ha. ha. ha. They can't even be convinced to get a summer student to run through the sources replacing "cannot open FROBOZZ" with at *least* perror. -- Peter da Silva. `-_-' peter@ferranti.com +1 713 274 5180. 'U` "Have you hugged your wolf today?"