geoff@utstat.uucp (Geoff Collyer) (12/20/87)
Henry and I simultaneously found a rather nasty bug, or at least unportability, in the printf family which can result in corrupted active files. So far we have only seen this problem on PDP-11 V7, but Ian Darwin found a related problem on his Dual System V 68k box two years ago (we provide a workaround for that case). The affected code is in rnews/active.c, in the function incartnum. It attempts to format an article number as a five-digit string, padded with zeros on the left, using the format "ldzeropad", defined in rnews/zeropad.c. The default ldzeropad in the alpha release is "%0*.*ld". (The ANSI C drafts support this form, though they deprecate it.) The distributed rnews/zeropad.c is a link to rnews/bugs/zeropad/okay/zeropad.c. An alternate zeropad.c for people with certain kinds of broken printf (or new SVID-compliant printf) is rnews/bugs/zeropad/bugged/zeropad.c, which fixes the problem that Ian saw: "%*.*ld". On PDP-11 V7 (and possibly other implementations), "%0*.*ld" looks like the correct format, but printf has a bug: if the decimal representation of the number being formatted exactly fills the field width (five in this case), printf emits an extra leading zero. This causes the second field of the active file entry to be an order of magnitude too small(!), since only the leftmost five characters of that field are stored on disk. "%*.*ld" doesn't produce the desired output, but "%0*ld" comes close: it zero-fills to the field width correctly, but can produce more digits than the field width if the number being formatted exceeds 99999, again leading to a too-small active file entry. I concluded that it is simpler (and more robust!) to provide my own long-to-zero-filled-ascii output conversions than to try to deal with the growing variety of printf formats needed to do the job, and is simpler than to distribute my own unfinished, portable printf implementation. If your printf can't be made to correctly perform the output conversion, you'll need to replace the sprintf call in incartnum() with equivalent code; mine is scattered across several source files and is little big to include in this message, but doing it yourself is pretty easy. Has anybody thought about widening the active file fields to more than five digits? utzoo already has a few newsgroups with five-digit numbers (around 11000) in their active file entries, and it's only been about a year since The Great Renaming, when all the entries were created afresh. Several people have asked why C news does such simple locking, using links and not removing "obviously stale" locks. Superior locking facilities found in most modern Unixes have wildly different interfaces and function, so no efficient and portable abstract interface presented itself. Furthermore, some of these facilities do not work over network file systems (e.g. 4.2BSD's flock system call over Sun's NFS). The uucp heuristic of checking to see if the process whose process id is stored in the active file is still alive via kill(pid, 0) (or ps) also falls down in the presence of network file systems. (I already share /usr/spool/uucp across all our machines, but I must currently execute uucp and uux on the file server which owns /usr/spool/uucp, and I wish I could execute them locally, but that would require rethinking uucp's stale-lock removal.) I also like having a lock be an object in the file system name space, so that it can be manually examined (and possibly removed), rather than a lock being an invisible attribute of an in-core i-node or a peculiar invisible entity accessible only through special system calls (shades of System V IPC!). Perhaps more importantly, the breakdown of a locking protocol may be symptomatic of far greater problems, and just forcing the lock and blindly barging ahead may be the wrong thing to do. From a practical standpoint, we have found C news to be quite robust and its components, notably expire and rnews, do not dump core in production, once one has configured and installed them correctly. (Rnews dumped core on one of my machines once in production, during development, over a year ago.) A dedicated opponent possibly could make expire or rnews dump core, but I am bullet-proofing rnews so that ultimately that should not be possible either. -- Geoff Collyer utzoo!utstat!geoff, utstat.toronto.{edu,cdn}!geoff