[news.software.b] C News Bulletin #6 - printf vs rnews caveat, locking discussion

geoff@utstat.uucp (Geoff Collyer) (12/20/87)

Henry and I simultaneously found a rather nasty bug, or at least
unportability, in the printf family which can result in corrupted active
files.  So far we have only seen this problem on PDP-11 V7, but Ian
Darwin found a related problem on his Dual System V 68k box two years
ago (we provide a workaround for that case).

The affected code is in rnews/active.c, in the function incartnum.
It attempts to format an article number as a five-digit string, padded
with zeros on the left, using the format "ldzeropad", defined in
rnews/zeropad.c.  The default ldzeropad in the alpha release is
"%0*.*ld".  (The ANSI C drafts support this form, though they deprecate
it.)

The distributed rnews/zeropad.c is a link to
rnews/bugs/zeropad/okay/zeropad.c.  An alternate zeropad.c for people
with certain kinds of broken printf (or new SVID-compliant printf) is
rnews/bugs/zeropad/bugged/zeropad.c, which fixes the problem that Ian
saw: "%*.*ld".

On PDP-11 V7 (and possibly other implementations), "%0*.*ld" looks like
the correct format, but printf has a bug: if the decimal representation
of the number being formatted exactly fills the field width (five in
this case), printf emits an extra leading zero.  This causes the second
field of the active file entry to be an order of magnitude too small(!),
since only the leftmost five characters of that field are stored on
disk.  "%*.*ld" doesn't produce the desired output, but "%0*ld" comes
close: it zero-fills to the field width correctly, but can produce more
digits than the field width if the number being formatted exceeds 99999,
again leading to a too-small active file entry.

I concluded that it is simpler (and more robust!) to provide my own
long-to-zero-filled-ascii output conversions than to try to deal with
the growing variety of printf formats needed to do the job, and is
simpler than to distribute my own unfinished, portable printf
implementation.  If your printf can't be made to correctly perform the
output conversion, you'll need to replace the sprintf call in
incartnum() with equivalent code; mine is scattered across several
source files and is little big to include in this message, but doing it
yourself is pretty easy.

Has anybody thought about widening the active file fields to more than
five digits?  utzoo already has a few newsgroups with five-digit numbers
(around 11000) in their active file entries, and it's only been about a
year since The Great Renaming, when all the entries were created afresh.


Several people have asked why C news does such simple locking, using
links and not removing "obviously stale" locks.  Superior locking
facilities found in most modern Unixes have wildly different interfaces
and function, so no efficient and portable abstract interface presented
itself.  Furthermore, some of these facilities do not work over network
file systems (e.g. 4.2BSD's flock system call over Sun's NFS).  The uucp
heuristic of checking to see if the process whose process id is stored
in the active file is still alive via kill(pid, 0) (or ps) also falls
down in the presence of network file systems.  (I already share
/usr/spool/uucp across all our machines, but I must currently execute
uucp and uux on the file server which owns /usr/spool/uucp, and I wish
I could execute them locally, but that would require rethinking uucp's
stale-lock removal.)

I also like having a lock be an object in the file system name space, so
that it can be manually examined (and possibly removed), rather than a
lock being an invisible attribute of an in-core i-node or a peculiar
invisible entity accessible only through special system calls (shades of
System V IPC!).

Perhaps more importantly, the breakdown of a locking protocol may be
symptomatic of far greater problems, and just forcing the lock and
blindly barging ahead may be the wrong thing to do.  From a practical
standpoint, we have found C news to be quite robust and its components,
notably expire and rnews, do not dump core in production, once one has
configured and installed them correctly.  (Rnews dumped core on one of
my machines once in production, during development, over a year ago.)  A
dedicated opponent possibly could make expire or rnews dump core, but I
am bullet-proofing rnews so that ultimately that should not be possible
either.
-- 
Geoff Collyer	utzoo!utstat!geoff, utstat.toronto.{edu,cdn}!geoff