[net.bugs.uucp] anlwrk.c

ber (06/04/82)

#N:harpo:6300002:000:1327
harpo!ber    Jun  4 07:32:00 1982

***** harpo:net.bugs.v7 / utzoo!henry /  9:49 pm  Jun  3, 1982
Some of the code in the anlwrk routines in uucico looks like it was
written by Neanderthals, or else patched and re-patched by someone
who didn't understand it.

When asked to get a file of work, the routines first check a list of
pending work;  if the list is empty, they refill it by scanning the
directory.  This is fine.  BUT:

1. The in-core list is only 10 items long.
2. The directory-scan loop scans the ENTIRE directory even if the
	list was filled up by the first 10 files.
3. The loop allocates space for the filename, and copies it into said
	space, BEFORE it knows whether there is an empty slot in the
	list for that filename.  If the list is full, that space is
	NEVER FREED!

The combination of these results in EXTREMELY slow operation of uucico
if the directory is big.  In fact, the other end can time out while
uucico is rescanning the directory!  This can make it very hard to
clean out the pileup when one of your neighbors has been down for a
while.

The fixes are quite straightforward.  In anlwrk.c, change the value
of LLEN from 10 to (say) 50.  In anlwrk.c/gtwrk(), change the loop
condition to add "&& (last - first) < llen".  You can take out the
similar check in the if towards the end of the loop, it being now
redundant.
----------

ber (07/16/82)

#N:harpo:6300004:000:2930
harpo!ber    Jul 15 18:04:00 1982

***** harpo:net.unix-wizar / physics!gill /  9:30 pm  Jul 13, 1982
	A bug in anlwrk.c causes uucp to core dump after a perfect
login and startup. This happens when there is an unwritable A.xxx file in the
spool directory. Trouble is, anlwrk() in anlwrk.c doesn't check the
stream it gets from fopen against NULL before trying to do an fprintf
of the command lines completed count onto the file.

	afp = fopen(afile, "w");
	fprintf (afp, "%d", acount);
	fclose(afp);

should be changed to

	if ((afp = fopen(afile, "w")) != NULL)
	{
		fprintf (afp, "%d", acount);
		fclose(afp);
	}

The A.xxx file was owned by root on our system (somehow) and mode
0644, due to WFMASK (in uucp.h) being 0133.

The core files landed in /usr/spool/uucp, but were of zero length.
This was an extreme pain in the ass, as the symptom only showed
up when uucico was run from an ordinary uid with the setuid bit
on. The "no core files for setuid programs" restriction in the 4.1
kernal should happen before the if (ip== NULL) ... mknode(0666), 
not after. A better idea is to escape this test if the core file did not exist
before and the link count is still one. Anything but the misleading
documentation (only found in sig.c) stating that core dumps can in fact
happen to setuid programs.

On another front, I have repaired a bug with dialup() in conn.c.
The multiple calling of multiple phone numbers during a single attempt by 
uucico to contact a system didn't work.

This was because the alarm call which interrupted the "waiting for
carrier" open of the dz data line left the DTR bit high. Our DN dialer
refused to dial again on this line, since it thought it was busy. The only
way to lower DTR was for uucico to exit, since there was no file
descriptor to do a close on (open never returned). Luckily, as
there was a tty structure associated with that dz line, the exiting
of uucico caused a call to dzclose, lowering DTR.

I kludged up a solution by creating a child process which probed the DZ
line with its own timeout, and either sent a signal to uucico or died, 
depending on whether or not open returned in time. Upon
receit of the signal, uucico opened the line for itself and killed the 
child. If the wait returned instead, uucico just went on to try again;
the exiting of the child (the only one trying to open the dz line) caused
DTR to go low.

If anyone has had this problem and dreamed up a better solution, please
let me know. BTL UNIX 3.0 and beyond have no-wait opens, which offer a 
much cleaner way of dealing with ACUs and their associated serial lines.
If there aren't any better ways around this in 4.1, I'll be happy
to send my version of dialup() to anyone interested. I've thought
of changing the 4.1 kernal to call dzclose when the open call is
aborted, but have decided that this wouldn't always be the right
thing to do.

	Gill Pratt

	reachable by mail at

	....!alice!researc!physics!gill

	or

	gill@mc
----------