[comp.mail.uucp] core dumps in Xenix 2.3.2 HDB uucp : SUMMARY

corby@netxdev.DHL.COM (Corby Anderson, SOWID) (01/05/90)

> Our mail and news server runs on an HP Vectra 386 machine running Xenix
> 2.3.2.  We're using HDB uucp for our communications stuff, and here's the
> problem.  Every half hour, at 9 after, and at 39 after, uucico makes a
> core dump in /usr/spool/uucp.  So far, SCO hasn't been much help.  Has
> anyone else run into this problem?  If so, then what have you done about
> it?
> 
> Please reply via E-mail, responses will be summarized.
> reply to netxcom!corby

A couple weeks ago I posted this article about troubles I was having with
uucp.  I've gotten several responses, for which I thank all you netters,
and I've narrowed the problem down a bit, and, as promised, I am posting
my findings here:

   First, I should say that the problem is not solved, but I think I've 
   narrowed it down to a more precise problem.  Here is exactly what happens
   and what effects it will have:

   Every half hour, at 9 after and 39 after, cron invokes a script in
   /usr/lib/uucp called uudemon.hour.  This script calls uusched with
   an ampersand (&) to put it in the background.  It also calls
   uuxqt.  Both of these programs are also in /usr/lib/uucp.  The purpose
   of uusched, among other things, is to look at the /usr/spool/uucp/system
   directories, check if there are outgoing messages, and initiate a
   /usr/lib/uucp/uucico session for each system with mail to send.
   The counting of outgoing files and systems executes successfully, and
   uucico DOES GET INVOKED for the first system.  When uucico completes,
   and returns to uusched, uusched dumps core immediately.  It looks like
   it cores in the middle of printing a return code message or something.

   Anyway, so that's what the problem is.  Here's what it will do.
   uusched decides (based on scheduling times and submission times)
   what system to call first, and calls it.  Fine.  The uucico session
   completes, and returns to uusched, which dumps core.  If you have
   outgoing files for other systems, then tough luck.  They don't get
   serviced until the next 1/2 hour, when it will happen all over
   again.  It's possible, if you have several systems with a fair
   amount of traffic, that mail may wait all day to go out, or maybe
   never go at all!

I received mail from several people telling me to check various things
such as:
    - getting the SLS update for 2.3.2, which SCO claims is not needed
    - looking for a D file of length 0 in one of the outgoing Q's
      or on one of the machines that you call.  SCO HDB uucp gags on
      0 length files.
    - using fixhdr to increase the stack size for uusched or uucico
    - there is a vague mention in the Xenix documentation about "uucico
      not working" at certain times if there is very high traffic.

The response that made the most sense (unfortunately) is this:
> It is a known problem with the 2.3.2 uusched program.  Call SCO tech
> support and tell them that your 2.3.2 uusched is dumping core and ask
> for the 2.3.1 uusched program.

Well, that's what we plan to do about it.  In the interim, I was still
pissed off about it not working, so I wrote a little shell script
to take its place for a while.  I include it here for those of
you who have the same problem, and want to give it a try.

-----------------------------------------------------------
:
# fake uusched script
cd /usr/spool/uucp
for dir in *
do
    if test -d $dir
    then
	files=`ls $dir/C* 2> /dev/null`
        case $files in
            "");;
	    *) /usr/lib/uucp/uucico -r1 -s$dir;;
	esac
    fi
done
-----------------------------------------------------------

Of course, no guarantees of ANY SORT are made that this will help you
in any way (covering my ass), but it's so simple that... well, what
could go wrong?!  To install this, you should the real uusched
to something else, then install this as /usr/lib/uucp/uusched, and set
the group, owner, and permissions to the same things that the real
uusched are set to.  And away you go!

I must note, however, that the real uusched does FAR MORE that this little
script.  This should be used as an interim fix ONLY!  Here are some of
the things that uusched does that my script does not do:
   - pay attention to the retry time in the .Status/system files.  If
     there's a problem with a remote system or your dialer or something,
     the retry time keeps getting increased, and will get up to one
     retry a day, in the worst case.  My script just plugs away, every
     time it gets invoked by uudemon.hour.
   - the real uusched will only try to connect to a system 26 times,
     but my script will try FOREVER!
   - the real uusched decides who to call first by looking at the
     time of the mail sent longest ago.  This has the highest priority.
     My script decides by doing it alphabetically.

There's probably a ton of stuff I haven't mentioned, but I've got to stop
sometime!  Thanks again, everyone, for your help, and I hop this
posting helps some of you out there, too!

An honorable mention should go to Tim O'Reilly's "Managing uucp and Usenet"
which is in the Nutshell series.  If you want to get a good book about
uucp, this is it!

Corby Anderson
uunet!netxcom!corby

root@gold.UUCP (Christian Seyb) (01/08/90)

There is a bug in uucico. The procedure statlog.c (For the people
having access to the source code) does contain a statement like:

printf(".........",....., bytes1000/millisec);

Guess, what happens if millisec gets 0? There will be a nice core
dump.

regards Christian
-- 
-------------------------------------------------------------------------------
Christian Seyb                Mailbox login: bbs        uucp login: nuucp
Unix Mailbox Filderstadt   ...!unido!nadia!gold!cs      Data: +49-711-776494
-------------------------------------------------------------------------------

honey@citi.umich.edu (Peter Honeyman) (01/12/90)

i wrote:

>there is no file called statlog.c in honey danber, nor is there a
>variable named bytes1000 anywhere in honey danber.

and was bit by what rick adams calls the "honey danber is whatever is
running on honey's machine today" phenom.  there was such a file,
variable, and bug, oh so long ago.  don't tell me ... no ...

	peter