[comp.sys.apollo] IDA Sendmail 5.65 on Domain/OS problem

mmuegel@camdev.comm.mot.com (Michael Muegel) (02/15/91)

There has been much talk about sendmail of late in comp.sys.apollo. I had
been looking to replace Apollo's bogus sendmail with a current version so I
decided to take the plunge.

All seems to work great on my SR10.2 DN3500. It accepts SMTP connections
fine even when I touture tested it from another host. On our SR10.3 DSP3500
mail gateway, where I really need it, it has problems. It accepts connections
and delivers the mail but it retuns the following message to the sender about
90% of the time:

   ----- Transcript of session follows -----
   451 endmailer mail: wait: No child processes
   554 <mmuegel@mailbox.fwrdc.rtsg.mot.com>... Internal error

Cecking the node I see it has very few processes running (< 25) and so it
has definetly not reached the process limit. The version of IDA sendmail is
5.65+IDA-1.4.2. They are both using the SAME sendmail.cf so I know that is
not the problem. I also tried versions of sendmail compiled locally on the 
SR10.3 node and one on the SR10.2 node. Nada!

Any ideas? 
-Mike

-- 
+-----------------------------------------------------------------------------+
| Mike Muegel                              | Internet: mmuegel@mot.com        |
| Software Tools Group                     | UUCP:     uunet!motcid!muegel    |
| Fort Worth Research & Development Center | Voice:    (817) 232-6129         |
| Cellular Infrastructure Group            | Fax:      (817) 232-6081         |
| Radio Telephone and Systems Group        | Mail:     5555 North Beach St.   |
| Motorola, Inc.                           |           Fort Worth,  TX 76137  |
+-----------------------------------------------------------------------------+

wjw@ebs.eb.ele.tue.nl (Willem Jan Withagen) (02/18/91)

In article <368@camdev.comm.mot.com?> mmuegel@mot.com (Michael S. Muegel) writes:
=>There has been much talk about sendmail of late in comp.sys.apollo. I had
=>been looking to replace Apollo's bogus sendmail with a current version so I
=>decided to take the plunge.
So did I. :{

=>All seems to work great on my SR10.2 DN3500. It accepts SMTP connections
=>fine even when I touture tested it from another host. On our SR10.3 DSP3500
=>mail gateway, where I really need it, it has problems. It accepts connections
=>and delivers the mail but it retuns the following message to the sender about
=>90% of the time:
=>
=>   ----- Transcript of session follows -----
=>   451 endmailer mail: wait: No child processes
=>   554 <mmuegel@mailbox.fwrdc.rtsg.mot.com>... Internal error

My problems are slightly different. ;) I'm running it on a DN4500/sr10.2,
and I seem to remember a warning that it does not have a warranty for this
combination. However this run just fine. Except:
    1) 	bounced mail gets returned to the original user with all '\n'
	alone on a line doubled. So empty space in longer. I could live
	with this.
    2)	The forked child dies everytime. And a traceback will tell you that
	it can't traceback the process, because it has an invallid stack.
    3)	It seems to looose messages on and off, and that's what made me go back
	to the 5.61 version, until I upgrade the gateway to sr10.3

Willem Jan.
Eindhoven University of Technology   DomainName:  wjw@eb.ele.tue.nl    
Digital Systems Group, Room EH 10.10 
P.O. 513                             Tel: +31-40-473401
5600 MB Eindhoven                    The Netherlands

jf@ap.co.umist.ac.uk (John Forrest) (02/19/91)

In article <1093@eba.eb.ele.tue.nl>, wjw@ebs.eb.ele.tue.nl (Willem Jan Withagen) writes:
|> In article <368@camdev.comm.mot.com?> mmuegel@mot.com (Michael S. Muegel) writes:
|> =>There has been much talk about sendmail of late in comp.sys.apollo. I had
|> =>been looking to replace Apollo's bogus sendmail with a current version so I
|> =>decided to take the plunge.
|> So did I. :{
|> 
|> =>All seems to work great on my SR10.2 DN3500. It accepts SMTP connections
|> =>fine even when I touture tested it from another host. On our SR10.3 DSP3500
|> =>mail gateway, where I really need it, it has problems. It accepts connections
|> =>and delivers the mail but it retuns the following message to the sender about
|> =>90% of the time:
|> =>
|> =>   ----- Transcript of session follows -----
|> =>   451 endmailer mail: wait: No child processes
|> =>   554 <mmuegel@mailbox.fwrdc.rtsg.mot.com>... Internal error
|> 
|> My problems are slightly different. ;) I'm running it on a DN4500/sr10.2,
|> and I seem to remember a warning that it does not have a warranty for this
|> combination. However this run just fine. Except:
|>     1) 	bounced mail gets returned to the original user with all '\n'
|> 	alone on a line doubled. So empty space in longer. I could live
|> 	with this.
|>     2)	The forked child dies everytime. And a traceback will tell you that
|> 	it can't traceback the process, because it has an invallid stack.
|>     3)	It seems to looose messages on and off, and that's what made me go back
|> 	to the 5.61 version, until I upgrade the gateway to sr10.3
|> 
|> Willem Jan.
|> Eindhoven University of Technology   DomainName:  wjw@eb.ele.tue.nl    
|> Digital Systems Group, Room EH 10.10 
|> P.O. 513                             Tel: +31-40-473401
|> 5600 MB Eindhoven                    The Netherlands

Both of these are quite interesting, because they seem to indicate bugs that
might have gone! We run on 10.1 still (we've yet to upgrade, although we have
the tapes now), but I know Paul Pomes tested the latest (5.65a) binary under
10.3. Briefly,

1) This sounds like the putline problem raising it's ugly head. This was
rewritten to be more portable for 5.65, but it depends on some little used
features of printf, and these don't work on some systems - eg. OS10. I
understand it has been fixed on 10.3, but have yet to see it. If you have the
latest (?) source, it will come with a directory called uk.extras - our uk
upgrades. This includes our apollo 10.1 fixes - all switchable, and probably
relevent also for 10.2.

2) Difficult to know about this. We have had two problems around here - the
setprocline stuff (which never works on Apollo's) and also fnctl calls on
forks - that don't work on 10.1. The first can be disabled with a define -
make sure it is, and for the second we replaced with close calls (again
this is in our distributed mods).

3) Not sure about this, only to say I don't think we do - although sometimes
things get locked and messages stay in the queue quite a long time.

There is, or course, the extra issue about alias files - which varies from
place to place. The method we do it is to allow apollo's to update the alias
files, and then to make sure each has its own directory (/usr/lib/mail is
linked to `node_data/etc/lib.mail or similar). These each contain symbolic
links to the real alias file, but will create there own database files in
their own directories. I don't know if this makes much difference to the rest.

Anybody else had similar experiences?

John Forrest
Dept of Computation
UMIST

sboyle@mentorg.com (Sean Boyle x1542) (02/20/91)

In article <1093@eba.eb.ele.tue.nl> wjw@ebs.eb.ele.tue.nl (Willem Jan Withagen) writes:
>In article <368@camdev.comm.mot.com?> mmuegel@mot.com (Michael S. Muegel) writes:
>=>
>=>   ----- Transcript of session follows -----
>=>   451 endmailer mail: wait: No child processes
>=>   554 <mmuegel@mailbox.fwrdc.rtsg.mot.com>... Internal error
>
Did you set the Ox and OX to reasonable values for Apollo?  There is a process
limit even under 10.3 which is dependent upon your ram size...
It is even possible that the getla() stuff needs a little work.  I suspect
that the values returned are too small to be of use.  I was going to multiply
the return values by 10, but ran out of time...

>My problems are slightly different. ;) I'm running it on a DN4500/sr10.2,
>and I seem to remember a warning that it does not have a warranty for this
>combination. However this run just fine. Except:
>    1) 	bounced mail gets returned to the original user with all '\n'
>	alone on a line doubled. So empty space in longer. I could live
>	with this.
>    2)	The forked child dies everytime. And a traceback will tell you that
>	it can't traceback the process, because it has an invallid stack.
They did some clever stuff with changing the argv[0] to show you what sendmail
is doing at any given time.  This clobbers the stack with Apollo.  I #ifdef'd
it out.
-- 
"There is a time to laugh and a time not to laugh and this is not one of them."
                                              Inspector Jacques Clouseau

sboyle@mentor.com				Mentor Graphics Corporation