[comp.sys.apollo] Upgrading to SR10.2

awhitton@bcara132.bnr.ca (Alan Whitton @ BNR) (03/03/90)

Hello,

I am not sure if I am doing something wrong but I have to see if anybody
else has seen the problems upgrading from SR10.1 to SR10.2. 

If you have a system running SR10.1 (just BSD installed) and you install
as an upgrade (you do NOT format the disk or any other destructive thing),
it seems the MH tools break (at least they did for me). Whenever I do
a COMP I can never send anything with it, I get Message Not Sent.

I isolated this to the following call which abends with a Segmentation
Fault:

/usr/new/lib/mh/spost -library //foo/users/awhitton/Mail -verbose \
-watch //foo/users/awhitton/Mail/drafts/8
Segmentation fault                                                              
>tb                                                                           
Process        1278 (parent 1276, group 1278)                                   
Time           90/03/02.07:47(EST)                                              
Program        /bsd4.3/usr/lib/sendmail                                         
Status         00040004: reference to illegal address (OS/MST manager)          
In routine     "bcopy" line 167                                                 
Called from    "rca_$use_known_rgy" line 1056                                   
Called from    "rca_$find_a_candidate_registry" line 1229                       
Called from    "rca_$check_binding" line 1312                                   
Called from    "getpwent" line 129                                              
Called from    "rgy_unix_$getpwnam" line 214                                    
Called from    "rgyc_unix_$getpwnam" line 4409                                  
Called from    "getpwnam" line 213                                              
Called from    "username" line 393                                              
Called from    "setsender" line 443                                             
Called from    "main" line 707                                                  
Called from    "unix_$main" line 114                                            
Called from    "<apollo_c_startup>" line 31999                                  
Called from    "PM_$CALL" line 176                                              
Called from    "pgm_$load_run" line 891                                         
Called from    "pgm_$invoke_uid_pn" line 1112                                   

This is really weird because on my node which was loaded from
tape this does NOT happen (everything works hunky dory). Strangely
MAIL works fine, but if I compile ELM it fails also....

Currently this is the only anomally I can find (and not many
people use MH here), but it still nags at me that there is
something weird at work. Also anything under /usr/new
is unsupported by Apollo.

Any help, guesses, comments would be appreciated.

Be Seeing You,
Alan 
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
This is ONLY my Opinion          Bell Northern Research
awhitton@bnr.ca          "I am not a number, I am a free man!"

kerr@tron.UUCP (Dave Kerr) (03/05/90)

In article <1062@bnrgate.bnr.ca> awhitton@bcara132.bnr.ca (Alan Whitton @ BNR) writes:
>Hello,

[ Describes problem with mh breaking at sr10.2 ]

>I isolated this to the following call which abends with a Segmentation
>Fault:
>
>/usr/new/lib/mh/spost -library //foo/users/awhitton/Mail -verbose \
>-watch //foo/users/awhitton/Mail/drafts/8
>Segmentation fault                                                              

[ text deleted ]

You might try rebuilding the sendmail alias file (/usr/ucb/newaliases), 
and getting rid of any frozen configuration files (/usr/lib/sendmail.fc.
Hope that helps.


-- 
Dave Kerr (301) 765-4453 (WIN)765-4453
tron::kerr                 Internal WEC vax mail
kerr@tron.bwi.wec.com      from an Internet site
kerr@tron.UUCP             from a smart uucp mailer

markg@CAEN.ENGIN.UMICH.EDU (Mark Giuffrida) (03/05/90)

	/usr/new/lib/mh/spost -library //foo/users/awhitton/Mail -verbose \
	-watch //foo/users/awhitton/Mail/drafts/8
	Segmentation fault                                                              
	>tb                                                                           
	Process        1278 (parent 1276, group 1278)                                   
	Time           90/03/02.07:47(EST)                                              
	Program        /bsd4.3/usr/lib/sendmail                                         
	Status         00040004: reference to illegal address (OS/MST manager)          
	In routine     "bcopy" line 167                                                 
	Called from    "rca_$use_known_rgy" line 1056                                   
	Called from    "rca_$find_a_candidate_registry" line 1229                       
	Called from    "rca_$check_binding" line 1312                                   
	Called from    "getpwent" line 129                                              
	Called from    "rgy_unix_$getpwnam" line 214                                    
	Called from    "rgyc_unix_$getpwnam" line 4409                                  
	Called from    "getpwnam" line 213                                              
	Called from    "username" line 393                                              
	Called from    "setsender" line 443                                             
	Called from    "main" line 707                                                  
	Called from    "unix_$main" line 114                                            
	Called from    "<apollo_c_startup>" line 31999                                  
	Called from    "PM_$CALL" line 176                                              
	Called from    "pgm_$load_run" line 891                                         
	Called from    "pgm_$invoke_uid_pn" line 1112                                   

This is great.  I thought that we were the only ones getting this because
of our large registry size.  Let me explain what I have reported to apollo
so far.

I open a call about a month ago on this one (#A2004386).  I gave them exactly
the same scenario and the exact same traceback.  The problem was that I could not
get them to reproduce it.  I might add at first they said there wasn't much
they could do since it happened with "unsupported software" (i.e., the MH).  I did
manage to get them to file an apr on the problem (apr DDE11).

This is a strange bug, because getpwnam() functions correctly 99.99% of the time
(except when the registry is loaded and it falsely returns "unknown user", but
that is another problem - and another filed apr).  When getpwnam() fails under these
conditions, it consistantly fails.  I tried unsuccessfully to reproduce the problem.
I have always felt that this was a combination registry and fork() problem.

I hope Apollos boosts the priority of this problem as other sites are now experiencing
it.  This problem was not there in sr10.1.  It also cripples any site dependent on
using MH like we are.  BTW, I was able to get a workaround by compiling a shared library
version of MH.  For some reason getpwnam() is stable in that scenario.

Mark Giuffrida
University of Michigan
markg@caen.engin.umich.edu