[comp.sys.apollo] SR10.3.p is big trouble

system@aurum.chem.utoronto.ca (System Admin (Mike Peterson)) (01/19/91)

We upgraded our DN10020 to SR10.3.p 2 days ago, and in general our
experience so far has not been good:

1) vi still hangs in a DM pad after the first use; it will work once
after the pty's are rebuilt, then hangs ever after. Since we run
X Windows, this is not a high priority issue on the DN10000, but we have
2 DN580's that have sat useless for 8 months already because X takes too
long to do anything, yet you can't use vi in a DM pad on SR10.2 either.
If this problem still exists in SR10.3(.m), the 580's might as well be pushed
over the edge of the loading dock into the dumpster (some would argue
that is where they should have gone long ago :-) ).

2) SR10.3.p almost ran for 2 hours before the tcpd hung and started
eating 100% of the cpu time. While we have had SR10.2.p hang after less
than 10 minutes, this was NOT a good sign. After shutting down from
this hang, the front panel LEDs went to 'PFbF' and the keyboard was
dead. Pressing 'reset' didn't help - went back to 'PFbF' after running
the self-tests. Shutting off the power let it reboot.

3) about an hour ago, the whole system just froze, with 'c d.r t.' in
the LEDs - I crashed it and rebooted, but no idea what caused this one
(though it acts just like TCP hanging did at SR10.2.x.x..p).

4) saving the best for last, NONE OF THE WORKSTATIONS SOLUTIONS 9 TRACK
TAPE DRIVE SOFTWARE WORKS ON SR10.3.p - we get 'absolute load address
already occupied (process manager/loader)' errors. This leaves us with
no backup/restore capability, which is not joyful (we have 4 760 MB
disks on our DN10000, plus over 1 GB of disk on other nodes,
so c-tape is not a useful alternative).

If any one has any ideas what causes the 'load address' problem, I would
like to hear about it ASAP - either e-mail, post or phone me PLEASE.
If any one has any Workstations Solutions stuff that runs on SR10.3.p at
all, I'd like to hear about that too - no one there seems to know what
is wrong, which is somewhat surprising since at least some of them are
ex-Apollo employees.
-- 
Mike Peterson, System Administrator, U/Toronto Department of Chemistry
E-mail: system@alchemy.chem.utoronto.ca
Tel: (416) 978-7094                  Fax: (416) 978-8775

waldram@WOLF.UWYO.EDU (01/21/91)

RE: Message-Id: <1991Jan18.203853.1893@alchemy.chem.utoronto.ca>
  
>>>>>   We upgraded our DN10020 to SR10.3.p 2 days ago, and in general our
>>>>>   experience so far has not been good:

We upgraded our DN10010 to 10.3.p 20 days ago, and in general our experience
HAS BEEN EXCELLENT!  We went from 10.1.p to 10.3.p.
   
>>>>>   1) vi still hangs in a DM pad after the first use; it will work once
>>>>>   after the pty's are rebuilt, then hangs ever after.

We have not seen this problem at 10.1.p or 10.3.p!  We ARE running in the BSD environment.
   
>>>>>   2) SR10.3.p almost ran for 2 hours before the tcpd hung and started
>>>>>   eating 100% of the cpu time.

WE HAVE SEEN THIS PROBLEM, but the offending process was rgyd (a slave).  800-2APOLLO
explained the problem, but offered only a potentially dangerous fix.  (see previous
reports here)  We are still pursuing the problem (no APR), but have seen it occur only
3 times since installation (none in the last 10 days).  While any activity was EXTREMELY
HINDERED (slow response), by killing the offending processes, we were able to
do a clean SHUT.
   
>>>>>   4) saving the best for last, NONE OF THE WORKSTATIONS SOLUTIONS 9 TRACK
>>>>>   TAPE DRIVE SOFTWARE WORKS ON SR10.3.p

We found that 'rwmt' access worked with WS TapeAT version 2.2.1 (SR10.1.p & 10.3.p)
BUT NOT WITH 'wbak/rbak' backup utilities.  WS has supplied version 3.1.1 which I
was going to install this weekend (before I read #4).  This version is supposed to
fix the problems I was having!  What version is giving you these problems?

>>>>>   If any one has any Workstations Solutions stuff that runs on SR10.3.p at

I was also hoping to install our WS Exatape on our 10010 this weekend.  I'll keep
you posted on my experiences.

We have seen some problems with /dev/tty0x.  Our applications read a data stream
from tty01, tty02, and tty03 which worked at 10.1.p.  The tty0x's could not be
opened by the applications at 10.3.p.  HOWEVER!!! by modifying the code to allow
the "non-UNIX" /dev/siox to be used, the applications functioned!  We tried remaking
all the devices and rebooting to fix the problem, but the tty's did not return to 
functionality.  We investigated the "fix to tty's" where the hardware lines needed to
be at the correct levels, to no avail.  Getty did not care wether we used sio or tty
device files (both worked when H/W signal levels correct).  We gave up for the time
being as the applications were critical to daily operation.

We have also seen some major performance hits when LARGE applications are running.  We
attribute these to the lack of RAM (we have only 16MB) as we are seeing significant
disk paging request levels (20 to 90 per second).
                       -jjw
Jim Waldram
Department of Atmospheric Science, University of Wyoming
waldram@grizzly.uwyo.edu
jwaldram@outlaw.uwyo.edu
jwaldram@UWYO.BITNET

rand@HWCAE.CFSAT.HONEYWELL.COM (01/21/91)

>>>>>> Mike Peterson (system@alchemy.chem.utoronto.ca) 18/Jan/91 writes:
> We upgraded our DN10020 to SR10.3.p 2 days ago, and in general our
> experience so far has not been good:
> [...]
> ... we get 'absolute load address already occupied (process manager/loader)'
> errors.

I have seen this error message many times before. But never at SR10.3.
At SR10.3 HP/Apollo changed the sizes of the segment(s) for the coff
files. I first noticed this when I installed C++ 2.0. This was
compiled with the SR10.3 tools at HP/Apollo. And whenever I tried to
run it on my SR10.1 node, I got the same error message. I called up
HP/Apollo, and they said "Oops, load up patch m159." I did and
everything works fine.

Now, I don't know if this helps you with your SR10.3 problem, but I
thought I'd tell you anyway.

--
Douglas Keenan Rand                Honeywell -- Air Transport Systems Division
Phone: +1 602 436 2814               US Snail: P.O. Box 21111 Phoenix AZ 85036
Internet: @cim-vax.honeywell.com:rand@hwcae.cfsat.honeywell.com
UUCP: ...!uunet!asuvax!apciphx!hwcae!rand

system@aurum.chem.utoronto.ca (System Admin (Mike Peterson)) (01/21/91)

A few more details of our problems:

We are attempting to run version 3.1.1 of the Workstation Solutions
tapeAT software - only tar works, and that only works if used as
/bin/tar cf /dev/tapeat8 (or tapeat12), which is NOT one of the
documented ways to execute the WS software. None of rbak/wbak/rwmt
works at all.

P.S. I can't reach you Jim at U/Wyoming - both e-mail addresses fail.
-- 
Mike Peterson, System Administrator, U/Toronto Department of Chemistry
E-mail: system@alchemy.chem.utoronto.ca
Tel: (416) 978-7094                  Fax: (416) 978-8775