[mod.computers.vax] two VMS4.2 bugs create potential Catch-22 situation

bob@islenet.UUCP (11/04/85)

(synopsis of an SPR of mine, with commentary added)

							947369
Operating system:	VMS
Version:	4.2
System program or document title:	VMS
Version or document part number:	4.2
Date:	1-NOV-1985

Name:	Bob Cunningham
Firm:	Oceanography & Marine Lab
	University of Hawaii

Report type/priority:	problem/error	heavy system impact

Cpu type:	VAX-11/750
Memory size:	4M

Approximately one month after upgrading to VMS 4.2, system started crashing
at sporadic intervals with the fatal bugcheck: "unknown signal in ACP".  Within
a few days, that fatal bugcheck came more often, and the system was also
crashing with the fatal bugcheck  "illegal page fault with IPL too high",
although occasionally system would just hang.  Currently, the system
crashes with a fatal bugcheck during or immediately after rebooting from
a previous fatal bugcheck.

Local diagnostics as well as remote diagnostics (by DDC at Colorado
Springs) showed no evidence of hardware problems.  DDC indicated this is a
VMS 4.2 problem, and suggested the work-around was to -- on a weekly basis
-- completely backup the systems disk (and other disks as required), then
rebuild by restoring from backup tapes.

Note: SYS$UPDATE:STABACKIT.COM in VMS 4.2 as distributed will not build
a stand-alone backup kit on TU58 cartridges on a VAX-11/750, see my SPR
254133 (customer acknowledgement marked Corporate SPR No. 11-82218).

.....................

Commentary (for info-vax readers).

The fatal bugchecking is apparently a serious VMS 4.2 bug related to disk
fragmentation.  A similar problem may appear in VMS 4.1, although the
symptoms are repeated simple hangs of the operating system, rather than
the fatal bugchecks.  There may be patches available for 4.1, I know of
none currently for 4.2.

It is supposedly very typical that the problem shows up one month after
converting to VMS 4.2, although it can take longer if your systems disk
stays relatively unfragmented.  A possible preliminary indication is
not finding file when you do a $ DIRECTORY with a wildcard search;
repeat the command and the file appears.

I've talked with several of the DDC people about this, and it seems they
are seeing this thing about 2-3 times a day now, increasing
as more sites upgrade to VMS 4.2.

The completely unrelated STABACKIT.COM bug creates a potential
Catch-22 situation since the only apparent workaround for the
bugchecking problem requires a current version of stand-alone backup.

Bob Cunningham  {dual|vortex|ihnp4}!islenet!bob
Hawaii Institute of Geophysics, University of Hawaii

jeff@ISI-VAXA.ARPA (Jeffery A. Cavallaro) (11/07/85)

The STABACKIT problem:

(As was reported to me via this net...)

Change the /HEADER=40 on the INITIALIZE command for SYSTEM_2 to
/HEADERS=35.  The MAN update causes the overflow.  This problem appears
to only occur on TU58s.  RX01 kits make fine...

While we are on the subject of STABACKIT, I wish that it wouldn't refuse
to use a medium just because BAD found a bad block.  The documentation
states that INITIALIZE will take the MBBF and SWBBF to heart when building
BADBLK.SYS.  Of course, too many bad blocks may cause an overflow, but
oh well...

						Jeff