JOHNSON@nuhub.acs.northeastern.edu ("I am only an egg.") (01/09/87)
Following is an outline of a problem I recently encountered here at
Northeastern University. It points up the fact that I am possibly too
trusting in nature. It also shows the level of competence of DEC. I
thought I would let everyone in on it.
On December 30 1986 Northeastern University did updates
to its single node cluster VAX 8650 VMS system. The
products that where updated include the following:
o ... FORTRAN compiler
o ... PASCAL compiler
o ... COBOL compiler
o ... CDD
o ... TDMS
o ... DATATRIEVE
These products where installed using the procedures
outlined in their various installation notes and cover
letters. The installations proceeded successfully. These
installation were all performed by Chris Johnson, an
Northeastern staff member.
On December 31 a complaint was made by a user to
Northeastern's Academic Computer Services that the PASCAL
compiler didn't work. The error given concerned a invalid
DCL table CLD entry. Mr. Johnson called the DEC Customer
Support Center and as a start the PASCAL team was
referenced.
The DCSC software engineer in the PASCAL group
determined that the problem was of a system type and that
the VMS team should be called.
On January 5, Mr. Johnson again called DCSC and asked
for the VMS team. The error was described to the DCSC VMS
software engineer. The problem was determined to be the
installation of a SYS$SPECIFIC:[SYSLIB] version of
DCLTABLES.EXE rather than the SYS$COMMON:[SYSLIB] version of
it. It seems that layered product installation procedures
determine whether or not they are on a cluster node and if
they are they change the SYS$COMMON:[SYSLIB] version of
DCLTABLES.EXE. For a reason as yet unknown to both DCSC and
Northeastern there was a copy of DCLTABLES.EXE in
SYS$SPECIFIC:[SYSLIB]. When a cluster node is booted, the
SYS$LIBRARY logical name is equated to
SYS$SPECIFIC:,SYS$COMMON: IN THAT ORDER. Thus, if there is
a DCTABLES.EXE in SYS$SPECIFIC (as there was in this case)
then it will get installed on boot EVEN THOUGH THE
SYS$COMMON version of DCLTABLES.EXE was the one that was
updated by the layered produces installation procedures.
When asked why there were multiple copies of DCLTABLES
in multiple directories, the DCSC engineer was able offer no
explanation other than "We don't know, it just happens
sometimes." To be kind this answer was unhelpful. To be
truthful this answer points up a dreadful lack of knowledge
on the part of DEC an DEC's support staff of their own
installation procedures.
As a solution, The DCSC engineer had Mr. Johnson
install, using the INSTALL utility, the SYS$COMMON:[SYSLIB]
version of DCLTABLES.EXE and then delete, again with the
install utility, the SYS$SPECIFIC:[SYSLIB] version of
DCLTABLES.EXE. This was very shortly determined to be an
ill-advised procedure. The SYS$SPECIFIC version was of
course still in use and a delete pending flag was raised on
its global sections. This, in turn, prevented anyone else
from logging on to the system since a global section with a
delete pending flag is effectively not usable.
To solve this new problem, caused by implementing the
above DCSC advised procedure, the SYS$SPECIFIC version of
DCLTABLES.EXE had to be renamed and the system had to be
rebooted IMMEDEATELY causing inconvenience to users logged
on at the time.
This delete pending problem is one of two such. The
other is with dismounting disks that have open files on
them. A REVERSE procedure IS NECESSARY for these two delete
pending cases. In this way mistakes made by operators,
systems people AND DCSC would not REQUIRE A VERY
INCONVENIENT system reboot.
Chris Johnson
(more cynical than ever)
Northeastern UniversityMcGuire_Ed@GRINNELL.MAILNET.UUCP (01/15/87)
>It is always bad policy to delete an installed image file. . . . You >can only safely delete the file if its globals sections no longer show >up as delete pending in an $ install list/glob. As long as the file was installed /OPEN (which it was if it was installed /SHARE) it shouldn't matter if you delete the disk file. The file ought to be flagged for deletion but not actually deleted until it is closed, i.e. until the delete pending section is released by all processes. This has been my assumption so far, and nothing has broken as a consequence. Please set me straight if anyone has proof that deleting a disk file corrupts the shared sections.