JOHNSON@nuhub.acs.northeastern.edu ("I am only an egg.") (01/09/87)
Following is an outline of a problem I recently encountered here at Northeastern University. It points up the fact that I am possibly too trusting in nature. It also shows the level of competence of DEC. I thought I would let everyone in on it. On December 30 1986 Northeastern University did updates to its single node cluster VAX 8650 VMS system. The products that where updated include the following: o ... FORTRAN compiler o ... PASCAL compiler o ... COBOL compiler o ... CDD o ... TDMS o ... DATATRIEVE These products where installed using the procedures outlined in their various installation notes and cover letters. The installations proceeded successfully. These installation were all performed by Chris Johnson, an Northeastern staff member. On December 31 a complaint was made by a user to Northeastern's Academic Computer Services that the PASCAL compiler didn't work. The error given concerned a invalid DCL table CLD entry. Mr. Johnson called the DEC Customer Support Center and as a start the PASCAL team was referenced. The DCSC software engineer in the PASCAL group determined that the problem was of a system type and that the VMS team should be called. On January 5, Mr. Johnson again called DCSC and asked for the VMS team. The error was described to the DCSC VMS software engineer. The problem was determined to be the installation of a SYS$SPECIFIC:[SYSLIB] version of DCLTABLES.EXE rather than the SYS$COMMON:[SYSLIB] version of it. It seems that layered product installation procedures determine whether or not they are on a cluster node and if they are they change the SYS$COMMON:[SYSLIB] version of DCLTABLES.EXE. For a reason as yet unknown to both DCSC and Northeastern there was a copy of DCLTABLES.EXE in SYS$SPECIFIC:[SYSLIB]. When a cluster node is booted, the SYS$LIBRARY logical name is equated to SYS$SPECIFIC:,SYS$COMMON: IN THAT ORDER. Thus, if there is a DCTABLES.EXE in SYS$SPECIFIC (as there was in this case) then it will get installed on boot EVEN THOUGH THE SYS$COMMON version of DCLTABLES.EXE was the one that was updated by the layered produces installation procedures. When asked why there were multiple copies of DCLTABLES in multiple directories, the DCSC engineer was able offer no explanation other than "We don't know, it just happens sometimes." To be kind this answer was unhelpful. To be truthful this answer points up a dreadful lack of knowledge on the part of DEC an DEC's support staff of their own installation procedures. As a solution, The DCSC engineer had Mr. Johnson install, using the INSTALL utility, the SYS$COMMON:[SYSLIB] version of DCLTABLES.EXE and then delete, again with the install utility, the SYS$SPECIFIC:[SYSLIB] version of DCLTABLES.EXE. This was very shortly determined to be an ill-advised procedure. The SYS$SPECIFIC version was of course still in use and a delete pending flag was raised on its global sections. This, in turn, prevented anyone else from logging on to the system since a global section with a delete pending flag is effectively not usable. To solve this new problem, caused by implementing the above DCSC advised procedure, the SYS$SPECIFIC version of DCLTABLES.EXE had to be renamed and the system had to be rebooted IMMEDEATELY causing inconvenience to users logged on at the time. This delete pending problem is one of two such. The other is with dismounting disks that have open files on them. A REVERSE procedure IS NECESSARY for these two delete pending cases. In this way mistakes made by operators, systems people AND DCSC would not REQUIRE A VERY INCONVENIENT system reboot. Chris Johnson (more cynical than ever) Northeastern University
McGuire_Ed@GRINNELL.MAILNET.UUCP (01/15/87)
>It is always bad policy to delete an installed image file. . . . You >can only safely delete the file if its globals sections no longer show >up as delete pending in an $ install list/glob. As long as the file was installed /OPEN (which it was if it was installed /SHARE) it shouldn't matter if you delete the disk file. The file ought to be flagged for deletion but not actually deleted until it is closed, i.e. until the delete pending section is released by all processes. This has been my assumption so far, and nothing has broken as a consequence. Please set me straight if anyone has proof that deleting a disk file corrupts the shared sections.