ESMP09@SLACTWGM.BITNET.UUCP (04/05/87)
We have been using dual-ported (actually, multi-ported) disks with VMS with reasonable success, but not without a few problems. These are disks on an SI9900 controller. Each disk is mounted read/write from a single CPU, and read-only from the rest. This is NOT part of a VAX cluster, nor are we using any special software such as SILINK. My friend has a dual-ported DEC RP07, and is interested in using it in a similar mode. I mentioned some of the problems we have observed (described below), and suggested to him that the RP07 dual-port arrangement would probably behave similarly. My friend called the DEC software support center, described how he intended to use the disk, and asked if there were any special problems to be expected with this mode of operation. The answer was that this was a supported mode of operation, that there should be no problems-- simply mount the disk /WRITE from one CPU, /NOWRITE from the other; no need to even turn off cacheing. (DEC may also have stated that SET DEVICE/DUAL_PORT should be used; presumably this is necessary.) I would be interested in hearing of the experience of others who have tried using disks in this manner, especially DEC RP07's and disks on an SI9900 controller. (Sites running a VAXcluster or SILINK are, I think, irrelevant to this question.) The problems which we have seen with disks on the multi-ported SI controller when mounted in this manner are described below. We are currently running VMS 4.5. Problem 1: Expansion of INDEXF.SYS makes some files inaccessible --------- If the number of files stored on the disk increases beyond the number which can be accommodated by INDEXF.SYS, that file will be extended. However those files which are created with headers which are located in this expansion area will be inaccessible ('file not found') from a read-only CPU until the disk is dismounted and remounted by that CPU. Problem 2: Mount verification failures --------- If a disk goes into 'mount verification' for some reason, a CPU will often fail to remount the disk, the reason reported for the failure being 'wrong volume'. I believe I understand why this is so, although I have not verified all of these details. Apparently when a disk is mounted read/write, the date/time of mounting is recorded both on the disk itself and in memory. Should the disk go into the mount- verification state and a remount be attempted, it is required that not only the disk label but also this date/time match between what is on the disk and what is in memory--if both do not match, the disk is deemed to be the 'wrong volume'. If a disk is mounted read-only, the date/time which is currently on the disk is recorded in memory for later comparison should a mount verification be necessary. With this background, it can be seen that there is a potential for a CPU with read-only access to fail mount verification. Consider two scenarios: Scenario 1 (successful): At T0: CPU A mounts DISKX read/write, marking it with time T0 At T1: CPU B mounts DISKX read-only, noting its mount time of T0 At T2: DISKX goes into mount verification -- both CPUs successfully recover the disk, as they both agree with its recorded mount time of T0. Scenario 2 (unsuccessful): At T0: CPU B mounts DISKX read-only, noting its mount time (a time <T0) At T1: CPU A mounts DISKX read/write, marking it with time T1 At T2: DISKX goes into mount verification -- CPU A is successful, as the time T1 matches its expectation, but CPU B fails mount verification--the time recorded on the disk (T2) is not the same as what was there (<T0) when it mounted the disk, hence it is deemed to be the 'wrong volume'. Problem 3: XQPERR bugcheck crashes from read-only CPU --------- The nature of this problem is less certain than the first two, and we cannot reliably reproduce it. From discussions with DEC software support, a TENTATIVE interpretation is as follows: the read/write CPU has caused the INDEXF.SYS file to expand; the read-only CPU gets two conflicting pieces of information regarding the length of INDEXF.SYS; taken at face value, these conficting values imply that INDEXF.SYS has DECREASED in length, a logical impossibility within the rules of the file system, and hence the XQP bails out with a bugcheck (XQPERR). We've seen this problem twice, both since making the change to VMS 4.5 (with no time spent at 4.4). Clearly, there is a 'work-around' for two of these problems: make sure that INDEXF.SYS is large enough so that there is never any need to expand it. For the remaining problem (mount verification), I know of no reasonable work-around; the best we could do was to try to reduce the likelihood of events which lead to the need for mount verification. Ed Miller ESMP09@SLACTWGM.BITNET