CLAYTON@XRT.UPENN.EDU ("Clayton, Paul D.") (06/23/88)
Mark Wood of Indiana University has asked the following questions in relation to a past article that I posted to the network about converting to a 'Cluster Common System Disk' (CCSD). The questions are valid, and I will answer them as I am able. > Thanks to the author of the original posting, for much food for thought. Your welcome. > There are a couple of points I'd like to see cleared up, though. > 1) OK, *how* do you make a single-node cluster? When we put in our > first VAX a year ago, it *wouldn't boot* until we set VAXCLUSTER=0! > Yes, we have an HSC and CI hardware. What is the magic incantation > that will satisfy VMS as to the existence of a cluster, when there > is only one service node? Okay, the tough question first. Before I get started, I want it to be understood that there are MANY different ways to accomplish something with VAX/VMS and setting up a cluster is just one of the areas this holds true for. It is also an area of large concern due to it being the operating system, and without a 'stable' platform for the user community to use, the system is not worth the electric, let alone the air conditioning, it is using. It should also be noted that as far as I am concerned, there is no PERFECT cluster setup. It is a pure case of looking at and placing risk factors to the various forms a cluster can take. If enough people want, I can go into further details on clusters larger then one node, but for now I will stick with a single node VAXCluster. Onward with the answer!!! It is not enough to change the VAXCLUSTER SYSGEN parameter to a NON-ZERO value and say that a VAXCluster exists. This is as you found out. The items that go into a VAXCluster in the SYSGEN area are the following. SCSNODE - mandatory SCSSYSTEMID - mandatory VAXCLUSTER - mandatory VOTES - mandatory QUORUM - mandatory ALLOCLASS - mandatory DISK_QUORUM - optional QDSKVOTES - optional The reason behind the above list and the mandatory or optional status is due to how the cluster is to be setup and if there is an HSC controlling the disks. If there is no HSC in the hardware configuration, then I would drop the last two parameters from consideration. The SYSGEN parameters listed above are broken down here and the suggested values for them is detailed. 1. SCSNODE - This should be the same as the DECnet node name, and up to six characters long. Same rules as the DECnet node name. No if's and's or but's. 2. SCSSYSTEMID - This should be set to the value resulting from the equation (1024* DECnet area number + DECnet node address). No if's and's or but's. 3. VAXCLUSTER - This should be set to 2. Cluster code will always be loaded, unless someone makes changes to the SYS$SYSTEM:STARTUP.COM file and/or puts something besides "" in for the STARTUP_P1 SYSGEN parameter. 4. VOTES - This is the number of votes that this CPU will put towards the needed 'cluster quorum' to see if processing should continue or the CPU should HANG. I typically set this to '1' (one) for a single node cluster. 5. QUORUM - This is the expected number of 'votes' that are available at boot time, and afterwards, in order for processing to continue. If there are not enough votes, then the CPU will HANG. This is the first parameter that I could have two values for. The equation that is used for 'quorum' is 'sum of VOTES from all CPU's currently discovered, and the quorum disk if defined + 2 / 2'. If the configuration has no HSC controllers I would set QUORUM to be '1' (one). If there is an HSC in the configuration, I would define the last two optional parameters in my list and set the QUORUM parameter to be be '2' (two). 6. ALLOCLASS - This is a 'nicety' in the world of clusters. I am tired of seeing the disks defined as 'node_name$Dxxx:'. I like the form '$x$Dxxx:'. This parameter is what does it. Set it to a numeric value, I typically use '1' (one), and the disks will come up looking like '$1$DUA0:'. BEWARE, this is the FIRST and ONLY parameter that I know of, unless you change the DECnet node name, that CAN BREAK CODE. If there are applications that refer to the disks by the physical device name, instead of the logical, THEY WILL BREAK. My feeling is that the applications deserve to be broken for doing a dumb thing like that anyway. Thats why GOD created logical names. Note that I am NOT saying that logical names are always good, but they have a place. This is one of those places. It also has the side effect of getting you, and the user community used to the disk naming convention that is used in a multi-node CI/NI based cluster. 7. DISK_QUORUM - This is the disk that is to be used as the 'supplier' of votes that go towards cluster quorum. If the ALLOCLASS parameter is set as suggested, this parameter has the form '$1$DUA0'. I only use this parameter if there is a HSC controller in the configuration and if so, the disk to be used is one that is off the HSC. I do not use a 'local or MASSBUS/UNIBUS' type disk for this. My intent with this is to make sure that the CI path is okay and that the cluster will not continue if the connection breaks. 8. QDSKVOTES - This is the number of votes to be assigned to the quorum disk that is used to 'mimick' another CPU as far as QUORUM is concerned. My typical value for this is '1' (one) as long as a HSC is in the configuration. Otherwise the parameter is not used. All these parameters should be made in SYS$SYSTEM:MODPARAMS.DAT and then the command $@SYS$UPDATE:AUTOGEN SAVPARAMS SETPARAMS should be done. Note the reboot is not performed, and this allows you to verify that all is in place BEFORE shutting down the system and booting up the CCSD. The other mandatory item is a CCSD disk and all the associated [SYS0...] type directory structures. Modify the boot files to boot into the correct 'root directory', this is the numeric suffix after the 'SYS' shown above. For example, if there is a [SYS4...] directory that you are using, then the 'root value' for the boot files is '4' (four). This value goes into R5 as the high 4 bits, ie. '40000000' for the case of [sys4...]. The other registers in the boot file remain the same. Now boot the disk and watch a VAXCluster come to life. Then get used to the 'funny' logicals and other things that are unique to VAXCluster configurations. Note that the first time the system is booted off the CCSD and there is a quorum disk defined, the process may appear to be in a loop for a little bit. Do not be concerned until after 10 minutes. Then go looking for a problem in the SYSGEN parameters and other such things. A conversational bootstrap can help in this area. > 2) "...when putting up V5.0, and it asks you if you want the CCSD...." > The impression I got from a recent DECUS session is that the V5.0 > installation *won't ask*; from here on *all* system disks will be > structured as cluster-common. Did I misunderstand something? If > not, this is yet another reason to become familiar with the CCSD > structure. There have been various comments on the net about this topic, and I refer you to those messages. I have no comment at this time. Hope this helps in getting a single node VAXCluster up and running. If there are further questions, let me know. :-) NOTE*** I do not want, or desire any FLAME type messages about the recommendations that have been made here. The thoughts here are mine and work on VAX/VMS 4.0 and higher. I am open to further discussions, if needed/warranted, on the topic however. pdc Paul D. Clayton Address - CLAYTON%XRT@CIS.UPENN.EDU Disclaimer: All thoughts and statements here are my own and NOT those of my employer, and are also not based on, or contain, restricted information.