[comp.os.vms] Thoughts On How To Create A Single Node VAXCluster...

CLAYTON@XRT.UPENN.EDU ("Clayton, Paul D.") (06/23/88)

Mark Wood of Indiana University has asked the following questions in relation
to a past article that I posted to the network about converting to a 'Cluster
Common System Disk' (CCSD). The questions are valid, and I will answer them 
as I am able.

> Thanks to the author of the original posting, for much food for thought.

Your welcome.

> There are a couple of points I'd like to see cleared up, though.

> 1)      OK, *how* do you make a single-node cluster?  When we put in our
>         first VAX a year ago, it *wouldn't boot* until we set VAXCLUSTER=0!
>         Yes, we have an HSC and CI hardware.  What is the magic incantation
>         that will satisfy VMS as to the existence of a cluster, when there
>         is only one service node?

Okay, the tough question first. Before I get started, I want it to be 
understood that there are MANY different ways to accomplish something with 
VAX/VMS and setting up a cluster is just one of the areas this holds true for. 
It is also an area of large concern due to it being the operating system, and 
without a 'stable' platform for the user community to use, the system is not 
worth the electric, let alone the air conditioning, it is using. It should 
also be noted that as far as I am concerned, there is no PERFECT cluster 
setup. It is a pure case of looking at and placing risk factors to the various 
forms a cluster can take. If enough people want, I can go into further details 
on clusters larger then one node, but for now I will stick with a single node 
VAXCluster. Onward with the answer!!!

It is not enough to change the VAXCLUSTER SYSGEN parameter to a NON-ZERO value 
and say that a VAXCluster exists. This is as you found out. The items that go 
into a VAXCluster in the SYSGEN area are the following.
	SCSNODE			- mandatory
	SCSSYSTEMID		- mandatory
	VAXCLUSTER		- mandatory
	VOTES			- mandatory
	QUORUM			- mandatory
	ALLOCLASS		- mandatory
	DISK_QUORUM		- optional
	QDSKVOTES		- optional
The reason behind the above list and the mandatory or optional status is due 
to how the cluster is to be setup and if there is an HSC controlling the 
disks. If there is no HSC in the hardware configuration, then I would drop the 
last two parameters from consideration. The SYSGEN parameters listed above are 
broken down here and the suggested values for them is detailed.

1. SCSNODE - This should be the same as the DECnet node name, and up to six 
characters long. Same rules as the DECnet node name. No if's and's or but's.

2. SCSSYSTEMID - This should be set to the value resulting from the equation 
(1024* DECnet area number + DECnet node address). No if's and's or but's.

3. VAXCLUSTER - This should be set to 2. Cluster code will always be loaded, 
unless someone makes changes to the SYS$SYSTEM:STARTUP.COM file and/or puts 
something besides "" in for the STARTUP_P1 SYSGEN parameter.

4. VOTES - This is the number of votes that this CPU will put towards the 
needed 'cluster quorum' to see if processing should continue or the CPU should 
HANG. I typically set this to '1' (one) for a single node cluster.

5. QUORUM - This is the expected number of 'votes' that are available at boot 
time, and afterwards, in order for processing to continue. If there are not 
enough votes, then the CPU will HANG. This is the first parameter that I could 
have two values for. The equation that is used for 'quorum' is 'sum of VOTES 
from all CPU's currently discovered, and the quorum disk if defined + 2 / 2'. 
If the configuration has no HSC controllers I would set QUORUM to be '1' 
(one). If there is an HSC in the configuration, I would define the last two 
optional parameters in my list and set the QUORUM parameter to be be '2' 
(two).

6. ALLOCLASS - This is a 'nicety' in the world of clusters. I am tired of 
seeing the disks defined as 'node_name$Dxxx:'. I like the form '$x$Dxxx:'. This 
parameter is what does it. Set it to a numeric value, I typically use '1' 
(one), and the disks will come up looking like '$1$DUA0:'. BEWARE, this is the 
FIRST and ONLY parameter that I know of, unless you change the DECnet 
node name, that CAN BREAK CODE. If there are applications that refer to the 
disks by the physical device name, instead of the logical, THEY WILL BREAK. My 
feeling is that the applications deserve to be broken for doing a dumb thing 
like that anyway. Thats why GOD created logical names. Note that I am NOT 
saying that logical names are always good, but they have a place. This is one 
of those places. It also has the side effect of getting you, and the user 
community used to the disk naming convention that is used in a multi-node 
CI/NI based cluster.

7. DISK_QUORUM - This is the disk that is to be used as the 'supplier' of 
votes that go towards cluster quorum. If the ALLOCLASS parameter is set as 
suggested, this parameter has the form '$1$DUA0'. I only use this parameter if 
there is a HSC controller in the configuration and if so, the disk to be used 
is one that is off the HSC. I do not use a 'local or MASSBUS/UNIBUS' type disk 
for this. My intent with this is to make sure that the CI path is okay and 
that the cluster will not continue if the connection breaks.

8. QDSKVOTES - This is the number of votes to be assigned to the quorum disk 
that is used to 'mimick' another CPU as far as QUORUM is concerned. My typical 
value for this is '1' (one) as long as a HSC is in the configuration. Otherwise 
the parameter is not used.

All these parameters should be made in SYS$SYSTEM:MODPARAMS.DAT and then the 
command $@SYS$UPDATE:AUTOGEN SAVPARAMS SETPARAMS should be done. Note the 
reboot is not performed, and this allows you to verify that all is in place 
BEFORE shutting down the system and booting up the CCSD.

The other mandatory item is a CCSD disk and all the associated [SYS0...] type 
directory structures. Modify the boot files to boot into the correct 'root 
directory', this is the numeric suffix after the 'SYS' shown above. For 
example, if there is a [SYS4...] directory that you are using, then the 'root 
value' for the boot files is '4' (four). This value goes into R5 as the high 4 
bits, ie. '40000000' for the case of [sys4...]. The other registers in the 
boot file remain the same.

Now boot the disk and watch a VAXCluster come to life. Then get used to the 
'funny' logicals and other things that are unique to VAXCluster 
configurations. Note that the first time the system is booted off the CCSD and 
there is a quorum disk defined, the process may appear to be in a loop for a 
little bit. Do not be concerned until after 10 minutes. Then go looking for a 
problem in the SYSGEN parameters and other such things. A conversational 
bootstrap can help in this area.

> 2)      "...when putting up V5.0, and it asks you if you want the CCSD...."
>         The impression I got from a recent DECUS session is that the V5.0
>         installation *won't ask*; from here on *all* system disks will be
>         structured as cluster-common.  Did I misunderstand something?  If
>         not, this is yet another reason to become familiar with the CCSD
>         structure.

There have been various comments on the net about this topic, and I refer you 
to those messages. I have no comment at this time.

Hope this helps in getting a single node VAXCluster up and running. If there 
are further questions, let me know. :-)

NOTE***  I do not want, or desire any FLAME type messages about the 
recommendations that have been made here. The thoughts here are mine and work 
on VAX/VMS 4.0 and higher. I am open to further discussions, if 
needed/warranted, on the topic however.

pdc

Paul D. Clayton 
Address - CLAYTON%XRT@CIS.UPENN.EDU

Disclaimer:  All thoughts and statements here are my own and NOT those of my 
employer, and are also not based on, or contain, restricted information.