[comp.os.vms] Speeding Up System Bootup And Recommendations On How To Accomplish.

CLAYTON@XRT.UPENN.EDU ("Clayton, Paul D.") (09/28/87)

Information From TSO Financial - The Saga Continues...
Chapter 27 - September 28, 1987

The following article was submitted for answers, which I have injected into
the text.

	Speedup VAX-cluster startup-procedure
	-------------------------------------

        Booting our VAX-cluster takes too long. It takes approximately
        20 minutes before we have all systems up and running from a
        coldstart.

	The first bottleneck is the meeting that the 3 VAXes seem to
        have. Perhaps adding a quorum-disk can speedup that mumbling
        about entering and leaving the cluster?

Adding a quorum disk will NOT speed things up when the 'meeting' is taking
place. It will also LENGTHEN the time taken during cluster transistion when
a system leaves the cluster for any reason. I am currently engaged in a 
serious conversation with my LOCAL and AREA support reps over the way my
cluster is configured. DEC is predicting disaster, and in one case it may 
result. But I have lost two HSC's and THEY crapped up 4 disks. You are not
suppose to win.

	The second bottleneck is SYSTARTUP.COM itself. There are a lot
        of software products that have to be handled. Some of them I
        start up by submitting a job from batch. I edited one
        command-procedure to startup jnet from batch and if there is a
        problem I get a notification by MAIL from that
        command-procedure.

Good idea to start things in batch if you can. My practice to to NOT START the
batch queues during boot time. The reason is that battery backup on the system
clocks have failed me at least 3 times. The result is a bad system time. The
worst disaster to date was a 8700 booted with a time two YEARS ahead of what it
should have been. We have 40+ batch jobs that do ALL SORTS of things depending
on the system time. We ended up rebuilding a lot of disk packs to get the
files back to what they should have been. Sigh...

	The third bottleneck sometimes is the mount-verification of 10
        RA81's if one of the systems was not shut down properly. I can
        handle that by switching off mount-verification.

Again, I would NOT recommend that Mount-Verification be turned off. You could
end up with scrambled bit maps in the cache and a lost disk way after any
sensible point of rebuild. I HIGHLY recommend that MVTIMEOUT be set to 
something HUGE like 64000 so that a HSC50 with a TU58 can reboot BEFORE the
disks can complete the verification procedure.

        I am going to rewrite the startup-procedure. I want to
        implement a multi-trap-rocket taking care of dependencies. For
        example I should first startup DECnet and after that PSI. I
        think of writing one command-procedure which can be called
        with an argument indicating the command-procedure (eg.
        CMSSTARTUP.COM) There should be some error-checking and
        problems should be reported by MAIL using a distribution list
        pointing to our system-managers.

        SYSTARTUP.COM then could look like:

        $...
        $ @SUB_TO_BATCH CMSSTARTUP
        $ @SUB_TO_BATCH ACMSTART
        $ ...

Good idea. Just do this in SYSTARTUP.COM not STARTUP.COM.

	Now for my question:

        Anybody having some time-saving hints how to handle this all?
        I know how to do it but every hint might save me some time. To
        save you time reading boring remarks:

We are also currently doing the same type of things. I know of no current
DECUS tape that performs as we want. Have fun with your systems.

Paul D. Clayton - Manager Of Systems
TSO Financial - Horsham, Pa. USA
Address - CLAYTON%XRT@CIS.UPENN.EDU