CLAYTON@XRT.UPENN.EDU ("Clayton, Paul D.") (10/12/87)
Information From TSO Financial - The Saga Continues... Chapter 29 - October 11, 1987 Its been awhile since my last blurb and this edition is to bring several items to light for comtemplation and comment, if needed. Leading the list is networks. My rendition of how computers came to being is that CPU's were created and someone said they were good. The next day, someone wanted to log on the CPU's and networks were created. And EVERYTHING went to HELL. Computers systems have not recovered. I say the above because of the following reasons. 1. We have an Ethernet backbone shipped around the country via T1 links and Ethernet bridges. On a T1 link, there is a master clock that synchs ALL T1 modems together. Our network is setup for single branch failure on the phone lines. In this blessed event, traffic will 'fail-over' to the remaining links and attempt to continue. The rub comes in having one clock do all the work with the T1 modems. Should the POWER to the CLOCK or the CLOCK itself fail, all T1 modems are USELESS. For us, the power failed and the T1's went to lunch. Our Ethernet, PBX tie lines, dedicated data circuits and computer systems went to LUNCH with them. In researching this more, it may NOT be possible to have a second clock on the network. The vendor is checking further. Sigh... 2. In reading all the messages about DECnet/Ethernet problems, and the latest bit about the unavailable buffers, I consider networks to be extremely unsophisticated and in DIRE need of new technologies and/or tools. I had to chuckle about the response concerning the use of Ethernim in locating a node pumping out garbage. While we have Ethernim and run it from time to time, the displaying of a significant number of Broadcast or Multicast messages is almost useless. There is no way to isolate the source, short of trimming the network. How many networks can be trimmed to find the source? Ours can not due to our using it for so many different things at all times of the day and night. On to other topics. Our Vax Cluster Console (VCS) system has arrived AT LAST. I have/had high hopes for the system but in light of the network problem listed in item #1 part of the project is being SERIOUSLY reconsidered. Anyway, the boxes all came, all fifteen (15) of them and nothing was damaged. Ours is a LARGER configuration then is normal for a VCS. We are having a seperate Ethernet backbone for it, to eliminate problem in item #1, and adding terminal servers to do some special functions for us. Our VCS system was planned to control 15+ systems both local and remote. The gotcha's started occuring quickly. 1. The CONSOLE port on a MicroVAX II system is NOT a DB25 connector. Its one of those SUB-D types that someone thought was a good idea. WRONG. The fiber optic links are all based on DB-25 connectors. One of which has to go onto the back of the processor in place of the console connector. This can NOT be done on the MVAX II systems. Stalemate for the moment, Colorado is working on it. 2. The 85/87/88 PRO consoles ALSO present a problem. The VCS gets plugged into the PRO and the question is where. The current statements from DEC is that it gets plugged into the RDC port on the PRO and the MDS01 box gets plugged into the VCS system. This is a bad idea due to only having one MDS01 per VCS. I have on several bad days, lost more then one system. How is DEC to perform the RDC function on several systems from one MDS01?? The other alternative is to have many MDS01's on several DHV11 ports. I did NOT order enough interfaces for that solution. My experience with the folks at RDC is that it is sometimes extremely hard for them to get things right and be able to work my system(s) from afar. And to expect the RDC gang to be able to UNDERSTAND and USE VCS type commands I feel is asking to much, at least for this point in time. The other purpose we use the MDS01 boxes for is dial up lines into our systems. If all the MDS01's get moved to the VCS system, then I have to have a bunch of junk accounts on it for people to log into and then SET HOST to another system. This is also something I DO NOT WANT!!! 3. The fiber optic links also caused problems. I was told ONLY of the power supply on the VCS end of the links. It turns out that there is a power supply on the opposite end as well. This causes EXTRA power outlets to be run for each system that has a VCS tie in. There is one good point here, in that the power supply on the VCS end is a 16 tap deal with ONE power tail. Someone used their head on that one at least. 4. On the plug that attaches to the processor being controlled by the VCS, there is a switch that allows for output to go to the VCS or a LOCAL terminal. This would be needed should the VCS system be unavailable for what ever reason. This sounded like a GOOD idea. I then asked about its impact when connected to the HSC console port. The deal here is that should you TURN OFF the console terminal that is connected to the HSC console port, the HSC WILL REBOOT. This is NOT healthy in a prodcution shop, and NOT something to happen in the midst of LOSING ALL system consoles at one time. I have one HSC50 and a reboot of it would have me down for MINUTES/HOURS while the TU58 whirls!! The question boils down to this, "Does the switching from VCS to LOCAL maintain the signals in the HSC that will PREVENT a HSC reboot?". Answer is UNKNOWN as yet and Colorado is looking into this also. All this and the software has NOT been installed yet. Who knows what results I will have when that task is completed. Needless to say at least ONE more chapter of 'The Saga...' should come out of it. Also got an UPS system approved and the room it will reside in started. It is due to arrive by the end of October. Just to prove we were correct in getting one, the power has FAILED four (4) times in the span of three (3) days. Three of the hits resulted from UNKNOWN power shutdowns from our supplier. Two of these were within 15 minutes of each other. Just enough time between hits to have the MOUNT commands starting on a large disk farm in the bootup files. You better BELIEVE we do the REBUILDS during the MOUNTs. The fourth was a lightning hit. All four lasted less then 10 seconds. The UPS is batteries good for 10 minutes at 75KVA. Since we only use about 50KVA, we have close to 20 minutes worst case. I shall rest easier when its in, and my local field service is also glad we are getting one. Thats enough for this edition, so until things break, again, I shall continue my vigil and hope for the best and get the worst. :-) Paul D. Clayton - Manager Of Systems TSO Financial - Horsham, Pa. USA Address - CLAYTON%XRT@CIS.UPENN.EDU