croft@russell.stanford.EDU.UUCP (03/26/87)
I've interspersed some answers to the questions that have come up. > Date: 25 Mar 87 18:19:25 EST > From: JOSEPH@RED.RUTGERS.EDU > Subject: Kbox/Kip problems > To: info-applebus@C.CS.CMU.EDU > Cc: croft@RUSSELL.STANFORD.EDU > Message-Id: <12289303332.51.JOSEPH@RED.RUTGERS.EDU> > > ... > The Kbox seems to crash at random times and will not restart on power > up. The box will sometimes run for 3-4 days and other times will > crash after 12-16 hours. The boxes I am aware of are all staying up for indefinite periods (months at least). I have seen two instances of kboxes on noisy AC circuits that have had crashes like this. After moving the power plugs or installing line filters, the crashes ceased. However I don't rule out the possibility that your packet environment may have uncovered some unknown bug in the gateway. Can you try (1) experimenting with the power situation. (2) experimenting with smaller groups of users on your appletalk cable to see if this is application or protocol dependent? We have groups here using the kbox and MacServe / AppleShare without these crashes. > The 10/86 version of Kip would flood the > appletalk network with packets when it crashed and would not answer > pings on the Ethernet side. The 'flooding' you speak of is typical of the power glitch type of crash we saw here. The current Kinetics PROM code has a small bug that does not disable ethernet interrupts following a brief power glitch. This same bug surfaces when you tell Prompt to restart or reset an operating gateway. The 'flood' consists of recursive stack dump packets from the gateway PROM code as it crashes over and over again with the unvectored ethernet interrupt. This type of recursive stack dump I believe can also occur under other software crash circumstances. At any rate, Kinetics has fixed the PROM bug in their 'next release', which I assume they will make available for free or a nominal fee. There is nothing in the KIP code itself (short of a newly discovered bug) that could start this dumping process. > This week we have imported the new > version of Kip and it also crashes. It does not seem to impact the > Appletalk side the same way when it freaks out, but it also will not > answer pings on the Ether side. When the gateway is crashed, the > Kinetics PROMPT program can not find a gateway until I switch it off > and back on again. Until the new PROMs are available, it is always recommended to power cycle the gateway for a second or so before loading. Otherwise the ethernet interrupt enable latch doesnt get cleared. > > Questions: > > 1) if the gateway is running, and I switch it off and back on, should > it restart again all by itself? I assume it would just reload its > routing tables via ATALKAD running on our PYRAMID. Only if the program text/data segments checksum correctly. If the crash has somehow corrupted the program, then you will have to reload with Prompt. > > 2) If it should but does not, is there any way I can diagnose whether > the problem is in hardware (the Kbox) or software? Try the suggestions listed above. > > 3) Has anyone else experienced this problem? How stable is this > software? Should I expect it to run bug free for weeks or is this > typical performance? > It's not typical. I think I would have heard from users if they were having experiences like this. But again, you might have some unusual packets in your environment that are triggering this. > 4) We are running MacServe 2.2 on this network. Could it be > interacting? Has anyone else put a gateway on a MacServe Appletalk > network? Are there known incompatabilities? Not that we have seen. > > 5) MacTelnet seems to take an awfully long time to realize that the > gateway is down. It will often sit for 3-5 minutes before complaining > that it cannot get an IP address from the gateway. Is there any way > to shorten this wait? > Charlie Kim or Mark Sherman might be able to shorten this timeout without too much trouble. > Thank you for your help. > > Seymour Joseph > Systems programmer/Microcomputers > > P.S. Is the terminal emulation in MacTelnet being worked on? Its > vt100 is really marginal. ... > > Date: Wed, 25 Mar 87 18:40:10 est > From: dk1z#@andrew.cmu.edu (David Kovar) > To: info-applebus@C.CS.CMU.EDU, JOSEPH@RED.RUTGERS.EDU > Subject: Re: Kbox/Kip problems > Cc: croft@RUSSELL.STANFORD.EDU > In-Reply-To: <12289303332.51.JOSEPH@RED.RUTGERS.EDU> > > I'm sure that I'll be corrected on a few points, but here are my > observations. > > 1) If you power off the gateway and bring it back up it should come up on its > own. As I am paranoid, I tend to reload it by hand anyhow. If it comes up by itself then you can be assured that everything is fine since the program checksum succeeded. > > 2) No informed answer. > > 3) I've had no stability problems and I've pounded on it pretty hard at > times. From my experience, you should be able to count on it for weeks. > > 4) We do not have MacServe. We have had problems with Hayes InterBridges, > though. I would like to hear of any other Interbridge problems. Brad Parker of General Computer seems to be using it with his Hayes boxes ok. Have you verified that the latest release still has Hayes problems? > > 5) The timeout in MacTelenet (CMU/Columbia) is quite long. ... > > 6) MacTelenet will probably not be getting many improvements. ... > > 7) Prompt frequently loses the Kinetics box. When I use Reset, the Kinetics > box will reset, but Prompt will not find it. Power cycling always works. The > Prompt screen is also totally useless and the configuration file drives me > nuts. Having to encode the name and load file in hex is insane. Probably Kinetics should do a couple things to Prompt: (1) disable or blank certain dialog boxes when a config file is read. After all, when you are reading from a config file, you are setting up all this stuff in advance, so it doesnt require manual dialog editing. The config files allow operations staff, unfamiliar with appletalk terminology, to load and start a gateway with minimal fuss. (2) You DONT have to encode the gateway/file name in hex. These are only carryovers from the original config file scheme that Kinetics used. It would be nice if the config reader in Prompt would at least understand dot format IP addresses. I thought this change was going to go into the next release. > If Kinetics > would release the source, I'd be happy to upgrade it. If I had the time, I'd > just write it from scratch, but I'm not that nuts, yet. Once I start handing > boxes out to local administrators on campus I'm certain that I'm going to > wish that I had rewritten it. For large sites, the answer is probably going to be that 2nd PROM socket inside the kbox. This would allow the boxes to boot themselves over the ethernet. However this still leaves the problem of where to put the Prompt-type configuration information (what's my ether address, IP address, etc.) Some type of unique serial number / ethernet address must be present for schemes such as BOOTP to work. I also surely wish Kinetics had put an small EEPROM onto the board, as I originally suggested with Seagate. This is the place for Prompt-type config info...
brad@gcc-milo.UUCP.UUCP (03/26/87)
croft@russell.stanford.EDU (Bill Croft): > I would like to hear of any other Interbridge problems. Brad Parker of > General Computer seems to be using it with his Hayes boxes ok. Have you > verified that the latest release still has Hayes problems? Er, uh, huh? My ears are burning! We have a net with two appletalk bridges and one kbox gateway. I think were still running the 10/86 code with some minor updates (It's been a long time since I reloaded it - over a month) and I use it many times a day to upload cross-compiled applications from our vax. I run *very* heavy duty file system exercisors on top of our remote file system (HyperNet 2.0 - coming to a dealer near you) which literaly flood the network with packets. This will always (yes, always) cause the hayes bridge to lock up but the kbox code never falls down (thanks Bill). I suspect that the hayes code has some small "windows" which can cause it to hang up (you remember - atomicity!). These windows are small enough not to be exercised in normal traffic. If, however, you send as many packets as you can stuff on the network for 12 hours, you will hit the window. (Having written lots of background appletalk code and having written code for the kbox, I've run into this before) The symptom is that the bridge appears to be alive (responds to NBP) but refuses to forward packets (wow - I'd love to fire up an emulator to see what's going on. hi ho.) Anyway, for what it's worth, the kbox code appears to be solid. The hayes code, however, needs an update... -brad