[mod.protocols.appletalk] kbox/kip

croft@russell.stanford.EDU.UUCP (03/26/87)

I've interspersed some answers to the questions that have come up.

> Date: 25 Mar 87 18:19:25 EST
> From: JOSEPH@RED.RUTGERS.EDU
> Subject: Kbox/Kip problems
> To: info-applebus@C.CS.CMU.EDU
> Cc: croft@RUSSELL.STANFORD.EDU
> Message-Id: <12289303332.51.JOSEPH@RED.RUTGERS.EDU>
> 
> ...
> The Kbox seems to crash at random times and will not restart on power
> up.  The box will sometimes run for 3-4 days and other times will
> crash after 12-16 hours.  

The boxes I am aware of are all staying up for indefinite periods
(months at least).  I have seen two instances of kboxes on noisy AC
circuits that have had crashes like this.  After moving the power plugs
or installing line filters, the crashes ceased.

However I don't rule out the possibility that your packet environment
may have uncovered some unknown bug in the gateway.  Can you try
(1) experimenting with the power situation.  (2) experimenting with
smaller groups of users on your appletalk cable to see if this is
application or protocol dependent?

We have groups here using the kbox and MacServe / AppleShare without
these crashes.

> The 10/86 version of Kip would flood the
> appletalk network with packets when it crashed and would not answer
> pings on the Ethernet side. 

The 'flooding' you speak of is typical of the power glitch type of
crash we saw here.  The current Kinetics PROM code has a small bug that
does not disable ethernet interrupts following a brief power glitch.
This same bug surfaces when you tell Prompt to restart or reset an
operating gateway.

The 'flood' consists of recursive stack dump packets from the gateway
PROM code as it crashes over and over again with the unvectored
ethernet interrupt.  This type of recursive stack dump I believe can
also occur under other software crash circumstances.  At any rate,
Kinetics has fixed the PROM bug in their 'next release', which I assume
they will make available for free or a nominal fee.

There is nothing in the KIP code itself (short of a newly discovered
bug) that could start this dumping process.

> This week we have imported the new
> version of Kip and it also crashes.   It does not seem to impact the
> Appletalk side the same way when it freaks out, but it also will not
> answer pings on the Ether side.  When the gateway is crashed, the
> Kinetics PROMPT program can not find a gateway until I switch it off
> and back on again.

Until the new PROMs are available, it is always recommended to power
cycle the gateway for a second or so before loading.  Otherwise the
ethernet interrupt enable latch doesnt get cleared.

> 
> Questions:
> 
> 1) if the gateway is running, and I switch it off and back on, should
> it restart again all by itself?   I assume it would just reload its
> routing tables via ATALKAD running on our PYRAMID.

Only if the program text/data segments checksum correctly.  If the
crash has somehow corrupted the program, then you will have to reload
with Prompt.

> 
> 2) If it should but does not, is there any way I can diagnose whether
> the problem is in hardware (the Kbox) or software?

Try the suggestions listed above.

> 
> 3) Has anyone else experienced this problem?  How stable is this
> software?  Should I expect it to run bug free for weeks or is this
> typical performance?
> 

It's not typical.  I think I would have heard from users if they were
having experiences like this.  But again, you might have some unusual
packets in your environment that are triggering this.

> 4) We are running MacServe 2.2 on this network. Could it be
> interacting?  Has anyone else put a gateway on a MacServe Appletalk
> network?  Are there known incompatabilities?

Not that we have seen.

> 
> 5) MacTelnet seems to take an awfully long time to realize that the
> gateway is down.  It will often sit for 3-5 minutes before complaining
> that it cannot get an IP address from the gateway.  Is there any way
> to shorten this wait?
> 

Charlie Kim or Mark Sherman might be able to shorten this timeout
without too much trouble.

> Thank you for your help.
> 
> Seymour Joseph
> Systems programmer/Microcomputers
> 
> P.S.  Is the terminal emulation in MacTelnet being worked on?  Its
> vt100 is really marginal.  ...
> 

> Date: Wed, 25 Mar 87 18:40:10 est
> From: dk1z#@andrew.cmu.edu (David Kovar)
> To: info-applebus@C.CS.CMU.EDU, JOSEPH@RED.RUTGERS.EDU
> Subject: Re: Kbox/Kip problems
> Cc: croft@RUSSELL.STANFORD.EDU
> In-Reply-To: <12289303332.51.JOSEPH@RED.RUTGERS.EDU>
> 
> I'm sure that I'll be corrected on a few points, but here are my
> observations.
> 
> 1) If you power off the gateway and bring it back up it should come up on its
> own. As I am paranoid, I tend to reload it by hand anyhow.

If it comes up by itself then you can be assured that everything is fine
since the program checksum succeeded.

> 
> 2) No informed answer.
> 
> 3) I've had no stability problems and I've pounded on it pretty hard at
> times. From my experience, you should be able to count on it for weeks. 
> 
> 4) We do not have MacServe. We have had problems with Hayes InterBridges,
> though.

I would like to hear of any other Interbridge problems.  Brad Parker of
General Computer seems to be using it with his Hayes boxes ok.  Have you
verified that the latest release still has Hayes problems?

> 
> 5) The timeout in MacTelenet (CMU/Columbia) is quite long. ...
> 
> 6) MacTelenet will probably not be getting many improvements. ...
> 
> 7) Prompt frequently loses the Kinetics box. When I use Reset, the Kinetics
> box will reset, but Prompt will not find it. Power cycling always works. The
> Prompt screen is also totally useless and the configuration file drives me
> nuts. Having to encode the name and load file in hex is insane. 

Probably Kinetics should do a couple things to Prompt:  (1) disable or
blank certain dialog boxes when a config file is read.  After all, when
you are reading from a config file, you are setting up all this stuff in
advance, so it doesnt require manual dialog editing.  The config files
allow operations staff, unfamiliar with appletalk terminology, to load
and start a gateway with minimal fuss.

(2) You DONT have to encode the gateway/file name in hex.  These are only
carryovers from the original config file scheme that Kinetics used.
It would be nice if the config reader in Prompt would at least understand
dot format IP addresses.  I thought this change was going to go into
the next release.

> If Kinetics
> would release the source, I'd be happy to upgrade it. If I had the time, I'd
> just write it from scratch, but I'm not that nuts, yet. Once I start handing
> boxes out to local administrators on campus I'm certain that I'm going to
> wish that I had rewritten it.

For large sites, the answer is probably going to be that 2nd PROM socket
inside the kbox.  This would allow the boxes to boot themselves over the
ethernet.  However this still leaves the problem of where to put the
Prompt-type configuration information (what's my ether address, IP address,
etc.)  Some type of unique serial number / ethernet address must be
present for schemes such as BOOTP to work.  I also surely wish Kinetics
had put an small EEPROM onto the board, as I originally suggested with
Seagate.  This is the place for Prompt-type config info...

brad@gcc-milo.UUCP.UUCP (03/26/87)

croft@russell.stanford.EDU (Bill Croft):
> I would like to hear of any other Interbridge problems.  Brad Parker of
> General Computer seems to be using it with his Hayes boxes ok.  Have you
> verified that the latest release still has Hayes problems?

Er, uh, huh? My ears are burning!

We have a net with two appletalk bridges and one kbox gateway. I think
were still running the 10/86 code with some minor updates (It's been
a long time since I reloaded it - over a month) and I use it many times
a day to upload cross-compiled applications from our vax.

I run *very* heavy duty file system exercisors on top of our remote
file system (HyperNet 2.0 - coming to a dealer near you) which literaly
flood the network with packets. This will always (yes, always) cause the
hayes bridge to lock up but the kbox code never falls down (thanks Bill).

I suspect that the hayes code has some small "windows" which can cause it
to hang up (you remember - atomicity!). These windows are small enough not
to be exercised in normal traffic. If, however, you send as many packets 
as you can stuff on the network for 12 hours, you will hit the window.
(Having written lots of background appletalk code and having written
code for the kbox, I've run into this before)

The symptom is that the bridge appears to be alive (responds to NBP) but
refuses to forward packets (wow - I'd love to fire up an emulator to 
see what's going on. hi ho.)

Anyway, for what it's worth, the kbox code appears to be solid. The
hayes code, however, needs an update...

-brad