[comp.protocols.tcp-ip.ibmpc] PCROUTE?

HOISVE@xanadu.cc.utah.edu (David Hoisve) (05/27/91)

This is probably on the top five list of FAQs, but here goes....

Where can I get a copy of PCROUTE?  I've seen the references, and it looks
very interesting.

One other question for folks using KA9Q as a router --

We have recently encountered router hangs under very heavy load using the
latest releases (4/20 and 5/23) of KA9Q.  The router is a Tandy 386 IS/16
routing between an NE2000 and an IPX encapsulated IP driver (IPXPKT) which
provides IP services to our campus Novell IPX internetwork.  It also appears
that the newer versions are up to 20% slower (FTP performance) than the '89
release we have been using.  The '89 release is rock solid.

Has anyone else encountered this problem?  

Thanks...

-- Dave

========================================
David Hoisve
University of Utah Computer Center
(801) 581-6025

NSFNet:   HOISVE@XANADU.CC.UTAH.EDU
or...     HOISVE@CC.UTAH.EDU
BitNet:   HOISVE@UTAHCCA.BITNET
========================================

ccml@hippo.ru.ac.za (Mike Lawrie) (05/28/91)

In <9105262246.AA00843@fcom.cc.utah.edu> HOISVE@xanadu.cc.utah.edu (David Hoisve) writes:

>Where can I get a copy of PCROUTE?  I've seen the references, and it looks
>very interesting.

Don't get too excited, read on....


>We have recently encountered router hangs under very heavy load using the
>latest releases (4/20 and 5/23) of KA9Q. 

				... because we have PCRoute on about
six trunks at 9600 and 19200. While investigating problems of lost
routes, it was discovered that PCRoute would go off into the woods
when two simultaneous transfers were in progress from each end
of a link. The only way to recover was to reset the PCs. Given
that they are 600 miles apart, this is something of a nuisance.

We don't know the cause, and are in the throes of more testing.
Anyone else with similar heartrending stories on PCRoute? Or better
still, fixes?

Mike
--
Mike Lawrie
Director Computing Services, Rhodes University, South Africa
.....................<ccml@hippo.ru.ac.za>..........................
Rhodes University condemns racism and racial segregation 

mah@wu-wien.ac.at (Michael Haberler) (05/28/91)

|> 				... because we have PCRoute on about
|> six trunks at 9600 and 19200. While investigating problems of lost
|> routes, it was discovered that PCRoute would go off into the woods
|> when two simultaneous transfers were in progress from each end

There was a bug in the slip code which caused PCroute to hang under high
traffic. It's been a at least a year since this was fixed by the author
of the Slip code, David Johnson <dave@tacky.cs.olemiss.edu>.

I had a similar problem as you described it, and after applying the
fix it went away. No outages in over a year.


I include Dave's message; I rather suggest you get a fresh copy of 
PCroute from tacky.cs.olemiss.edu with ftp.

- michael
----------------------------------

Subject: Re: SLIP on pcroute
In-Reply-To: Your message of "Sat, 31 Mar 90 19:01:53 EST."
             <9003311807.AA22631@tacky.cs.olemiss.edu>
Date: Sat, 31 Mar 90 17:17:39 -0600
From: David E. Johnson               ** Title:        Systems Programmer
 <dave@tacky.cs.olemiss.edu>
 
 
Michael,
 
        For the past several months our router has failed consistently
every few days.  The symptoms are the same as you reported.
The ethernet side still worked
but the SLIP side seemed to forget the Xmit interrupts.  Since this
was our connection to the Internet and also connected a PC (terminal)
lab via network to our campus IBM, this problem was significant.
 
        After trying every possible solution, we decided to order some
16550's assuming that the 8250 was at fault.  However, we have not
received these yet.  So I decided to have the router report its status
via syslog every so often so that maybe we could track the problem
down.  (A masters student here has not completed the SNMP
implementation which would have helped tremendously)  While going
through the code to place "counters" at strategic locations, I FOUND
THE PROBLEM.  Not in hardware, but in software.
 
        In the routine SLIP_DL_IP_W_ACCESS interrupts are turned off
to avoid a possible race condition with the interrupt handler.  The
problem is that the macro between turning ints off and on has a way to
jump out of this pair.  When the ethernet side overloads the SLIP side
this routine jumps automatically to "no_buffer", thus leaving
interrupts OFF.
 
        I have removed the cli and sti from this routine and the
router has not failed once (about 7 days).  If a race condition still
exists, it hasn't caused a problem thus far.  The cli and sti could be
left and a sti placed in the BUFF_CHECK routine just before jumping to
"no_buffer", but like I said, we have had no problem thus far.
 
        An update will be available soon, but since you have
re-assembled anyway, you can just take these out.  I would like to
know if this solves your problem.  I would also like to see any
changes you have made or may make for the 16550.  Since we have a few
coming, we might as well take advantage of them.
 
 
David E. Johnson               ** Title:        Systems Programmer
Department of Computer Science ** Telephone:    (601) 232-7396
The University of Mississippi  ** Internet:     dave@cs.olemiss.edu
336 Weir Hall                  **
University, MS  38677          **
 
 
 
 
 



-- 
Michael Haberler 		mah@wu-wien.ac.at,  mah@awiwuw11.bitnet
University of Economics and Business Administration
A-1090 Vienna, Augasse 2-6	    Biz:    +43 (1) 31336 x4796 Fax: 347-555
Home: +43 (1) 961-679 (voice & fax) D-Netz: +43 (663) 811-056