djc@duke.cs.duke.edu (David J. Cherveny) (12/08/90)
In the last couple of months, we have started using CUTCP, packet drivers and the BYU Novell-packet diver interface to do TCP/IP Telnet from our Novell workstations. It all works fine... most of the time. Intermittently, a workstation will loose touch with the server and give the good ole IGNORE, RETRY, ABORT message. Retrying does not help. I've rebuilt the ni5210 driver with the DAN code turned off as it is known to cause problems with Novell nets. I've heard the "free" version 2.1 BYU code can cause some hangs, but I'm unwilling to pay for the 3.0 version just yet. Until recently, we thought it was just occuring on workstations using the packet driver setup. However, recently, it has occured to NORMALLY CONFIGURED workstations as well. It seems to happen in rashes. Like for an hour all hell will break loose then it will be quite for days then will happen again. It does NOT happen to the STARLAN connected workstations. We've used NETWATCH during these spells but haven't noticed anything unusual. We are connected to a campus "backbone" network with a MAC level bridge. It is starting to sound like a HW problem to me since unmodified workstations are involved. Does anyone have any suggestions on how to proceed? HW and SW Details: DOS 4.0 on AT&T PCs Micom/Interlan NI5210 cards in workstations Novell 286 Netware SFT v2.15 rev c NI5210 version 7 packet driver with DAN code OFF BYU Novell-Packet driver interface V 2.1 CUTCP 2.2d Network HW is Synoptics Lattice Net UTP from a 110 box. David Cherveny Duke University Medical Center djc@hodgkin.mc.duke.edu (919)684-6804
trier@cwlim.INS.CWRU.Edu (Stephen C. Trier) (12/08/90)
In article <660597854@lear.cs.duke.edu> djc@duke.cs.duke.edu (David J. Cherveny) writes: >Intermittently, a workstation will loose touch with the server and give >the good ole IGNORE, RETRY, ABORT message. Retrying does not help. I know that bug!!! We're having exactly the same problems here. They started happening in August, about the time we started widespread use of the packet drivers. I spent a week watching our network analyzer for traces of what was going on and found nothing. We've been swapping cards, changing cables for the servers, and everything else we could think of. The problem seems to affect only computers that haven't sent network packets for ten minutes or so, which leads me to suspect that the server keep-alive packets are somehow getting dropped. Hardware used: Just about anything that runs MS-DOS 3.0 or higher. Many machines are AT&T 6386's, but we have bunches of Zeniths and PS/2's, too. The file servers are all AT&T 6386's. Ethernet cards are mostly AT&T Starlan-10 Fiber NAU's, but we've also got bunches of 3c503's, 3c523's, and Cabletron 1020 and 1040 cards. The servers use 3c505's. Software: CWRU-PC/IP (local PC/IP descendant) and BYU's packet driver IPX, version 2.1. The servers are running Advanced Netware 2.15 rev C, 2.15 rev A, and 2.15 rev 0. It's nice to know that we aren't imagining this. Does anyone have any ideas where to start looking? The failures are random, which makes watching with the net. analyzer a little difficult. -- Stephen Trier Case Western Reserve University Work: trier@cwlim.ins.cwru.edu Information Network Services Home: sct@seldon.clv.oh.us %% Any opinions above are my own. %%
nelson@sun.soe.clarkson.edu (Russ Nelson) (12/09/90)
In article <1990Dec8.061321.11400@usenet.ins.cwru.edu> trier@cwlim.INS.CWRU.Edu (Stephen C. Trier) writes: In article <660597854@lear.cs.duke.edu> djc@duke.cs.duke.edu (David J. Cherveny) writes: >Intermittently, a workstation will loose touch with the server and give >the good ole IGNORE, RETRY, ABORT message. Retrying does not help. I know that bug!!! We're having exactly the same problems here. They started happening in August, about the time we started widespread use of the packet drivers. Kelly McDonald, the author of the BYU Packet driver shell says: The problem: Idle workstations logged into Novell servers periodically come up with an error message stating that they have lost a connection to their logged in file server and their connection is no longer valid. The cause: other stations (besides the idle one) that is running the packet driver shell sometimes incorrectly respond to the "watchdog" packet sent out to the idle station from the server to see if it is still alive. The incorrect response causes the server to close the connection to the idle station. When the user of the idle station tries to access the server again, the error message is generated. (As far as we can tell, this only occurs with Netware 286 or earlier servers.) There would seem to be several solutions: o License the 3.0 Packet driver shell from Kelly McDonald. He has licensed it back from Atlantix for use by degree-granting American universities only. He wants several thousand dollars, which I'm sure merely reflect *his* cost from Atlantix. o Re-implement the packet driver shell and copyleft it. This requires the use of Novell's device driver kit, which costs $7,500. Now, that's a heap of money. Perhaps we could convince a manufacturer who's already bought it to let someone use theirs. That might be difficult, as Novell requires a nondisclosure agreement. Perhaps we should form an ad-hoc consortium to write a freely copyable packet driver shell? o Wait until Novell writes their ODI-over-packet-driver interface. o Switch to another LAN operating system that supports the packet drivers, you know, ???????. Hmmm... That would seem to be a problem. Perhaps we could convince Artisoft or whomever to include packet driver support? -- --russ (nelson@clutx [.bitnet | .clarkson.edu]) FAX 315-268-7600 It's better to get mugged than to live a life of fear -- Freeman Dyson I joined the League for Programming Freedom, and I hope you'll join too.
Jan.Engvald@ldc.lu.se (Jan Engvald LDC) (12/09/90)
> >Intermittently, a workstation will loose touch with the server and give > >the good ole IGNORE, RETRY, ABORT message. Retrying does not help. > > I know that bug!!! We're having exactly the same problems here. They > started happening in August, about the time we started widespread use > of the packet drivers. > >Kelly McDonald, the author of the BYU Packet driver shell says: > > The problem: > Idle workstations logged into Novell servers periodically come > up with an error message stating that they have lost a connection > to their logged in file server and their connection is no longer > valid. > > The cause: > other stations (besides the idle one) that is running the packet > driver shell sometimes incorrectly respond to the "watchdog" packet > sent out to the idle station from the server to see if it is still > alive. The incorrect response causes the server to close the > connection to the idle station. When the user of the idle station > tries to access the server again, the error message is generated. > (As far as we can tell, this only occurs with Netware 286 or > earlier servers.) Anybody that has more details on the above proposed cause? Reading between the lines I get the impression that the bad station sends a response to the server with a from address that is not its own. Is it the Ethernet address or the IPX address or both? We have been plauged by this aborted communication ever since June. We have been running packet drivers with the BYU driver for several years, so it is hard to believe that any of those is the cause. Late May, however, we got the Novell 3.01 rev A shells, and I would guess that they have something to do with the error. The rev D of NETx does not seem to help for this error. I have seen rumors on a rev B of 3.01 IPX, it might help. Is there any anonymous FTP server with IPX 3.1 rev B? If a new IPX does not help and the problem really is wrong from address, it is easy as a temporary fix to do a special packet driver version to force correct from address for a novell packet. Jan Engvald, Lund University Computing Center ________________________________________________________________________ Address: Box 783 E-mail: Jan.Engvald@ldc.lu.se S-220 07 LUND Earn/Bitnet: xjeldc@seldc52 SWEDEN (Span/Hepnet: Sweden::Gemini::xjeldc) Office: Soelvegatan 18 VAXPSI: psi%2403732202020::xjeldc Telephone: +46 46 107458 (X.400: C=se; A=TeDe; P=Sunet; O=lu; Telefax: +46 46 138225 OU=ldc; S=Engvald; G=Jan) Telex: 33533 LUNIVER S