pane@cat.cmu.edu (John Pane) (07/08/88)
In article <736@cunixc.columbia.edu> cck@cunixc.columbia.edu (Charlie C. Kim) writes: >In article <3422@ut-emx.UUCP> boerner@ut-emx.UUCP (Brendan B. Boerner) writes: >> >>Has anyone out there received the following message: >> ae0: overflow NIC reset failed >> ae6_intr: Receive overflow warning. >>.... > >Yes, I've been waiting to see if anyone else had this problem. This >happens every time I leave my mac booted for any period of time. I >believe it happens a result of many closely spaced packets causing the >board to go into a bad hardware state that the driver cannot reset... > I started having this problem when our network was re-arranged here, and it was so bad that I couldn't do any networking. The new configuration had placed me on a very busy portion of the network at CMU. Some of the problem was tracked down to broadcasts that my A/UX machine was making, that were being responded to by hundreds of machines on campus. Although this doesn't completely solve the problem, here are the steps I took which resulted in a big improvement. 1) In /etc/inittab I turned off nfs0 (the release notes tell you to turn this on even if you're not running nfs). I haven't noticed any loss of functionality after turning it off. 2) I created a file /etc/resolv.conf, listing three domain name servers, so my machine doesn't broadcast domain name resolution requests. See the manual entry for resolver(4). 3) Changed my broadcast address from 128.2.0.0 to 128.2.255.255 (most of the machines in the CS department here are still using 128.2.0.0, but the plan is to move to 128.2.255.255). This is a temporary fix, relying on the fact that fewer machines are currently responding to broadcasts on the new address. So now, my machine does less broadcasting, and because of the change of broadcast address, receives fewer replies when it does broadcast. The only remaining problem, which happens much less frequently, is that 100+ other machines on the network don't know about the 255.255 broadcast address, and when they receive such a broadcast (from my machine or others) they respond by arp'ing. This flood of arp's still causes my networking to go down. The fact remains that the hardware/low-level software should be able to handle this level of traffic. Does anybody know if the acknowledged "defect" in the ethertalk boards could manifest itself in this way? John Pane Department of Computer Science Carnegie Mellon University (412)268-5884 pane@cs.cmu.edu