dyer@spdcc.COM (Steve Dyer) (09/08/90)
This isn't specifically Cisco-related, but I thought that the readers here might be familiar with this kind of scenario. I've just installed a 56kb DDS circuit with DSU/CSUs on each end and a Cisco CGS/2 on my end (there's a larger Cisco on the other end.) The line itself came up with only a little difficulty, but I am having regular trouble with the serial line staying up for any length of time without the Cisco dropping DTR and resetting the line. The number of carrier transitions reported by "show interface" is almost mindboggling-- after 10 minutes from powerup, it reports 817 carrier transitions and 8 interface resets. Naturally, this is wreaking havoc with applications getting ICMP unreachables and prematurely closing. I had the telco run loopback tests on both ends of the line for as long as 15 minutes and they reported that it looked as clean as a whistle, with no data or clocking errors. I'm really at a loss for where to look next. Anyone have any pointers? -- Steve Dyer dyer@ursa-major.spdcc.com aka {ima,harvard,rayssd,linus,m2c}!spdcc!dyer dyer@arktouros.mit.edu, dyer@hstbme.mit.edu
dyer@spdcc.COM (Steve Dyer) (09/09/90)
In article <3951@ursa-major.SPDCC.COM> dyer@ursa-major.spdcc.COM (Steve Dyer) writes: >[problems of hundreds of carrier transitions and serial line resets] The problem has disappeared, and the superficial solution was simple (if that was in fact the source of the problem.) I'd specified the high-speed serial interface on my CGS/2 even though I was planning to run at 56kb so that I could upgrade simply to fractional T1 speeds if my downstream link ever upgraded also. During the initial configuration of the serial interface, we neglected to specify the line bandwidth as 56kb (since that was what had been assumed for other CGS/2 models, and was the default for the Cisco on the other end of the DDS line. Apparently though, the default for the high-speed serial interface is T1 rate. Once I reset this to 56kb, things settled down within minutes, and I've had only a few carrier transitions and no interface resets since. I suppose that the "bandwidth" field is used for the initial values of the HDLC timers, and having such a mismatch between the two sides (not to mention the actual bandwidth) could cause problems, no? By the way, I really have to say that I'm impressed how easily it is to install one of these gateways and have it all work--it's practically a "turnkey" operation. Things have come a long way since the original LSI-11 gateways when I was at BBN... :-) -- Steve Dyer dyer@ursa-major.spdcc.com aka {ima,harvard,rayssd,linus,m2c}!spdcc!dyer dyer@arktouros.mit.edu, dyer@hstbme.mit.edu
pte900@jatz.aarnet.edu.au (Peter Elford) (09/10/90)
In article <3961@ursa-major.SPDCC.COM>, dyer@spdcc.COM (Steve Dyer) writes: |> In article <3951@ursa-major.SPDCC.COM> dyer@ursa-major.spdcc.COM (Steve Dyer) writes: |> >[problems of hundreds of carrier transitions and serial line resets] |> |> The problem has disappeared, and the superficial solution was simple |> (if that was in fact the source of the problem.) I'd specified the |> high-speed serial interface on my CGS/2 even though I was planning to |> run at 56kb so that I could upgrade simply to fractional T1 speeds if my |> downstream link ever upgraded also. During the initial configuration |> of the serial interface, we neglected to specify the line bandwidth as 56kb |> (since that was what had been assumed for other CGS/2 models, and was |> the default for the Cisco on the other end of the DDS line. Apparently |> though, the default for the high-speed serial interface is T1 rate. |> Once I reset this to 56kb, things settled down within minutes, and |> I've had only a few carrier transitions and no interface resets since. |> |> I suppose that the "bandwidth" field is used for the initial values |> of the HDLC timers, and having such a mismatch between the two sides |> (not to mention the actual bandwidth) could cause problems, no? I understood that the bandwidth sub-command "sets an informational parameter only; you cannot adjust the actual bandwidth of an interface with this command" (Gateway Server Manual p. 4-22). If this is not the case, then I would like to know about it, because on some of our 48K DDS services we see similar very high transition and reset counts, Peter Elford, e-mail: P.Elford@aarnet.edu.au Network Co-ordinator, phone: +61 6 249 3542 Australian Academic Research Network, fax: +61 6 247 3425 c/o, Computer Services Centre, post: PO Box 4 Australian National University Canberra 2601 Canberra, AUSTRALIA
dyer@spdcc.COM (Steve Dyer) (09/10/90)
In article <25924@boulder.Colorado.EDU> pte900@jatz.aarnet.edu.au (Peter Elford) writes: >I understood that the bandwidth sub-command "sets an informational parameter >only; you cannot adjust the actual bandwidth of an interface with this command" >(Gateway Server Manual p. 4-22). If this is not the case, then I would like >to know about it, because on some of our 48K DDS services we see similar >very high transition and reset counts, I wasn't suggesting that it adjusted the actual bandwidth. I was hypothesizing that the "informational parameter" might have been used for HDLC timers and that a severe mismatch might cause the line to go down or behave erratically. I've since been told by folks at Cisco that the bandwidth parameter is only used within IGRP for routing information. In any event, the "fix" appears to have been a coincidence. Having moved the equipment to its permanent resting place, the carrier transitions and interface resets (after a stable, peaceful weekend) have returned with a vengeance, even with the parameter set to 56kb. I think I'm in the nether world of flaky cables/modems, and will pursue it on that level. -- Steve Dyer dyer@ursa-major.spdcc.com aka {ima,harvard,rayssd,linus,m2c}!spdcc!dyer dyer@arktouros.mit.edu, dyer@hstbme.mit.edu
kannan@osc.edu (Kannan Varadhan) (09/10/90)
Thus spake dyer@ursa-major.spdcc.COM (Steve Dyer) >In any event, the "fix" appears to have been a coincidence. Having moved the >equipment to its permanent resting place, the carrier transitions and >interface resets (after a stable, peaceful weekend) have returned with a >vengeance, even with the parameter set to 56kb. I think I'm in the >nether world of flaky cables/modems, and will pursue it on that level. We have a 9.6kB line on which we see such a similar occurence. We hypothesized that what's happenning is that the line is saturating with data, and then some, causing even the router's memory to fill up. Thereafter, once the router's miss 3 hdlc keepalive patterns, they reset the line etc. etc. etc. The salesperson's suggestion was that we turn off the keepalives on that circuit, so that such interface resets did not happen. You might try that. We did not try that, because we were actually working on a different problem, and thought this was the cause of it, when it wasn't. Having fixed the other problem, we curled into our own cute li'll cubby holes, and went back to hibernation yet again :-). Check the throughput and errors on the line, and see if they occur when there is a large volume of data flowing through them. You could recreate this by, say, flood-pinging the line with ultra-large packets, and watching the status of the line. Track traffic patterns for a while, and see if there is any co-relation between these and interface resets. Best I can think of in a pinch...hope these help, Kannan -- Kannan Varadhan, Internet Engineer, OARNet Ohio Supercomputer Center, Columbus, OH 43212 +1 (614) 292-4137 email: kannan@oar.net | osu-cis!malgudi.oar.net!kannan
fortinp@bcars223.bnr.ca (Pierre Fortin) (09/11/90)
In article <3970@ursa-major.SPDCC.COM>, dyer@spdcc.COM (Steve Dyer) writes: > > In any event, the "fix" appears to have been a coincidence. Having moved the > equipment to its permanent resting place, the carrier transitions and > interface resets (after a stable, peaceful weekend) have returned with a > vengeance, even with the parameter set to 56kb. I think I'm in the > nether world of flaky cables/modems, and will pursue it on that level. Here's a short list of the V.35 problems I've encountered over the last 17 months: - V.35 applique Rev 3: inverted clocks; mods applied to some boards - V.35 applique Rev 4: inverted clocks; mods for Rev 3 boards applied inadvertently to these boards (I have personally seen at least three different attempts at clearing up this problem) We even had one unit in which the RxD line was leaking back out over one of the clock leads, giving the appearance of a bad DCE. - DL551V T1 CSU/DSU: most units bad (power supply drifting, incorrect factory options, bad repairs, poor QC, etc.) All units were to be returned to Digital Link for checkout; don't know current status of this. - Screws (Yes, SCREWS!): WHY is it that something as simple as a screw (actually screw threads) can cause problems; I guess we need patience testers... V.35 connectors are available with EITHER single- or double-helix retaining screws! Go figure... - V.35 cables: Nearly all cables we tested were of poor quality. We designed our own cable (verified to 70 feet) where each pair is individually shielded (with the source end only grounded); then an overall shield grounded at both ends via a six-inch pig-tail to a spade-lug. Major improvements!!! - MCI HDLC controller: The Rockwell HDLC controller chips with a date-code prior to a certain date (contact cisco for date) had bugs when the applied voltage was less than EXACTLY 5.00V. In some literature I received from cisco, there was a small piece of pink paper which said that this should only affect short X.25 packets with odd packet lengths; but, where there's smoke... - V.35 applique Rev 6: This applique is correct. We are continuing to replace all pre-Rev-6 appliques with these newer ones. - Operations: Our operations personnel used to make statements like: "The CSU/DSUs never work in loopback"; this has since been corrected. Any comm gear which does not work in loopback when the manufacturer claims it does should be highly suspect. In another instance, two links from different remotes (should have been [A]---[B0&B1]---[C]) were connected incorrectly ([A]---[B1&B0]--[C]). With our fully redundant mesh topology, the ciscos never complained, but you should have seen the highly inefficient routing, BUT IT WORKED from the users' perspective (a tip of the hat to cisco!!). - "Nah!" ;^) Someone I talked to in another company which shall forever remain nameless managed to find a cable and screw it in. Upon closer inspection, the cable was found to have a female connector. If you fail to spot the humor in this, check the sex of the V.35 appliques on your cisco... These are the major points I can think of off the top of my head at this time (it's 02:45). The general thrust of my message here is that, for what should be a mature interface, expect problems ANYWHERE. Likely you will find MORE than one problem. Perhaps this has to do with the fact (someone please prove me wrong) that there is no V.35 standard beyond the original which only covers the approx. 40KB speed. Other things to note about V.35: - the DCE supplies both clocks SCR and SCT - the DTE *returns* the Tx clock (SCTE) to the DCE to account for varying cable lengths at higher speeds - double-check cable polarities on ALL pairs Otherwise, V.35 was a piece of cake.... in the face! ...and now we're getting ready (hah!!) for the onslaught of T3 equipment... Huh? Why am I standing on this chair with a V.35 cable around my neck? > > -- > Steve Dyer > dyer@ursa-major.spdcc.com aka {ima,harvard,rayssd,linus,m2c}!spdcc!dyer > dyer@arktouros.mit.edu, dyer@hstbme.mit.edu Pierre Fortin fortinp@bnr.ca P.S.: If enough people send in their $35 (easy to remember ;^) ), we will seriously consider publishing "Living with V.35" complete with over 400 pages of tests, results, multi-channel scope traces, levels, power supply voltages, pictures of modifications, plus much more... :^) :^) P^) ;^)
fortinp@bcars223.bnr.ca (Pierre Fortin) (09/11/90)
In article <890@manhandler.osc.edu>, kannan@osc.edu (Kannan Varadhan) writes: > We hypothesized that what's happenning is that the line is saturating > with data, and then some, causing even the router's memory to fill up. > Thereafter, once the router's miss 3 hdlc keepalive patterns, they reset > the line etc. etc. etc. This reminds me: when using the older CSC-T cards, always "shutdown" any unused interfaces. I don't recall whether the cisco crashed, or the in-service links had problems, but issuing the shutdown command on all unused links cleared up our problems (this was about a year ago). I don't recall any similar problems with the cisco designed cards (MCI & SCI), but I wasn't about to take a chance, so the rule here is "shutdown". > > Best I can think of in a pinch...hope these help, See my other posting re "V.35 problems"... > > Kannan > -- > Kannan Varadhan, Internet Engineer, OARNet > Ohio Supercomputer Center, Columbus, OH 43212 +1 (614) 292-4137 > email: kannan@oar.net | osu-cis!malgudi.oar.net!kannan Pierre Fortin fortinp@bnr.ca