kevin@kosman.UUCP (Kevin O'Gorman) (06/24/88)
I have been having bad results lately, and don't quite know what to blame it on. I know there has been a lot of discussion on the unix-pc net lately about reliability of various configurations, but I have not been paying enough attention to be aware if there's a consensus. As an ex-beta test site, I have several versions of the software running around here, but I am up to rev at the moment with 3.51 OS and utilities, and 3.51a kernel. I'm running a pretty recent HDB from THE STORE! I am distressed because my machine keeps freezing on me. It seems pretty clearly related to uucp traffic. Just about 15 minutes ago, it froze for the first time while I was watching it. It was just at the end of an incoming uucp call (my unix-pc feed, to be exact), and it left something for cunbatch in /lost+found. This means that the freezing is costing me dropped news, which is something I have suspected for some time. Anyway, in addition to the above, I am running a Telebit Trailblazer Plus on /dev/tty002 to handle all data calls. I would even consider getting a third phone line in here and resurrecting the OBM for some stuff if that would help. What I'm looking for is some opinions from you folks who have been paying attention. I want to know what configuration is the most reliable. I can back off to 3.51 or even 3.50 kernel, I can play with uucp versions, I can even reactivate the OBM for my 1200 baud traffic if I must (and GTE willing: I seem to be wired for 6 pairs in this house, thank goodness). But I want some informed opinions; I know you're out there, there's lots of talent running around this net. There ought to be some way to figure out a configuration that won't keep barfing in the middle of uupc traffic. Thanks, Kevin O'Gorman ( kevin@kosman ) voice: 805-984-8042 Vital Computer Systems, 5115 Beachcomber, Oxnard, CA 93035
ken@maxepr.UUCP (Ken Brassler) (06/26/88)
In article <422@kosman.UUCP> kevin@kosman.UUCP (Kevin O'Gorman) writes: >I have been having bad results lately, and don't quite know what to blame >it on. >I am distressed because my machine keeps freezing on me. It seems pretty >clearly related to uucp traffic. >It was just at the end of an >incoming uucp call (my unix-pc feed, to be exact), and it left something >for cunbatch in /lost+found. My machine has run continuously and flawlessly for 2 1/2 years. A few weeks ago a had my first kernel panic, due to a kernel parity error. I used the reset button to recover. Later that same day, my machine crashed (froze, locked-up, etc.) twice while receiving news. The second time, I was using 'rn' at the same time, and after reset, I found that my .newsrc file now contained the text of a news article. Since both crashes occurred while uucico was running, I surmised that the disk image of uucico had probably been partially overwritten with garbage during the first kernel panic. I reloaded new copies of all the uucp executables, uucico, uux, uuxqt, and uucp from my archives, and the problem immediately disappeared. Personally, I think that kernel crashes are due to a dram address, where the kernel is loaded, missing a refresh cycle, or being hit by a cosmic ray (not completely a joke). If no damage was done to the files on the hard disk during the crash, you can get by with a reset. If crashes increase in frequency, I think it's time to reformat and reload the hard disk. -- Ken Brassler {ihnp4|qantel|pyramid|lll-crg}!pacbell!maxepr!ken
erict@flatline.UUCP (j eric townsend) (07/05/88)
In article <422@kosman.UUCP>, kevin@kosman.UUCP (Kevin O'Gorman) writes: > I am distressed because my machine keeps freezing on me. It seems pretty > clearly related to uucp traffic. Just about 15 minutes ago, it froze for > the first time while I was watching it. It was just at the end of an > incoming uucp call (my unix-pc feed, to be exact), and it left something > for cunbatch in /lost+found. This means that the freezing is costing me > dropped news, which is something I have suspected for some time. > What I'm looking for is some opinions from you folks who have been paying > attention. I want to know what configuration is the most reliable. I can > back off to 3.51 or even 3.50 kernel, I can play with uucp versions, I can Okedoke. I'm still running 3.0, and I *never* have uucico/uucp problems. Ever, ever ever. Well, I had them twice, actually, but once was because I was playing around with uucico and friends. Bad idea. :-) The one problem that I had was setgetty not finishing up %100. ie: the LCKs would be rm'd, but uucico wouldn't finish up and exit. More uucicos would start throughout the day, and not exit. So... I'd have about 8 uucicos, none of them talking on the line, and no LCK files. I killed all the uucicos, but that didn't help, the problem started again. So I rebooted, and that solved the problem. This is after about 1.5 years of unix-pc uptime and constant usage. Personally, I'd be %100 content with 3.0 if I could have the 3.5 development set and libraries (flexnames, real curses, etc), but.... An idea: run one machine as the "newsbox" with 3.0 and HDB (HDB under 3.0 is a godsend) and don't use it for much else. Rogue, maybe, and a few other things. :-) -- Skate UNIX or go home, boogie boy... J. Eric Townsend ->uunet!nuchat!flatline!erict smail:511Parker#2,Hstn,Tx,77007 ..!bellcore!tness1!/
andy@rbdc.UUCP (Andy Pitts) (07/06/88)
This has all been said before but I'll post it again for any new people who may have missed it. The standard things that screw up uucp are: 1) The inittab entry. There MUST be a space preceding the entry for the tty line in inittab. example: ph1:2:respawn:.... ^ must have leading space. If the space is missing setgetty will lock. 2) There is a bug on some versions of the eia/ram cards that causes the system to crash. The fix is simple, you just replace a chip. If you are using an eia card call the hotline and ask about it. This bug only affected V3.50 and later if my memory serves. 3) Also I seem to remember reading that the device driver for tty000 was brain damaged. As I recall it will not pass a null (so break won't change speed) and hardware flow control did not work. Some or all of this may have been fixed with the fixdisk, but I don't know. Perhaps someone else out there can tell us. The drivers for the eia cards seem to work however (if you change the chip). I hope this helps. -- Andy Pitts andy@rbdc.UUCP : "The giant Gorf was hit in one eye by a stone, att \ : and that eye turned inward so that it looked kd4nc !gladys!rbdc!andy : into his mind and he died of what he saw there." pacbell/ : --_The Forgotten Beast of Eld_, McKillip--
rjg@sialis.mn.org (Robert J. Granvin) (07/08/88)
In article <531@rbdc.UUCP> andy@rbdc.UUCP (Andy Pitts) writes: >This has all been said before but I'll post it again for any new people >who may have missed it. The standard things that screw up uucp are: > >1) The inittab entry. There MUST be a space preceding the entry for >the tty line in inittab. example: > ph1:2:respawn:.... >^ must have leading space. If the space is missing setgetty will lock. > >2) There is a bug on some versions of the eia/ram cards that causes the >system to crash. The fix is simple, you just replace a chip. If you are >using an eia card call the hotline and ask about it. This bug only affected >V3.50 and later if my memory serves. > >3) Also I seem to remember reading that the device driver for tty000 was >brain damaged. As I recall it will not pass a null (so break won't change >speed) and hardware flow control did not work. Some or all of this may >have been fixed with the fixdisk, but I don't know. Perhaps someone else >out there can tell us. The drivers for the eia cards seem to work however >(if you change the chip). The tty000 driver is broken in at least version 3.51. It is also broken in 3.5, if I recall correctly. The inability for it to pass a NUL (BREAK) is correct. This also has the additional added benefit that if you pass (or attempt to pass) a BREAK through the OBM, you will, or may, hang any device on tty000. The device on tty000 will sometimes, though not always, recover itself later. 3.51a resolves this problem. The chip relacement for the EIA/RAM boards work like a charm, but if you've still got an old one, it's in your best interest to get a replacement chip NOW. It's not unlikely that these things will become very scarce in a short time (much like clock replacement batteries are today). Re: other messages about uucico crashing systems, etc: Hardware Flow Control works, but is broken. HFC will consistantly repeat a block of data in an entirely predictable way. The problem has been reported to ATT, and has also been sent up a level. This escalation in priority (which matches that given to the 3.51a kernel problems) means that it's very likely (though not guaranteed) that this bug will be resolved either in the next fixdisk, or a future one. 3.51 has a broken uucico and ph. These are repaired by the 3.51a fixdisk. 3.5 is also broken. The result is that on completion of _some_ connections (there is little consistency of which connection will be affected) will cause a system hang, and often a panic. You are forced to go for the little black reset button. ATT claims that ph is the primary culprit, but experience says that uucico is really the primary blame, although both is broken. Primarily based on the number of calls and demands they were getting, the released the 3.51a uucico on an unannounced request-only basis several months before the 3.51a fixdisks were released. In (nearly) every case, replacing the old uucico with this one completely solved all panic/crash problems. Now a step further. When the 3.51a fixdisk was released, all these problems were resolved, but a new one was introduced. If you use the OBM for UUCP connections, you will _still_ get occasional system panics with a kernel fault. If you do _not_ use the OBM, this problem goes away. The problem has been directly identified to be in the 3.51a unix kernel. The bug has been identified as well as verified that it does not exist in previous kernels. The problem is caused by the OBM not correctly closing it's last "physical buffer". It has been fixed, and the new kernel (3.51b) is currently in testing. The fixdisk does not yet exist, so don't call and ask for it. I'll post a note when I know it's available. The 3.51b fixdisk may also resolve several other issues as well. The exact contents are not known. By the way, re: the OBM. As some have noticed, the OBM firmware will handshake very happily as an MNP modem. While it's fully capable of handshaking MNP, it certainly does not communicate MNP. If you're using a Telebit or other MNP capable modem, you must be sure to turn off MNP abilities for any connection to a 3b1, or your connection will fail. By the way, the OBM dialing _out_ will _not_ handshake MNP, so it looks like someone originally designed it to be an MNP modem, then that capability was removed. Unfortunately, it wasn't removed enough. ATT was surprised. -- "I've been trying for some time to Robert J. Granvin develop a life-style that doesn't National Information Systems, Inc. require my presence." rjg@sialis.mn.org -Garry Trudeau ...{{amdahl,hpda}!bungia,rosevax}!sialis!rjg
david@ms.uky.edu (David Herron -- One of the vertebrae) (07/09/88)
In article <635@sialis.mn.org> rjg@sialis.mn.org (Robert J. Granvin) writes: >By the way, the OBM dialing _out_ will _not_ handshake MNP, so >it looks like someone originally designed it to be an MNP modem, then >that capability was removed. Unfortunately, it wasn't removed enough. >ATT was surprised. no, I don't buy that story. The Unix PC was on the market before MNP ever was. MNP is *not* very old! -- <---- David Herron -- The E-Mail guy <david@ms.uky.edu> <---- ska: David le casse\*' {rutgers,uunet}!ukma!david, david@UKMA.BITNET <---- <---- Today is the yesterday you worried about tomorrow.
rjg@sialis.mn.org (Robert J. Granvin) (07/10/88)
>>By the way, the OBM dialing _out_ will _not_ handshake MNP, so >>it looks like someone originally designed it to be an MNP modem, then >>that capability was removed. Unfortunately, it wasn't removed enough. >>ATT was surprised. > >no, I don't buy that story. The Unix PC was on the market before >MNP ever was. MNP is *not* very old! The actual story isn't the issue, but the observable facts are. By the way: MNP modems have been on the market for several years (probably longer than many realize), and the 3b1/7300 hasn't necessarily been on the market as long as some would think (for a discontinued machine). There is no reason to believe that the 3b1/7300 could NOT have MNP if they wanted it, unless the timing was just too close between the two. I have tested dialing inbound to a number of different aged and configuration 3b1's and 7300s with several modems. Most were capable of MNP class 3, others capable of MNP class 5. They all reported MNP handshake upon connection, whether you were in reliable or autoreliable mode. MNP modems have been on the market for several years, and the design existed long before they were made available (of course). Comparably, when these modems were set in autoreliable mode, a 3b1 OBM dialing _into_ them would not handshake MNP. If those modems were set to reliable mode, no connection could be made, of course. Checking status registers during a "live" connection verifies that the modem has indeed successfully and correctly handshaked an MNP connection. The only thing I haven't attempted to discover yet is what level of MNP it will actually handshake at. The condition was reported to ATT, which they found surprising. They tried it out, and like magic, an MNP modem handshook just like that. Don't ever expect to see a fix, though, it's firmware (unless they for some reason decide to fix something else in the OBM firmware (highly unlikely) in which case they'll probably forget to address this as well... :-) Now, there is one suggestion for the skeptics: Try it yourself. I'm relatively sure it's not a condition limited only to Minnesota and New York. :-) Your Trailblazer is capable of MNP, so there's a diagnostic tool. And for those, there are probably two possibilities of why it'll handshake: 1) ATT (and/or Convergent) were possibly working on the protocol together, or they had some sort of agreement with MicroCom for it's use. There is a long list of possibilities of why the modem might handshake MNP and yet never utilize it. The only people who would know for sure would most likely be anyone directly associated with the original "Safari 4" design team. 2) Pure dumb luck that this chip just happens to understand an MNP handshake request and respond to it accordingly. Not entirely likely. Heck of a coincidence, though! :-) By the way: I do like to verify my facts before I post them. It makes me feel confident that what I post is as close to fact as possible. When it comes to possibilities, it's nice to use phrases like "could be" or "possibly" or "a possible scenario", etc. When the true story and reasons are important, then someone will do their best to find that out. In this case, _why_ it does it is not important, but the fact that it does, is. This whole thing comes back to the very same suggestion as was in the previous note: If you are running as an autoreliable dialout MNP modem, make sure you turn _off_ all MNP before you try to connect to a 3b1/7300, or you'll get a connect, but that's it. 'Nuf said (more than :-), I hope. -- "I've been trying for some time to Robert J. Granvin develop a life-style that doesn't National Information Systems, Inc. require my presence." rjg@sialis.mn.org -Garry Trudeau ...{{amdahl,hpda}!bungia,rosevax}!sialis!rjg
mml@magnus.UUCP (Mike Levin) (07/12/88)
In article <9915@g.ms.uky.edu> david@ms.uky.edu (David Herron -- One of the vertebrae) writes: >no, I don't buy that story. The Unix PC was on the market before >MNP ever was. MNP is *not* very old! I have been using modems from Microcom (who invented Microcom Networking Protocol, also known as MNP) for at *least* 6 years, and they weren't brand-new at that time. So, I have a hunch the MNP out-dates the unix-pc. Of course, I haven't got any idea how long *it's* been on the market. -- +---+ P L E A S E R E S P O N D T O: +---+ * * * * * * * * * * | Mike Levin, Silent Radio Los Angeles (magnus)| I never thought I'd be LOOKING | Path {csun|kosman|mtune|srhqla}!magnus!levin | for something to say! ! ! +----------------------------------------------+------------------------------+