jesup@cbmvax.commodore.com (Randell Jesup) (03/24/91)
[ followups to comp.sys.amiga.hardware, .tech is supposed to be gone ] In article <1991Mar21.180521.9469@swbatl.sbc.com> jburnes@swbatl.sbc.com (Jim Burnes - 235-0709) writes: >Has anyone noticed any hard drive errors on new A3000's. We've been >getting CRC erros on our Quantum hard drives. Whats really wierd is >that after running continous overnight diagnostics on the system, the >errors won't occurr. After bringing this machine (and were not the >only one in St. Louis that has this problem) ...after bringing it to >our dealer MULTIPLE TIMES and having every major system componenet >swapped out the errors keep occurring. Sometimes they occurr when >running the 2.0 version of DPAINT III. Sometimes we can't even predict >a crash. I'm beginning to think this is a engineering design problem. >Some sort of strange bus activity/noise problem that causes the machine >to fail. There are other possibilites. You may be getting glitches on your power lines (these can be REALLY confusing), so it might happen only at your home. It can be hard to diagnose unless lights or other equipment show changes (like dimming or brightening). It could be a PD or commercial utility leaving "time bombs" unintentionally. It may be an interaction of two incorrect programs (for example, one write to location 0, the other reads it and assumes it's 0). What version of 2.0 are you running? Does it happen if you run 1.3? (You should have 2.02 (version 36.207)). There was a WB disk for 2.02, did you update your system using it? The sort of thing you mention is a theoretical possibility, but that sort of error is quite unlikely assuming you did swap motherboard/disk/PS. (You did swap those, if I read your message right.) >My Commodore dealer is stumped because everyone he calls at Commodore >says that they havent heard a thing about this. I think they are >lying. More than one of his customers has had the same exact problem >If Commodore won't admit to it, I'm forced to take an informal survey. I certainly haven't heard anything like this, but then again I'm a software engineer, not hardware, let alone PA or Service. NOTE: this is (as usual) a personal unofficial comment/opinion, not that of Commodore or any branch thereof. -- Randell Jesup, Keeper of AmigaDos, Commodore Engineering. {uunet|rutgers}!cbmvax!jesup, jesup@cbmvax.commodore.com BIX: rjesup The compiler runs Like a swift-flowing river I wait in silence. (From "The Zen of Programming") ;-)
daveh@cbmvax.commodore.com (Dave Haynie) (03/26/91)
In article <20071@cbmvax.commodore.com> jesup@cbmvax.commodore.com (Randell Jesup) writes: >[ followups to comp.sys.amiga.hardware, .tech is supposed to be gone ] >>I'm beginning to think this is a engineering design problem. Some sort of >>strange bus activity/noise problem that causes the machine to fail. > There are other possibilites. You may be getting glitches on your >power lines (these can be REALLY confusing), so it might happen only at >your home. Actually, the power line would be my guess. We had a similar problem back in the early days of the A2620. The systems worked fine in the offices, but failed at my lab station. I was chasing down a failure that kicked on about once every two days. Since I thought there was some obscure timing problem, all eyes were on me, while the UNIX software folks merrily worked along without a glitch. I eventually caught a power line induced glitch on my system, which had the effect of freezing everything up. Of course, I wasn't using a hard disk system there, just the CPU running in a constant loop. A line glitch can as easily clobber a hard disk as it can a CPU. As far as long term hard disk activity, I don't know what QA has done, but I have run tests of my own. I set an A3000 up in the lab with three hard disks; the built-in, an A2091, and a Hardframe. They were each grabbing stuff from each other, deleting, copying, etc. in a tight loop. This ran until I took it down, a little over two weeks, without a single problem. I think most of us using A3000s find similar reliability -- they stay up until you crash them or the power goes out (not totally unheard of around here). If you're running into this kind of thing, where the machine fails at home, works fine at the repair shop, examine what's being done closely. You of course have to make sure the tests at the shop mimic the failure mode at home -- you're not likely to have a problem in 5 minutes at a shop if you don't see the problem at home for hours. But if they shop can't make it fail, there's a real good chance its not your machine. Radio Shack currently sells some kind of power line monitor thing for about $20 or so which may be useful in tracking down glitches (I haven't used the thing, we have an expensive one here at work which prints out fluctuations on paper tape rather than via LEDs). > It could be a PD or commercial utility leaving "time bombs" >unintentionally. It may be an interaction of two incorrect programs (for >example, one write to location 0, the other reads it and assumes it's 0). Which is why it's so important to duplicate the exact failure mode if you're trying to show it off at the store. If BonzoPaint left a time bomb, you're not likely to get the same failure with IggyCAD no matter how long you play with it. > The sort of thing you mention is a theoretical possibility, but that >sort of error is quite unlikely assuming you did swap motherboard/disk/PS. >(You did swap those, if I read your message right.) It's theoretically possible, but very unlikely. When solving problems, you eliminate the likely first, before venturing into the unlikely. >>My Commodore dealer is stumped because everyone he calls at Commodore >>says that they havent heard a thing about this. I personally have heard relatively isolated stories, on occasion, of such random crashes, on every system I have worked on, from the PLUS/4 and C128 on up to the A3000. It's never proven to be a design flaw. Always something else -- bad power line spikes, flakey or misinserted expansion boards, software uglies, etc. Sure, there's always a first time for everything, but a design flaw of any kind kicks up a large number of failures, and we do indeed hear about it. Very loudly, in fact, if its anything that's to do with engineering, rather than production problems. >>I think they are lying. I don't understand this attitude that Commodore is some evil giant intent on taking your money and delivering systems than give you heartburn. Maybe if we were IBM, I could understand it. But really, there's no heinous cover up or anything here, and all our index fingers are shorter than our middle fingers. And you're talking here to the engineers, too, not The Great Uninformed. > I certainly haven't heard anything like this, but then again I'm >a software engineer, not hardware, let alone PA or Service. And I can't say for certain that service hasn't isolated some kind of problem either, anything's possible. But they do contact us about problem they can't solve. And I haven't been hiding or anything. So I do suggest investigation of those other areas I have mentioned. >Randell Jesup, Keeper of AmigaDos, Commodore Engineering. -- Dave Haynie Commodore-Amiga (Amiga 3000) "The Crew That Never Rests" {uunet|pyramid|rutgers}!cbmvax!daveh PLINK: hazy BIX: hazy "That's me in the corner, that's me in the spotlight" -R.E.M.