cloos@acsu.buffalo.edu (James H. Cloos) (02/14/91)
I do not want to alarm anyone, as I have not been able to reproduce this, but Monday morning I was checking out the performance of the SPLINE suite that was posted here, and ran into an interesting problem. I gave the routines the data [ 1 5 14 30 55 91 125 153 171 175 161 125 90 58 31 11 ] [ 3 18 ] and hit the key. After about 5 minutes, I gave up on the program & tried to break out. ATTN didn't work, so I went so far as to try ON-C. This was also non-functional. (Last April or May I ran into a similar problem when I had cleared all of the key assignments & put myself in USER mode; ON-C did not work even though it is supposed to. The next time I tried it, ON-C did work, but that time it did not. I gave up on reproducing that until now.) I ended up taking the foot off and defibrillated it. The screen went blank so I hit ON, which turned it back on & gave me a Do You Want To Recover Mem screen. Of course I hit YES, and soon got that screen back, so I hit YES again & was returned 8 directories (D.0[1-8]) of which the first contained all of the other 7 (named D.0[1-7]) and the others were each of the subdir's I had w/ the exception of the SPLINE dir & its parent (which, interestingly, contined many of those recovered dirs). The '' dir came up as one of the D.... dirs. I did find it interesting that the 128K RAM card I had MERGEd was still MERGEd and the BAK in it of HOMEDIR was unchanged. The clock was also OK. I was able to recover by RESTOREing the BAK from PORT0. What is the moral of this story? I don't know, but this does make 3 times now that I had the calc lock up completely, and the 1st and 3rd were while it was running a strictly userlanguage program. (The 2nd was so bad even ON-A-F didn't work, nor did the defibrillator; only removing the batteries did.) After the first time, I mentioned in a posting that ON-C could be disabled by clearing the assignment of the ON key--asuming you are in USER mode. Someone on the design team (maybe Bill) followed-up that nothing could (in userlang anyway) block ON-C. My subsequent trials agreed with that sentiment, but now it has happened to me again, & I KNOW that I was pressing ON and C real well many, many times. There MAY be an obscure problem, or I may just have gotten a thick spurt of some interference (cosmic rays?) that mucked up RAM just enough to crash. I was given to understand that defibrillating would NOT trash mem, so it looks like trashed mem caused the crash. I would be curious to learn if anyone else has had unexplainable crashes (w/o the use of non-userlang stuff). P.S. I'm currently leaning toward some kind of RAM-affecting interference. P.P.S. FLAMES to /dev/null, bitte; only constructive replies need reply or follow-up. -JimC -- James H. Cloos, Jr. Phone: +1 716 673-1250 cloos@ACSU.Buffalo.EDU Snail: PersonalZipCode: 14048-0772, USA cloos@ub.UUCP Quote: <>
frechett@spot.Colorado.EDU (-=Runaway Daemon=-) (02/14/91)
I don't want to quote this whole article, but you said that you were led to believe that the reset button didn't trash memory. I don't think it does. It sounds to me like the memory was trashed before you hit the button. If you consider that they calc was already locked up and nothing else was working, I would have to think that your memory wasn't in very good shape. Try hitting the reset button when the calc is working fine. It seems to be analgous to the hp28s's ON-ENTER-BACKSPACE sequence which just turns the calc off. You will notice that the reset button always does turn the calc off. As for the actual process of locking up the machine, it is easy to do. I have crashed my memory many times. The last time, was while beta-testing NRTS0.1 and seemed to have something to do with the input buffer, as everything was hunky dory when I left it and a few minutes later it was displying Try to recover Memory? I have locked it up bad enough to have to hit the button about 4 times and in about 3 out of 4, the memory loss was total. If you think about it, other machines do it too. I used to freak out Apples many years ago. Ever play with screwed up binaries on a Mac? Wasn't it something like "Fatal System error" and it shows a little bomb.... I have locked up my share of PCs and the best was when in the time span of about an hour, I managed to crash 5 DECstation 3100s 3 or 4 times each. Our big DEC 5500 and SEQUENT machines freak and die about once a week. The main difference between them and the hp48sx is that the hp48sx just doesn't come back up as easily. It doesn't have a drive. I am getting a second 128K card so I should be able to use this one for all myu backups. Will make hacking a much safer proposition. Fun little machine. ian
mcgrant@elaine30.stanford.edu (Michael Grant) (02/15/91)
In article <59716@eerie.acsu.Buffalo.EDU> cloos@acsu.buffalo.edu (James H. Cloos) writes: >There MAY be an obscure >problem, or I may just have gotten a thick spurt of some interference >(cosmic rays?) that mucked up RAM just enough to crash. > >P.S. I'm currently leaning toward some kind of RAM-affecting interference. It could be a 'soft error', which is simply the interaction of radiation with the memory, causing a temporary read error. I've worked in memory failure analysis before, and soft errors are enough of a problem that they are used as a measure of the robustness of a new design. They bombard the thing with alpha particles, and see if anything goes wrong. Of course, this means that they get many more errors than anyone else ever will. Well, to be perfectly honest, there is no way of knowing if this is the reason why the calculator crashed, but, despite all semiconductor companies' efforts to prevent them, they still crop up--especially since memory cell size keeps shrinking (the lower a cell's capacitance, the fewer the number of electrons, the higher the susceptibility). Hell, for all I know, it could be completely irrelevant to this particular crash--I've never known a bug in my life to be attributable to anything but software error. But, your mention of cosmic rays brought this to mind. Just a wild-eyed suggestion, Michael C. Grant
ervin@pinbot.enet.dec.com (Joseph James Ervin) (02/15/91)
>It could be a 'soft error', which is simply the interaction of radiation >with the memory, causing a temporary read error. I've worked in memory >failure analysis before, and soft errors are enough of a problem that they >are used as a measure of the robustness of a new design. I believe you can pretty much rule out alpha particles as a source of the errors you've seen. Such "soft" errors are a phenomenon of dynamic memory devices. The memory in the HP48 is static, so alpha particles have a much, much, much smaller chance of doing any bit-flipping than in the case of dynamic memory. >>>Joe Ervin
mamos@uafhp.uark.edu (Mark _E_ Amos) (02/16/91)
As long as we're on the subject of RAM problems and lockups, I would like to share my latest adventure, in the hopes I can learn just what the H*LL is happening. -Some guys I know in my department discovered they could use Smith-Corona 32K RAM cards ($23.84, local Wal-Mart) in their 48's. Out of 3 purchased, 1 didn't work, and a quick exchange at the store fixed that. Here comes the fun part: A friend and I figured we'd cash in on this cheap expansion stuff, and after discovering the local Wal-Mart was sold out (word gets around, eh?) we went to one at a neighboring town, finding plenty of new and unopened blister packs of the little jewels. I bought 2 and my friend bought 1. Upon reaching the car, I got my 48 out and plugged 1 of mine in - it worked great! I then plugged the second one in and turned on the power to see random vertical lines, then blank screen, then a BLACK screen which began bleeding and pulsating from left to right, with NO key sequence having any affect at all. The bleeding screen smacked of overdrive, so I unplugged the second card. The machine then came back after a bit of a wait and key pressing. -Ok, fine, I happened to get a bad card like the other dude I knew... So, I tried my friend's card - same exact thing. This happened with the two "bad" cards whether I had the "good" one in or not, and didn't matter which slot. Now what? Try them on my friend's machine - they ALL 3 worked flawlessly, including two at once... -Hmmm. Well, I had about 28k used on mine, including a library in port 0, so I copied all my stuff to his and duplicated the above sequence. Same result. Next I tried checking ROM versions, etc.-identical (D). I then began grabbing for straws - wiped my machine (ON-A-F, No) and tried again: same thing. OK, he had his old original batteries and I had brand new ones (NOW we're grabbing for shadows of straws) so I swapped them - same result. (Incidently, the ON-D, G sequence does NOT seem to have any indication of relative battery strength WHATSOEVER, as I checked this during the process). -I finally gave up and went back in to exchange one of the "bad" cards, hoping I would get lucky and get one that would work on mine like the one I had that would. No such luck. This one also worked fine in my friend's but did the bleeding screen bizness on mine. -I have been an electronics tech for upwards of 8 years, and an engineering student for 2 years so far - I consider my methods logical and thorough, yet I can find NO explanation whatsoever for this "phenomena". I will be sending my machine in to update ROM to E in a few months, but in the mean time, what in the name of Sam Hill is happening? I know, I know, it's not an authorized HP card, etc., etc. - but the fact remains 3 of 4 cards would NOT work on MY machine, yet 7 of 8 work on 3 other machines - the 8th of which I have yet to discover the actual symptoms of (ie. screen bleeding, or just no workum?). -Conclusion: authorized RAM card or no, there are obiously SOME kind of differences between machines that are ROM independent. OK, so let's talk about the variations in line drivers/buffers, etc. -I know these differences exist, but what I would like to know most of all is, am I just the fluke or does this kind of thing happen to anyone else out there? "Curiouser and curiouser..." ============================================================================== Mark _E_ Amos | University of Arkansas Computer Science Engineering mamos@uafhp.uark.edu | mea1@engr.uark.edu | (emphasise the Computer Engineering please) ------------------------------------------------------------------------------ "Man's mind, when stretched to a new idea, never goes back to its original dimension." -Oliver Wendell Holmes ==============================================================================
cah@gripl.UUCP (Chris Heitmann) (02/19/91)
Well, I too went out and tried a S.C. ram card from Lechmere ($27.99 :-( ) and it did not work. I observed the same bleeding screen effect that the previous poster saw. I exchanged it for another, and the same thing happened again. I will be trying another soon, but wanted to give it a rest for a day (supersticious I guess...). One word of caution, when I tried the unauthorized ram cards, my memory was erased. Not just the internal memory but the 128k ram card also. The external memory was merged with the internal so possibly if it were a backup or something instead, it would not have been erased. In any case I could not find any serial numbers on the S.C. ram card to compare so as to figure out any differences. I am considering trying the Korg memory cards (128k!) from the local music store ($115.00 if memory serves...oops, no pun intended!). Chris cah@gripl.uucp
frechett@spot.Colorado.EDU (-=Runaway Daemon=-) (02/20/91)
From what I can tell... I think the Smith Corona Cards are as better deal and possibly better cards... Unfortunately, I don't have the old discussion on Epson RAM cards archived, so I don't remember what all the differences were. Something like, the Epson cards are made to run at 5-5.5V and the HP cards are of course made to work at 4.5V. I don't know how this translates into weird problems described here. My personal theory is that it could have something to do with the various versions with the dying LCDs. I know that they extended up into the Ds. Any ideas anyone? ian -- -=Runaway Daemon=-
TNAN0@CCVAX.IASTATE.EDU (02/20/91)
Chris, I purchased a Smith Corona "DataStore 32K" card today, brought it home, tried it out and it works perfectly. It's model number (I think that's what it is) is S 75531. I have tried it in two HP-48s so far and I haven't yet noticed any odd screen (or memory) effects. I have: HP-48D (and tried it on an E) Serial #: 3031A00755 I purchased the card at Wal-Mart for $25.03 (after 5% sales tax). I've tried it solo and with the EQLIB card, but not with another RAM card-- could this be the problem? ---Xeno
jcohen@lehi3b15.csee.Lehigh.EDU (Josh Cohen [890918]) (03/13/91)
try the sc card with NEW batteries.. I have heard that low batt condition in the hp can cause a crash wth the insertion of a sc card.