hull@hao.ucar.edu (Howard Hull) (04/03/89)
Having installed a Quantum 80S hard disk in my A2000, I then had the problem of backing up the hard disk files. Since I have not had time to put very much on the disk, I decided that the best way to back it up for the time being was to transfer the directories one at a time to a ram disk and then use DosKwik to write them from the ram disk to floppy. The use of the ram disk would then allow me to selectively recover files if necessary. One of the largest directories I have is full of "engineering sketches" in Aegis Draw Plus format. To accomodate this directory required 1.8 megabytes in ram, and two DosKwik disks upon which to store the archive. Imagine my surprise when RiteKwik announced that my ram disk had a read error when I tried to push this large directory to disk. I decided to investigate this in more detail. I got the MemBoardTest off of the Fred Fish AmigaLibDisk158 and set the low address for 200000 and the high address for 3FFFFE. The program ran through the write cycle on the first pass (green dots) and then started through the read (blue dots). At 2C0194 it started reporting read errors. Errors were reported for the two center address digits in a pattern that went 01, 05, 09, 0D, 11, 21, 25, 29, and 2D. The bad data bit (with the Linear Test) was bit 12 (that's of course on a scale from 00 to 15 decimal). Well, I guess that explains some of the mysterious crashes I have been having whenever I have had things fairly well loaded up. [I hereby publicly take back half of all the bad things I ever said about the Manx z editor - but not all of them! :-) ] I had been working with a one-meg recoverable ram disk (which occupies 300000 to 3FFFFF), and the AmigaDOS Ram Disk, which usually starts somewhere in the C00000 memory and crosses over into the 200000 boundary after I get about a quarter meg of files in it. With that configuration, I would only have 192Kb of Fast Mem left when the crashes hit. I thus assumed that some of my programs were not getting along with one another with respect to resource management - but I work for a federally funded institution where such behavior is the rule rather than the exception, so I thought nothing much of it, and simply reduced the load. It would be well to note that while the MemBoardTest is a wonderful program, it does write to memory much as you ask, so it may very well cream some of the AmigaDOS soft mem pointers, and may then either crash at the end of a cycle through the test, or upon exit from the MemBoardTest program. Also, for my A2000, MemBoardTest crashes the system if I use the Palette gadget. When using the CLI, the MemBoardTest program is invoked from the MemBoardTest directory on the Fish disk by the command "Run main". In particular, for the A2052 autolocated at 200000 Hex, avoid testing below 210000 or above 3FF000 if you want to stay alive for a while. With the one-bit error revealed by MemBoardTest, the problem now was to locate the bad memory chip. I looked in my paper file to see what was in the A2052 documentation. Nothing. N O T H I N G. I called my local Amiga dealer. He said "dunno how they're mapped." More calling around yielded zilch about this. Luckily I had a sales brochure put out by CBM that had a picture of an A2052 and a partially populated A2058 on it. The partial population in sockets on the A2058 occupied eight positions at the upper left on the top row, and eight more at the top right. So I figured that the A2052 chips - which are soldered in place - would have a similar layout. As it turned out, my assumption in this regard was correct, but was not really enough to go on at the start. I was worried that I was going to have to use a logic analyzer to figure it out. That would also mean a lot of physical hassle. As a first attempt, I decided to use an oscilloscope. As I did not have a schematic for the A2052, I decided to look through the A2000 manual to see if I could learn anything from what was done there. This turned out to be a good lesson indeed. From the schematic marked "Memories alone in the moonlight" I was able to determine a typical 512K layout. The first disappointment was the discovery that the write enable (_WE) was set for all four banks any time a write was issued to any bank (4 X 512K is 2 meg, eh?). However, I could also see that column address strobe (CAS) and row address strobe (RAS) were driven by '244 buffers and had "safety resistors" in series with the drive lines to protect against a short in any individual chip. I therefore decided to gamble by shorting the RAS and CAS pins of various of the chips to ground one at a time while running the MemBoardTest to see which bank crapped out when I crowbarred a strobe line. When I shorted CAS to ground, the machine instantly crashed. Same goes for whenever I shorted either RAS or CAS for bank 200000-27FFFF. But when I shorted RAS to ground on the 280001-2FFFFF bank, the entire bank would show massive failures for all bits in the bank. (RAS is pin 15, so be careful and don't nail the +5 on pin 16 with the grounded test lead. Use a "bug clip" and an insulated pin socket that fits the pins on the bug clip.) From this I determined the following pattern for an A2052 autolocated to the range 200000 to 3FFFFF: 15 14 13 12 11 10 09 08 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00 15 14 13 12 11 10 09 08 15 14 13 12 11 10 09 08 07 06 05 04 03 02 01 00 ^ 07 06 05 04 03 02 01 00 07 06 05 04 03 02 01 00 The bank addresses for the above bit enumerations are: High Bytes - odd addresses Low Bytes - even addresses 200001-27FFFF 300001-37FFFF 200000-27FFFE 280001-2FFFFF 380001-3FFFFF 280000-2FFFFE 300000-37FFFE 380000-3FFFFE The position of my chip having the bad data cells (address incl 2Cnn94) is marked above with a carat. To repair the board, I chopped the chip out of the board with a pair of diagonal-cutting mini-pliars, unsoldered the individual leg stubs one at a time with a temperature controlled soldering iron (750 degrees), gently pulling them out from the component side of the board with the application of the soldering iron, and using a pair of groove-jawed long-nosed mini- pliars. Then, in a separate operation, the excess solder was sucked out of the holes using a solder sucker pop-gun on the circuit side of the board while applying the soldering iron from the component side. I much prefer the pop-gun to the alternate approach, which is to sop up the solder using a copper-rosing wicking web. The temperatures have to be much higher for the wicking web, and it is likely that the pads may be scuffed and detached from the board while trying to press the wick into the hole. Note that the pin 8 and pin 16 PC board lands are pretty hunky, so will take more heat to free those pins than is needed on other pins. Watch the work carefully. To stabilize things, I had the board clamped in the slotted rubber jaws of a PC card vise. So one does need some equipment to do this gracefully... I then put the board back in the Amiga (sans one chip) to verify that I had blown away the correct bit in the correct bank. As that all checked out with a missing bit 12 in the proper bank, I then soldered in a new chip, a 41256-15 supplied by a Denver Amiga dealer at quite a few more bucks than I would have to have paid for it had I waited until Monday and drawn it from federal stocks. [I do use my personal Amiga to get my job done, so the chip did die in honorable federal service, and did deserve a regulation burial and a federal replacement :-) ]. A re-check with the MemBoardTest showed that the board was now working properly. At this point in time it is evident that CBM has gone to socketed memory boards (yea!). Failed A2052 boards are thus probably regarded as junk by CBM and many repair facilities. The mandated method of repair may be to buy an A2058 or get 32-bit memory on an A2620 '020 board. In any case, one probably shouldn't put very many A2052s in an A2000 rather as a matter of the power dissipation. The A2058 runs quite a lot cooler, due mainly to the lower power density density in the 1-meg DRAMs. So I offer this information on the repair of A2052 memory cards in the light that they are dinosaurs in this day and age, and while basically reliable, 41256-15 mem chips do die at usually the most inconvenient of times, and it helps to know how to fix the boards on isolated weekends when you want to keep going no matter what... Howard Hull hull@hao.ucar.edu