mercurio@crash.CTS.COM (Phil Mercurio) (07/14/87)
[-] The following is a description of a close call I had with my 20 MB Supra hard disk on my Amiga. Warning: those of you with queasy stomachs, who lie awake at night worrying about how thoroughly you've backed up your hard disk, may find the material below distressing. I will attempt to alleviate the suspense, however, by revealing now that there is a happy ending. I should also mention that this is a tad long-winded. First, I should describe my hardware configuration. I have an old Amiga 1000 (acquired in Sept. '85) with a 20 MB Supradrive, an ASDG Minirack-C containing a 2 MB memory board plugged into the Supra controller's bus connector, and two external 3.5" drives (I have a long desk). All of what I describe occurred under Kickstart/Workbench 1.2. Last night, a friend and I were attempting to use Carolyn Scheppner's wonderful Cmd program to generate a printout that we would later send to an HP LaserJet. We were using my PagePrint1.3 program (available on DevDisk0005), which pretty-prints C source, and both the C file being printed and the printer image file being generated by Cmd were on my Supra hard disk (in different directories). PagePrint accesses the printer by Open()'ing "prt:" -- nothing magical going on here. After a few seconds of disk activity, the system froze (no Guru, just an unresponsive system). This didn't really bother us, since a screen grabbing program (Snatch, part of the commercially available WindowPrint) we'd been using earlier had been crashing repeatedly. We administer the Amiga/Vulcan nerve pinch (CTRL-Ah-Ah) and the system reboots. The first command in my Startup-Sequence is Supramount, as it should be. Upon the attempt to be mounted, the Supra starts blinking its busy light madly and continuously (it normally mounts in a second or so). This went on for several seconds, maybe a minute, before my friend and I realized something was not right. We rebooted again, and again the system seemed to get stuck attempting to mount the drive, blinking that little busy light until the cows come home (there aren't many cows here in San Diego -- I'm from Chicago, where cows have been known to cause trouble in the past). We attempted to boot off of a standard 1.2 Workbench disk (which does not attempt to mount the drive), and it came up fine. We edited the Startup-Sequence on my Workbench to remove just the Supramount command and booted from that disk -- no problems. Attempting to mount the drive again caused it to busy loop, again. We rebooted again, this time shutting the power to the entire system off before it knew what hit it. We waited about 10 seconds, then turned the power back on (I have everything plugged into one power strip so I can turn everything on at once -- this has been working fine for months). The system kickstarted properly, but again busy looped at the attempt to mount the drive. This was starting to look serious. Although I was reasonably well backed up, I had no desire to reformat and reconstruct an 85% full 20 MB disk. We had begun to suspect that the Supra was taking the initiative and attempting the reformatting for me. It was clearly time for action. We had one observation upon which we could base a decision: the pattern of blinking of the busy light demonstrated a small amount of variability -- signs of life! I consulted with my friend, Phil Cohen, whose wisdom I have sought out and deferred to since we worked together at UCSD 9 years ago. He'd been witness to this entire melodrama. We decided that the drive probably either knew what it was doing, or else all was lost anyway. We decided to attempt to mount the drive again, and this time, to wait until the busy light stopped blinking. So we waited, and we waited, and then we waited some more. If this were a 1940's black-and-white movie, the hands of an analog clock would be seen advancing at many times their normal rate, then the pages of a calendar would begin to tear off and flutter away. Seasons would change, the frost would melt, springtime birds would begin to chirp ... sorry, I get carried away. Seriously, I was too distraught to think to time anything, but I would estimate that the light blinked for over 5 minutes. Occasionally, it would pause for a second or two, then continue. And then, it stopped. I gingerly approached the keyboard to cd to dh0: -- it was there! I checked out a few important directories -- all there. I did an info -- the hard disk was as full as I had expected it to be. We toasted our good fortune and began backing up everything in sight. And I've had no problems since then. It ran flawlessly for several hours more that night, and is still humming along this morning. The damn thing actually healed itself. I have no idea why this occurred in the first place, and I have no desire to attempt to replicate it. I don't think the fault lies with either Scheppner's Cmd or with my PagePrint, but rather with the Supra itself. But then, it did fix itself, so it's hard to complain. All in all, I'm pleased with the Supra's performance, and would chalk this up to either good design on Supra's part or just plain dumb luck. Phil Mercurio DevWare, Inc. mercurio@pnet01.CTS.COM Usenet mercurio PeopleLink or GEnie
bryce@COGSCI.BERKELEY.EDU (Bryce Nesbitt) (07/15/87)
Sounds sort of like the AmigaDOS disk "validator". You described a
system crash and when things came up again the drive spent 5 minuites
blinking it's light.
If it's the validator then what happened was this: The crash prevented
AmigaDOS from from perfroming a final update on the hard disk. Next
time things where booted AmigaDOS noticed somehting "strange" and decided
to validate the ENTIRE disk.
AmigaDOS keeps a lot of links that can be used to reconstruct most of
even a badly mangled disk. THIS IS A FEATURE! If you happend to zap
the dual copies of the FAT on an IBM disk you be "up data creek without
any file links". (Your data would be quite scrambled)
-----------------------------
|\ /| . Ack! (NAK, EOT, SOH)
{o O} .
( " ) bryce@cogsci.berkeley.EDU -or- ucbvax!cogsci!bryce
U "Success leads to stagnation; stagnation leads to failure."
cmcmanis%pepper@Sun.COM (Chuck McManis) (07/15/87)
Phil, and others who will inevitably have this happen to them. I believe you have witnessed the disk-validator in the flesh. You see when the disk is being written to and hasn't finished the file yet, and the machine crashes the disk has to be revalidated before AmigaDOS will trust it. If this has happened to you on a full floppy you know that it took nearly a minute to validate your 880K floppy, if it happened on your 20 meg hard disk, it doesn't take 20 minutes but it takes a good long time. So yes, you will need to wait a while, and cross your fingers and hope it doesn't say 'Disk structure Corrupt, Use DiskDoctor to correct it.' --Chuck McManis uucp: {anywhere}!sun!cmcmanis BIX: cmcmanis ARPAnet: cmcmanis@sun.com These opinions are my own and no one elses, but you knew that didn't you.
higgin@cbmvax.UUCP (Paul Higginbottom SALES) (07/15/87)
In article <1385@crash.CTS.COM> mercurio@crash.CTS.COM (Phil Mercurio) writes:
$The following is a description of a close call I had with my 20 MB
$Supra hard disk on my Amiga.
$[...SNIP...]
$We decided to attempt to mount the drive again, and this time, to
$wait until the busy light stopped blinking. ...I would estimate that
$the light blinked for over 5 minutes. And then, it stopped.
$I gingerly approached the keyboard to cd to dh0: -- it was there!
$I checked out a few important directories -- all there. I did an
$info -- the hard disk was as full as I had expected it to be.
$Phil Mercurio
$DevWare, Inc.
What you had witnessed was simply the hard drive being VALIDATED by
AmigaDOS. Since it was SOOOOO full, it took forever (well.. 5 minutes).
This was caused by the fact that the disk bitmap probably didn't
checksum correctly, and caused AmigaDOS to rebuild it. And the bitmap
didn't checksum because a command had been writing to the disk when
the machine crashed so badly that DOS didn't have chance to finish
writing the critical information back to the drive. But, thanks to
the redundancy in AmigaDOS, it is able to heal itself.
Ya know, some people complain about the poor performance of AmigaDOS,
but ask yourself something - have you ever lost a disk because of
the DOS? In my experience it has ALWAYS been media failure, or
copy protection failure.
Paul Higginbottom.
davidlo@madvax.UUCP (David Lo) (07/16/87)
In article <8707150606.AA03673@cogsci.berkeley.edu>, bryce@COGSCI.BERKELEY.EDU (Bryce Nesbitt) writes: > If it's the validator then what happened was this: The crash prevented > AmigaDOS from from perfroming a final update on the hard disk. Next > time things where booted AmigaDOS noticed somehting "strange" and decided > to validate the ENTIRE disk. > > AmigaDOS keeps a lot of links that can be used to reconstruct most of > even a badly mangled disk. THIS IS A FEATURE! If you happend to zap > the dual copies of the FAT on an IBM disk you be "up data creek without > any file links". (Your data would be quite scrambled) > I've noticed the first time I boot from an almost full workbench, the Amiga took quit a while (like 1 to 3 minutes, or at least seems to be) to load workbench. The subsequence reboot takes less time. I guess what happens is the Amiga actually attempted to rearrange the segments of the workbench disk. Isn't it true ? -- David Lo (415)939-2400 /\ o Varian Instruments, 2700 Mitchell Drive, Walnut Creek, CA 94598 \/ {ptsfa,lll-crg,zehntel,dual,amd,fortune,ista,rtech,csi,normac}varian!davidlo
rokicki@rocky.STANFORD.EDU (Tomas Rokicki) (07/16/87)
SCSI drives fix themselves. If your box crashes while doing a write to the drive, the SCSI drive will play with itself for a while until it repairs itself. The time this takes is dependent on the size of your disk partition; this is an excellent reason to partition a hard disk. Perhaps a 4MByte development partition where you develop your most crash-prone programs, so the box comes back up quickly. Sometimes, as happened to my CLtd drive a long time ago, the light never stops flashing. (We are talking days here.) Then it's time to worry. I turned the system off for a week, turned it back on, and everything was back. I sent the drive back to CLtd anyway . . . -tom
fnf@mcdsun.UUCP (Fred Fish) (07/16/87)
In article <2122@cbmvax.UUCP> higgin@cbmvax.UUCP (Paul Higginbottom SALES) writes: >Ya know, some people complain about the poor performance of AmigaDOS, >but ask yourself something - have you ever lost a disk because of >the DOS? In my experience it has ALWAYS been media failure, or >copy protection failure. I suspect this is going to generate *lots* of heat and flames! The DOS seems to be absolutely the most fragile part of an otherwise nicely engineered system (aside from lack of an MMU which is probably my number one gripe). I have never lost a single floppy to a media failure, after it passed format and verified, though I've run into a few that wouldn't format (far less than 1%). But then I always use top grade DSDD floppies. I have lost a couple that could be attributed to copy protection. But I've lost count of the number of floppies that have bit the dust because the system guru'd while a write to the floppy was in progress. I've also had to completely reformat and reload my hard disk several times because of the same problem. Because I am probably more paranoid than most people about the filesystem reliability, I've backed up my stuff religiously and have not yet lost anything really important. -Fred -- = Drug tests; just say *NO*! = Fred Fish Motorola Computer Division, 3013 S 52nd St, Tempe, Az 85282 USA = seismo!noao!mcdsun!fnf (602) 438-5976o
daveh@cbmvax.UUCP (Dave Haynie) (07/16/87)
in article <1385@crash.CTS.COM>, mercurio@crash.CTS.COM (Phil Mercurio) says: > Keywords: Amiga hard disk Supra horror story > > The following is a description of a close call I had with my 20 MB > Supra hard disk on my Amiga. Sounds like a long visit from the disk validator. This can happen when the system crashes somehow in the middle of a disk operation, even if the crash occurs at a relatively safe time, like when no disk activity is actually taking place. The disk's bitmap gets marked invalid, some operations take place, and before the disk get marked valid again, the crash takes place. Next time you start up the disk's handler, it validates the bitmap, which on a nearly full 20Meg hard drive certainly could take some time. The SupraMount command is probably why this happened before you could see the disk; on a HD that get's mounted via BindDrivers, the disk usually shows up as soon as the system comes up, even though the validator may run for some time. How is the Supra Drive? A friend asked me about a problem he's having with it in combination with a ComSpec memory card; maybe you or someone else out there has some ideas. He's got an Amiga with a 68010 (running DeciGel), the 2 meg ComSpec, and the Supra. When the ComSpec and Supra are used together, the machine gurus in WorkBench. Used separately, they work OK. And he hasn't been able to make them crash together from CLI. I'm not at all familiar with the Supra Drive, but that SupraMount command is the first thing that makes me suspicious. > (there aren't > many cows here in San Diego -- I'm from Chicago, where cows have been > known to cause trouble in the past). We have lots of cow problems here in PA, so I can sympathize. Can't keep the suckers out of lab or the computer rooms.... > Phil Mercurio > DevWare, Inc. -- Dave Haynie Commodore-Amiga Usenet: {ihnp4|caip|rutgers}!cbmvax!daveh "The A2000 Guy" PLINK : D-DAVE H BIX : hazy "Catch a wave and you're sittin' on top of the world" -Beach Boys
mph@rover.UUCP (Mark Huth) (07/16/87)
In article <1385@crash.CTS.COM> mercurio@crash.CTS.COM (Phil Mercurio) writes: > >The following is a description of a close call I had with my 20 MB >Supra hard disk on my Amiga. > ......... >Open()'ing "prt:" -- nothing magical going on here. After a few seconds of >disk activity, the system froze (no Guru, just an unresponsive system). > ......... >We administer the Amiga/Vulcan nerve pinch (CTRL-Ah-Ah) and the system >reboots. The first command in my Startup-Sequence is Supramount, as >it should be. Upon the attempt to be mounted, the Supra starts blinking >its busy light madly and continuously (it normally mounts in a second >or so). This went on for several seconds, maybe a minute, before my >friend and I realized something was not right. We rebooted again, and > This is normal behavior for a hard disk which has been rebooted with an invalid bit map! Those of you with hard disks, do not panic! The disk validator (I think) is performing its proper function. When the drive is mounted the validator checks out the drive, and discovers that the bit map on the disk is INVALID. This usually happens when the system has crashed with files opened for writing. Whell, it takes a long time (5 -10 minutes) for the validator to examine EVERY sector on the drive and determine if it is allocated to a a valid file or not, and then repair the bit map. LEAVE IT ALONE while this is happening a give thanks to those that designed this program WHICH IS SAVING your hard disk from becoming a useless bit bucket. I learned this from hard experience. I am affiliated with a company which is developing yet another hard disk for the Amiga. It works through the parallel port (groan - but there are good points to this as well) and is rather inexpensive. I'll give more details of this RSN, as the last bugs are being ironed out. Anyway, during the course of development of drivers and backup utilities, I have left many a hard drive in limbo. The first time that I rebooted and the light came on and stayed on I was quite discouraged - so much so that I couldn't do anything for several minutes. Fortunate indeed, as I discovered what was really happening when the light went out and continued the startup sequence. Well done! (Actually, shades of Un*x after a crash.) Another thing that I have learned is that the DiskDoctor program does work on properly designed hard drives. Having gotten the drive to work well enough that I refused to work without it, I managed to discover some lurking hardware bugs that occasionally corrupted the data on the disk. After repairing these bugs (not yet having developed the backup program) I was faced with many busted directories. So, with nothing to lose, I ran DiskDoctor. It ran, and ran, and reported its completion. I did a dir on the drive and found to my horror that it was empty! I poked around with a disk dumping routine I've written, and discoverd that everything was intact, but disconected form the root directory. Well, thought I, this can be patched (and if Commodore would get off their duffs and send us the developer kit for which we have paid, I might have used DiskEd). I decided to rerun DiskDoctor, and after some thrashing and the trashing of some unimportant files, it completed. Now the disk was intact!! Someone has a sense of humor, though, because what used to be JE_sys_disk was now called Lazarus. I ran development on it for several more weeks until just this weekend I was able to get both backup and restore to perform useful work. FLAME ON - full heat to Commodore Jefferson Enterprises played your silly developer games, was granted the priviledge of sending in our 50 bucks. This procedure took several months, but three months ago, WE SENT IN OUR MONEY. We called a month later and were told that the management shakeup had slowed things down a bit - be patient. Two more months, have elapsed, and now we get an answering machine and most recently a recording telling us that the number had been disconnected. We really don't want to turn Commordore into the Postmaster General for mail fraud, but WHERE IS THE DEVELOPER KIT!!!!!!!!! You took our money, using the US Mail, now I want the goodies! We would relly like to make hardware for the expansion port but I find that difficult without the bus timing information that is supposed to be in the developer kit. We got plans for a really spiffy controller that will have performance approximating a RAM:disk but WE NEED INFORMATION! FLAME OFF - enter keyboard cool down period. Mark Huth seismo!nogo!mcdsun!rover!mph >Phil Mercurio >DevWare, Inc. > >mercurio@pnet01.CTS.COM Usenet >mercurio PeopleLink or GEnie
peter@sugar.UUCP (Peter da Silva) (07/19/87)
> Ya know, some people complain about the poor performance of AmigaDOS, > but ask yourself something - have you ever lost a disk because of > the DOS? In my experience it has ALWAYS been media failure, or > copy protection failure. > > Paul Higginbottom. So the only criterion we should apply to a file suystem is whether or not it crashes the disk. Great. Yes, of course I disagree. Speed is important. It shouldn't take more time to get a directory listing than to read a file of the same size. This means that on the Amiga floppy you should not have to do more than one disk access to get the current directory until the total size of all the file headers in the current directory exceeds 5K. At the very least you should *attempt* to put the file headers contiguously after the directory, to the point of not allocating any other blocks on a directory track until there simply isn't any more room elsewhere. Ideally you should do a UNIX-style file system with preferential caching of inodes, then directories, then files. It's also probably a good idea not to cache large files that are being read sequentially. Yeh, and how about sequential block prefetch and disk seek optimisation. Then you get loadseg to request all blocks at once, to cut out the dreaded "run grind" you get after you've run a program (or selected it form the workbench) and then thoughtlessly typed another command that accesses the same disk. Of course, your mileage may vary. -- -- Peter da Silva `-_-' ...!seismo!soma!uhnix1!sugar!peter (I said, NO PHOTOS!)
carolyn@cbmvax.UUCP (Carolyn Scheppner CATS) (07/21/87)
In article <419@rover.UUCP> mph@rover.UUCP (Mark Huth) writes: >[] >FLAME ON - full heat to Commodore > >Jefferson Enterprises played your silly developer games, was granted the >priviledge of sending in our 50 bucks. This procedure took several months, but >three months ago, WE SENT IN OUR MONEY. We called a month later and were told >that the management shakeup had slowed things down a bit - be patient. Two more >months, have elapsed, and now we get an answering machine and most recently a >recording telling us that the number had been disconnected. We really don't >want to turn Commordore into the Postmaster General for mail fraud, but >WHERE IS THE DEVELOPER KIT!!!!!!!!! You took our money, using the US Mail, now >I want the goodies! We would relly like to make hardware for the expansion port >but I find that difficult without the bus timing information that is supposed >to be in the developer kit. We got plans for a really spiffy controller that >will have performance approximating a RAM:disk but WE NEED INFORMATION! > >FLAME OFF - enter keyboard cool down period. > >[] GOOD NEWS: Lauren says that she has spoken to you about the delay. Your package was mailed about 2 months ago to a Post Office Box which was the wrong address, and we were not aware of this until it was returned to us recently. I am not sure if the PO Box address was supplied on your application or if that was an error at this end. We apologize for the delay. Your package was remailed to your correct address last week and you should be receiving it very soon. BAD NEWS: The Certified developer package does not include DiskEd (which will be on the Software Tools product), and does not include bus timings (these are in the A1000 Schematics and Expansion Specs). It does include a 1 year Amigamail subscription, $20 BIX discount coupon, 1.2 Enhancer (which may be returned un-opened for any of our $20 support materials), and a discount hardware price list. The hardware discounts alone are worth the $50. -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Carolyn Scheppner -- CBM >>Amiga Technical Support<< UUCP ...{allegra,caip,ihnp4,seismo}!cbmvax!carolyn PHONE 215-431-9180 =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
carlos@io.UUCP (Carlos Smith) (07/22/87)
Dave Haynie had mentioned his suspicion of the "Supramount" command. It so happens that I recently spoke with a tech support guy at Supra, and asked him about it. He said that the supramount command uses the information in the mountlist along with a file called Supra.0 in the devs: directory to mount the drive. All that is in Supra.0 is partioning information for the drive. He said that if one were to examine the information in Supra.0 and use it to build the additional partitions in the mountlist, that one could then just use the normal mount command. I examined this file, and sure enough, it seemed to mostly be low cylinder and high cylinder numbers for the partitions. Then looking at the mountlist information, it appeared obvious how to explicitly define separate partitions for the hard disk using this information. I will try this next time I repartition the disk, and see if it really works this way. I guess they do this because they have a nice utility that you can run at disk configuration time that lets you easily define partitions using gadgets for the number and size of partitions. The mountlist contains only information for the dh0: device. I figure that rather than mucking around with the mountlist when you set up partitions, or making the user do it, they set up this file and let the supramount command to do it "magically". They also say that this set up causes no problems with the new, hard-disk optimized file system which they say they have been testing. Anyway, I am quite happy with the Supra. I run it daisy-chained with a CLtd Amega board and have never had problems (the AMega is inboard of the Supra controller). It's also nice to have a clock-calendar built in (though I have had the date trashed by particularly violent crashes - copper going crazy, weird sounds from the audio, etc.). -- Carlos Smith uucp:...!harvard!umb!ileaf!carlos Bix: carlosmith
charles@hpcvcd.HP (Charles Brown) (07/30/87)
>> Ya know, some people complain about the poor performance of AmigaDOS, >> but ask yourself something - have you ever lost a disk because of >> the DOS? In my experience it has ALWAYS been media failure, or >> copy protection failure. >> Paul Higginbottom. I am not sure it is so clear whether a particular failure is caused by the operating system or by the media. Well, maybe the failure really is in the media, but the operating system is not very convenient about recovering the disk. I want to be able to recover as much as possible from a corrupt file. AmigaDo* PROTECTS me from that. >At the very least you should *attempt* to put the file headers contiguously >after the directory, to the point of not allocating any other blocks on a >directory track until there simply isn't any more room elsewhere. Ideally you >should do a UNIX-style file system with preferential caching of inodes, then >directories, then files. >-- Peter da Silva `-_-' ...!seismo!soma!uhnix1!sugar!peter One of the nice things about Un*x, with its inodes is linking. Is the Amig* file system able to do this? I sorely miss it. Charles Brown hplabs!hp-pcd!charles