guineau@star.enet.dec.com (W. John Guineau) (04/22/91)
Open question with no personal opinions: Is anyone using SCSI devices in a WARM SWAP mode? By swap, I mean physically removing a device from the bus and replacing it with another. Some definitions for reference: COLD SWAP: everything powed off, peripherals swapped, everything powered up. WARM SWAP: everything powered on, NO activity on bus, peripherals swapped. HOT SWAP: everything powered on, activity on bus, peripherals swapped What are the risks associated with this? Electrical experts? SCSI experts? ANSI? -- W. John Guineau grep meaning life | more VMS Development Digital Equipment Corporation guineau@star.enet.dec.com
kaufman@neon.Stanford.EDU (Marc T. Kaufman) (04/22/91)
In article <22249@shlump.nac.dec.com> guineau@star.enet.dec.com (W. John Guineau) writes: >Open question with no personal opinions: >Is anyone using SCSI devices in a WARM SWAP mode? >WARM SWAP: everything powered on, NO activity on bus, peripherals swapped. >What are the risks associated with this? In my experience, the most common problem is mating the SCSI cable to the connector. Often, people angle the connectors to get them to mate at one end first before seating them. This can cause adjacent lines to short together. In the case of signal lines, the host or a peripheral may see an unwanted command (such as bus reset). In the cast of the trmPwr line, you may blow the fuse (as happens on Macs). Marc Kaufman (kaufman@Neon.stanford.edu)
scion@cs.utk.edu (Sam C. Nicholson II) (04/23/91)
In article <22249@shlump.nac.dec.com> guineau@star.enet.dec.com (W. John Guineau) writes: > > >Is anyone using SCSI devices in a WARM SWAP mode? > >COLD SWAP: everything powed off, peripherals swapped, everything powered up. >WARM SWAP: everything powered on, NO activity on bus, peripherals swapped. >HOT SWAP: everything powered on, activity on bus, peripherals swapped > >What are the risks associated with this? > I was recently writing drivers for WORM drives and would frequently swap drives around. Can't say as I recommend it as good for the health of your running system. The reconnection is going to generate a SCSI reset. Some OSs just will not tolerate them.* I think that I have treated devices on a SCSI every bit as cavalierly as some C programmers treat pointers and integers and as hardware hackers have treated devices on a UNIBUS. I have also panic'ed and re-booted often. As a powercycle time saver I do feel that I saved more time than I lost. Just remember where your terminators are. It is difficult to say whether there is activity on the bus or not. I certainly would not trust the absence of LEDs or the sound of swishing heads as a certain indicator of a quiecent bus. If I had an emergency swap, (e.g. Tape drive failed and I just had to get a crash dump or a backup without a reboot ) I would halt the processor ( L1-A on my Sun, ^P on my VAX; your mileage may differ) and feel that that would halt most activity on the bus, do the swap, and continue; hoping for the best. I don't beleive that I have read any thing about the SCSI bus that would give me any confidence in the correctness of dis- and re-connecting devices with regard to the electrical connections. I believe that the standards folk would not assume that activity as normal. For production use, I would *strongly* recommend removable media devices. Sony, LMS, Bernoulli, and Syquest come to mind for disks Archive, Wangtek and Tandberg likewise for cartridge tapes. GAWD, I ramble on... -sam -------------- * They feel that THEY (being the bus master) have an exclusive domain over the reset line. They are wrong, but they are in use.
ben@epmooch.UUCP (Rev. Ben A. Mesander) (04/23/91)
>In article <hamilton.672457277@kickapoo.cs.iastate.edu> hamilton@kickapoo.cs.iastate.edu (Jon Hamilton) writes: >I'm amazed that you people would think of plugging / unplugging drives with >the power on. I assumed the first post was a joke, but I see that I was in >error. Do y'all plug/unplug boards with the power on too? Are you _really_ >too lazy to restart your machines? Are you _that_ willing to risk your data >or even your hardware? Weren't you ever taught that you don't do stuff to the >inside of a puter with it on?! There are legitimate reasons for such tomfoolery. I used to write firmware for Imprimis. I would take my Wren 5's and plug and unplug them from the bus. Of course I'd also pick them up while running and drop them... I got to know the hardware pretty well, as in "tap that capacitor hard and the drive will generate a servo error." It was a real handy way to test my firmware under adverse conditions. At one time, I was writing a lot of firmware to ensure drive survivability under such conditions. -- | ben@epmooch.UUCP (Ben Mesander) | "Cash is more important than | | ben%servalan.UUCP@uokmax.ecn.uoknor.edu | your mother." - Al Shugart, | | !chinet!uokmax!servalan!epmooch!ben | CEO, Seagate Technologies |
david@talgras.UUCP (David Hoopes) (04/23/91)
In article <22249@shlump.nac.dec.com> guineau@star.enet.dec.com (W. John Guineau) writes: > >Is anyone using SCSI devices in a WARM SWAP mode? >WARM SWAP: everything powered on, NO activity on bus, peripherals swapped. Some of the people around here do this on a regular basis. Every so often they have to replace fuses. You get some really interesting errors when the fuses are blown. If you are not willing to replace fuses fairly frequently then DON'T DO IT. -- --------------------------------------------------------------------- David Hoopes Tallgrass Technologies Inc. uunet!talgras!david 11100 W 82nd St. Voice: (913) 492-6002 x323 Lenexa, Ks 66214
acoolidg@wpi.WPI.EDU (Aaron P Coolidge) (04/24/91)
Hi. I just tried a warm swap on my PC (386sx, 4mb, wd7000 fasst2 SCSI, quantun 105s). I tried adding my spare drive (a miniscribe *ugh* 20M) while the machine was up and running. Plugged the sucker into the external port, tried a dir of the quantum, it came up OK, tried a dir of the miniscribe, got an "invalid drive spec" error. Fine. I <ctrl> <alt> <del> 'd it, and everything came up OK (I could read both drives). Then I unplugged the miniscribe, and tried a dir on the quantum. Nothing- the machine just locked up. Fine. I warm booted it again, and got an "int 19h boot failture". So i reset it- another "int 19h boot failture". Terrific, I thought, I just toasted the WD7000! Power off, wait 1 minute, power on, another "int 19h boot failture". Wait a minute- maybee the boot block's corrupted! Boot the machine off a floppy. try a "dir c:". Get: "invalid drive spec". Lovely! Try FDISK. What?! No partitions defined?! YES, ITS TRUE!!! Ugh! Spend an hour reformatting and reloading from tape. Can anyone shed any light on what may have happened here? All the partition info had been wiped off the quantum (the miniscribe was fine), with the result that I couldn't get my data off! No big deal, but a pain none the less. Was this due to the wd7000, or should I just power off before i plug in/ remove SCSI devices? I guess I should! PS. I plug in and unplug SCSIs all the time wih my Amiga, with it on, and have had no problems yet. -- Aaron Coolidge acoolidg@wpi.wpi.edu bitnet:sorry, use a gateway. "I'm always in control of my car. Well, at least 70% of the time."
hamilton@kickapoo.cs.iastate.edu (Jon Hamilton) (04/24/91)
I'm amazed that you people would think of plugging / unplugging drives with the power on. I assumed the first post was a joke, but I see that I was in error. Do y'all plug/unplug boards with the power on too? Are you _really_ too lazy to restart your machines? Are you _that_ willing to risk your data or even your hardware? Weren't you ever taught that you don't do stuff to the inside of a puter with it on?! -- Jon Hamilton hamilton@kickapoo.cs.iastate.edu " I feel a lot more like I do now that I did before I got here " - can't remember who
kaufman@neon.Stanford.EDU (Marc T. Kaufman) (04/24/91)
In article <hamilton.672457277@kickapoo.cs.iastate.edu> hamilton@kickapoo.cs.iastate.edu (Jon Hamilton) writes: > Weren't you ever taught that you don't do stuff to the >inside of a puter with it on?! Depends on the computer. All telephone stuff and some fault-tolerant systems (like Tandem) are designed to have stuff plugged and unplugged with the power on. It's not overly much work to design PC compatible stuff that won't be injured in a power-on plug-in =provided= you don't short the connector traces. It's too bad that most manufacturers would rather save $0.10 than do it, though. Things like SCSI busses, if implemented per the standard, are relatively immune from gross hardware catastrophe because the drivers are current limited. Again, its too bad that some folks who program the device firmware don't do sanity checking if the bus accidently wiggles. Maybe the SCSI-3 standard should specify a minimum performance standard with respect to bus errors. Marc Kaufman (kaufman@Neon.stanford.edu)
dtb@adpplz.UUCP (Tom Beach) (04/25/91)
Another point which hasn't been mentioned is that many systems do an autoconfig when powered up and any device that isn't on the bus at power on isn't EVER on the bus, whether you add it physically on not. In my testing I frequently swap devices with the SCSI powered. My devices under test are external to the host with separate power supplies and on a second SCSI bus. I power them down, swap devices, power them back up. Do a dummy access to clear the ERROR: Device has been reset! messages, and continue the test. Except for the rare bus power fuse failure, this works great but on many systems You Can't: 1) Add new devices! 2) Change device categories, e.g. replace a disk with a tape! As I said, just my two cents! Tom ------------------------------------------------------------------------ | Tom Beach : Sr Project Engineer : Mass Storage Technology | | phone : (503) 294-1541 | | email : uunet : dtb@adpplz.uucp | | ADP Dealer Services, ADP Plaza, 2525 S.W. 1st Ave, Portland OR, 97201 | ------------------------------------------------------------------------
chugins@hpcupt1.cup.hp.com (Chris Hugins) (04/27/91)
Some mid-range computers do allow peripherals to be powered on and off, and even replaced without interrupting other users from performing their tasks. One example is the tape drive which may be seldom used. Another is that of a disk containing a private data set which may be moved from machine to machine. Not to mention the replacement/repair of a bad peripheral. Some database and manufacturing shops do not allow any (ANY) downtime. Turning off the computer is not an option. SCSI is not just for desk-tops anymore. Unfortunately, the design of the Small Computer System Interface (even "2") has not fully become cognizant of this fact. The ability to power-on and off devices without corruption of data across a common bus with other devices is important. It is not completely clear if SCSI allows this, due to "noise" generated on the bus at power-on (and OFF!) of scsi peripherals. Some are "quieter" than others, dependent upon the method of filtering at the line-drivers. To remove/replace peripherals/cables on SCSI (even on a quiesced bus) is extremely risky. Basically, it's "Do you feel lucky, punk?" Maybe with SCSI-3.... Chris T. Hugins chugins@hpisoa2.hp.com
ritchie@hpdmd48.boi.hp.com (David Ritchie) (04/29/91)
Auspex has special gismos that do this in their file servers, so it can be done..... -- Dave Ritchie ritchie@hpdmd48.boi.hp.com
Rob_Steven_Kramarz@cup.portal.com (05/05/91)
Experimentally, I have found that warm swap works consistently. I have never tried a hot swap and do not intend to since theoretically it is disastrous. If you do plan to do a warm swap, make sure that the bus truely is inactive (unmount all file systems, or use a driver which allows the on-off status of the bus to be toggled at the driver level. My expertise in this area derives from our work at 1776, Inc. on disk mirroring and disk array device drivers, where this question is very germaine to our customers.