ghg@ea.ecn.purdue.edu (George Goble) (12/13/88)
In April, 1987, we started driver development on the 2nd BETA Exabyte drive on a Sun-2/120. Most of the initial headaches resulted from deficiencies with the Sun-2 SCSI system. For a couple of months work continued on the Sun-2, but was later moved to a Sun-3/50 (SunOS 3.2/3.3/3.4). The Sun-2 could not handle the disconnect, save-data-pointer SCSI commands which Exabyte began to require in the summer of 1987. Also, the Sun-2 was just too slow to stream the drive while data-comparing each byte. Initially, the Sun-3/50 would run the drive, but not in disconnect mode, since the Exabyte would disconnect on odd byte boundries sometimes, but Exabyte finally added a vendor bit to force disconnects on even bytes which then allowed full disconnect/reconnect operation on the 3/50. Summer of '87 also uncovered a myriad of problems, broken tapes, nasty jams, motor and brake problems, lots of "hang" conditions, which were all nice to work out before the product was up for sale. I was on the phone several times/day with the person doing the drive firmware at this time. I am very grateful to Exabyte to have allowed us to have BETA drives which allowed us and a few others to uncover quite a few bugs before it hit the market. There were few remaining problems when the drive went up for sale in Sept 1987. Most of the problems one would "expect" from an 8mm Camcorder drive were not there. Tape interchange problems (density was 35 million bits/sq inch), tape "wearing out", being affected by weak magnetic fields, or not being able to read it back after several months, just did not happen. The early prototype drive "P4-6" didn't have firmware (June 87) to spin down the rotary head when the drive was not active, so a human had to be present during testing to unload the tape if the firmware hung, etc. One day, I got called away to fix a crashed system someplace, left the Exabyte test running, and it of course hung, and left the head spinning with the tape loaded... for 45-60 minutes! To my amazement, the tape was not visibly damaged, and the drive read the area of tape with no errors where the head had been left spinning! Both us and Exabyte tested several brands of tape, and both agreed that the SONY tapes worked the best (we found FUJI a close second, TDK and Maxell to be the worst). Exabyte's lab testing showed tapes would not even begin to "degrade" until around 800-900 passes, but due to their super ECC methods, they were usable well beyond 2000-3000 passes. I don't think anyone puts near this many passes on their backup tapes. We were never able to "wear out" a tape (few hundred passes on a short piece of it).. We ran a drive for 24 hours/day for 2 1/2 months. It held up mechanically fine. It was only cleaned once and didn't need it. This "lifetime" test simulated approximately how long a head would last on a conventional tape drive. (you can almost buy a new Exabyte for the price of replacing a head on a 1/2" drive) As far as reading back old tapes goes, I can still read back OK, tapes I made in July 1987. Anything before that, is not compatible with the current Exabyte format. Signal levels at the head preamp output (TP3, right most test point on top of R/W card) are comparable to recently made tapes (within 5-10%)). The ECC is phenomenal. Tape is written in "stripes" diagonally across the tape. Each stripe contains eight 1024 byte sectors plus 400 bytes of ECC (per sector) plus control and servo information. The drive does full read-after-write checking, and even a single bit error causes the affected block to be rewritten later on the tape in a different relative position in the stripe. All this is sorted out on readback. ECC can correct a 263 byte burst error in a 1024 byte sector plus something like 80 one and two bit errors in the same sector. Compare that to your disks! The fall of 1987 brought the ramp up of production, and the advent of high speed filemark positioning (10X read speed). As with the ramp up of almost any new product, some problems developed when components were purchased in mass quantities. As usual, we always ended up with drives made from the "bad" batches. The most notable was the "3-1/2 week disease" which started in drives built or upgraded starting Dec 1987. Drives would run, almost to the day, 3-1/2 weeks of power up time and then start doing flakey things, especially having read problems. One could see the head preamp output on a scope fading in and out. This was finally "cured" around May 1988; turned out to be capacitors "leaking" very slightly, hosing up the delicate DC bias in the head preamp circuit. A 6 week disease was then evident, which also caused "flakey" operation, and the appearance of tape interchange problems. It was tracked down to a critical capacitor in the head Synch circuit slowly drifting in value, causing the head synch to drift out of acceptable limits in 6-9 weeks. All 13 of our drives had both 3-1/2 and 6 week diseases (some went back 5 or 6 times). Head synch problems were laid to rest for good during the last week of June 1988 by switching to a very good grade capacitor. There were also scattered problems, like "mode" motor problems, and bad crystal oscillators, but those were fairly rare here. From the last week in June, 1988, until the present, it has been smooth sailing. Only 1 drive out of 13 has failed since end of June (5 months!) , compared with 100% of the drives every 6-9 weeks before that. The one drive which failed just needed a realignment and died in the first couple of days. Maybe it got dropped in shipping. If you have drives which were built or upgraded before 6/29/88, it would be a very good idea to send them back to be upgraded. If your drive does not have a playing-card sized sticker on the top lid, it has probably been built or upgraded before mid June '88 and should be upgraded. We break almost everything we get here, and have appreciated the long working relationship with Exabyte in turning the drive into a solid product. Currently we see about 1 unrecovered write error/month (for all 12 drives total). We read back (essentially dd to /dev/null) all tapes made. There was one case (late Aug) where a tape was written ok and would not read back (error at the load point). There was some problem synch'ing up at the "load point" (firmware bug) and this tape was successfully read using MX firmware 4$23 (now released). This only happened once. Once tapes have been written and read back once, there have been no read failures since 6/29/88 (the head sync cap fix). Also, we have only really "lost" one tape. It was heated to between 110 and 115F in the summer heat wave in an unairconditioned apartment (Spec is around 105F I think). It was observed in pre July 88 drives that during the gradual failures, it was possible to write a tape ok, and have it fail on the readback. Theory was that the read-after-write did indeed work, but that after a pause, the drive servo became sloppy (due to something being sick already), and smashed already written and verified data. This has not been observed on drives after July 88. Out of general paranoia, we still read back (dd to /dev/null) EVERYTHING, for an unattended overnight dump, this takes no more (human) time. Around once every week or two a few drives running MX 4$22 firmware (summer 1988), would just unload the tape for no apparent reason, and Unix would take a "no cartridge loaded" error. This was caused by the servo code being overly paranoid about motor ramp ups, etc. It would occasionly "panic" and emergency unload the tape. Currently released firmware MX-4$23/SV-B017 fixes this. We have run around 200 tapes (approx 60 of them almost full) since July 1988. Bitwise each Exabyte tape is equivalent to 16 reels of 2400 ft 6250 BPI (long records) magtape. In actual practice, it is more like 22 or 23 reels (doing dumps) since filesystems are usually started on new reels and Exabyte reclaims all that wasted space as well. We have put as many as 250 filesystems (incremental level 9) on a single tape. That takes around 10 hours (due to the network), now we use 3-4 drives and do 50-80 filesystems on each one. The advertised capacity of 2.3GB is only obtainable if the drive is run in streaming mode (must be fed at 15 MB/min) AND your driver knows how to deal with the LEOT warning zone. The LEOT is a "warning" that PEOT (real end of tape) is "near". 230 MB still remain though on a 120 Min tape. The drive issues a SCSI check condition (does partial xfer if in fixed mode) and the transfer must be restarted by the device driver. 4$23 firmware issues a check on every write after the LEOT. The drivers from Perfect Byte (Sun3.4), Ciprico, and possibly Artecon, are the only ones I know of which handle the LEOT (4$22 or earlier) and let you use the last 230 MB on the tape. It is a real pain to make that work correctly. (Let me know if any other drivers exist that can deal with the LEOT). Most drivers just return hard error when hitting the LEOT. Running the drive start/stop (like dump via a network) results in the loss of 200-300MB or so more. 4$23 firmware allows the driver to set the "pad stripes" to be zero (right 3 bits set to 0 of the 2nd vendor uniq byte on a mode select command). This reduces the nonstreaming loss to around 100MB. Most Unix users are probably ending up with a driver which is run non-streaming, and can't handle the LEOT, so they should figure on a capacity of 1.8GB instead of 2.3GB. Anything which uses lots of filemarks (VMS, ANSI formats, etc) loses severely! A filemark consumes 2.2MB of tape (you can put an entire DEC "RK05" disk image inside one Exabyte filemark, for you oldtimers) This summer, Exabyte added a "short" filemark (set "vendor unique" byte 5, bit 7 in the Sun cdb (vu_57) to get a short mark). The short filemark only consumes .86" (.488MB) of tape (60 tracks) vs 3.8" (2.2MB) of tape (270 tracks) for a regular filemark. The short filemark cannot be overwritten like the regular one can. Comments in the next paragraph only apply to "regular", not short filemarks. Some companies selling devices and drivers to deal with the VMS world force (or default to) short filemarks. Due to the helical scan and the erase mechanism, there is a writing limitation on Exabyte drives. "tar r" will not work ("tar c" is ok). One can only start writing at 1) beginning of tape, 2) on the end of what was last written, 3) "front" side of a (regular) filemark. Say, you have a tape with 3 tar files on it and want to save the first file, and you want to begin writing over the 2nd file. Normally you would "mt fsf 1; tar cf /dev/nrst0 .." for an Exabyte: "mt fsf 1; mt bsf 1; mt weof 1; tar cf /dev/nrst0 .." will work. The regular filemark consists of an erased zone 3.8" long (needed to begin a write). In this case, the first filemark is rewritten in place, which creates an erased zone AFTER it, clearing the way to write more on the tape. The erase head is not helical. Warning: The above sequences and in general positioning on a "stock" (st.c) SunOS 4.0 and 4.0.1 device driver will not work (Ciprico rfsd.c driver should work fine). st.c driver has no concept of "front" or "back" sides of a filemark. Only "safe" positioning operation for Exabyte (st.c) on SunOS 4.0/4.0.1 is "mt fsf X" FROM THE BOT. Anything else will probably abort or leave the tape positioned incorrectly (I saw a "mt bsf 1" rewind the tape when 10 files out!) One can position a tape to the end of what was last written by reading until a "blank tape" error is returned. Writing can be started at this point. The tape does not become positioned somewhere down the "erased" area as does a conventional magtape. One can issue multiple reads at the "blank tape" error, but the Exabyte stays positioned at the beginning of the blank area, ready to accept write commands. File skip operations do not stop at blank tape and will run into old data or run to the end of the tape, so you have to be careful not to "mt fsf toomany". The initial driver developmemt under SunOS 3.[234] ran the drive in "fixed" mode like the Sun carts. Fixed mode looks like a disk; one can use any blocking factor for writing and reading (even a different one for reading). SunOS4.0 came out and made the default to be "variable" mode, which looks like a 1/2" magtape. In general fixed and variable mode tapes are not interchangable, hence all the grief caused earlier this summer. In variable mode, the size in bytes of the write is remembered on the tape, and must be read back into the same size (or bigger) buffer. Reading into too small a buffer, will cause loss of data, unlike fixed mode (buffer size assumed to be 1024 or a multiple thereof). In July 88, we defaulted everything to variable mode, to stay compatible with Sun. A number of people have called me on how to read Exabyte tapes made under SunOS 3.[234] under SunOS4.X. If the tape was written in 1024-byte "fixed" mode (the orig blocking factor does not make any difference) as on most 3.[234] systems, one can try a /etc/restore ifb /dev/nrst0 2 on SunOS4.0. If you use 4.3BSD dump/restore like we do, use "1" instead for the block factor (1024 bytes). It will run VERY slow! Lastly, a final word of warning, this being from a design limitation of Unix itself, one cannot read/write a single file > 2.0GB. The Unix file table offset overflows (it is a 32 bit signed quantity). If you raw read/write the Exabyte tape, you will run into this. Doing a periodic "lseek(fd, 0, 0)" will reset the file offset but do nothing to the drive to allow > 2.0GB in a single file on raw tape. We have also spent the last year and a half also shaking down an STC 2925 1/2" magtape (on Gould NP1s & Suns). We started with a "production" model, which had been shipping for sometime. Finding numerous bugs, having STC out here several times, and running thru approx 40 versions (ROM sets) of the drive firmware, we are now just getting that drive solid as well. We are still seeing a hard write error every 20-30 reels on magtape (this translates into a hard write error once/tape on an Exabyte for dumps). Exabyte is currently doing MUCH better than that. (one hard write error every 40 tapes or so; equiv to 800 magtapes) We also have not cleaned any drives in 5 months! (they got cleaned every month at Exabyte before that due to failures). Exabyte will be introducing a "wet" (Freon-TF) type of cleaning cart shortly, similar to a Geneva. They will also probably recommend a cleaning for every 30 hours' (once/week) use. Experience has shown, that drives run fine for 100-300 hours between cleanings if kept in clean office/computer room environments. Some people will probably put them in steel mills, hence the 30 hour recommendation. We have some drives at 300 hours now, running fine. One has to lie slightly to make dump give good tape estimates on the length of the tape. Telling the real numbers to dump causes an integer overflow during tape length calculation. In reality, the tape is 348 ft at 550700 "bpi". 550700 bpi is not really a true density, but is the "effective" bpi assuming the Exabyte was a 9-track drive for purposes of calculating dump tape capacity for dump(8). Aerial recording density is around 35 million bits per square inch. You get around 500kbytes per inch of tape. Thus: /etc/dump 0fubsd /dev/nrst0 50 348 550700 /filesystem Should work in theory, but will probably cause an overflow during tape length calculations, so We use: /etc/dump 0fubsd /dev/nrst0 50 6000 54000 /filesystem We use 4.3BSD dump on suns, so you will probably have to change the blocking factor "50" (51200 bytes) to be "100" since Sun uses 512 byte blocks. A fake tape length of 6000 feet and a "bpi" of 54000 seems to give good estimates on sizing the cartridge. These numbers are based on experience and don't match exactly what you would get doing the math (maybe the lack of gaps?). Dumping a 230 MB filesystem yields an estimate of "0.10 tapes". Exabyte only builds drives, not complete subsystems. Various VARs and OEMs buy their drives, add enclosures, write documentation, and put together complete systems for Suns and other machines. They also provide user support. Some of these companies are Perfect Byte, Delta Microsystems, Artecon, Emerald, Spectrum, Megatape, etc. The trade rags are full of them. Exabyte is rumored to be working on a drive which is 5GB, and transfers at 30 MB/min (twice the current 15MB/min), and will be able to read/write your old tapes. Beta drive in about a year or so. Also rumored is development of an "X-Y stack loader". The "CHS" (Cartridge Handling System) is purported to fit in a 19" rack, hold 5 drives, and 120 (small model) and 240 (large model) tapes. A robot arm will pick, insert the tape, and close the door of the drive. The current drive, the EXB-8200, reads/writes 14-15MB/min (around 240KB/sec), therefore it takes 2 hours 40 mins to read/write the entire tape. Dumps approach this speed when going from a local disk (restores are horrible no matter how or from what they are done from due to directory, indirect block, and inode synchronous writes). A Sun-3/50 can stream 3 drives at once on its SCSI; we have had 7 drives on it at once (you have to change driver slightly to move bits in minor dev). All 7 drives will "run", although not at full speed due to the SCSI running out of time. The Sun-3/180 with a Ciprico RF-3500 was found to also run 7 drives ok, and can stream 4 of them, and almost stream 5. Skipping to filemarks (both forward and reverse) runs at 10X the read speed (~150 MB/min), or about 16 mins to go the whole length of the tape. Rewind is around 70X read speed or 2 mins 35 sec. Possibilities have been discussed about having the next drive be able to position to any file, any block at nearly rewind speed (3-4 mins), but nothing is committed. I may do some experimenting with high speed positioning on the current drive via servo commands issued from the 9600 baud "maint" port on the drive. Until now, most of the work was done on Sun-3/50s running SunOS 3.4. 3/60's have also been run and seem to work ok. A Ciprico RF-3500 controller is running on a 3/180 (that driver was rewritten also, and given back to Ciprico). The next project is to make the SunOS 4.0 driver "work" with the Exabyte. It currently moves the tape, but still has a way to go. To conclude, it looks like the Exabyte drive is finally a solid product. It comes just in time as systems with 1.2GB disks are becoming the commonplace. I wish to thank Exabyte and all their staff for putting up with a year and a half of continuous questions, bug reports, and bitching from me. Also I like to thank Exabyte and Perfect Byte for donating some drives (and Ciprico controller) to make all this possible. --ghg George Goble, Engineering Computer Network, Purdue U, W. Lafayette IN 47907 (317) 494-3545 Arpa: ghg@purdue.edu uucp: {backbone}!pur-ee!ghg
ghg@ucbvax.BERKELEY.EDU (George Goble) (12/20/88)
[note: This article is a combination of two previous articles (the original and the update) I posted to comp.periphs from Purdue: <7378@ea.ecn.purdue.edu> <7546@ea.ecn.purdue.edu> In a period of two weeks, these articles appear to have made it to only a handful of sites due to news problems. I am reposting this from ucbvax, "the gateway to the world". --ghg] In April, 1987, we started driver development on the 2nd BETA Exabyte drive on a Sun-2/120. Most of the initial headaches resulted from deficiencies with the Sun-2 SCSI system. For a couple of months work continued on the Sun-2, but was later moved to a Sun-3/50 (SunOS 3.2/3.3/3.4). The Sun-2 could not handle the disconnect, save-data-pointer SCSI commands which Exabyte began to require in the summer of 1987. Also, the Sun-2 was just too slow to stream the drive while data-comparing each byte. Initially, the Sun-3/50 would run the drive, but not in disconnect mode, since the Exabyte would disconnect on odd byte boundries sometimes, but Exabyte finally added a vendor bit to force disconnects on even bytes which then allowed full disconnect/reconnect operation on the 3/50. Summer of '87 also uncovered a myriad of problems, broken tapes, nasty jams, motor and brake problems, lots of "hang" conditions, which were all nice to work out before the product was up for sale. I was on the phone several times/day with the person doing the drive firmware at this time. I am very grateful to Exabyte to have allowed us to have BETA drives which allowed us and a few others to uncover quite a few bugs before it hit the market. There were few remaining problems when the drive went up for sale in Sept 1987. Most of the problems one would "expect" from an 8mm Camcorder drive were not there. Tape interchange problems (density was 35 million bits/sq inch), tape "wearing out", being affected by weak magnetic fields, or not being able to read it back after several months, just did not happen. The early prototype drive "P4-6" didn't have firmware (June 87) to spin down the rotary head when the drive was not active, so a human had to be present during testing to unload the tape if the firmware hung, etc. One day, I got called away to fix a crashed system someplace, left the Exabyte test running, and it of course hung, and left the head spinning with the tape loaded... for 45-60 minutes! To my amazement, the tape was not visibly damaged, and the drive read the area of tape with no errors where the head had been left spinning! Both us and Exabyte tested several brands of tape, and both agreed that the SONY tapes worked the best (we found FUJI a close second, TDK and Maxell to be the worst). Exabyte's lab testing showed tapes would not even begin to "degrade" until around 800-900 passes, but due to their super ECC methods, they were usable well beyond 2000-3000 passes. I don't think anyone puts near this many passes on their backup tapes. We were never able to "wear out" a tape (few hundred passes on a short piece of it).. We ran a drive for 24 hours/day for 2 1/2 months. It held up mechanically fine. It was only cleaned once and didn't need it. This "lifetime" test simulated approximately how long a head would last on a conventional tape drive. (you can almost buy a new Exabyte for the price of replacing a head on a 1/2" drive) As far as reading back old tapes goes, I can still read back OK, tapes I made in July 1987. Anything before that, is not compatible with the current Exabyte format. Signal levels at the head preamp output (TP3, right most test point on top of R/W card) are comparable to recently made tapes (within 5-10%). The ECC is phenomenal. Tape is written in "stripes" diagonally across the tape. Each stripe contains eight 1024 byte sectors plus 400 bytes of ECC (per sector) plus control and servo information. The drive does full read-after-write checking, and even a single bit error causes the affected block to be rewritten later on the tape in a different relative position in the stripe. All this is sorted out on readback. ECC can correct a 263 byte burst error in a 1024 byte sector plus something like 80 one and two bit errors in the same sector. Compare that to your disks! The fall of 1987 brought the ramp up of production, and the advent of high speed filemark positioning (10X read speed). As with the ramp up of almost any new product, some problems developed when components were purchased in mass quantities. As usual, we always ended up with drives made from the "bad" batches. The most notable was the "3-1/2 week disease" which started in drives built or upgraded starting Dec 1987. Drives would run, almost to the day, 3-1/2 weeks of power up time and then start doing flakey things, especially having read problems. One could see the head preamp output on a scope fading in and out. This was finally "cured" around May 1988; turned out to be capacitors "leaking" very slightly, hosing up the delicate DC bias in the head preamp circuit. A 6 week disease was then evident, which also caused "flakey" operation, and the appearance of tape interchange problems. It was tracked down to a critical capacitor in the head Synch circuit slowly drifting in value, causing the head synch to drift out of acceptable limits in 6-9 weeks. All 13 of our drives had both 3-1/2 and 6 week diseases (some went back 5 or 6 times). Head synch problems were laid to rest for good during the last week of June 1988 by switching to a very good grade capacitor. There were also scattered problems, like "mode" motor problems, and bad crystal oscillators, but those were fairly rare here. From the last week in June, 1988, until the present, it has been smooth sailing. Only 1 drive out of 13 has failed since end of June (5 months!) , compared with 100% of the drives every 6-9 weeks before that. The one drive which failed just needed a realignment and died in the first couple of days. Maybe it got dropped in shipping. If you have drives which were built or upgraded before 6/29/88, it would be a very good idea to send them back to be upgraded. If your drive does not have a playing-card sized sticker on the top lid, it has probably been built or upgraded before mid June '88 and should be upgraded. We break almost everything we get here, and have appreciated the long working relationship with Exabyte in turning the drive into a solid product. Currently we see about 1 unrecovered write error/month (for all 12 drives total). We read back (essentially dd to /dev/null) all tapes made. There was one case (late Aug) where a tape was written ok and would not read back (error at the load point). There was some problem synch'ing up at the "load point" (firmware bug) and this tape was successfully read using MX firmware 4$23 (now released). This only happened once. Once tapes have been written and read back once, there have been no read failures since 6/29/88 (the head sync cap fix). Also, we have only really "lost" one tape. It was heated to between 110 and 115F in the summer heat wave in an unairconditioned apartment (Spec is around 105F I think). It was observed in pre July 88 drives that during the gradual failures, it was possible to write a tape ok, and have it fail on the readback. Theory was that the read-after-write did indeed work, but that after a pause, the drive servo became sloppy (due to something being sick already), and smashed already written and verified data. This has not been observed on drives after July 88. Out of general paranoia, we still read back (dd to /dev/null) EVERYTHING, for an unattended overnight dump, this takes no more (human) time. We have seen only 4 or 5 "bad" tapes (out of 300-400 tapes), Sony P6-120MP, purchased from "consumer" wholesale houses. The bad tapes had a wrinkle in them and got hard write errors. One of them wrote ok, and failed on read back, the wrinkle being eaten away by the rotary heads. (another good reason to always read it back) I don't know of anybody who had only 5 bad magtapes out of 6600-8800 1/2 inch reels. Around once every week or two a few drives running MX 4$22 firmware (summer 1988), would just unload the tape for no apparent reason, and Unix would take a "no cartridge loaded" error. This was caused by the servo code being overly paranoid about motor ramp ups, etc. It would occasionly "panic" and emergency unload the tape. Currently released firmware MX-4$23/SV-B017 fixes this. We have run around 200 tapes (approx 60 of them almost full) since July 1988. Bitwise each Exabyte tape is equivalent to 16 reels of 2400 ft 6250 BPI (long records) magtape. In actual practice, it is more like 22 or 23 reels (doing dumps) since filesystems are usually started on new reels and Exabyte reclaims all that wasted space as well. We have put as many as 250 filesystems (incremental level 9) on a single tape. That takes around 10 hours (due to the network), now we use 3-4 drives and do 50-80 filesystems on each one. The advertised capacity of 2.3GB is only obtainable if the drive is run in streaming mode (must be fed at 15 MB/min) AND your driver knows how to deal with the LEOT warning zone. The LEOT is a "warning" that PEOT (real end of tape) is "near". 230 MB still remain though on a 120 Min tape. The drive issues a SCSI check condition (does partial xfer if in fixed mode) and the transfer must be restarted by the device driver. 4$23 firmware issues a check on every write after the LEOT. The drivers from Perfect Byte (Sun3.4), Ciprico, and possibly Artecon, are the only ones I know of which handle the LEOT (4$22 or earlier) and let you use the last 230 MB on the tape. It is a real pain to make that work correctly. (Let me know if any other drivers exist that can deal with the LEOT). Most drivers just return hard error when hitting the LEOT. Running the drive start/stop (like dump via a network) results in the loss of 200-300MB or so more. 4$23 firmware allows the driver to set the "pad stripes" to be zero (right 3 bits set to 0 of the 2nd vendor uniq byte on a mode select command). This reduces the nonstreaming loss to around 100MB. Most Unix users are probably ending up with a driver which is run non-streaming, and can't handle the LEOT, so they should figure on a capacity of 1.8GB instead of 2.3GB. Anything which uses lots of filemarks (VMS, ANSI formats, etc) loses severely! A filemark consumes 2.2MB of tape (you can put an entire DEC "RK05" disk image inside one Exabyte filemark, for you oldtimers) This summer, Exabyte added a "short" filemark (set "vendor unique" byte 5, bit 7 in the Sun cdb (vu_57) to get a short mark). The short filemark only consumes .86" (.488MB) of tape (60 tracks) vs 3.8" (2.2MB) of tape (270 tracks) for a regular filemark. The short filemark cannot be overwritten like the regular one can. Comments in the next paragraph only apply to "regular", not short filemarks. Some companies selling devices and drivers to deal with the VMS world force (or default to) short filemarks. Due to the helical scan and the erase mechanism, there is a writing limitation on Exabyte drives. "tar r" or "tar u" will not work ("tar c" is ok). One can only start writing at 1) beginning of tape, 2) on the end of what was last written, 3) "front" side of a (regular) filemark. Say, you have a tape with 3 tar files on it and want to save the first file, and you want to begin writing over the 2nd file. Normally you would "mt fsf 1; tar cf /dev/nrst0 .." for an Exabyte: "mt fsf 1; mt bsf 1; mt weof 1; tar cf /dev/nrst0 .." will work. The regular filemark consists of an erased zone 3.8" long (needed to begin a write). In this case, the first filemark is rewritten in place, which creates an erased zone AFTER it, clearing the way to write more on the tape. The erase head is not helical. Warning: The above sequences and in general positioning on a "stock" (st.c) SunOS 4.0 and 4.0.1 device driver will not work (Ciprico rfsd.c driver should work fine). st.c driver has no concept of "front" or "back" sides of a filemark. Only "safe" positioning operation for Exabyte (st.c) on SunOS 4.0/4.0.1 is "mt fsf X" FROM THE BOT. Anything else will probably abort or leave the tape positioned incorrectly (I saw a "mt bsf 1" rewind the tape when 10 files out!) One can position a tape to the end of what was last written by reading until a "blank tape" error is returned. Writing can be started at this point. The tape does not become positioned somewhere down the "erased" area as does a conventional magtape. One can issue multiple reads at the "blank tape" error, but the Exabyte stays positioned at the beginning of the blank area, ready to accept write commands. File skip operations do not stop at blank tape and will run into old data or run to the end of the tape, so you have to be careful not to "mt fsf toomany". The initial driver developmemt under SunOS 3.[234] ran the drive in "fixed" mode like the Sun carts. Fixed mode looks like a disk; one can use any blocking factor for writing and reading (even a different one for reading). SunOS4.0 came out and made the default to be "variable" mode, which looks like a 1/2" magtape. In general fixed and variable mode tapes are not interchangable, hence all the grief caused earlier this summer. In variable mode, the size in bytes of the write is remembered on the tape, and must be read back into the same size (or bigger) buffer. Reading into too small a buffer, will cause loss of data, unlike fixed mode (buffer size assumed to be 1024 or a multiple thereof). In July 88, we defaulted everything to variable mode, to stay compatible with Sun. A number of people have called me on how to read Exabyte tapes made under SunOS 3.[234] under SunOS4.X. If the tape was written in 1024-byte "fixed" mode (the orig blocking factor does not make any difference) as on most 3.[234] systems, one can try a /etc/restore ifb /dev/nrst0 2 on SunOS4.0. If you use 4.3BSD dump/restore like we do, use "1" instead for the block factor (1024 bytes). It will run VERY slow! Lastly, a final word of warning, this being from a design limitation of Unix itself, one cannot read/write a single file > 2.0GB. The Unix file table offset overflows (it is a 32 bit signed quantity). If you raw read/write the Exabyte tape, you will run into this. Doing a periodic "lseek(fd, 0, 0)" will reset the file offset but do nothing to the drive to allow > 2.0GB in a single file on raw tape. We have also spent the last year and a half also shaking down an STC 2925 1/2" magtape (on Gould NP1s & Suns). We started with a "production" model, which had been shipping for sometime. Finding numerous bugs, having STC out here several times, and running thru approx 40 versions (ROM sets) of the drive firmware, we are now just getting that drive solid as well. We are still seeing a hard write error every 20-30 reels on magtape (this translates into a hard write error once/tape on an Exabyte for dumps). Exabyte is currently doing MUCH better than that. (one hard write error every 40 tapes or so; equiv to 800 magtapes) We also have not cleaned any drives in 5 months! (they got cleaned every month at Exabyte before that due to failures). Exabyte will be introducing a "wet" (Freon-TF) type of cleaning cart shortly, similar to a Geneva. They will also probably recommend a cleaning for every 30 hours' (once/week) use. Experience has shown, that drives run fine for 100-300 hours between cleanings if kept in clean office/computer room environments. Some people will probably put them in steel mills, hence the 30 hour recommendation. We have some drives at 300 hours now, running fine. One has to lie slightly to make dump give good tape estimates on the length of the tape. Telling the real numbers to dump causes an integer overflow during tape length calculation. In reality, the tape is 348 ft at 550700 "bpi". 550700 bpi is not really a true density, but is the "effective" bpi assuming the Exabyte was a 9-track drive for purposes of calculating dump tape capacity for dump(8). Aerial recording density is around 35 million bits per square inch. You get around 500kbytes per inch of tape. Thus: /etc/dump 0fubsd /dev/nrst0 50 348 550700 /filesystem Should work in theory, but will probably cause an overflow during tape length calculations, so We use: /etc/dump 0fubsd /dev/nrst0 50 6000 54000 /filesystem We use 4.3BSD dump on suns, so you will probably have to change the blocking factor "50" (51200 bytes) to be "100" since Sun uses 512 byte blocks. A fake tape length of 6000 feet and a "bpi" of 54000 seems to give good estimates on sizing the cartridge. These numbers are based on experience and don't match exactly what you would get doing the math (maybe the lack of gaps?). Dumping a 230 MB filesystem yields an estimate of "0.10 tapes". It is also possible to write a "directory" on the front of a dump tape (also works with most magtapes), AFTER the dump is done. This way one can store the command file which made the dump, along with the actual log of the dumps on that tape. Sometimes one will have some filesystems abort (when dumped over a network) due to machines crashing and/or network problems. Most of the time, these can just be appended on the end of the tape. One initially writes a file of 10 Megabytes of zeros on the front of the tape, then writes the dumps on. One then rewinds, then uses a slightly modified tar command to write on the directory. The mod to tar is just to execute a "mtioctl REWIND" before exiting, so the device driver close routine does not write a filemark after the directory. If a filemark were written, the tape would have two filemarks with the same sequence number, which would confuse the Exabyte (will still work though). You have room to tar on about 2 MB of whatever you want in the directory area. To access the dump files, ones does a "mt fsf 1" to skip over the directory and the blank area. One cannot "read" past the directory, as an erased tape error (blank check) will occur. The huge capacity of Exabyte also lessens dump hassles in other ways. We do level 0 dumps on large Gould systems every two weeks, and level 9s everyday, no level 1-8s are needed, which makes restores a cinch. These are heavy use (lots of undergrads) machines, which have some filesystems with 50% of their files touched after 2 weeks. I put 4 Gould NP1s, 4 Gould PN9080s, and a CCI 6/32 on 5-1/2 Exabyte tapes, level 0. The above systems' level 9s all fit on 1/2 to 3/4 of one tape (dumped to a Sun 3/50 in my office) For Suns (staff and grad students only), and other research machines, we have noticed the level 0s can be run every 4-6 weeks, with 9s everyday. Restores are simple, just do the level 0, and the most recent level 9 tape. Exabyte level 9s are highly resistant to "blowouts", where some special research project or massive undergrad final projects, etc, will touch 200 Megabytes or more in one day on a single filesystem. A 200 meg blowout can wreak havoc with conventional backup schedules (requires more tapes, keeping track of them, etc) An Exabyte sails along, taking maybe 15 mins longer to dump the filesystem with the blowout, and it all still fits on one tape. Exabyte only builds drives, not complete subsystems. Various VARs and OEMs buy their drives, add enclosures, write documentation, and put together complete systems for Suns and other machines. They also provide user support. Some of these companies are Perfect Byte, Delta Microsystems, Artecon, Emerald, Spectrum, Megatape, etc. The trade rags are full of them. Exabyte is rumored to be working on a drive which is 5GB, and transfers at 30 MB/min (twice the current 15MB/min), and will be able to read/write your old tapes. Beta drive in about a year or so. Also rumored is development of an "X-Y stack loader". The "CHS" (Cartridge Handling System) is purported to fit in a 19" rack, hold 5 drives, and 120 (small model) and 240 (large model) tapes. A robot arm will pick, insert the tape, and close the door of the drive. The current drive, the EXB-8200, reads/writes 14-15MB/min (around 240KB/sec), therefore it takes 2 hours 40 mins to read/write the entire tape. Dumps approach this speed when going from a local disk (restores are horrible no matter how or from what they are done from due to directory, indirect block, and inode synchronous writes). A Sun-3/50 can stream 3 drives at once on its SCSI; we have had 7 drives on it at once (you have to change driver slightly to move bits in minor dev). All 7 drives will "run", although not at full speed due to the SCSI running out of time. The Sun-3/180 with a Ciprico RF-3500 was found to also run 7 drives ok, and can stream 4 of them, and almost stream 5. Skipping to filemarks (both forward and reverse) runs at 10X the read speed (~150 MB/min), or about 16 mins to go the whole length of the tape. Rewind is around 70X read speed or 2 mins 35 sec. Possibilities have been discussed about having the next drive be able to position to any file, any block at nearly rewind speed (3-4 mins), but nothing is committed. I may do some experimenting with high speed positioning on the current drive via servo commands issued from the 9600 baud "maint" port on the drive. Until now, most of the work was done on Sun-3/50s running SunOS 3.4. 3/60's have also been run and seem to work ok. A Ciprico RF-3500 controller is running on a 3/180 (that driver was rewritten also, and given back to Ciprico). The next project is to make the SunOS 4.0 driver "work" with the Exabyte. It currently moves the tape, but still has a way to go. To conclude, it looks like the Exabyte drive is finally a solid product. It comes just in time as systems with 1.2GB disks are becoming the commonplace. I wish to thank Exabyte and all their staff for putting up with a year and a half of continuous questions, bug reports, and bitching from me. Also I like to thank Exabyte and Perfect Byte for donating some drives (and Ciprico controller) to make all this possible. --ghg George Goble, Engineering Computer Network, Purdue U, W. Lafayette IN 47907 (317) 494-3545 Arpa: ghg@purdue.edu uucp: {backbone}!pur-ee!ghg