[comp.periphs] Exabyte 8mm tape drives come of age

ghg@ea.ecn.purdue.edu (George Goble) (12/13/88)

In April, 1987, we started driver development on the 2nd BETA Exabyte
drive on a Sun-2/120. Most of the initial headaches resulted from
deficiencies with the Sun-2 SCSI system. For a couple of months work
continued on the Sun-2, but was later moved to a Sun-3/50 (SunOS
3.2/3.3/3.4). The Sun-2 could not handle the disconnect,
save-data-pointer SCSI commands which Exabyte began to require in the
summer of 1987.  Also, the Sun-2 was just too slow to stream the drive
while data-comparing each byte.

Initially, the Sun-3/50 would run the drive, but not in disconnect
mode, since the Exabyte would disconnect on odd byte boundries
sometimes, but Exabyte finally added a vendor bit to force disconnects
on even bytes which then allowed full disconnect/reconnect operation on
the 3/50.

Summer of '87 also uncovered a myriad of problems, broken tapes, nasty
jams, motor and brake problems, lots of "hang" conditions, which were
all nice to work out before the product was up for sale.  I was on the
phone several times/day with the person doing the drive firmware at
this time.  I am very grateful to Exabyte to have allowed us to have
BETA drives which allowed us and a few others to uncover quite a few
bugs before it hit the market.

There were few remaining problems when the drive went up for sale in
Sept 1987.  Most of the problems one would "expect" from an 8mm
Camcorder drive were not there.  Tape interchange problems (density was
35 million bits/sq inch), tape "wearing out", being affected by weak
magnetic fields, or not being able to read it back after several
months, just did not happen. The early prototype drive "P4-6" didn't
have firmware (June 87) to spin down the rotary head when the drive was
not active, so a human had to be present during testing to unload the
tape if the firmware hung, etc.  One day, I got called away to fix a
crashed system someplace, left the Exabyte test running, and it of
course hung, and left the head spinning with the tape loaded... for
45-60 minutes!  To my amazement, the tape was not visibly damaged, and
the drive read the area of tape with no errors where the head had been
left spinning!

Both us and Exabyte tested several brands of tape, and both agreed that
the SONY tapes worked the best (we found FUJI a close second, TDK and
Maxell to be the worst). Exabyte's lab testing showed tapes would not
even begin to "degrade" until around 800-900 passes, but due to their
super ECC methods, they were usable well beyond 2000-3000 passes. I
don't think anyone puts near this many passes on their backup tapes.

We were never able to "wear out" a tape (few hundred passes on a short
piece of it).. We ran a drive for 24 hours/day for 2 1/2 months.  It
held up mechanically fine. It was only cleaned once and didn't need
it.  This "lifetime" test simulated approximately how long a head would
last on a conventional tape drive.  (you can almost buy a new Exabyte
for the price of replacing a head on a 1/2" drive)

As far as reading back old tapes goes, I can still read back OK, tapes
I made in July 1987. Anything before that, is not compatible with the
current Exabyte format. Signal levels at the head preamp output (TP3,
right most test point on top of R/W card) are comparable to recently
made tapes (within 5-10%)).

The ECC is phenomenal.  Tape is written in "stripes" diagonally across
the tape. Each stripe contains eight 1024 byte sectors plus 400 bytes
of ECC (per sector) plus control and servo information.  The drive does
full read-after-write checking, and even a single bit error causes the
affected block to be rewritten later on the tape in a different
relative position in the stripe. All this is sorted out on readback.
ECC can correct a 263 byte burst error in a 1024 byte sector plus
something like 80 one and two bit errors in the same sector. Compare
that to your disks!

The fall of 1987 brought the ramp up of production, and the advent of
high speed filemark positioning (10X read speed).  As with the ramp up
of almost any new product, some problems developed when components were
purchased in mass quantities.  As usual, we always ended up with drives
made from the "bad" batches. The most notable was the "3-1/2 week
disease" which started in drives built or upgraded starting Dec 1987.
Drives would run, almost to the day, 3-1/2 weeks of power up time and
then start doing flakey things, especially having read problems.  One
could see the head preamp output on a scope fading in and out. This was
finally "cured" around May 1988; turned out to be capacitors "leaking"
very slightly, hosing up the delicate DC bias in the head preamp
circuit.

A 6 week disease was then evident, which also caused "flakey"
operation, and the appearance of tape interchange problems.  It was
tracked down to a critical capacitor in the head Synch circuit slowly
drifting in value, causing the head synch to drift out of acceptable
limits in 6-9 weeks. All 13 of our drives had both 3-1/2 and 6 week
diseases (some went back 5 or 6 times).  Head synch problems were laid
to rest for good during the last week of June 1988 by switching to a
very good grade capacitor.  There were also scattered problems, like
"mode" motor problems, and bad crystal oscillators, but those were
fairly rare here.

From the last week in June, 1988, until the present, it has been smooth
sailing.  Only 1 drive out of 13 has failed since end of June (5
months!) , compared with 100% of the drives every 6-9 weeks before
that.  The one drive which failed just needed a realignment and died in
the first couple of days.  Maybe it got dropped in shipping.

If you have drives which were built or upgraded before 6/29/88, it
would be a very good idea to send them back to be upgraded.  If your
drive does not have a playing-card sized sticker on the top lid, it has
probably been built or upgraded before mid June '88 and should be
upgraded.

We break almost everything we get here, and have appreciated the long
working relationship with Exabyte in turning the drive into a solid
product.  Currently we see about 1 unrecovered write error/month (for
all 12 drives total).  We read back (essentially dd to /dev/null) all
tapes made.  There was one case (late Aug) where a tape was written ok
and would not read back (error at the load point). There was some
problem synch'ing up at the "load point" (firmware bug) and this tape
was successfully read using MX firmware 4$23 (now released). This only
happened once.  Once tapes have been written and read back once, there
have been no read failures since 6/29/88 (the head sync cap fix).
Also, we have only really "lost" one tape. It was heated to between 110
and 115F in the summer heat wave in an unairconditioned apartment (Spec
is around 105F I think).

It was observed in pre July 88 drives that during the gradual failures,
it was possible to write a tape ok, and have it fail on the readback.
Theory was that the read-after-write did indeed work, but that after a
pause, the drive servo became sloppy (due to something being sick
already), and smashed already written and verified data.  This has not
been observed on drives after July 88. Out of general paranoia, we
still read back (dd to /dev/null) EVERYTHING, for an unattended
overnight dump, this takes no more (human) time.

Around once every week or two a few drives running MX 4$22 firmware
(summer 1988), would just unload the tape for no apparent reason, and
Unix would take a "no cartridge loaded" error. This was caused by the
servo code being overly paranoid about motor ramp ups, etc. It would
occasionly "panic" and emergency unload the tape. Currently released
firmware MX-4$23/SV-B017 fixes this.

We have run around 200 tapes (approx 60 of them almost full) since July
1988. Bitwise each Exabyte tape is equivalent to 16 reels of 2400 ft
6250 BPI (long records) magtape. In actual practice, it is more like 22
or 23 reels (doing dumps) since filesystems are usually started on new
reels and Exabyte reclaims all that wasted space as well. We have put
as many as 250 filesystems (incremental level 9) on a single tape. That
takes around 10 hours (due to the network), now we use 3-4 drives and
do 50-80 filesystems on each one.

The advertised capacity of 2.3GB is only obtainable if the drive is run
in streaming mode (must be fed at 15 MB/min) AND your driver knows how
to deal with the LEOT warning zone.  The LEOT is a "warning" that PEOT
(real end of tape) is "near".  230 MB still remain though on a 120 Min
tape.  The drive issues a SCSI check condition (does partial xfer if in
fixed mode) and the transfer must be restarted by the device driver.
4$23 firmware issues a check on every write after the LEOT.  The
drivers from Perfect Byte (Sun3.4), Ciprico, and possibly Artecon, are
the only ones I know of which handle the LEOT (4$22 or earlier) and let
you use the last 230 MB on the tape. It is a real pain to make that
work correctly. (Let me know if any other drivers exist that can deal
with the LEOT).  Most drivers just return hard error when hitting the
LEOT. Running the drive start/stop (like dump via a network) results in
the loss of 200-300MB or so more.  4$23 firmware allows the driver to
set the "pad stripes" to be zero (right 3 bits set to 0 of the 2nd
vendor uniq byte on a mode select command). This reduces the
nonstreaming loss to around 100MB. Most Unix users are probably ending
up with a driver which is run non-streaming, and can't handle the LEOT,
so they should figure on a capacity of 1.8GB instead of 2.3GB. Anything
which uses lots of filemarks (VMS, ANSI formats, etc) loses severely!
A filemark consumes 2.2MB of tape (you can put an entire DEC "RK05"
disk image inside one Exabyte filemark, for you oldtimers)

This summer, Exabyte added a "short" filemark (set "vendor unique" byte
5, bit 7 in the Sun cdb (vu_57) to get a short mark).  The short
filemark only consumes .86" (.488MB) of tape (60 tracks) vs 3.8"
(2.2MB) of tape (270 tracks) for a regular filemark.  The short
filemark cannot be overwritten like the regular one can.  Comments in
the next paragraph only apply to "regular", not short filemarks. Some
companies selling devices and drivers to deal with the VMS world force
(or default to) short filemarks.

Due to the helical scan and the erase mechanism, there is a writing
limitation on Exabyte drives. "tar r" will not work ("tar c" is ok).
One can only start writing at  1) beginning of tape, 2) on the end of
what was last written, 3) "front" side of a (regular) filemark. Say,
you have a tape with 3 tar files on it and want to save the first file,
and you want to begin writing over the 2nd file.

Normally you would "mt fsf 1; tar cf /dev/nrst0 .."

for an Exabyte:

"mt fsf 1; mt bsf 1; mt weof 1; tar cf /dev/nrst0 .."

will work.  The regular filemark consists of an erased zone 3.8" long
(needed to begin a write). In this case, the first filemark is
rewritten in place, which creates an erased zone AFTER it, clearing the
way to write more on the tape. The erase head is not helical.  Warning:
The above sequences and in general positioning on a "stock" (st.c)
SunOS 4.0 and 4.0.1 device driver will not work (Ciprico rfsd.c driver
should work fine). st.c driver has no concept of "front" or "back"
sides of a filemark.  Only "safe" positioning operation for Exabyte
(st.c) on SunOS 4.0/4.0.1 is "mt fsf X" FROM THE BOT. Anything else
will probably abort or leave the tape positioned incorrectly (I saw a
"mt bsf 1" rewind the tape when 10 files out!)

One can position a tape to the end of what was last written by reading
until a "blank tape" error is returned. Writing can be started at this
point. The tape does not become positioned somewhere down the "erased"
area as does a conventional magtape.  One can issue multiple reads at
the "blank tape" error, but the Exabyte stays positioned at the
beginning of the blank area, ready to accept write commands. File skip
operations do not stop at blank tape and will run into old data or run
to the end of the tape, so you have to be careful not to "mt fsf
toomany".

The initial driver developmemt under SunOS 3.[234] ran the drive in
"fixed" mode like the Sun carts. Fixed mode looks like a disk; one can
use any blocking factor for writing and reading (even a different one
for reading). SunOS4.0 came out and made the default to be "variable"
mode, which looks like a 1/2" magtape. In general fixed and variable
mode tapes are not interchangable, hence all the grief caused earlier
this summer.  In variable mode, the size in bytes of the write is
remembered on the tape, and must be read back into the same size (or
bigger) buffer. Reading into too small a buffer, will cause loss of
data, unlike fixed mode (buffer size assumed to be 1024 or a multiple
thereof). In July 88, we defaulted everything to variable mode, to stay
compatible with Sun.

A number of people have called me on how to read Exabyte tapes made
under SunOS 3.[234] under SunOS4.X. If the tape was written in
1024-byte "fixed" mode (the orig blocking factor does not make any
difference) as on most 3.[234] systems, one can try a

	/etc/restore ifb /dev/nrst0 2

on SunOS4.0.  If you use 4.3BSD dump/restore like we do, use "1"
instead for the block factor (1024 bytes). It will run VERY slow!

Lastly, a final word of warning, this being from a design limitation of
Unix itself, one cannot read/write a single file > 2.0GB.  The Unix
file table offset overflows (it is a 32 bit signed quantity).  If you
raw read/write the Exabyte tape, you will run into this.  Doing a
periodic "lseek(fd, 0, 0)" will reset the file offset but do nothing to
the drive to allow > 2.0GB in a single file on raw tape.

We have also spent the last year and a half also shaking down an STC
2925 1/2" magtape (on Gould NP1s & Suns). We started with a
"production" model, which had been shipping for sometime.  Finding
numerous bugs, having STC out here several times, and running thru
approx 40 versions (ROM sets) of the drive firmware, we are now just
getting that drive solid as well.  We are still seeing a hard write
error every 20-30 reels on magtape (this translates into a hard write
error once/tape on an Exabyte for dumps). Exabyte is currently doing
MUCH better than that.  (one hard write error every 40 tapes or so;
equiv to 800 magtapes) We also have not cleaned any drives in 5
months!  (they got cleaned every month at Exabyte before that due to
failures).  Exabyte will be introducing a "wet" (Freon-TF) type of
cleaning cart shortly, similar to a Geneva. They will also probably
recommend a cleaning for every 30 hours' (once/week) use. Experience
has shown, that drives run fine for 100-300 hours between cleanings if
kept in clean office/computer room environments. Some people will
probably put them in steel mills, hence the 30 hour recommendation.  We
have some drives at 300 hours now, running fine.

One has to lie slightly to make dump give good tape estimates on the
length of the tape. Telling the real numbers to dump causes an integer
overflow during tape length calculation.  In reality, the tape is 348
ft at 550700 "bpi". 550700 bpi is not really a true density, but is the
"effective" bpi assuming the Exabyte was a 9-track drive for purposes
of calculating dump tape capacity for dump(8).  Aerial recording
density is around 35 million bits per square inch. You get around
500kbytes per inch of tape.

Thus:

/etc/dump 0fubsd /dev/nrst0 50 348 550700 /filesystem

Should work in theory, but will probably cause an overflow during tape
length calculations, so

We use:

/etc/dump 0fubsd /dev/nrst0 50 6000 54000 /filesystem

We use 4.3BSD dump on suns, so you will probably have to change the
blocking factor "50" (51200 bytes) to be "100" since Sun uses 512 byte
blocks. A fake tape length of 6000 feet and a "bpi" of 54000 seems to
give good estimates on sizing the cartridge.  These numbers are based
on experience and don't match exactly what you would get doing the math
(maybe the lack of gaps?).  Dumping a 230 MB filesystem yields an
estimate of "0.10 tapes".

Exabyte only builds drives, not complete subsystems. Various VARs and
OEMs buy their drives, add enclosures, write documentation, and put
together complete systems for Suns and other machines.  They also
provide user support.  Some of these companies are Perfect Byte, Delta
Microsystems, Artecon, Emerald, Spectrum, Megatape, etc.  The trade
rags are full of them.

Exabyte is rumored to be working on a drive which is 5GB, and transfers
at 30 MB/min (twice the current 15MB/min), and will be able to
read/write your old tapes. Beta drive in about a year or so.  Also
rumored is development of an "X-Y stack loader". The "CHS" (Cartridge
Handling System) is purported to fit in a 19" rack, hold 5 drives, and
120 (small model) and 240 (large model) tapes.  A robot arm will pick,
insert the tape, and close the door of the drive.

The current drive, the EXB-8200, reads/writes 14-15MB/min (around
240KB/sec), therefore it takes 2 hours 40 mins to read/write the entire
tape.  Dumps approach this speed when going from a local disk (restores
are horrible no matter how or from what they are done from due to
directory, indirect block, and inode synchronous writes). A Sun-3/50
can stream 3 drives at once on its SCSI; we have had 7 drives on it at
once (you have to change driver slightly to move bits in minor dev).
All 7 drives will "run", although not at full speed due to the SCSI
running out of time. The Sun-3/180 with a Ciprico RF-3500 was found to
also run 7 drives ok, and can stream 4 of them, and almost stream 5.
Skipping to filemarks (both forward and reverse) runs at 10X the read
speed (~150 MB/min), or about 16 mins to go the whole length of the
tape. Rewind is around 70X read speed or 2 mins 35 sec. Possibilities
have been discussed about having the next drive be able to position to
any file, any block at nearly rewind speed (3-4 mins), but nothing is
committed. I may do some experimenting with high speed positioning on
the current drive via servo commands issued from the 9600 baud "maint"
port on the drive.

Until now, most of the work was done on Sun-3/50s running SunOS 3.4.
3/60's have also been run and seem to work ok.  A Ciprico RF-3500
controller is running on a 3/180 (that driver was rewritten also, and
given back to Ciprico). The next project is to make the SunOS 4.0
driver "work" with the Exabyte.  It currently moves the tape, but still
has a way to go.

To conclude, it looks like the Exabyte drive is finally a solid
product. It comes just in time as systems with 1.2GB disks are becoming
the commonplace. I wish to thank Exabyte and all their staff for
putting up with a year and a half of continuous questions, bug reports,
and bitching from me. Also I like to thank Exabyte and Perfect Byte for
donating some drives (and Ciprico controller) to make all this
possible.

--ghg

George Goble, Engineering Computer Network, Purdue U, W. Lafayette IN 47907
(317) 494-3545  Arpa: ghg@purdue.edu  uucp: {backbone}!pur-ee!ghg

ghg@ucbvax.BERKELEY.EDU (George Goble) (12/20/88)

[note: This article is a combination of two previous articles 
(the original and the update) I posted to comp.periphs from Purdue:
<7378@ea.ecn.purdue.edu> <7546@ea.ecn.purdue.edu>
In a period of two weeks, these articles appear to have made it
to only a handful of sites due to news problems. I am reposting
this from ucbvax, "the gateway to the world".  --ghg]

In April, 1987, we started driver development on the 2nd BETA Exabyte
drive on a Sun-2/120. Most of the initial headaches resulted from
deficiencies with the Sun-2 SCSI system. For a couple of months work
continued on the Sun-2, but was later moved to a Sun-3/50 (SunOS
3.2/3.3/3.4). The Sun-2 could not handle the disconnect,
save-data-pointer SCSI commands which Exabyte began to require in the
summer of 1987.  Also, the Sun-2 was just too slow to stream the drive
while data-comparing each byte.

Initially, the Sun-3/50 would run the drive, but not in disconnect
mode, since the Exabyte would disconnect on odd byte boundries
sometimes, but Exabyte finally added a vendor bit to force disconnects
on even bytes which then allowed full disconnect/reconnect operation on
the 3/50.

Summer of '87 also uncovered a myriad of problems, broken tapes, nasty
jams, motor and brake problems, lots of "hang" conditions, which were
all nice to work out before the product was up for sale.  I was on the
phone several times/day with the person doing the drive firmware at
this time.  I am very grateful to Exabyte to have allowed us to have
BETA drives which allowed us and a few others to uncover quite a few
bugs before it hit the market.

There were few remaining problems when the drive went up for sale in
Sept 1987.  Most of the problems one would "expect" from an 8mm
Camcorder drive were not there.  Tape interchange problems (density was
35 million bits/sq inch), tape "wearing out", being affected by weak
magnetic fields, or not being able to read it back after several
months, just did not happen. The early prototype drive "P4-6" didn't
have firmware (June 87) to spin down the rotary head when the drive was
not active, so a human had to be present during testing to unload the
tape if the firmware hung, etc.  One day, I got called away to fix a
crashed system someplace, left the Exabyte test running, and it of
course hung, and left the head spinning with the tape loaded... for
45-60 minutes!  To my amazement, the tape was not visibly damaged, and
the drive read the area of tape with no errors where the head had been
left spinning!

Both us and Exabyte tested several brands of tape, and both agreed that
the SONY tapes worked the best (we found FUJI a close second, TDK and
Maxell to be the worst). Exabyte's lab testing showed tapes would not
even begin to "degrade" until around 800-900 passes, but due to their
super ECC methods, they were usable well beyond 2000-3000 passes. I
don't think anyone puts near this many passes on their backup tapes.

We were never able to "wear out" a tape (few hundred passes on a short
piece of it).. We ran a drive for 24 hours/day for 2 1/2 months.  It
held up mechanically fine. It was only cleaned once and didn't need
it.  This "lifetime" test simulated approximately how long a head would
last on a conventional tape drive.  (you can almost buy a new Exabyte
for the price of replacing a head on a 1/2" drive)

As far as reading back old tapes goes, I can still read back OK, tapes
I made in July 1987. Anything before that, is not compatible with the
current Exabyte format. Signal levels at the head preamp output (TP3,
right most test point on top of R/W card) are comparable to recently
made tapes (within 5-10%).

The ECC is phenomenal.  Tape is written in "stripes" diagonally across
the tape. Each stripe contains eight 1024 byte sectors plus 400 bytes
of ECC (per sector) plus control and servo information.  The drive does
full read-after-write checking, and even a single bit error causes the
affected block to be rewritten later on the tape in a different
relative position in the stripe. All this is sorted out on readback.
ECC can correct a 263 byte burst error in a 1024 byte sector plus
something like 80 one and two bit errors in the same sector. Compare
that to your disks!

The fall of 1987 brought the ramp up of production, and the advent of
high speed filemark positioning (10X read speed).  As with the ramp up
of almost any new product, some problems developed when components were
purchased in mass quantities.  As usual, we always ended up with drives
made from the "bad" batches. The most notable was the "3-1/2 week
disease" which started in drives built or upgraded starting Dec 1987.
Drives would run, almost to the day, 3-1/2 weeks of power up time and
then start doing flakey things, especially having read problems.  One
could see the head preamp output on a scope fading in and out. This was
finally "cured" around May 1988; turned out to be capacitors "leaking"
very slightly, hosing up the delicate DC bias in the head preamp
circuit.

A 6 week disease was then evident, which also caused "flakey"
operation, and the appearance of tape interchange problems.  It was
tracked down to a critical capacitor in the head Synch circuit slowly
drifting in value, causing the head synch to drift out of acceptable
limits in 6-9 weeks. All 13 of our drives had both 3-1/2 and 6 week
diseases (some went back 5 or 6 times).  Head synch problems were laid
to rest for good during the last week of June 1988 by switching to a
very good grade capacitor.  There were also scattered problems, like
"mode" motor problems, and bad crystal oscillators, but those were
fairly rare here.

From the last week in June, 1988, until the present, it has been smooth
sailing.  Only 1 drive out of 13 has failed since end of June (5
months!) , compared with 100% of the drives every 6-9 weeks before
that.  The one drive which failed just needed a realignment and died in
the first couple of days.  Maybe it got dropped in shipping.

If you have drives which were built or upgraded before 6/29/88, it
would be a very good idea to send them back to be upgraded.  If your
drive does not have a playing-card sized sticker on the top lid, it has
probably been built or upgraded before mid June '88 and should be
upgraded.

We break almost everything we get here, and have appreciated the long
working relationship with Exabyte in turning the drive into a solid
product.  Currently we see about 1 unrecovered write error/month (for
all 12 drives total).  We read back (essentially dd to /dev/null) all
tapes made.  There was one case (late Aug) where a tape was written ok
and would not read back (error at the load point). There was some
problem synch'ing up at the "load point" (firmware bug) and this tape
was successfully read using MX firmware 4$23 (now released). This only
happened once.  Once tapes have been written and read back once, there
have been no read failures since 6/29/88 (the head sync cap fix).
Also, we have only really "lost" one tape. It was heated to between 110
and 115F in the summer heat wave in an unairconditioned apartment (Spec
is around 105F I think).

It was observed in pre July 88 drives that during the gradual failures,
it was possible to write a tape ok, and have it fail on the readback.
Theory was that the read-after-write did indeed work, but that after a
pause, the drive servo became sloppy (due to something being sick
already), and smashed already written and verified data.  This has not
been observed on drives after July 88. Out of general paranoia, we
still read back (dd to /dev/null) EVERYTHING, for an unattended
overnight dump, this takes no more (human) time.

We have seen only 4 or 5 "bad" tapes (out of 300-400 tapes), Sony
P6-120MP, purchased from "consumer" wholesale houses.  The bad tapes
had a wrinkle in them and got hard write errors. One of them wrote ok,
and failed on read back, the wrinkle being eaten away by the rotary
heads.  (another good reason to always read it back) I don't know of
anybody who had only 5 bad magtapes out of 6600-8800 1/2 inch reels.

Around once every week or two a few drives running MX 4$22 firmware
(summer 1988), would just unload the tape for no apparent reason, and
Unix would take a "no cartridge loaded" error. This was caused by the
servo code being overly paranoid about motor ramp ups, etc. It would
occasionly "panic" and emergency unload the tape. Currently released
firmware MX-4$23/SV-B017 fixes this.

We have run around 200 tapes (approx 60 of them almost full) since July
1988. Bitwise each Exabyte tape is equivalent to 16 reels of 2400 ft
6250 BPI (long records) magtape. In actual practice, it is more like 22
or 23 reels (doing dumps) since filesystems are usually started on new
reels and Exabyte reclaims all that wasted space as well. We have put
as many as 250 filesystems (incremental level 9) on a single tape. That
takes around 10 hours (due to the network), now we use 3-4 drives and
do 50-80 filesystems on each one.

The advertised capacity of 2.3GB is only obtainable if the drive is run
in streaming mode (must be fed at 15 MB/min) AND your driver knows how
to deal with the LEOT warning zone.  The LEOT is a "warning" that PEOT
(real end of tape) is "near".  230 MB still remain though on a 120 Min
tape.  The drive issues a SCSI check condition (does partial xfer if in
fixed mode) and the transfer must be restarted by the device driver.
4$23 firmware issues a check on every write after the LEOT.  The
drivers from Perfect Byte (Sun3.4), Ciprico, and possibly Artecon, are
the only ones I know of which handle the LEOT (4$22 or earlier) and let
you use the last 230 MB on the tape. It is a real pain to make that
work correctly. (Let me know if any other drivers exist that can deal
with the LEOT).  Most drivers just return hard error when hitting the
LEOT. Running the drive start/stop (like dump via a network) results in
the loss of 200-300MB or so more.  4$23 firmware allows the driver to
set the "pad stripes" to be zero (right 3 bits set to 0 of the 2nd
vendor uniq byte on a mode select command). This reduces the
nonstreaming loss to around 100MB. Most Unix users are probably ending
up with a driver which is run non-streaming, and can't handle the LEOT,
so they should figure on a capacity of 1.8GB instead of 2.3GB. Anything
which uses lots of filemarks (VMS, ANSI formats, etc) loses severely!
A filemark consumes 2.2MB of tape (you can put an entire DEC "RK05"
disk image inside one Exabyte filemark, for you oldtimers)

This summer, Exabyte added a "short" filemark (set "vendor unique" byte
5, bit 7 in the Sun cdb (vu_57) to get a short mark).  The short
filemark only consumes .86" (.488MB) of tape (60 tracks) vs 3.8"
(2.2MB) of tape (270 tracks) for a regular filemark.  The short
filemark cannot be overwritten like the regular one can.  Comments in
the next paragraph only apply to "regular", not short filemarks. Some
companies selling devices and drivers to deal with the VMS world force
(or default to) short filemarks.

Due to the helical scan and the erase mechanism, there is a writing
limitation on Exabyte drives. "tar r" or "tar u" will not work ("tar c"
is ok).  One can only start writing at  1) beginning of tape, 2) on the
end of what was last written, 3) "front" side of a (regular) filemark.
Say, you have a tape with 3 tar files on it and want to save the first
file, and you want to begin writing over the 2nd file.

Normally you would "mt fsf 1; tar cf /dev/nrst0 .."

for an Exabyte:

"mt fsf 1; mt bsf 1; mt weof 1; tar cf /dev/nrst0 .."

will work.  The regular filemark consists of an erased zone 3.8" long
(needed to begin a write). In this case, the first filemark is
rewritten in place, which creates an erased zone AFTER it, clearing the
way to write more on the tape. The erase head is not helical.
Warning:  The above sequences and in general positioning on a "stock"
(st.c) SunOS 4.0 and 4.0.1 device driver will not work (Ciprico rfsd.c
driver should work fine). st.c driver has no concept of "front" or
"back" sides of a filemark.  Only "safe" positioning operation for
Exabyte (st.c) on SunOS 4.0/4.0.1 is "mt fsf X" FROM THE BOT. Anything
else will probably abort or leave the tape positioned incorrectly (I
saw a "mt bsf 1" rewind the tape when 10 files out!)

One can position a tape to the end of what was last written by reading
until a "blank tape" error is returned. Writing can be started at this
point. The tape does not become positioned somewhere down the "erased"
area as does a conventional magtape.  One can issue multiple reads at
the "blank tape" error, but the Exabyte stays positioned at the
beginning of the blank area, ready to accept write commands. File skip
operations do not stop at blank tape and will run into old data or run
to the end of the tape, so you have to be careful not to "mt fsf
toomany".

The initial driver developmemt under SunOS 3.[234] ran the drive in
"fixed" mode like the Sun carts. Fixed mode looks like a disk; one can
use any blocking factor for writing and reading (even a different one
for reading). SunOS4.0 came out and made the default to be "variable"
mode, which looks like a 1/2" magtape. In general fixed and variable
mode tapes are not interchangable, hence all the grief caused earlier
this summer.  In variable mode, the size in bytes of the write is
remembered on the tape, and must be read back into the same size (or
bigger) buffer. Reading into too small a buffer, will cause loss of
data, unlike fixed mode (buffer size assumed to be 1024 or a multiple
thereof). In July 88, we defaulted everything to variable mode, to stay
compatible with Sun.

A number of people have called me on how to read Exabyte tapes made
under SunOS 3.[234] under SunOS4.X. If the tape was written in
1024-byte "fixed" mode (the orig blocking factor does not make any
difference) as on most 3.[234] systems, one can try a

	/etc/restore ifb /dev/nrst0 2

on SunOS4.0.  If you use 4.3BSD dump/restore like we do, use "1"
instead for the block factor (1024 bytes). It will run VERY slow!

Lastly, a final word of warning, this being from a design limitation of
Unix itself, one cannot read/write a single file > 2.0GB.  The Unix
file table offset overflows (it is a 32 bit signed quantity).  If you
raw read/write the Exabyte tape, you will run into this.  Doing a
periodic "lseek(fd, 0, 0)" will reset the file offset but do nothing to
the drive to allow > 2.0GB in a single file on raw tape.

We have also spent the last year and a half also shaking down an STC
2925 1/2" magtape (on Gould NP1s & Suns). We started with a
"production" model, which had been shipping for sometime.  Finding
numerous bugs, having STC out here several times, and running thru
approx 40 versions (ROM sets) of the drive firmware, we are now just
getting that drive solid as well.  We are still seeing a hard write
error every 20-30 reels on magtape (this translates into a hard write
error once/tape on an Exabyte for dumps). Exabyte is currently doing
MUCH better than that.  (one hard write error every 40 tapes or so;
equiv to 800 magtapes) We also have not cleaned any drives in 5
months!  (they got cleaned every month at Exabyte before that due to
failures).  Exabyte will be introducing a "wet" (Freon-TF) type of
cleaning cart shortly, similar to a Geneva. They will also probably
recommend a cleaning for every 30 hours' (once/week) use. Experience
has shown, that drives run fine for 100-300 hours between cleanings if
kept in clean office/computer room environments. Some people will
probably put them in steel mills, hence the 30 hour recommendation.  We
have some drives at 300 hours now, running fine.

One has to lie slightly to make dump give good tape estimates on the
length of the tape. Telling the real numbers to dump causes an integer
overflow during tape length calculation.  In reality, the tape is 348
ft at 550700 "bpi". 550700 bpi is not really a true density, but is the
"effective" bpi assuming the Exabyte was a 9-track drive for purposes
of calculating dump tape capacity for dump(8).  Aerial recording
density is around 35 million bits per square inch. You get around
500kbytes per inch of tape.

Thus:

/etc/dump 0fubsd /dev/nrst0 50 348 550700 /filesystem

Should work in theory, but will probably cause an overflow during tape
length calculations, so

We use:

/etc/dump 0fubsd /dev/nrst0 50 6000 54000 /filesystem

We use 4.3BSD dump on suns, so you will probably have to change the
blocking factor "50" (51200 bytes) to be "100" since Sun uses 512 byte
blocks. A fake tape length of 6000 feet and a "bpi" of 54000 seems to
give good estimates on sizing the cartridge.  These numbers are based
on experience and don't match exactly what you would get doing the math
(maybe the lack of gaps?).  Dumping a 230 MB filesystem yields an
estimate of "0.10 tapes".

It is also possible to write a "directory" on the front of a dump tape
(also works with most magtapes), AFTER the dump is done.  This way one
can store the command file which made the dump, along with the actual
log of the dumps on that tape.  Sometimes one will have some
filesystems abort (when dumped over a network) due to machines crashing
and/or network problems.  Most of the time, these can just be appended
on the end of the tape.  One initially writes a file of 10 Megabytes of
zeros on the front of the tape, then writes the dumps on.  One then
rewinds, then uses a slightly modified tar command to write on the
directory. The mod to tar is just to execute a "mtioctl REWIND" before
exiting, so the device driver close routine does not write a filemark
after the directory.  If a filemark were written, the tape would have
two filemarks with the same sequence number, which would confuse the
Exabyte (will still work though). You have room to tar on about 2 MB of
whatever you want in the directory area. To access the dump files, ones
does a "mt fsf 1" to skip over the directory and the blank area. One
cannot "read" past the directory, as an erased tape error (blank check)
will occur.

The huge capacity of Exabyte also lessens dump hassles in other ways.
We do level 0 dumps on large Gould systems every two weeks, and level
9s everyday, no level 1-8s are needed, which makes restores a cinch.
These are heavy use (lots of undergrads) machines, which have some
filesystems with 50% of their files touched after 2 weeks.  I put 4
Gould NP1s, 4 Gould PN9080s, and a CCI 6/32 on 5-1/2 Exabyte tapes,
level 0.  The above systems' level 9s all fit on 1/2 to 3/4 of one tape
(dumped to a Sun 3/50 in my office) For Suns (staff and grad students
only), and other research machines, we have noticed the level 0s can be
run every 4-6 weeks, with 9s everyday.  Restores are simple, just do
the level 0, and the most recent level 9 tape.  Exabyte level 9s are
highly resistant to "blowouts", where some special research project or
massive undergrad final projects, etc, will touch 200 Megabytes or more
in one day on a single filesystem.  A 200 meg blowout can wreak havoc
with conventional backup schedules (requires more tapes, keeping track
of them, etc) An Exabyte sails along, taking maybe 15 mins longer to
dump the filesystem with the blowout, and it all still fits on one
tape.

Exabyte only builds drives, not complete subsystems. Various VARs and
OEMs buy their drives, add enclosures, write documentation, and put
together complete systems for Suns and other machines.  They also
provide user support.  Some of these companies are Perfect Byte, Delta
Microsystems, Artecon, Emerald, Spectrum, Megatape, etc.  The trade
rags are full of them.

Exabyte is rumored to be working on a drive which is 5GB, and transfers
at 30 MB/min (twice the current 15MB/min), and will be able to
read/write your old tapes. Beta drive in about a year or so.  Also
rumored is development of an "X-Y stack loader". The "CHS" (Cartridge
Handling System) is purported to fit in a 19" rack, hold 5 drives, and
120 (small model) and 240 (large model) tapes.  A robot arm will pick,
insert the tape, and close the door of the drive.

The current drive, the EXB-8200, reads/writes 14-15MB/min (around
240KB/sec), therefore it takes 2 hours 40 mins to read/write the entire
tape.  Dumps approach this speed when going from a local disk (restores
are horrible no matter how or from what they are done from due to
directory, indirect block, and inode synchronous writes). A Sun-3/50
can stream 3 drives at once on its SCSI; we have had 7 drives on it at
once (you have to change driver slightly to move bits in minor dev).
All 7 drives will "run", although not at full speed due to the SCSI
running out of time. The Sun-3/180 with a Ciprico RF-3500 was found to
also run 7 drives ok, and can stream 4 of them, and almost stream 5.
Skipping to filemarks (both forward and reverse) runs at 10X the read
speed (~150 MB/min), or about 16 mins to go the whole length of the
tape. Rewind is around 70X read speed or 2 mins 35 sec. Possibilities
have been discussed about having the next drive be able to position to
any file, any block at nearly rewind speed (3-4 mins), but nothing is
committed. I may do some experimenting with high speed positioning on
the current drive via servo commands issued from the 9600 baud "maint"
port on the drive.

Until now, most of the work was done on Sun-3/50s running SunOS 3.4.
3/60's have also been run and seem to work ok.  A Ciprico RF-3500
controller is running on a 3/180 (that driver was rewritten also, and
given back to Ciprico). The next project is to make the SunOS 4.0
driver "work" with the Exabyte.  It currently moves the tape, but still
has a way to go.

To conclude, it looks like the Exabyte drive is finally a solid
product. It comes just in time as systems with 1.2GB disks are becoming
the commonplace. I wish to thank Exabyte and all their staff for
putting up with a year and a half of continuous questions, bug reports,
and bitching from me. Also I like to thank Exabyte and Perfect Byte for
donating some drives (and Ciprico controller) to make all this
possible.

--ghg

George Goble, Engineering Computer Network, Purdue U, W. Lafayette IN 47907 
(317) 494-3545  Arpa: ghg@purdue.edu  uucp: {backbone}!pur-ee!ghg