sysmark@physics.utoronto.ca (Mark Bartelt) (04/05/91)
OK, back to the well again. Several more questions about exabyte drives under IRIX. Several weeks ago I wrote | I have an exabyte drive (from Dilog) on a 4D25. When I recently upgraded | to IRIX 3.3.1, I was pleased to find the /dev/{r}mt/tps*v devices: It's | nice to finally be able to read/write arbitrary-length tape records! to which several people pointed out (more politely than I deserved) that I didn't RTFM. In fact, the tps*v devices were there all along. However, this brings up question #1: It turns out that MAKEDEV will create the tps*v devices *only* if there is an exabyte drive attached to the SCSI bus and powered on. (This is why I'd been under the misimpression that the tps*v devices were new: The exabyte hadn't yet arrived when I installed 3.2.1, but was connected when 3.3.1 got installed.) How come? Specifically, why is this particular class of special files treated differently from nearly everything else? For example, if I do MAKEDEV t3270 it just goes ahead and makes the appropriate special files, irrespective of whether or not I actually have an IBM 3270 interface installed on my IRIS. This is actually a nice feature, since I can ensure that the special files I want are there in advance, in anticipation of a device actually getting installed. Not so, for some reason, with SCSI tapes, at least in certain circumstances. The tps*v files get created only if "mt -t whatever status" reports that there's an exabyte there. Why? Anyway, this brings me to question #2: How exactly are the tps*v forms of the device handled? Various people, in a previous exabyte discussion, made some comments about the way that data gets stored in physical records on tape. For example ... [ Doug Thompson ] | Physical records on 1kb records grouped 8 records per helical track, | 8kb per track. [ Dave Olson ] | The 'block size' is written as part of the header info. Larger block | sizes gain you no capacity or performance. You DO want to make sure the | block sizes is a multiple of 1024 for maximum capacity [ ... ] So question #2 is, given that the actual on-tape physical records are 1k, how are variable-length records written? Are logical records somehow packed into the 1k physical records, incorporating some sort of record structure on the physical tape records (something similar to the way Fortran unformatted disk files are implemented under UNIX, for example)? Or is the information out-of-band somewhere? And, if the latter, where? In the record headers? I'm not sure knowing the answer to this is really crucial, but it would be culturally interesting, and possibly useful when understanding portability problems between IRIX and exabyte drives on other vendors' systems. Finally, question #3. Does anyone have a clue as to what's going on here? % maketape /dev/rmt/tps0d7nsv # "nsv" device works fine 5+0 records in 5+0 records out 10+0 records in 10+0 records out 15+0 records in 15+0 records out % % maketape /dev/rmt/tps0d7ns # "ns" device doesn't 5+0 records in 5+0 records out dd: write error: I/O error 3+0 records in 3+0 records out dd: write error: I/O error 3+0 records in 3+0 records out % Here's what "maketape" looks like. Note that the dd's bs is 1k, so I'd expect that the exabyte should be happy, regardless of whether it's the "nsv" or "ns" device. Changing the bs to, say, 8k, doesn't help (in fact, the error appears after a smaller number of records!). Also, whenever the failure occurs, the orange light on the drive comes on for a rather long time (~30 seconds, though I haven't actually timed it), during which time all console windows are semi-frozen, in that they echo what's typed, but it appears that no processes get to run (keyboard input doesn't get processed until the exabyte stops fussing around). Is the drive locking up the SCSI bus? Is the IRIX SCSI driver sleeping at some high priority, waiting for the drive to finish chatting with it? A combination of both of the above? Or something else? So, actually, two related questions: (A) why does the script fail for the "ns" flavour of the device; (B) when it fails, why does the whole system go out to lunch for a semi-eternity? --------------------------------------------------------------------------- #!/bin/sh # script to demonstrate exabyte weirdness case $# in 1) tape=$1 ;; *) echo "Usage: `basename $0` tapedev" >&2; exit 1 ;; esac mt -t $tape rewind for x in 5 10 15; do dd if=/dev/zero bs=1k of=$tape count=$x done mt -t $tape rewind --------------------------------------------------------------------------- As usual, and as many times before, thanks in advance. Mark Bartelt 416/978-5619 Canadian Institute for sysmark@cita.toronto.edu Theoretical Astrophysics sysmark@cita.utoronto.ca
vjs@rhyolite.wpd.sgi.com (Vernon Schryver) (04/05/91)
In article <1991Apr4.202436.25620@helios.physics.utoronto.ca>, sysmark@physics.utoronto.ca (Mark Bartelt) writes: > ... > How come? Specifically, why is this particular class of special files treated > differently from nearly everything else? For example, if I do > > MAKEDEV t3270 > > it just goes ahead and makes the appropriate special files, irrespective of > whether or not I actually have an IBM 3270 interface installed on my IRIS. > This is actually a nice feature, since I can ensure that the special files > I want are there in advance, in anticipation of a device actually getting > installed. Not so, for some reason, with SCSI tapes, at least in certain > circumstances. The tps*v files get created only if "mt -t whatever status" > reports that there's an exabyte there. Why? Because /dev is fat, bloated, bulging, and and ugly. Because there are important, hallowed UNIX commands that stat(2) names in /dev until they find what they want, with obvious performance synergisms with the current size of /dev. Because we have not figured out a good mechinism to create the nodes only for the hardware that actually exists on the system. Because it's practically impossible to find a device name in /dev by visually scanning `ls -l /dev`. (Sorry about the babble. This issue has many old connotations around here.) Vernon Schryver, vjs@sgi.com
olson@anchor.esd.sgi.com (Dave Olson) (04/05/91)
In <1991Apr4.202436.25620@helios.physics.utoronto.ca> sysmark@physics.utoronto.ca (Mark Bartelt) writes: | OK, back to the well again. Several more questions about exabyte drives | under IRIX. Several weeks ago I wrote | | | I have an exabyte drive (from Dilog) on a 4D25. When I recently upgraded | | to IRIX 3.3.1, I was pleased to find the /dev/{r}mt/tps*v devices: It's | | nice to finally be able to read/write arbitrary-length tape records! | | to which several people pointed out (more politely than I deserved) that | I didn't RTFM. In fact, the tps*v devices were there all along. However, | this brings up question #1: | | It turns out that MAKEDEV will create the tps*v devices *only* if there is | an exabyte drive attached to the SCSI bus and powered on. (This is why I'd | been under the misimpression that the tps*v devices were new: The exabyte | hadn't yet arrived when I installed 3.2.1, but was connected when 3.3.1 got | installed.) | | How come? Specifically, why is this particular class of special files treated | differently from nearly everything else? For example, if I do Partly because of the number of devices created, and partly because of the confusion when folks try to use the variable mode devices on drives that don't support it. This is even more the case with the 9 track drives, where you get even more possible types of devices. As I said in an earlier posting, one of the things you get when you buy peripherals directly from SGI is a little manual telling you these kinds of things; 3rd party is cheaper, but you don't usually get the same level of support and information. | Anyway, this brings me to question #2: How exactly are the tps*v forms of | the device handled? Various people, in a previous exabyte discussion, made | some comments about the way that data gets stored in physical records on | tape. For example ... | [ Doug Thompson ] | | Physical records on 1kb records grouped 8 records per helical track, | | 8kb per track. | [ Dave Olson ] | | The 'block size' is written as part of the header info. Larger block | | sizes gain you no capacity or performance. You DO want to make sure the | | block sizes is a multiple of 1024 for maximum capacity [ ... ] | So question #2 is, given that the actual on-tape physical records are 1k, | how are variable-length records written? Are logical records somehow packed | into the 1k physical records, incorporating some sort of record structure on | the physical tape records (something similar to the way Fortran unformatted | disk files are implemented under UNIX, for example)? Or is the information | out-of-band somewhere? And, if the latter, where? In the record headers? Yes, the records are packed (except possibly the last one before a FM, or if you can't keep the drive streaming), and the info is out of band, in that you can't get to it, but the drive can. The net is that if a vendor supports the variable mode in their driver, it should be able to read tapes written on any other vendords exabyte drive in variable mode. Figuring out the correct size to use when reading can be an interesting/frustrating process sometimes... | Here's what "maketape" looks like. Note that the dd's bs is 1k, so I'd | expect that the exabyte should be happy, regardless of whether it's the | "nsv" or "ns" device. Changing the bs to, say, 8k, doesn't help (in fact, | the error appears after a smaller number of records!). Also, whenever the | failure occurs, the orange light on the drive comes on for a rather long | time (~30 seconds, though I haven't actually timed it), during which time | all console windows are semi-frozen, in that they echo what's typed, but it | appears that no processes get to run (keyboard input doesn't get processed | until the exabyte stops fussing around). Is the drive locking up the SCSI | bus? Is the IRIX SCSI driver sleeping at some high priority, waiting for | the drive to finish chatting with it? A combination of both of the above? | Or something else? So, actually, two related questions: (A) why does the | script fail for the "ns" flavour of the device; (B) when it fails, why does | the whole system go out to lunch for a semi-eternity? Well, I'll bet that there are some messages in your console window, and in your SYSLOG. I might be able to tell you what the problem is, if you email them to me. My guess is that you have a cabling or termination problem, but there are other possiblities. (The SCSI bus is in use, and some pages for the window/manager probably can't get paged in; after 30 seconds or so, the driver times out, and resets the SCSI bus, after which most things will recover, but not all.) -- Dave Olson Life would be so much easier if we could just look at the source code.