[comp.unix.ultrix] /etc/rmt hangs

robm@ataraxia.Berkeley.EDU (Rob McNicholas) (05/08/91)

We do backups by rsh'ing rdump commands to our workstations from a
DECstation 3100 with an Exabyte 8mm drive.  However, I find that these
backups hang occasionally.  If I kill off the rsh process, all
processes appear to die, except netstat shows that there is still a
connection from the remote machine. (In this example, bsim is the name
of the remote machine, and ataraxia is the name of the machine with
the tape drive).  From ataraxia, netstat shows:

tcp        0      0  ataraxia.shell         bsim.1019              CLOSE_WAIT

From bsim, it shows:

tcp        0      0  bsim.1019              ataraxia.shell         FIN_WAIT_2


ataraxia has a continuous load of 1, and /etc/rmt is still running.
Any attempt to access the tape drive reports "/dev/nrmt0h: device
busy".

Attempts to kill the /etc/rmt process are fruitless.  /etc/rmt shows
as:

root     25629  0.0  0.2  128   40 ?  D     0:01 /etc/rmt

Can I tell rmt to close the tape device without writing a custom
program?  Any other suggestions?

Many thanks in advance,

-Rob
--
Rob McNicholas			Computer Systems Support Group, U.C. Berkeley
robm@janus.berkeley.edu		....!ucbvax!janus!robm
Home: 415/339-1514		Work: 415/642-8633

litwack@dccs.upenn.edu (Mark Litwack) (05/25/91)

There is no way to get /etc/rmt out of the I/O wait
once it is stuck (or dump for that matter, if you
happen to be using it locally).

A patch is available from DEC to fix processes
that are hung waiting for I/O on SCSI tape drives.

The patch contains mscp_tape.o, scsi.o, and scsi_asc.o.

mscp_tape.o fixes the hang problem, scsi.o fixes a kernel
panic, and scsi_asc.o improves SCSI tape performance for
the DS5000.

Overall, a good set of patches to have.

-mark