[comp.sys.sun] Problem with lockf on SunOS 3.5

rbj@dsys.ncsl.nist.gov (Root Boy Jim) (06/24/89)

? From:    chuck@morgan.com (Chuck Ocheret)

? I have been able to remove <exiting> processes in the past when the
? scenario is that the process is waiting for an output queue to flush.  A
? short program that flushes the device's queue allows the process to rest
? in peace.  For example for a terminal, an ioctl() call with a TCFLSH
? command (for termio at least) can flush the offending output queue.  The
? zombie problem occurs very often when testing new comm systems (maybe I'm
? a sloppy developer) so I keep an "exorcist" program around for just such
? occasions.

? I would like to see a general purpose "exorcist" which, given a pid, can
? determine what device the process is waiting for and take the appropriate
? action.

? Chuck Ocheret
? Morgan Stanely & Co., Inc.

? 1251 Avenue of the Americas
? New York, N.Y.  10020
? (212)703-4474
? chuck@morgan.com

? [[ When one of our terminal lines got stuck with an <exiting> process, I
? ran a program that called ioctl() with TCFLSH (remembering that this
? sometimes unstuck these beasts).  It had no effect.  So it doesn't always
? work.  --wnl ]]

OK, so TCFLSH is one of them new-fangled termio thingies. Perhaps that
makes a difference. But as we all know, regular terminal I/O is always
interruptible by signals. Why is the tty driver sleeping rather than
waiting? Sounds like a bug somewhere.

	Root Boy Jim is what I am
	Are you what you are or what?

anand@amax.npac.syr.edu (Rangachari Anand) (06/25/89)

The following program works perfectly on SunOS 4.0.  When run on 3.5 the
following things happen.

  1. The program hangs. The process can not be killed even with
     kill -9

  2. The daemon lockd dies.

The program is shown below:
#include <stdio.h>
#include <sys/file.h>
#include <unistd.h>

main()
{
  int fd, res;

  fd = open("foo", O_WRONLY | O_CREAT);
  if (fd == -1) { perror("t1"); exit(0); }

  res = lockf(fd, F_LOCK, (long) 1);
  if (res == -1) { perror("t1"); exit(0); }
  printf("Lockf returns %d\n", res);

  res = lockf(fd, F_ULOCK, (long) 1);
  if (res == -1) { perror("t1"); exit(0); }
  printf("Lockf returns %d\n", res);

  res = lockf(fd, F_TEST, (long) 1);
  if (res == -1) { perror("t1"); exit(0); }
  printf("Lockf returns %d\n", res);
}

Does any one know what causes this problem.

                                           R.Anand

Internet:  anand@amax.npac.syr.edu
Bitnet:    ranand@sunrise

guy@uunet.uu.net (Guy Harris) (07/06/89)

>OK, so TCFLSH is one of them new-fangled termio thingies. Perhaps that
>makes a difference. But as we all know, regular terminal I/O is always
>interruptible by signals. Why is the tty driver sleeping rather than
>waiting? Sounds like a bug somewhere.

No, the problem is that the process is in the <exiting> state.  If it's
in that state, it's not accepting signals, so even if it is sleeping at
a priority above PZERO it's still not interruptable.  This is pretty
much inherent in the way UNIX handles the automatic "close" done for an
exiting process - if such a "close" blocks, you're screwed, since
there's no provision for having a device's "close" routine know that
somebody's impatient and that it should therefore shut the device down
even if the event it's waiting for hasn't occurred yet.

You could argue that the lack of such a provision is a bug, but if it
is, it's not a simple bug in a driver, it's a more fundamental
deficiency.