[comp.unix.questions] Answer on sockets; Question on screen under SunOS 4.01

primer@math.harvard.edu (Jeremy Primer) (11/13/90)

In article <see References line!> papp@remus.rutgers.edu (papp) writes:

   If anyone out in net-land has a piece of fairly simple
   and straightforward C code that creates the sockets and passes
   information back and forth, I would greatly appreciate a look-see.

An elegantly written and useful piece of code which uses sockets is
screen 2.0 by Oliver Laumann, which implements a detachable window
system for a dumb terminal (and gives the virtual screens ANSI
compatibility).  This uses sockets.  It is not overly simple (to me),
but it is clearly written.  It is available for anonymous ftp
from isy.liu.se (and perhaps other places). 

I have a related question which is really about SunOS: I have been
running screen 2.0 for several months, previously under SunOS 4.0 and now
under SunOS 4.1.  Previously, when one suspended the primary program
in a window, it would automatically restart, which was fine.  Now,
under 4.1, any suspended program simply dies.  For example, type
"screen emacs" followed by C-z, and find that the one window has
closed, allowing screen to terminate.  Emacs has also terminated, and
is not suspended somewhere.  What has changed?
--
Jeremy Primer, Department of Mathematics, 1 Oxford Street, Cambridge MA 02138
primer@math.harvard.edu   ...!harvard!zariski!primer    primer@zariski.bitnet

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (11/14/90)

In article <PRIMER.90Nov12224354@osgood.harvard.edu> primer@math.harvard.edu (Jeremy Primer) writes:
  [ suspended process under screen 2.0 dies under SunOS 4.1 ]
> What has changed?

You're seeing the evil effects of POSIX sessions, the only innovation in
P1003.1 not based upon real-world experience and hence (objectively) the
worst feature of the standard. SunOS 4.1 is a POSIX-based system.

I've upgraded pty to SunOS 4.1. Now if we could only convince Oliver to
support pty under screen along with (or in place of) his old pseudo-tty
allocation code, this wouldn't be a problem. Oh, well.

---Dan

net@opal.cs.tu-berlin.de (Oliver Laumann) (11/15/90)

In article <7122:Nov1408:21:1690@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
> In article <PRIMER.90Nov12224354@osgood.harvard.edu> primer@math.harvard.edu (Jeremy Primer) writes:
>   [ suspended process under screen 2.0 dies under SunOS 4.1 ]
> > What has changed?
> 
> I've upgraded pty to SunOS 4.1. Now if we could only convince Oliver to
> support pty under screen along with (or in place of) his old pseudo-tty
> allocation code, this wouldn't be a problem. Oh, well.

Huh?  What does the problem that was described in the original article
have to do with the way pseudo-ttys are allocated in screen?  Am I
missing something?

The problem with screen under SunOS 4.1 obviously is that a process
running under control of screen is killed (i.e. receives a SIGKILL)
when you type ^Z in the respective window.  Since screen doesn't do
this, it looks like the kernel kills the process.

Instead of trying to convince me to change screen's pseudo-tty
allocation code I would appreciate if someone could tell me under what
conditions exactly the SunOS kernel kills a process, so that I can try
to put a work-around into screen (unfortunately we don't have the SunOS
sources, so I can't look that up myself).

Thanks,
--
Oliver Laumann     net@TUB.BITNET     net@tub.cs.tu-berlin.de     net@tub.UUCP

gwyn@smoke.brl.mil (Doug Gwyn) (11/16/90)

In article <7122:Nov1408:21:1690@kramden.acf.nyu.edu> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
>You're seeing the evil effects of POSIX sessions, the only innovation in
>P1003.1 not based upon real-world experience and hence (objectively) the
>worst feature of the standard. SunOS 4.1 is a POSIX-based system.

Actually, it was based on real-world experience, although as with many
of the details in such standards the standard specified a slight variant
of what had previously been implemented.  The capsule summary:  BSD job
control is a horrible kludge that never did work right and required
vhangup etc.  HP-UX was based on UNIX System V and when customers wanted
job control some reengineering was necessary in order to make it work in
that environment and also to close several security holes.  POSIX job
control was originally specified along the lines recommended by HP-UX
engineers, but got redesigned during balloting, as did lots of other
stuff.  I think the IEEE balloting procedures suck.

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (11/16/90)

In article <2212@kraftbus.opal.cs.tu-berlin.de> net@tubopal.UUCP (Oliver Laumann) writes:
> Huh?  What does the problem that was described in the original article
> have to do with the way pseudo-ttys are allocated in screen?

It has everything to do with it. I had many reports of the same problem
with pty. What happens is that the process running under the pseudo-tty
is in what POSIX calls an ``orphaned process group,'' because it's in a
different session from its parent. You know how orphaned processes are
killed when they're stopped? Well, POSIX requires that this also happen
to processes in an orphaned process group, even if the parent is around.
This requirement is unreasonable, useless, and unnecessarily hurts lots
of existing programs, but it's what the standard says.

Now that pty works around the problem, all code using it will be
portable to SunOS 4.1 (and Ultrix 4.0 and Convex UNIX 8.0, which have
the same behavior). This is why encapsulating pseudo-tty management into
a single program is so useful: no other program ever has to worry about
pseudo-tty portability again.

> Instead of trying to convince me to change screen's pseudo-tty
> allocation code I would appreciate if someone could tell me under what
> conditions exactly the SunOS kernel kills a process, so that I can try
> to put a work-around into screen (unfortunately we don't have the SunOS
> sources, so I can't look that up myself).

And this is what happens when you refuse to make your programs portable
by taking advantage of what's available.

---Dan

gwc@root.co.uk (Geoff Clare) (11/19/90)

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:

> You know how orphaned processes are
>killed when they're stopped? Well, POSIX requires that this also happen
>to processes in an orphaned process group, even if the parent is around.
>This requirement is unreasonable, useless, and unnecessarily hurts lots
>of existing programs, but it's what the standard says.

Wrong, wrong, wrong.  That is not what the standard says at all.

Here is the relevant text from P1003.1-1990, 3.3.1.3:

	"A process that is a member of an orphaned process group shall
	not be allowed to stop in response to the SIGTSTP, SIGTTIN, or
	SIGTTOU signals.  In cases where delivery of one of these signals
	would stop such a process, the signal shall be discarded."

In other words, POSIX has fixed the revolting BSD behaviour Dan describes.
Instead of getting killed the process just gets an EIO error from read(),
write(), etc.
-- 
Geoff Clare <gwc@root.co.uk>  (Dumb American mailers: ...!uunet!root.co.uk!gwc)
UniSoft Limited, Hayne Street, London EC1A 9HH, England.   Tel: +44-71-315-6600

brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (11/20/90)

In article <2539@root44.co.uk> gwc@root.co.uk (Geoff Clare) writes:
> brnstnd@kramden.acf.nyu.edu (Dan Bernstein) writes:
> > You know how orphaned processes are
> >killed when they're stopped? Well, POSIX requires that this also happen
> >to processes in an orphaned process group, even if the parent is around.
> >This requirement is unreasonable, useless, and unnecessarily hurts lots
> >of existing programs, but it's what the standard says.
  [ correction ]

Yes. I should have said the following:

  You know how job control doesn't work on orphaned processes? Well,
  POSIX requires that this also happen to processes in an orphaned
  process group, even if the parent is around. This requirement is
  unreasonable, useless, and unnecessarily hurts lots of existing
  programs, but it's what the standard says. 

> In other words, POSIX has fixed the revolting BSD behaviour Dan describes.

No, it has not. Any process with a parent should be allowed to stop.
BSD does that correctly. POSIX doesn't. Give me one good reason that
screen should fail under POSIX.

In the exceptional case when a process doesn't have a parent, there are
arguments for (BSD) killing the process, for stopping the process, and
for (POSIX) ignoring the signal. I favor the POSIX behavior over the BSD
behavior, though I think ignoring orphan status is the best solution. In
any case, this is a minor issue compared to the problem of what POSIX
does to processes that *do* have parents.

---Dan