libes@cme.nist.gov (Don Libes) (08/23/90)
I've recently done some work with ptys and thought I'd share my experiences with you - especially since the manuals weren't very sharing for me. I'm interested in comments. The Sun 3/60 used is running SunOS4.1 The DecStation 3100 used is running V2.1.14 1) After slave side closes fd, master side read() returns -1. master reads X seconds after slave close Sun errno Dec errno 0 5 (EIO) 35 (EWOULDBLOCK) 10 5 (EIO) 5 (EIO) 20 5 (EIO) 5 (EIO) I had expected read() to return 0 to indicate EOF. The Sun engineer said the manuals are in error not to document this behavior, but could not explain why the driver was written this way. Can anyone? I didn't bother to call Dec, but I couldn't find that behavior documented either, nor can I rationalize it. (The fd was NOT marked no-delay.) 2) After slave writes data and closes fd, the master side reads: master reads X seconds after slave close Sun Dec 0 data data 10 data data 20 -1,EIO data The Sun manual actually does document this, but doesn't phrase it quite the way I'd say it. Specifically, it is a byproduct of the underlying streams implementation - "close() waits up to 15 seconds, for each module and driver, for any output to drain before dismantling the stream." In other words, if you don't read your data quickly enough, you lose it! The Dec behavior is what I would've expected. The Sun engineer could not explain where the number 15 came from, although he was kind enough to point out that Sun Consulting could change it on my machine for a small fee. Otherwise it is not user-settable. He said no one had ever reported these as bugs before. He added that they might their implementation [both (1) and (2)] in the future but made no guarantees. He did guarantee to change the manual, however. Don Libes libes@cme.nist.gov ...!uunet!cme-durer!libes
guy@auspex.auspex.com (Guy Harris) (08/24/90)
>The Dec behavior is what I would've expected. The Sun engineer could >not explain where the number 15 came from, It comes from the AT&T S5R3.x source code. Where it got the number, I don't know.
libes@cme.nist.gov (Don Libes) (08/24/90)
In article <3948@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes: >>The Dec behavior is what I would've expected. The Sun engineer could >>not explain where the number 15 came from, > >It comes from the AT&T S5R3.x source code. Where it got the number, I >don't know. Uhhh, this wasn't quite the answer I was looking for. Let me rephrase the question(s): Why do pty's return EIO instead of 0 upon EOF? Is SunOS, Ultrix, or neither doing the "right" thing? If Ultrix is BSD-based and SunOS is SV-based, obviously this EIO behavior is common practice, yet I don't find any documentation on it, nor could a company engineer explain it. What's the rationale? Why do stream's dump their buffers when the writer closes? I would think this could be a problem with other drivers besides ptys. Or am I confused and is this just an error in the way the pty driver uses streams? Is it possible to change this behavior using a flow control option without taking a severe hit in efficiency? (The manual alluded to this but didn't give enough to go on.) What is it you are expected to do in 15 seconds? It sure seems unusually large for internal system cleanup purposes. Yet it is worthless for user purposes if you can't make it infinite. Why can't you change this number? And finally, why does everyone answer easy questions with "[long essay deleted] and this should probably be in the FAQ if it isn't already"? That indicates to me that neither the asker nor the answerer has read the FAQ. (This is a rhetorical question at the moment since the FAQ has slipped off the face of the earth. Hey, repost that sucker!) Don Libes libes@cme.nist.gov ...!uunet!cme-durer!libes
guy@auspex.auspex.com (Guy Harris) (08/26/90)
>Why do pty's return EIO instead of 0 upon EOF? Ask Berkeley. The standard 4.3BSD pseudo-tty driver returns EIO if nobody's holding the slave side open. >Is SunOS, Ultrix, or neither doing the "right" thing? Beats me, what's "the 'right' thing?" A program can probably deal with any sort of indication that the slave side is closed, either a zero-length read or -1 and EIO. If 4.3BSD compatibility is considered important, returning -1 and setting "errno" to EIO is "the 'right' thing." If one considers a zero-length read to be philosophically correct, or if it simplifies programs that run on the master side of a pseudo-tty, or something like that, returning -1 and setting "errno" to EIO isn't "the 'right' thing." >If Ultrix is BSD-based and SunOS is SV-based, SunOS is based on both BSD and S5, and also includes Sun stuff based on neither of them. SunOS's pseudo-tty driver is more based on the BSD one than on the S5 one, the fact that the slave side (but not the master side) in SunOS 4.x is STREAMS-based nonwithstanding. The S5R4 pseudo-tty subsystem (it's more than just a driver, it includes a couple of STREAMS modules) returns an EOF (zero-length read) when the slave side closes, rather than returning EIO. >obviously this EIO behavior is common practice, yet I don't find any >documentation on it, nor could a company engineer explain it. What's >the rationale? Ask Berkeley, it was their idea. We preserved it at Sun, and I assume DEC did the same thing. Other vendors probably did so as well. >Why do stream's dump their buffers when the writer closes? It has to do *something* with them when the queue is deleted, since they're attached to that queue.... >I would think this could be a problem with other drivers besides ptys. It could be. Unfortunately, just about *any* close behavior is going to screw *somebody*. Waiting forever for output to drain can lock up a tty port forever if it gets ^S'ed and there's output waiting. Un-^S-ing when the port is closed screws terminals that depend on strict ^S/^Q behavior (yes, this actually happened). (System V Release "1"-to-3 behavior.) Giving the port a finite amount of time to drain and then flushing output means you can lose output if the port stays ^S'ed for too long. (SunOS 4.0[.x] and maybe S5R4 behavior; also S5R3 behavior if the vendor has made any streams-based ttys.) The ideal, at least for tty ports, is *probably* to wait until ^S is received or "carrier" goes away (real carrier in the case of serial ports; on pseudo-ttys, wait until ^S is recieved or the process on the master side goes away), but I don't guarantee that this won't screw anybody, either. (This is, I think, what 4.3BSD does.) >Or am I confused and is this just an error in the way the pty driver uses >streams? Is it possible to change this behavior using a flow control >option without taking a severe hit in efficiency? (The manual alluded >to this but didn't give enough to go on.) I'm not sure where it alludes to this, nor why it does, nor what it means by "a flow control option". You can tweak processes on the slave side to wait for output to drain using the TCSBRK "ioctl", but this means you have to change those processes. As I remember, we decided to change SunOS 4.1 to, in effect, wait forever for output to drain, by having the "ldterm" streams module do said "ioctl" internally for you as part of its "close" procedure, before its queue gets destroyed and before any of the queues below it get closed. >What is it you are expected to do in 15 seconds? It sure seems >unusually large for internal system cleanup purposes. It's not for internal system cleanup purposes; it's waiting for output to drain. >Yet it is worthless for user purposes if you can't make it infinite. Why >can't you change this number? Ask AT&T, it was their idea. I think I tried to sell them on having an "ioctl" to change it at one point.
brnstnd@kramden.acf.nyu.edu (Dan Bernstein) (08/27/90)
In article <3954@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes: > >I would think this could be a problem with other drivers besides ptys. > It could be. Unfortunately, just about *any* close behavior is going to > screw *somebody*. Not necessarily. > Waiting forever for output to drain can lock up a tty port forever if it > gets ^S'ed and there's output waiting. This is the correct behavior. The difficulties with locking up tty ports are reflections of two different problems: first, that ptys aren't dynamically allocated in 4BSD; and second, that standard ttys exist at all. Hardwired /dev/tty* should be replaced with raw /dev/modem* and so on; *all* tty use should go through a common interface provided by a pseudo-terminal session manager. This would solve many problems at once. ---Dan
boyd@necisa.ho.necisa.oz (Boyd Roberts) (08/27/90)
In article <3954@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes: > >It could be. Unfortunately, just about *any* close behavior is going to >screw *somebody*. > This is the _classic_ `virtual circuit problem'. The problem of deciding what is circuit shutdown [error] and what is end of data, and which is appropriate. You've got to make _all_ the right choices, and some of them are _hard_. The way I like to think about it is the way pipes work. A close on a pipe indicates EOF to the reader. But, a write on a pipe with no one to read it is an error (SIGPIPE/EPIPE). But, to generalise this correctly you need to be able to say `kill this circuit for me because an error's occurred', so that one end can say to the other that somethings up. I say that each protocol layer should be self contained and _clean_. Now, the ISO people are not going to like this, but with virtual circuits you require two ways to shutdown a circuit at the protocol level itself, and not make it the responsibility of the layer above. I remember all too well the existential horror when I realised (while writing this X.25 `spool across the wire' print server/client) that when I said close() it shut the circuit down -- _right now_!! No waiting for the data to arrive at the other end -- nothing. I had to write this _revolting_ gore, using the Q bit to say: X.25 software ABC X.25 software DEF Client: write(You got that?|Q_BIT) Server: read() You got that?|Q_BIT Server: write(I got it|Q_BIT) Server: hangup circuit after `I got it' is delivered Client: read() I got it|Q_BIT Client: close() Server: close() Client: exit() Server: Loop I didn't want to write any file transfer protocol -- why should I? After all, I was using a reliable, sequenced, unduplicated, connection based virtual circuit. I just wanted close() to block correctly and for a subsequent server read() to return 0. But, X.25 software ABC had an `interesting' idea about virtual circuits. So I got to thinking that this was just _wrong_ and that Dennis* did it right. Boyd Roberts boyd@necisa.ho.necisa.oz.au ``When the going gets wierd, the weird turn pro...'' * pure, vanilla, no foul gore, straight streaming V8 stream code
gwyn@smoke.BRL.MIL (Doug Gwyn) (08/27/90)
In article <6038@muffin.cme.nist.gov> libes@cme.nist.gov (Don Libes) writes: >Why do pty's return EIO instead of 0 upon EOF? If they do this, it is clearly wrong and would most likely be due to UNIX development now being done, or at least directed, by people who don't understand UNIX. read() should not return -1 upon encountering normal EOF on ANY object. If the end of the stream is due to a communication link failure, for example, then an error indication would in fact be preferable to an undifferentiated EOF indication. >Why do stream's dump their buffers when the writer closes? They're not supposed to; do they really do that? Is USO really screwing up UNIX to such an extent?? >What is it you are expected to do in 15 seconds? There is no justification for penalizing an application for not consuming all the data buffered in the kernel within 15 seconds. I lost track of the origin of this thread; if this timeout is supposed to be related to the TCP protocol, my guess would be that somebody has yet again tripped up on the "FIN_WAIT_2" issue. >And finally, why does everyone answer easy questions with "[long essay >deleted] and this should probably be in the FAQ if it isn't already"? >That indicates to me that neither the asker nor the answerer has read >the FAQ. Not really, usually it indicates that the responder has not memorized the FAQ list but feels that the question should be there and that either the asker has failed to read the FAQ list or else the question wasn't there. (Or, as in in your case, >... at the moment ... the FAQ has slipped off the face of the earth. )
guy@auspex.auspex.com (Guy Harris) (08/28/90)
>>Why do stream's dump their buffers when the writer closes? > >They're not supposed to; do they really do that? If by "dump their buffers" the original poster meant "throws data away if it isn't sent downstream in 15 seconds", the answer is "yes, they really do that".
guy@auspex.auspex.com (Guy Harris) (08/28/90)
>This is the correct behavior. The difficulties with locking up tty ports >are reflections of two different problems: first, that ptys aren't >dynamically allocated in 4BSD; and second, that standard ttys exist at >all. Hardwired /dev/tty* should be replaced with raw /dev/modem* and so >on; *all* tty use should go through a common interface provided by a >pseudo-terminal session manager. Even "ttys", i.e. serial ports, to which, say, a printer or plotter is attached? What happens if, for whatever reason, a ^Q sent by said printer or plotter is lost? Is the idea that you detach the printer from the session, attach the session to a regular terminal, and type ^Q at it?
stevea@i88.isc.com (Steve Alexander) (08/28/90)
In article <3954@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes: >>Yet it is worthless for user purposes if you can't make it infinite. Why >>can't you change this number? > >Ask AT&T, it was their idea. I think I tried to sell them on having an >"ioctl" to change it at one point. System V Release 4.0 has the I_SETCLTIME ioctl, which allows one to change the close wait time on the stream. I believe that the time is specified in milliseconds. There is also I_GETCLTIME which does what you'd expect. I guess Guy should move into sales... -- Steve Alexander, Software Technologies Group | stevea@i88.isc.com INTERACTIVE Systems Corporation, Naperville, IL | ...!{sun,ico}!laidbak!stevea
les@chinet.chi.il.us (Leslie Mikesell) (08/29/90)
In article <13650@smoke.BRL.MIL> gwyn@smoke.BRL.MIL (Doug Gwyn) writes: >In article <6038@muffin.cme.nist.gov> libes@cme.nist.gov (Don Libes) writes: >>Why do pty's return EIO instead of 0 upon EOF? >If they do this, it is clearly wrong and would most likely be due to >UNIX development now being done, or at least directed, by people who >don't understand UNIX. read() should not return -1 upon encountering >normal EOF on ANY object. Is this meant to imply that the developers of STREAMS don't understand unix? A read on a STREAMS file is documented to return -1 when O_NDELAY is set and there is no data available (which has unfortunately been propagated into the tty emulation of at least some network implimentations). Apparently there is some reason to want to know about zero length messages. Les Mikesell les@chinet.chi.il.us
les@chinet.chi.il.us (Leslie Mikesell) (08/29/90)
In article <3964@auspex.auspex.com> guy@auspex.auspex.com (Guy Harris) writes: >Even "ttys", i.e. serial ports, to which, say, a printer or plotter is >attached? >What happens if, for whatever reason, a ^Q sent by said printer or >plotter is lost? Is the idea that you detach the printer from the >session, attach the session to a regular terminal, and type ^Q at it? Most printers will supply a ^Q when powered up, when the lid is closed, when the on-line button is pressed, etc. I'd prefer for the computer to wait for such an occurrance rather than trying to guess when the paper supply has been replenished. The real problem is when you have placed a long distance call to or from a modem on a unix machine and pick up a ^S from line noise. I've even seen cases where the device driver would lock up so that even a kill -9 wouldn't release the process and there was no way to drop the call without physical access to the modem. Les Mikesell les@chinet.chi.il.us
thorinn@skinfaxe.diku.dk (Lars Henrik Mathiesen) (09/09/90)
In article <6038@muffin.cme.nist.gov> libes@cme.nist.gov (Don Libes) writes: >Why do pty's return EIO instead of 0 upon EOF? In my opinion, there is no such thing as an EOF when reading from a (master) pty. After all, the pty is designed to let a daemon, script program or similar pretend that it is on the outside of the machine looking in through a serial interface, receiving exactly the same bytes as would be passed over a serial line. And, barring wire-cutters and over-voltage, a serial line is very open-ended and EOF-free. However, a serial interface may have some control lines, such as DTR or RTS. These will typically be asserted when the corresponding UNIX device file is opened; they may be negated when it is closed (but only if the HUPCLS flag is set, I think). So, apart from the HUPCLS business, the EIO error on a pty master corresponds to ``DTR not asserted'', not to EOF. -- Lars Mathiesen, DIKU, U of Copenhagen, Denmark [uunet!]mcsun!diku!thorinn Institute of Datalogy -- we're scientists, not engineers. thorinn@diku.dk