dennis@rlgvax.UUCP (Dennis Bednar) (12/13/85)
multiplexor(MUX), flow control, out of band data Introduction - Overview I recently came across a very interesting situation which can cause deadlocks over an asynchronous serial computer link when both machines use the xon/xoff protocol, and the traffic is full-duplex (both machines can send data to each other at the same time). The deadlock causes both machines to cease transmission - permanently. The discussion below is dependent on how the tty driver in the kernel interfaces to the terminal multiplexor (MUX) hardware: the assumption being that part of the XON/XOFF processing is done in the host, and part of the XON/XOFF processing is done in the MUX. The problem described below occurs when the MUX processes XON/XOFF's received from the network, ie, the host interfaces to a semi-intelligent MUX. If the MUX is dumb, that is, the host tty driver has full control of the MUX, then the problem described below *may* not appear. Figure 1 below - Overview Application host tty driver tty hardware asynch serial line ----------- -------------------- ------------- ----------------- Host-send >---> Send Half tty driver --> MUX Send ---> to network tty device Host-rcv <--- Rcv Half tty driver <-- MUX Rcv <--- from tty Figure 2 (below) - Data Received Too fast from network Host-rcv <--- Rcv Half generates XOFF <-- MUX Rcv <-- data too fast to send out of MUX Passes All (to net) when rcv queue Data Up to HOST goes above high water mark, and generates XON to send out of MUX when rcv queue goes below low water mark Figure 3 (below) - Data Sent Too fast to network Host-send >---> Send Half tty driver --> MUX Send ---> too fast to network MUX Eats <--- XOFF XOFF and does NOT pass up to host and now causes output to network to be blocked .... later <--- XON ---> resume sending Assumptions: - no sophisticated Data Link Control protocol is used. - that only the XON/XOFF protocol be used for flow-control. - that both machines can send an XON/XOFF (to net), and that both machines can understand an XON/XOFF if received (from net). - both machines may decide to send data to each other at the same time (ie full duplex). - the model is a sending process that sends data over the link to a receiving process at the other end, and that this is symmetric for both machines. - that transmission of the XON/XOFF to the network is done in the *host* receive half of the tty driver, to limit the rate of data received from the network (System 3 IXOFF ioctl()), see figure 2. - that the *MUX* stops/starts sending data to the network if if XON/XOFF is received from the net (S3 IXON ioctl()). In this case, the MUX discards the XON/XOFF, and never passes the characters up to the host receive half of the tty driver, see figure 3. - *KEY POINT to this argument: that the MUX refuses to send any data character to the network, when XOFF last received from the network. The assumption is that all 256 bytes are "data" to the MUX, *including* XON and XOFF. Scenerio of Problem: Both machines begin sending data to each other. Suppose that the application receive process on each machine cannot process the data received from the other machine fast enough. The receive process gets behind, the rcv queue goes above the the high water mark, which causes the network-receive half of the tty driver to transmit an XOFF. Now suppose that both machines decided the send XOFF to each other at the same time. Later, the receive process on both machines will consume the data buffered in the network-receive half of the tty driver, which causes the network-receive half of the tty driver to send an XON to the MUX (for transmission to net). What if the MUX is programmed not to send the "data" character XON, because its "XOFF'ed on output to the network". Suppose both MUX'es are implemented this way, then neither side can transmit an XON to the other, so neither side can transmit any data. Result: Both senders and both receiver processes on both machines become deadlocked. Solution to Problem: There are two different solutions. The first is to use a dumb MUX which can always be controlled by the host CPU. The second solution is to keep the "semi-intelligent" MUX, but the host should have the ability to send "out-of-band" data to the MUX in addition to sending normal data, which can be thought of as "high priority" data. The "out-of-band" data is the XON and XOFF characters. The MUX sends the "out-of-band" (XON/XOFF) data to the network, even when the MUX is blocked on sending to the network (ie XOFF last received from net.) -- Dennis Bednar Computer Consoles Inc. Reston VA 703-648-3300 {decvax,ihnp4,harpo,allegra}!seismo!rlgvax!dennis dennis@rlgvax.UUCP
jon@altos86.UUCP (Jonathan Stern) (12/16/85)
In article <848@rlgvax.UUCP> dennis@rlgvax.UUCP (Dennis Bednar) writes: >multiplexor(MUX), flow control, out of band data > > >Introduction - Overview > >I recently came across a very interesting situation which can >cause deadlocks over an asynchronous serial computer link when >both machines use the xon/xoff protocol, and the traffic is full-duplex >(both machines can send data to each other at the same time). >The deadlock causes both machines to cease transmission - permanently. >Solution to Problem: > There are two different solutions. The first is to use a dumb > MUX which can always be controlled by the host CPU. The second > solution is to keep the "semi-intelligent" MUX, but the host > should have the ability to send "out-of-band" data to the MUX > in addition to sending normal data, which can be thought of as > "high priority" data. The "out-of-band" data is the XON and > XOFF characters. The MUX sends the "out-of-band" (XON/XOFF) > data to the network, even when the MUX is blocked on sending > to the network (ie XOFF last received from net.) Even this solution *may* not solve the problem. Each machine has presumably XOFFed the other because it was not ready to recieve more data. If one machine is significantly faster than the other or one machine is in a state where it is not unblocking the input data stream it may actually lose the XON and stay deadlocked. The solution here is to process XON/XOFF protocol within the MUX but many machines do not do this. What this underscores is that XON/XOFF is not a practical flow control method for full duplex, high speed, data transfer. I would be very interested to hear how others have attacked this problem. ------- Jonathan Stern Altos Computer Systems -- ucbvax!dual!vecpyr!altos86!jon
henry@utzoo.UUCP (Henry Spencer) (12/18/85)
> Even this solution *may* not solve the problem. Each machine has presumably > XOFFed the other because it was not ready to recieve more data. If one machine > is significantly faster than the other or one machine is in a state where it > is not unblocking the input data stream it may actually lose the XON and stay > deadlocked... The solution to this is the "persistent" kind of xon/xoff that some devices, notably the ones from HP, do. If you sent an XOFF and the other end is still sending data, send another XOFF after a little while. (A reasonable strategy is to send one when you're down to [say] 128 characters of buffer, another at 64, another at 32, etc.) And if you sent an XON and the other end is silent, send another XON after, say, 15 seconds. And persist. XON/XOFF is not a very good protocol, but it's the best we have right now, and "persistent" versions of it are much more robust than simplistic ones. -- Henry Spencer @ U of Toronto Zoology {allegra,ihnp4,linus,decvax}!utzoo!henry
kbb@faron.UUCP (Kenneth B. Bass) (12/18/85)
In article <176@altos86.UUCP> jon@gateway.UUCP (Jonathan Stern) writes: >In article <848@rlgvax.UUCP> dennis@rlgvax.UUCP (Dennis Bednar) writes: >>multiplexor(MUX), flow control, out of band data >> >>I recently came across a very interesting situation which can >>cause deadlocks over an asynchronous serial computer link when >>both machines use the xon/xoff protocol, and the traffic is full-duplex >>(both machines can send data to each other at the same time). >>The deadlock causes both machines to cease transmission - permanently. > > What this underscores is that XON/XOFF is >not a practical flow control method for full duplex, high speed, data transfer. >I would be very interested to hear how others have attacked this problem. > >------- >Jonathan Stern Altos Computer Systems -- ucbvax!dual!vecpyr!altos86!jon I have to disagree. XON/XOFF flow controlling is practical for full duplex, high speed data transfer. This is assuming, of course that the XON/XOFF characters are truly "out-of-band" characters. It seems that the problem above occurs because this MUX is only "semi-intelligent". That is, it sounds like the MUX is doing some of the flow controlling, but not all of it. The MUX should either: 1) do full flow controlling at both sides (to/from network, and to/from host); or 2) do no flow controlling at all. If case 1 is chosen, then the MUX would trap and process XON/XOFF's it receives from the network, as well as from the host; but it would not pass these characters through. For the other case, the MUX would be "dumb" and just pass the XON/XOFF's it receives - either from the host, or from the network - through to the network or host. The only major problem I have ever encountered with any type of flow controlling is when does the receive end decide to send the XOFF signal. In the latter case above, where the MUX is dumb and does not do any flow controlling, then the XOFF must travel through the MUX, through the network, through the MUX to the remote host. The remote host will be sending data throughout this time, and will not stop until it sees the XOFF. The problem then, is how many characters max will the host need to be able to buffer AFTER it sends out the XOFF. "Tell me why" ken bass linus!faron!kbb