rjh@intelob.intel.com (Bob Hathaway) (09/27/89)
I'm implementing TCP urgent data handling for our product line and have discovered some ambiguous semantics. This implementation will support multiple transport layer interfaces including a Unix socket layer and must be able to communicate with other TCP/IP implementations, however it appears BSD Unix doesn't implement the TCP specifications as I would expect. I'd appreciate hearing from any internet experts or be referred to any existing specifications which can clarify urgent data handling. TCP RFC 793 and the MIL STD do not offer precise semantics for urgent data handling. Single byte messages are simple but larger messages seem to be poorly defined. For example, Ultrix assumes the first byte of a multi-character message is urgent and 4.3 BSD assumes the last byte. Also, 4.3 breaks large urgent messages into several segments with the URG bit set and the urgent pointer pointing to just past the data *in each segment*. The receiver will believe each segment is an urgent message and each segment will override the last saved urgent byte unless inlining is specified. This implementation seems erroneous. A more correct interpretation of the TCP specifications for multi-segment urgent messages seems to be setting the URG bit on the first segment only and setting the urgent pointer to one byte past the last byte in the entire multi-segment urgent message. The transport service will consider an entire urgent message as urgent data allowing the socket layer to extract a single byte from the urgent message if necessary. Future socket implementations will hopefully conform more closely to the TCP specification. With this interpretation, a receiver sets TCB variable RCV.UP = SEG.UP when an URG bit is detected and arriving data up to *RCV.UP* is assumed to be urgent. For example, this interpretation of the TCP specification will result in: URGENT MESSAGE SEGMENTS ============== ======== URG=1, UP=3000 -+ m |-------| m |-------| | | | | | | | | m + 1023 |-------| | | | | | | URG=UP=0 | ~ ~ m + 1024 |-------| | | | | | | | | m + 2047 |-------| | | | | | | URG=UP=0 | | | m + 2048 |-------| | m + 2999 |-------| | | | m + 2999 |-------| | <-------+ PSH will also be set on all outgoing segments. For reliability, SEG.UP would have to be constraint checked and segments with the URG bit set and a new UP which arrive past the first segment but within the original urgent message would have to be handled, for example when the second or third segments above arrived with URG=1, UP=new value. These updates could be considered errors by sending a RST with logging or could be considered correct by updating RCV.UP. We opt for considering UP updates within messages an error condition and disconnect with a RST because this indicates the peer is out of sync. This scheme raises some important questions such as compatibility with existing systems and correctness. Are there any existing specifications or internet experts which can clarify this? Thanks, Bob Hathaway rjh@inteloa.intel.com !tektronix!biin!rjh
hedrick@geneva.rutgers.edu (Charles Hedrick) (09/29/89)
The urgent bit is sort of odd. Your questions suggest that you want to be able to tell exactly what data is urgent and what isn't. That's not really what the original intent was, I don't think. There are no "messages" in TCP, nor is there any intent in the spec to define where urgent data begins, only where it ends. Urgency is intended to indicate a condition that takes effect immediately. Its beginning is not synchronized with the data stream. Thus urgent should be set in the next datagram to be sent, even if it is a retransmission and did not have urgent set last time. (This may be hard to implement of course, and as far as I know is not. But conceptually it would make sense, and it would improve some aspects of telnet and rlogin behavior.) Furthermore, if two sections of "urgency" are adjacent, they may look like one. Both of the uses of urgent data that I know -- telnet and rlogin -- are designed with these concepts in mind. This means that the concept of "urgent message" is not well defined. Only the concept of "last urgent byte" makes any sense, and with the off by one fiasco, even that is ambiguous. I would suggest that you *not* try to "clarify" urgent, but stick with the 4.3 semantics. Anyone who needs urgent "messages" should arrange to delimit the messages by some sort of marker in the data.
guy@guy.uucp (Guy Streeter) (09/29/89)
hedrick@geneva.rutgers.edu (Charles Hedrick) writes: > ... >Only the concept of "last urgent byte" makes any sense, and with the >off by one fiasco, even that is ambiguous. I would suggest that you >*not* try to "clarify" urgent, but stick with the 4.3 semantics. > ... RFC 1011 - Official Internet Protocols May 1987 Urgent: Page 17 is wrong. The urgent pointer points to the last octet of urgent data (not to the first octet of non-urgent data). ... ***DRAFT RFC*** TRANSPORT LAYER -- TCP June 16, 1989 4.2.2.4 Urgent Pointer: RFC-793 Section 3.1 The second sentence is in error: the urgent pointer points to the sequence number of the LAST octet (not LAST+1) in a sequence of urgent data. ... BSD 4.3 sets the urgent pointer to the LAST+1 octet in a sequence of urgent data. Whenever anyone else violates a protocol, the response in this newsgroup is always "Fix your software!" Should we propagate Berkeley's error in the interest of compatibility, or should we do it right? Guy Streeter b11!guy!guy@ingr.com ...uunet!ingr!b11!guy!guy
subbu@hpindda.HP.COM (MCV Subramaniam) (09/30/89)
>multi-character message is urgent and 4.3 BSD assumes the last byte. Also, >4.3 breaks large urgent messages into several segments with the URG bit >set and the urgent pointer pointing to just past the data *in each segment*. 4.3 TCP considers only one byte in the message as urgent. Therefore, if you call send() with 2000 bytes, the last byte will be considered urgent, and the SND.UP pointer will be updated to that value. Thereafter, every packet sent will have the URG flag set, but the URP in the packet will be equal to SND.UP. That is, each segment will contain the same value of URP in it. I believe 4.2 used to set URG flag and URP only in the segments in which the OOB byte is actually transmitted. (Someone correct me if I am wrong). >The receiver will believe each segment is an urgent message and each segment >will override the last saved urgent byte unless inlining is specified. This If, however, you call multiple send()s with MSG_OOB, then each segment sent will contain one urgent byte, and if the user has not received the last OOB byte, it will be overridden with the new one, and RCV.UP will be moved past the new URP received. The following bugs exist in 4.3 BSD urgent (OOB) data handling: 1. If you send Out of band bytes *too* frequently, i.e. send the next OOB before the first is acknowledged, then 4.3 BSD TCP leads to data corruption (if you don't use OOBINLINE option). [Bug in the sender] 2. Also, if 4.3 BSD TCP transmits (say) 3 segments and the third one had the URG flag set, and the first one got lost, then SND.UP gets messed up when the first segment is retransmitted. This, again, leads to data corruption. [Bug in the sender] 3. If segments containing urgent bytes have to be retransmitted, and get reassembled in the receivers TCP reassembly queue, data corruption could result [Bug in the receiver - Reassembly code] -Subbu
CERF@A.ISI.EDU (10/01/89)
Bob, The most sensible implementation of URGENT POINTER is to mark the byte just past the end of the urgent message. If the message is broken into segments, one could continue to set URG=1 and UP="byte number 1 past the last byte of urgent data". Resetting URG and UP is OK, too, so long as the recipient remembers UP until the received in sequence bytes exceed the last byte of urgent data. The implementation which sets UP to just past the data of each segment isn;t necessarily broken, but it seems unnecessary to implement in that fashion. The question of first or last byte of an urgent message caught me by surprise. At the TCP level, the only thing you can specify is where the urgent data ends, not where it starts. The interface between the process wanting to send urgent information and the kernel TCP service needs to have a way for the process to say where the urgent data ENDS, since that is the information that the TCP can convey. The receiving process will need clues in addition to those provided by TCP to distinguish the urgent from non-urgent information - these semantic and syntactic matters were left to the protocol layer above TCP to deal with. Vint