[comp.unix.programmer] inline OOB data with TCP sockets: why is it inserting extra byte?

subbarao@phoenix.Princeton.EDU (Kartik Subbarao) (05/29/91)

In article <1991May18.021148.13730@athena.mit.edu> mlevin@jade.tufts.edu writes:
[lotsa code deleted]

>Now, the problem is this: it sort of works, but when the OOB data
>comes in, the sequence gets disrupted: it seems to insert en extra
>zero byte in the middle of it! Here's the output (of the reader):

[output deleted]

>As you can see, it somehow inserted a \0 between the 3 and the 4 in
>the string 1234 which I sent as a single OOB message. Where did it

It's not inserting a 0 anywhere. The read only returns 3, but since you 
ignore the return value of the read and print 4 things, the 4th one shows
up as a null because you bzero'd the array.

Remember, with sockets, read()s are not guaranteed to return the same
number of bytes that you request. I tried your program with a read that
checked to see how many bytes it got, but still had no luck fixing the
program. I then decided to get really basic and do this:

	  for (t = 0; t < 4; t++) {
	  	read(i, s+t, 1);
	  	ioctl(i, SIOCATMARK, &o);
      	if (o) printf("OOB: %c\n", s[t]);
   	  } 
	  if (!o) puts(s);
	}

i.e, read every single character and see whether its associated with either
Out of Band or a normal message. This is the output I got from running
this:

abcd
....
OOB:3
1234
abcd
....

So it would appear that the SIOCATMARK ioctl() only returned true when the
'3' was read. This consistently happens, which is kinda weird. I was under
the impression that SIOCATMARK would tell you whether the NEXT thing you
are going to read from the socket is OOB or not, but apparently it's the
other way around; you read something, then SIOCATMARK tells you whether it
was OOB or not. I'm not sure. What I don't see is (although I know read()s
aren't guaranteed to return the exact number of bytes you request) why
consistently the read is returning 3, and why the 4 somehow did not get
flagged as an OOB event. For that matter, why aren't the 1 and 2 being
flagged either? Either its quite strange, or I've really lost it.

I was thinking of approaching this problem from the different direction of
SIGURG or something? Would that work? And what's the deal here?


			-Kartik

--
internet% whoami

subbarao@phoenix.Princeton.EDU -| Internet
kartik@silvertone.Princeton.EDU (NeXT mail)  
SUBBARAO@PUCC.BITNET			          - Bitnet

torek@elf.ee.lbl.gov (Chris Torek) (05/29/91)

[NB: I have deleted a `Distribution: usa' header line.]

In article <1991May18.021148.13730@athena.mit.edu>
mlevin@jade.tufts.edu writes:
>Can someone help me with Out-of-band data with TCP sockets?

The very first thing you need to know is that TCP *does not have* out
of band data.  TCP has `urgent data', which is a different thing entirely.
The BSD socket abstraction provides only out of band data.  Thus,
there is a mismatch; something must give.

> send(he,"1234",4,MSG_OOB);


>As you can see, it somehow inserted a \0 between the 3 and the 4 in
>the string 1234 which I sent as a single OOB message. ...

In article <az8.BZxCGw6pM@idunno.Princeton.EDU> subbarao@phoenix.Princeton.EDU
(Kartik Subbarao) writes:
>It's not inserting a 0 anywhere. The read only returns 3, but since you 
>ignore the return value of the read and print 4 things, the 4th one shows
>up as a null because you bzero'd the array.

In addition, the various send()s might return a short count.  *Always*
put a loop around read and write calls, unless a short count really
means `stop' (e.g., when using select to effect non-blocking I/O).

>... I then decided to get really basic and do this:
>
>	  for (t = 0; t < 4; t++) {
>	  	read(i, s+t, 1);
>	  	ioctl(i, SIOCATMARK, &o);
>      	if (o) printf("OOB: %c\n", s[t]);
>   	  } 
>
>i.e, read every single character and see whether its associated with either
>Out of Band or a normal message. This is the output I got from running
>this: ... OOB:3 ...  So it would appear that the SIOCATMARK ioctl()
>only returned true when the '3' was read. This consistently happens,
>which is kinda weird.

Actually, it is not weird at all:

>I was under the impression that SIOCATMARK would tell you whether the
>NEXT thing you are going to read from the socket is OOB or not,

It does.  The `4' is `the' out of band data, or rather the `urgent data'.

Here is what happens:

The BSD socket code provides for one (1) byte of out of band data.
TCP provides an `urgent pointer'.  As it says in tcp_usrreq.c:

		/*
		 * According to RFC961 (Assigned Protocols),
		 * the urgent pointer points to the last octet
		 * of urgent data.  We continue, however,
		 * to consider it to indicate the first octet
		 * of data past the urgent section.
		 * Otherwise, snd_up should be one lower.
		 */

and in tcp_input.c:

		 * According to RFC961 (Assigned Protocols),
		 * the urgent pointer points to the last octet
		 * of urgent data.  We continue, however,
		 * to consider it to indicate the first octet
		 * of data past the urgent section as the original
		 * spec states (in one of two places).
		 */

What happens, then, is that the send with MSG_OOB puts all four bytes
in the output queue and sets the `urgent pointer' to point just past
the `4'.  The TCP input code sees the corresponding urgent pointer and
says, `aha, the last byte of the stuff I just got must be where the
socket out-of-band mark goes'.

>What I don't see is (although I know read()s aren't guaranteed to return
>the exact number of bytes you request) why consistently the read is
>returning 3,

In the TCP code, the (single) out of band data byte that is present
is either stored in the (single) out of band byte holder---this is the
usual case, when SO_OOBINLINE is not set---or simply left in place.
In addition, the code does the following:

	so->so_oobmark = so->so_rcv.sb_cc + (tp->rcv_up - tp->rcv_nxt) - 1;
	if (so->so_oobmark == 0)
		so->so_state |= SS_RCVATMARK;
	sohasoutofband(so);

This sets the `out of band data offset' (so_oobmark) to the distance
between the `next byte the user will get with read/recv' and the (single)
out of band byte, whether or not that byte is removed from the data
stream (depending on SO_OOBINLINE).  If that offset is zero, it sets
the `at mark' flag.  This flag is what is retrieved by SIOCATMARK.
The flag is set whenever so_oobmark decreases to zero, and cleared
initially on every read/recv.

Since the `4' is the out of band data byte, a read() that plowed on
through and consumed it would zip on past the out of band mark.  It
would then be impossible to figure out where the out of band data was.
Thus, the socket reading code carefully stops whenever it reaches
the out of band mark, so that SIOCATMARK can work.

>I was thinking of approaching this problem from the different direction of
>SIGURG or something? Would that work? And what's the deal here?

sohasoutofband() will deliver a SIGURG when there is somewhere to send
it.  At that point, you can set a flag or do SIOCATMARKs and read() calls
until the out of band data is reached.  If subsequent out of band (urgent)
TCP data arrive before you get there, you will only get the last one.
-- 
In-Real-Life: Chris Torek, Lawrence Berkeley Lab CSE/EE (+1 415 486 5427)
Berkeley, CA		Domain:	torek@ee.lbl.gov