[comp.unix.questions] Commonly Asked Socket Questions

penneyj@servio.UUCP (D. Jason Penney) (04/13/90)

In article <28080@ut-emx.UUCP> ycy@walt.cc.utexas.edu (Joseph Yip) writes:
>I am writing some socket programs. Everything works fine except when I
>want my socket to be NON-BLOCKING.
>
>I use,
>
>	/* to read a connected socket */
>	read(socket_id,buf,BUFS);
>	
>Does anyone know how to read from a socket using non-blocking mode? 
Yes.
Oh, you want to know how?  This varies slightly between systems.  In most 
unixes, you turn on the FNDELAY flag  by reading the socket's flags with 
fcntl (F_GETFL), OR'ing in the flag, and then setting the flags (F_SETFL).
Some systems require you to use the FIONBIO command to ioctl().

You didn't ask it, but the other common operation of interest is to enable
asynchronous notification.  To do this, you use fcntl to turn on the FASYNC
flag.  (Other systems require that you use FIOASYNC with ioctl.)  Then you 
must ALSO set the process group owner to be your own pid by using ioctl with the
SIOCSPGRP command.  BTW, SIGIO is pretty brain-dead; if you have multiple 
sockets, you'll still have to call select() to see which one(s) of them caused
the signal.

>recv() system call has a flag that you can PEEK at the socket. 
Not particularly germane to the current discussion, I think.

>Do I have to go through fcntl() call to make it non-blocking?
Yes, if not ioctl().  BSD systems tend to favor the fcntl call.  You may
discover certain SYS-V systems that prefer the ioctl approach.

>Another question is that if I am transferring 512*512 image file using
>socket,
>
>	/* send 512*512 image file */
>	write(socket_id,buf,512*512);
>
Warning: if your socket is non-blocking, it is ALWAYS an error not to check the
value returned from write() -- it could even be 0!

In general, if your socket is non-blocking you need to effectively enclose your
write() in a loop that keeps trying:

numSend = 0;
while (numSent < 512*512) {
  count = write(socket_id, &buf[numSent], 512*512 - numSent);
  if (count < 0) {
    /* oh wow, socket is broken -- quick, examine errno */
    return;
    }
  numSent += count;
  }

Note that you can use select(2) (or poll(2)) to determine if a socket is ready
for writing, so this doesn't have to be a busy loop (you can insert a select()
call to make you go to sleep).

>	......
>	------------------------------------
>
>	/* at the receiving end */
>	read(socket_id,buf,512*512);
>
>Will the receiving end receive all 512*512 bytes before returning?
Emphatically not!  Supposing that ethernet is actually being used, the
underlying packet size is something like 2K bytes.  Since sockets are STREAMS
(record boundaries are not preserved), the receiving end may receive all of the
bytes at once, or in a number of multiple packets, depending on some very timing
dependent circumstances.  For instance, collision and retry on the ethernet...

N.B. -- It is ALWAYS a mistake, whether or not a socket is blocking, to throw
away the value returned from read(2).  The third argument to read() merely
indicates the MOST bytes that may be returned.  The return value tells you how
many bytes were actually read, or if an error occured.

BTW, the ONLY time that read() returns 0 is when the stream has effectively
ended -- EOF.

>What is the optimal size to send using socket?
My personal preference is to attempt to send as many bytes as possible in each
call to write().  Think about it:  each call to write(2) has to initiate a
PHYSICAL transfer.  There is no buffering or flushing at this layer.  The fewer
physical transfers initiated, the less work for everyone.  Incidentally,
ethernet has a 2K limitation, but if the socket is a loopback, you may be able
to send and receive more bytes at a pop.

It is probably not relevant, but we have noticed on most of our Unix hosts that
both most streams seem to buffer 8K bytes of incoming and another 8K bytes of
outgoing data.  This is definitely not universal; PC-NFS I think has a buffer
size of 2K, for instance.  In any event, we've used this observation to make
sure that the higher layers of our system prepare at least 8K or an entire
message's worth, which ever is less, before calling write().
-- 
D. Jason Penney           Ph: (503) 629-8383
Beaverton, OR 97006       uucp: ...uunet!servio!penneyj (penneyj@slc.com)
"Talking about music is like dancing about architecture." -- Steve Martin

chris@mimsy.umd.edu (Chris Torek) (04/14/90)

In article <412@servio.UUCP> penneyj@servio.UUCP (D. Jason Penney) writes:
[lots of correct stuff...]

>>Will the receiving end [of a stream socket] receive all 512*512 bytes
>>before returning?

>Emphatically not!  Supposing that ethernet is actually being used, the
>underlying packet size is something like 2K bytes.

(well, 1.5K)

>Since sockets are STREAMS (record boundaries are not preserved),

Not all sockets are streams.  Some are `datagram sockets'; these do
preserve record boundaries.  Such sockets generally will refuse to
transmit 512*512 bytes as a single record, however.

In addition, many implementations provide only two services: reliable
(flow controlled, error checked, sequenced, clean) `streams' and
unreliable (uncontrolled, passes errors on, loses, duplicates,
reorders, and generally mucks up data at times) `datagrams'.  There
*are* such things as reliable datagram sockets; they are merely
relatively rare in current implementations.

[more correct-stuff deleted]

>>What is the optimal size to send using socket?

>My personal preference is to attempt to send as many bytes as possible in each
>call to write().

This is generally the best approach for stream sockets, since the
implementation can break up large messages as appropriate for the media
in use (media, not medium, as there may be more than one).  Datagram
sockets are an exception, again.

In any case, there is no single optimal size.  On BSD VAXen over local
Ethernets, 1K (and multiples thereof) works particularly well, but on
the same machines over slow serial lines smaller packets may work out
better.  This is why it is generally best to let the implementation
break up large writes.  (But see below.)

>It is probably not relevant, but we have noticed on most of our Unix
>hosts that both most streams seem to buffer 8K bytes of incoming and
>another 8K bytes of outgoing data.

On BSD boxes and systems with TCPs derived therefrom, TCP stream sockets
have `tcp_sendspace' and `tcp_recvspace' of kernel buffering for outgoin
and incoming data respectively.  Newer systems allow socket options to
change the buffer sizes.  These default to 4K or 8K bytes, typically.

Since the outgoing data must be retained until the remote host has
acknowledged correct reception (in case the data are lost or mangled on
the way and must be re-sent), and since the buffer has a limited size,
writes of more than tcp_sendspace will cause the writer to `hang'
(block or wait) by default until the first data have been acknowledged
and the rest have been buffered up.  Thus, the `most efficient' size
for a write() call is `tcp_sendspace - however_much_is_still_waiting',
at least in one sense.

This is where non-blocking write()s can be useful: after setting FIONBIO
mode, a write() of `too many' bytes will place as many in the outgoing
buffer as will fit, and will return the count of bytes placed in that buffer.
This could be anything between 0 and the number of bytes given to write().
Refer to the parent article of this one to see how to manage non-blocking
writes.
-- 
In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163)
Domain:	chris@cs.umd.edu	Path:	uunet!mimsy!chris