freedman@granite.cr.bull.com (Jerome Freedman) (02/22/90)
I have some related questions on sockets 1) a read (or write) on a socket involves a buffer in which the data to be read/written is contained. This buffer can be adjusted, according to TFM, but what are the limits of the adjustment. What is the default size? 2) Suppose I get more data (TCP socket) then I have buffer for how can I avoid dropping bytes on the floor when I read. 3) What if I am writing data (TCP socket - in fact all questions refer to TCP sockets) Jerry Freedman,Jr
chris@mimsy.umd.edu (Chris Torek) (02/22/90)
In article <1990Feb21.205708.13533@granite.cr.bull.com> freedman@granite.cr.bull.com (Jerome Freedman) writes: >1) a read (or write) on a socket involves a buffer in which the data >to be read/written is contained. This buffer can be adjusted, according >to TFM, but what are the limits of the adjustment. This is up to the implementation. Typical limits are 32767 or 65535. The underlying protocol (TCP, AF_UNIX, XNS, Appletalk, etc.) may impose other limitations as well. >What is the default size? This is up to the implementation. Typical implementations have secret configuration parameters, such as global variables called `tcp_sendspace', `tcp_recvspace', etc. >2) Suppose I get more data (TCP socket) then I have buffer for >how can I avoid dropping bytes on the floor when I read. >3) What if I am writing data (TCP socket - in fact all questions >refer to TCP sockets) TCP is a reliable stream protocol. If you ask to read 10 bytes, and there are 100 bytes, you get 10 of the 100 and the remaining 90 hang around. If you ask to write 1000 bytes and there is room in the outgoing buffer for only 100, your process `hangs' until all the bytes are written 100-at-a-time (or fewer, if the TCP peer does not permit sending a 100 byte segment). The protocol includes `yucky stuff' that lets the sender and receiver tell each other how much they can handle at any time; any correct implementation will not overrun the space. (An incorrect implementation can, of course, lose data; but then, an incorrect implementation can do *anything*, so why worry about that?) -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@cs.umd.edu Path: uunet!mimsy!chris
JAZBO@brownvm.brown.edu (James H. Coombs) (02/22/90)
Jerome Freedman writes 1) a read (or write) on a socket involves a buffer in which the data to be read/written is contained. This buffer can be adjusted, according to TFM, but what are the limits of the adjustment. What is the default size? How can the buffer be adjusted? You supply a buffer in the read/write call, but that is not the same buffer that the transport layer uses. For read/write, you can supply any size buffer you please, just as with any file descriptor read/write. 2) Suppose I get more data (TCP socket) then I have buffer for how can I avoid dropping bytes on the floor when I read. You can read recursively. Just as you read a large file in chunks, so you can read from a socket in chunks. You don't have to read all of the pending data at once. The thing that you have to watch out for is reading before all of the expected data has arrived. If, for example, you know that the writer has sent 1 Kb, or is supposed to have sent 1 Kb, then you have to stay in the read loop until you get the entire "packet." I handle this by sending a length prefix. I also put the burden on the client to supply an adequately sized buffer to the read routine. This latter constraint, however, requires a clear protocol for the application. The primary point is that data stays "in the socket" until you read it (or something extraordinary occurs). 3) What if I am writing data (TCP socket - in fact all questions refer to TCP sockets) write() will send what it can and block until it can send more. You don't have a problem unless you can't afford to block. Some people prefer to use nonblocking reads and writes and use select() to handle their own more precise blocking. When the select() times out, they decide that there is a probable communications failure. If you don't have complete control over both ends of the communication, this more robust approach is probably appropriate. If you are just getting started with sockets, however, you can postpone the complications. Another thing to watch out for is the interruption of a read/write (or send/recv) by a signal. The return code will be EOF, but errno will be EINTR. There has not been a failure in the communications routine, and you should loop on the read/write. In a group development environment, you just have to accept that signals will go off without your knowing or caring what they are. Or, you may later decide that you need a signal for some reason, and then you may find that it breaks your communications library. Communications library, now there is a good point. If you write your own library to provide a high-level interface to sockets, then you will be a lot happier in the long run. For example, when someone started using setitimer(), I found that I had to check for EINTR. Because I had isolated all socket access, it did not take me long to upgrade for all applications. If I had made direct calls to sockets in the applications, then I would have been off on a long chase. Something similar occurred when I switched from read/write to send/recv so that I could use out-of-band data. --Jim Dr. James H. Coombs Senior Software Engineer, Research Institute for Research in Information and Scholarship (IRIS) Brown University, Box 1946 Providence, RI 02912 jazbo@brownvm.bitnet Acknowledge-To: <JAZBO@BROWNVM>