ables@lot.ACA.MCC.COM (King Ables) (03/06/90)
I've been using read(2) to read data from a socket and am having problems when the buffers get large. I'm hoping someone has run into this before and knows what I am doing wrong. I want to read an arbitrarily large block of data and not have to worry about message boundaries (hence read/write rather than send/recv). I use write(2) to send it to the connected socket, and get back a return code indicating all characters were "written" to the socket. I've been using the following in my readmsg function: int hd; /* host descriptor */ char *buf; /* ptr to buffer */ int n; /* size of input buffer */ count = read(hd, buf, n); hd is the socket. buf is my data buffer. n is max number of characters I want to read. count will return number of characters read. This works fine on a socket connected between two processes on my Sun up to a buffer size of 4096. Anything after that doesn't seem to make it. I get (for example) 8000 back from write(hd, buf, n) where n=8000, but when I read(hd, buf, 8000) I get 4096 and a successive read(2) returns -1. On sockets connected between two different machines, the successful buffer size varies between 512 and 2048, but acts the same way. I've fired up an ethernet monitor, and most of the data is appearing in packets, but not all of it even seems to get sent. However, more is showing up in packets (getting sent) than what winds up being read, anyway. So I'm guessing there's a timing problem. I tried setting SO_LINGER with setsockopt(2) on both ends, but that didn't seem to have any effect. I decided to try doing reads like I've seen in some source code, one character at a time. When I changed my code to do (approximately) this: while (read(hd, &c, 1) == 1) buf[count++] = c; Then it gets all 8000 characters! For several reasons, I'd really like to do it the first way. Namely I would rather not have to look for some trailing key character to know when to stop reading (or have to specify a length at the front end). Can anybody tell me what I'm missing about buffered reads that's causing me grief? Or is the answer just do single character reads? Thanks. King Ables Micro Electronics and Computer Technology Corp. ables@mcc.com 3500 W. Balcones Center Drive +1 512 338 3749 Austin, TX 78759
libes@cme.nist.gov (Don Libes) (03/06/90)
In article <637@lot.ACA.MCC.COM> ables@lot.ACA.MCC.COM (King Ables) writes: >I want to read an arbitrarily large block of data and not have to worry >about message boundaries (hence read/write rather than send/recv). > >This works fine on a socket connected between two processes on my Sun up >to a buffer size of 4096. Anything after that doesn't seem to make it. >I get (for example) 8000 back from write(hd, buf, n) where n=8000, but >when I read(hd, buf, 8000) I get 4096 and a successive read(2) returns -1. Well, what's the value of errno? Note that expecting to be able to read arbitrarily-large packets is unrealistic due to limited kernel buffering and protocol design. I know you don't want to encapsulate your I/O, but that is the only solution. I have some code that does it which you can anonymously ftp from durer.cme.nist.gov as pub/sized_io.shar.Z I also wrote a paper describing some of these problems. It is "Packet-Oriented Communications Using a Stream Protocol --or-- Making TCP/IP on Berkeley UNIX a Little More Pleasant to Use", NISTIR 90-4232, January 1990" and is available from: Mary Lou Fahey NIST Bldg 220, Rm A-127 Gaithersburg, MD 20899 fahey@cme.nist.gov Don Libes libes@cme.nist.gov ...!uunet!cme-durer!libes
jnixon@andrew.ATL.GE.COM (John F Nixon) (03/07/90)
ables@lot.ACA.MCC.COM (King Ables) writes: > I've been using read(2) to read data from a socket and am having > problems when the buffers get large... I want to read an arbitrarily large > block of data and not have to worry about message boundaries (hence > read/write rather than send/recv). Sorry, but if you are using AF_INET SOCK_STREAM sockets, and it sounds like you are from your problem description, read/write will not preserve record boundaries. From "Introductory 4.3BSD IPC" ... Stream communication implies serveral things. ... as in pipes, no record boundaries are kept. Reading from a stream may result in reading the data send from one or several calls to write() or only part of the data from a single call, if there was not enough room for the entire message, or if not all data from a large message has been transfered. So, if you want reliability, you have to manage record boundaries. If you want record boundaries, you use SOCK_DGRAM and give up reliability. I have not used any other types of sockets... yet. >I decided to try doing reads like I've seen in some source code, one character >at a time. When I changed my code to do (approximately) this: > while (read(hd, &c, 1) == 1) buf[count++] = c; >Then it gets all 8000 characters! This tells me that you are seeing problems due to the lack of record boundaries (or stating it another way, not all of your write arrives in one read). You don't have to do it character at a time! You do have to include a record size to keep yourself straight. Once you know the record size, you can ask for all of the data, accept what you get, and ask for the rest. Repeat till everything is there. while ( recordsize > sizehere ) { bytes = read (soc, buf + sizehere, recordsize - sizehere); /* error handling, make sure bytes is +ve */ sizehere += bytes; } The above fragment is more or less it. You can do all error handling in one place by making the read call a call to your own routine which then calls read. You can worry about blocking. But at least you will get all of your data. -- ---- jnixon@atl.ge.com ...steinmetz!atl.decnet!jnxion
aperez@cvbnet.UUCP (Arturo Perez x6739) (03/08/90)
From article <637@lot.ACA.MCC.COM>, by ables@lot.ACA.MCC.COM (King Ables): > I've been using read(2) to read data from a socket and am having > problems when the buffers get large. I'm hoping someone has run > into this before and knows what I am doing wrong. > > I want to read an arbitrarily large block of data and not have to worry > about message boundaries (hence read/write rather than send/recv). > > > King Ables Micro Electronics and Computer Technology Corp. > ables@mcc.com 3500 W. Balcones Center Drive > +1 512 338 3749 Austin, TX 78759 This is one of my pet peeves about BSD sockets. There is no way to read an arbitrary amount of data from a socket. You have to be aware of the kernel level buffering NO MATTER WHAT LEVEL your writing your code at; i.e. apps, system, etc. Why can't the kernel block your process until you get all the data you're asking for (unless, of course, FIONBIO or O_NDELAY is set)? If I'm willing to wait, I'm willing to wait. And if the connection goes down during the transfer, I can live with that, too, just return an error. Why was such a silly decision made? Arturo Perez ComputerVision, a division of Prime aperez@cvbnet.prime.com Too much information, like a bullet through my brain -- The Police
ka@cs.washington.edu (Kenneth Almquist) (03/11/90)
aperez@cvbnet.UUCP (Arturo Perez x6739) writes: > From article <637@lot.ACA.MCC.COM>, by ables@lot.ACA.MCC.COM (King Ables): >> I've been using read(2) to read data from a socket and am having >> problems when the buffers get large. I'm hoping someone has run >> into this before and knows what I am doing wrong. >> >> I want to read an arbitrarily large block of data and not have to worry >> about message boundaries (hence read/write rather than send/recv). > > This is one of my pet peeves about BSD sockets. There is no way > to read an arbitrary amount of data from a socket. You have to be > aware of the kernel level buffering NO MATTER WHAT LEVEL your writing > your code at; i.e. apps, system, etc. > > Why can't the kernel block your process until you get all the data you're > asking for (unless, of course, FIONBIO or O_NDELAY is set)? If I'm > willing to wait, I'm willing to wait. And if the connection goes down during > the transfer, I can live with that, too, just return an error. > > Why was such a silly decision made? I presume that the idea of having the read system call return a short count originally appeared in UNIX to deal with terminal input. When a program issues a read system call on a terminal, the read call will return as soon as a line of input is available, even if the number of characters in the line is smaller than the size of the buffer passed to read. If UNIX did not work this way, most interactive programs would have to issue a separate read system call for each character, and stop issuing system calls when a newline character was read. This would be inefficient. When a program issues a read system call on a pipe, the read call will return as soon as data is available, even if the number of characters available is smaller than the size of the buffer passed to read. If UNIX did not work this way, bc (which opens a pipe to dc) would not work unless dc inefficiently issued a separate read system call for every character. Berkeley sockets intentionally copied the pipe semantics, so that pipes could be implemented as a special case of sockets. And the semantics of Berkeley sockets can be justified independently of this. If a read on a socket worked the way that King suggests, then rlogin would have to issue a separate read system call for every character received from the remote host, which would be very inefficient. If you have to read a specific number of characters under UNIX, there are two ways to do it. One is to place a loop around the read system call. The other is to use the fread routine and let the standard I/O library take care of the buffering. Kenneth Almquist
antony@lbl-csam.arpa (Antony A. Courtney) (03/12/90)
In article <11057@june.cs.washington.edu> ka@cs.washington.edu (Kenneth Almquist) writes: >aperez@cvbnet.UUCP (Arturo Perez x6739) writes: >> From article <637@lot.ACA.MCC.COM>, by ables@lot.ACA.MCC.COM (King Ables): ||| I've been using read(2) to read data from a socket and am having ||| problems when the buffers get large. I'm hoping someone has run ||| into this before and knows what I am doing wrong. ||| ||| I want to read an arbitrarily large block of data and not have to worry ||| about message boundaries (hence read/write rather than send/recv). || || This is one of my pet peeves about BSD sockets. There is no way || to read an arbitrary amount of data from a socket. You have to be || [...] || Why was such a silly decision made? | |I presume that the idea of having the read system call return a short |count originally appeared in UNIX to deal with terminal input. |[...] |If UNIX did not work this way, [ lots of stuff would break because of lots | reasons]... hmmmm. The thought comes to mind: Why not just add an ioctl() that allows the user-level application to mark the socket for CTRAN i/o (Complete TRANsaction), and when so marked the socket will only return when the number of characters asked for on the read() call is into the buffer? antony -- ******************************************************************************* Antony A. Courtney antony@lbl.gov Advanced Development Group ucbvax!lbl-csam.arpa!antony Lawrence Berkeley Laboratory AACourtney@lbl.gov
mike@turing.cs.unm.edu (Michael I. Bushnell) (03/17/90)
In article <85@cvbnetPrime.COM> aperez@cvbnet.UUCP (Arturo Perez x6739) writes: >This is one of my pet peeves about BSD sockets. There is no way >to read an arbitrary amount of data from a socket. You have to be >aware of the kernel level buffering NO MATTER WHAT LEVEL your writing >your code at; i.e. apps, system, etc. >Why can't the kernel block your process until you get all the data you're >asking for (unless, of course, FIONBIO or O_NDELAY is set)? If I'm >willing to wait, I'm willing to wait. And if the connection goes down during >the transfer, I can live with that, too, just return an error. >Why was such a silly decision made? Nothing new. The same is true of terminal I/O. All you need is: int myread(des, buf, buflen) int des, buflen char *buf; { char *bp = buf; int nread = 0, nbytes; while (nread != buflen) { nbytes = read(des, bp, buflen - nread); if (nbytes == -1) return -1; /* Or whatever else you want */ bp += nbytes, nread += nbytes; } } This will solve your problem quite nicely. Any questions? Now you *don't* need to know about the low-level buffering. -- Michael I. Bushnell \ This above all; to thine own self be true LIBERTE, EGALITE, FRATERNITE \ And it must follow, as the night the day, mike@unmvax.cs.unm.edu /\ Thou canst not be false to any man. CARPE DIEM / \ Farewell: my blessing season this in thee!
ables@lot.ACA.MCC.COM (King Ables) (03/17/90)
From article <MIKE.90Mar16113811@turing.cs.unm.edu>, by mike@turing.cs.unm.edu (Michael I. Bushnell): > In article <85@cvbnetPrime.COM> aperez@cvbnet.UUCP > (Arturo Perez x6739) writes: > >>This is one of my pet peeves about BSD sockets. > > Nothing new. The same is true of terminal I/O. But we're not talking about terminal i/o. And anyway, "it's always been that way" is no justification for the way something works (or doesn't as the case may be). > All you need is: > [some code to take care of knowing how much data you got with > the read() so you can keep read()ing until you get it all.] Sure, it works. But Arturo's whole point is that you shouldn't *need* to do that! Everyone who has ever had this problem has had to re-invent the same wheel. Okay, it's not a complicated wheel, granted. And I understand why when terminal i/o gets involved and you're trying to write generic code to work on all kinds of i/o streams that you want to make lowest common denominator assumptions about capability. But making everyone in the world write their own version of read() to act like the real thing for their sockets isn't the answer, either. Don Libes' little library of socket i/o calls (that he referenced in a message here when this thread started) does a nice job of "fixing" the problem... too bad a few calls like this weren't in some BSD library in the first place. -king
aperez@cvbnet.UUCP (Arturo Perez x6739) (03/20/90)
It seems that I have generated a little bit of heat (but also quite a bit of light) with my statement that the buffering on BSD sockets is visible even at the application's level. I have even been accused of telling "a lie, excuse me, a misleading statement" in a public forum. So now I feel I must clarify what I meant. You may or may not recall that I claimed that the buffering on a BSD socket is visible to applications and sometimes even users. For example, here's an excerpt from a Sun 3/60 man page for tar(1): B Force tar to perform multiple reads (if necessary) so as to read exactly enough bytes to fill a block. This option exists so that tar can work across the Ethernet, since pipes and sockets return partial blocks even when more data is coming. That's my best piece of evidence. Now, you and I may know that it's not strictly necessary to have this option, but there it is. Arturo Perez ComputerVision, a division of Prime aperez@cvbnet.prime.com Too much information, like a bullet through my brain -- The Police
smb@ulysses.att.com (Steven M. Bellovin) (03/21/90)
In article <129@cvbnetPrime.COM>, aperez@cvbnet.UUCP (Arturo Perez x6739) writes: > You may or may not recall that I claimed that the buffering on a BSD socket > is visible to applications and sometimes even users. For example, here's > an excerpt from a Sun 3/60 man page for tar(1): > B Force tar to perform multiple reads (if necessary) > so as to read exactly enough bytes to fill a block. This > option exists so that tar can work across the Ethernet, > since pipes and sockets return partial blocks even > when more data is coming. > That's my best piece of evidence. Now, you and I may know that it's not > strictly necessary to have this option, but there it is. Without addressing your original claim, Berkeley had no choice on this one. The relevant factor is the TCP spec -- TCP has no concept of records, and does not preserve record boundaries. Thus, if BSD was to implement TCP -- which was the purpose of the DARPA grant that funded much of its development -- and if they were to support tar across a TCP connection -- and tar format antendates 4.2bsd by several years -- they had to do something at the application level. Any other possible implementation meeting those two constraints would have similar properties.
amoss@batata.huji.ac.il (amos shapira) (03/24/90)
ka@cs.washington.edu (Kenneth Almquist) writes: >I presume that the idea of having the read system call return a short >count originally appeared in UNIX to deal with terminal input. This isn't the only possible reason, what about reading to the end of a file? When a process tryes to read more bytes than avilable in the file then the same event would apear (i.e. the amount of bytes returned is less than the amounts requested). [ stuff deleted ] >Berkeley sockets intentionally copied the pipe semantics, so that >pipes could be implemented as a special case of sockets. I doubt this, the Berkeley sockets were devised solely for the purpose of letting processes talk to the network devices (that's why they were financed by DARPA) and they have nothing to do with the way data is transfered through them. ONE derivation of the socket mechanisem is the socketpair, which was copyed also into the implementaion of the pipe() system call. The read() and write() system calls were changed to support sockets mainly to let "naive" processes use sockets without knowing about them, this also seems to me as a good example of sticking to the UNIX dicipline that processes shouldn't care too much were their input/output comes-from/goes-to. For more info read "The Design and Implementation of the 4.3BSD UNIX(tm) Operating System" by Laffler, McKusick, Karels and Quarterman, chapters 10 to 12, note that the description of the sockets mechanism is completly separate from the other realated layers. >If you have to read a specific number of characters under UNIX, there >are two ways to do it. One is to place a loop around the read system >call. The other is to use the fread routine and let the standard I/O >library take care of the buffering. > Kenneth Almquist There is an ioctl() functions called FIONREAD which will return the number of immidietly avilable bytes to read (if you read the book mentioned above, then this is the so_rcv.sb_cc field). Note that this is implemented in the socket layer and not in the protocols layer (sys/sys_socket.c line 75 in 4.3BSD). Cheers, - Amos Shapira amoss@batata.bitnet amoss@batata.huji.ac.il