buck@siswat.UUCP (A. Lester Buck) (11/17/89)
I just finished writing a raw Ethernet driver for AIX PS/2. During performance testing, I ran across the limit on 16 iovec's in each readv(2) or writev(2) call. This limits the maximum transfer per system call to about 24KB. Since my usual assumption is that Unix does not have many gratuitous restrictions, and this limit also exists on SunOS and 4.3BSD, could someone please explain the reason for this limit? After all, the sum of the byte counts in the 16 iovec's can be anything up to 2^32-1. Thanks alot! -- A. Lester Buck ...!texbell!moray!siswat!buck
chris@mimsy.umd.edu (Chris Torek) (11/23/89)
In article <473@siswat.UUCP> buck@siswat.UUCP (A. Lester Buck) writes: >[why is there a limit of 16 iovec vectors]? Since the array of iovec structures must exist in kernel space (for reasons having to do with cleanliness and security in the rest of the kernel), they are `created' on the kernel stack during a sendmsg() or recvmsg() (or readv() or writev()) call, and the user values are copied into this local array. It has a `reasonable' bounded size. 4.4BSD already uses the kernel malloc for readv() and writev(), and allows 8 `free' iovec structures (on the stack) or up to 1024 `expensive' iovec structures (via malloc+io+free). The same could be done for sendmsg and recvmsg, but I suspect there is a bit less incentive. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7163) Domain: chris@cs.umd.edu Path: uunet!mimsy!chris
thorinn@skinfaxe.diku.dk (Lars Henrik Mathiesen) (11/24/89)
buck@siswat.UUCP (A. Lester Buck) writes: >I ran across the limit on 16 iovec's in each readv(2) or writev(2) call. The limit is there because the iovecs are copied into an array on the kernel stack; the further processing uses a pointer to this array and a count (contained in a structure which also has total and residual byte counts and some flags). >This limits the maximum transfer per system call to about 24KB. In my opinion you should not use the readv/writev mechanism to provide packet delimiters, even for debugging. Unless you are on a machine where memory is tight, I would suggest that your driver treat the user buffer as an array of (e.g.) 2K subbuffers, one for each packet. The last word in each subbuffer could then be a bytecount. -- Lars Mathiesen, DIKU, U of Copenhagen, Denmark [uunet!]mcvax!diku!thorinn Institute of Datalogy -- we're scientists, not engineers. thorinn@diku.dk
buck@siswat.UUCP (A. Lester Buck) (11/26/89)
In article <4996@freja.diku.dk>, thorinn@skinfaxe.diku.dk (Lars Henrik Mathiesen) writes: > The limit is there because the iovecs are copied into an array on the > kernel stack; the further processing uses a pointer to this array and > a count (contained in a structure which also has total and residual > byte counts and some flags). [same answer from Chris Torek. Thanks for the information.] I still think this is quite small. Why does it need to be on the kernel stack? Why not use a kernel buffer? A 1K buffer would have a limit of 128. This doesn't sound like rocket science... > >This limits the maximum transfer per system call to about 24KB. > > In my opinion you should not use the readv/writev mechanism to provide > packet delimiters, even for debugging. Unless you are on a machine > where memory is tight, I would suggest that your driver treat the user > buffer as an array of (e.g.) 2K subbuffers, one for each packet. The > last word in each subbuffer could then be a bytecount. Reasons, please? Works just fine for me. The original requirements for the driver included the ability to bundle packets. My driver background is from System V, so I was thinking of adding ioctl's so as not to mangle the clean semantics of read(2) and write(2). When readv(2) and writev(2) were pointed out, they were exactly what was needed. For the AIX PS/2 driver, it doesn't really make that much difference, but I am told that the bundled interface is very important for decent performance on AIX 370, which is part of the project I am working on. Why not use readv/writev if they are there? -- A. Lester Buck ...!texbell!moray!siswat!buck
thorinn@skinfaxe.diku.dk (Lars Henrik Mathiesen) (11/26/89)
buck@siswat.UUCP (A. Lester Buck) writes: >In article <4996@freja.diku.dk>, thorinn@skinfaxe.diku.dk (Lars Henrik Mathiesen) writes: >> In my opinion you should not use the readv/writev mechanism to provide >> packet delimiters. >Reasons, please? Works just fine for me. >Why not use readv/writev if they are there? Normally, the 4.3 BSD kernel does its best to treat all the iovecs in a single call to readv/writev as one continuous buffer --- that includes the PF_REMOTE mode of pseudo-ttys, for instance. That's why I thought that it would be confusing to use other semantics, and that it should therefore be avoided. But it turns out that I had overlooked the "physical I/O" character devices, such as raw tapes (and disks, and so on). These devices call the physio routine, which in turn calls the driver's strategy routine once (at least) _per_iovec_. On tapes this gives one tape block per iovec (I think). So, after all, your usage is actually quite consistent with the system. Sorry for the confusion. -- Lars Mathiesen, DIKU, U of Copenhagen, Denmark [uunet!]mcvax!diku!thorinn Institute of Datalogy -- we're scientists, not engineers. thorinn@diku.dk