discolo@ucsbcsl.UUCP (Anthony V. Discolo) (08/24/85)
In the work on the 4.2BSD kernel that I have been doing recently, I have come across a problem that I do not understand. The scenario is this: In a kernel process that I create at boottime (called somedaemon below), I allocate a buffer with geteblk with a size of about 10k, and use this buffer in two ways. First, I use it to receive a udp packet with soreceive(), and then I bcopy out the first 100 bytes or so, and then I copy out the remaining bytes in the message from the buffer. What I get is a spurious protection fault inside uiomove called from soreceive, or at the second bcopy statement. The fault happens after the system has been running a while under some pretty heavy testing. This is very sketchy, so here is the relevent code. Comments between (* *) describe code that I took out for brevity. %%% BEGIN CODE %%% #define MAXRPCSIZ (8096 + 2048) somedeamon() { int error, bpsize; register caddr_t bpaddr; struct iovec aiov; struct uio auio; register struct mbuf *m; struct mbuf *nam, *from, *rights; struct buf *bp; struct in_addr *myaddr; register struct rpcframe *rpf; struct sockaddr_in rpcdaddr; (* creation and binding of socket *) bp = geteblk(MAXRPCSIZ); if (bp == NULL) { printf("somedaemon: geteblk failed\n"); (* handle error *) } bpaddr = bp->b_un.b_addr; bpsize = bp->b_bufsize; for (;;) { /* * Read in message. */ aiov.iov_base = bpaddr; aiov.iov_len = bpsize; auio.uio_iov = &aiov; auio.uio_iovcnt = 1; auio.uio_segflg = 1; auio.uio_offset = 0; auio.uio_resid = aiov.iov_len; if (bp->b_bufsize != bpsize) printf("somedaemon: bufsize changed, now %d\n", bp->b_bufsize); /* * WILL SOMETIMES PANIC HERE */ error = soreceive(rpc_so, &from, &auio, 0, &rights); (* check error, m_free from and rights *) m = m_get(M_WAIT, MT_RPCFRAME); if (m == NULL) { printf("somedaemon: m_get failed\n"); continue; } rpf = mtod(m, struct rpcframe *); bcopy(bpaddr, (caddr_t)rpf, sizeof (struct rpcframe)); rpf->rpc_data = NULL; (* various data integrity checks *) if (rpf->rpc_datasiz <= 0) { printf("somedaemon: datasiz <= 0\n"); (void) m_free(m); continue; } else if (rpf->rpc_datasiz > bpsize - sizeof (struct rpcframe)) { printf("somedaemon: datasiz too large: was %d\n", rpf->rpc_datasiz); (void) m_free(m); continue; } /* * Copy in the data. */ rpf->rpc_data = geteblk(rpf->rpc_datasiz) if (rpf->rpc_data->b_bufsize < rpf->rpc_datasiz) { printf("somedaemon: b_bufsize too small: was %d, should be %d\n", rpf->rpc_data->b_bufsize, rpf->rpc_datasiz); brelse(rpf->rpc_data); (void) m_free(m); continue; } /* * WILL SOMETIMES PANIC HERE */ bcopy(bpaddr + sizeof (struct rpcframe), rpcdtoc(rpf), rpf->rpc_datasiz); (* more code - rpf->rpc_data will be brelse()d elsewhere *) } } %%% END CODE %%% I never see the output of any of the checks on the size of the message, or the size of the buffer, so I am assuming that it passes them. Here are two stack traces, one for each place it panics: sbr 8002ec64 slr 34dc p0br 805d5600 p0lr 0 p1br 7fdd5a00 p1lr 1ffff8 *(scb-4)$c _boot() from 80026fa6 _boot(0,0) from 80026fa6 _panic(8003fd2b) from 80011a6e _trap() from 80027618 _Xtransflt(8065bf8c,68,0,7fffff90) from 80001151 _soreceive(8065cb0c,7fffff88,7fffff90,0,7fffff84) from 8001d5fb _somedaemon() from 8000f25a _main(26d) from 80008cc9 sbr 8002ec64 slr 34dc p0br 805d5600 p0lr 0 p1br 7fdd5a00 p1lr 1ffff8 *(scb-4)$c _boot() from 80026f76 _boot(0,0) from 80026f76 _panic(8003fcef) from 80011a3e _trap() from 800275e8 _Xtransflt() from 80001151 _main(26c) from 80008cb1 In both cases, the pc (from the trap message) points to a movc3 instruction that includes the large buffer that somedaemon uses for receiving the udp message (called bp in the code above). I have figured out a way to fix this: allocate a character array in place of the buffer. This works fine. If any of you have any ideas why this doesn't work, some suggestions on where to look, or something to try in adb which might give me a clue, *please* send me mail (or post it if you think it is important). Thanks in advance. -- uucp: {ucbvax,cepu}!ucsbcsl!discolo arpa: ucsbcsl!discolo@BERKELEY csnet: discolo@ucsb USMail: U.C. Santa Barbara Department of Computer Science Santa Barbara, CA 93106 GTE: (805) 961-4178
chris@umcp-cs.UUCP (Chris Torek) (08/28/85)
>In the work on the 4.2BSD kernel that I have been doing recently, I >have come across a problem that I do not understand. ... I allocate >a buffer with geteblk with a size of about 10k, Say no more! That is the problem. You cannot allocate a buffer larger than MAXBSIZE without incurring "mysterious problems", since each buffer has a virtual memory space of MAXBSIZE bytes (note that for this reason MAXBSIZE must be a multiple of CLBYTES). Had you installed my mass driver---or had Berkeley put a firewall in allocbuf in the first place---you would have found the problem much earlier. Don't be embarrassed, though; I did the same thing the first time in the mass driver, thus the following fix. For those of you who don't want to install the whole thing, here's just the changes to sys/vax/ufs_machdep.c. Your line numbers may vary: *** /tmp/,RCSt1004349 Wed Aug 28 12:10:44 1985 --- /tmp/,RCSt2004349 Wed Aug 28 12:10:45 1985 *************** *** 29,32 sizealloc = roundup(size, CLBYTES); /* * Buffer size does not change --- 28,34 ----- sizealloc = roundup(size, CLBYTES); + if (sizealloc > MAXBSIZE) + panic("allocbuf"); + /* * Buffer size does not change *************** *** 68,72 &tp->b_un.b_addr[tp->b_bufsize], take); tp->b_bufsize += take; ! bp->b_bufsize = bp->b_bufsize - take; if (bp->b_bcount > bp->b_bufsize) bp->b_bcount = bp->b_bufsize; --- 70,74 ----- &tp->b_un.b_addr[tp->b_bufsize], take); tp->b_bufsize += take; ! bp->b_bufsize -= take; if (bp->b_bcount > bp->b_bufsize) bp->b_bcount = bp->b_bufsize; -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 4251) UUCP: seismo!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@maryland
thomson@uthub.UUCP (Brian Thomson) (08/29/85)
The file /sys/h/param.h defines the parameter MAXBSIZE, with the comment MAXBSIZE primarily determines the size of buffers in the buffer pool. Your system, like most, probably has MAXBSIZE set to 8192. Although geteblk(n) doesn't check to see if you've exceeded that limit, it is an error to do so. This happens because the system allocates MAXBSIZE bytes of system virtual space per buffer at startup, then slaps in as many real pages per buffer as you need when you need them. By asking for a 10k buffer, you slop over into the virtual space reserved for the next buffer, and those extra pages may be freed when that next buffer shrinks. The symptom would be a protection trap panic accessing an address more than 8k from the start of the buffer (bp->b_un.b_addr). -- Brian Thomson, CSRI Univ. of Toronto {linus,ihnp4,uw-beaver,floyd,utzoo}!utcsrgv!uthub!thomson