shannon@sun.uucp (Bill Shannon) (07/28/87)
Index: sys/kern_exec.c 4.3BSD [FIX] Description: vinifod (in vm_subr.c) is called with a pointer to pte's. vinifod calls bmap, which can sleep. While the process is sleeping in bmap, it can be swapped out. When it is swapped back in, the pte's can be allocated at a different kernel virtual address. vinifod will then use the old pte address and scribble on random kernel data. Repeat-By: Lotsa luck. While debugging new hardware we had just the right combination of events to make this occur repeatably. I wouldn't expect it to occur very often in general. Fix: The quick fix is to prevent the process from being swapped during the time the pte's are being initialized. (Sorry I don't have better diff's or something.) In kern_exec.c, change line 453 to u.u_procp->p_flag |= pagi | SKEEP; After line 469 (after the call to vinifod) add: u.u_procp->p_flag &= ~SKEEP;
goshen@ccicpg.UUCP (Shmuel Goshen) (07/28/87)
In article <24281@sun.uucp> shannon@sun.uucp (Bill Shannon) writes: >Description: > vinifod (in vm_subr.c) is called with a pointer to pte's. vinifod > calls bmap, which can sleep. While the process is sleeping in bmap, > it can be swapped out. When it is swapped back in, the pte's can > be allocated at a different kernel virtual address. vinifod will > then use the old pte address and scribble on random kernel data. >Fix: > In kern_exec.c, change line 453 to > > u.u_procp->p_flag |= pagi | SKEEP; > > After line 469 (after the call to vinifod) add: > > u.u_procp->p_flag &= ~SKEEP; The same fix should be made in vm_pt.c around line 120. Set SKEEP before the call to vinifod and reset it after the call. The modified text will look like: if (xp->x_flag & XLOAD) { p->p_flag |= SKEEP; vinifod((struct fpte *)tptopte(p, 0), PG_FTEXT, xp->x_iptr, (daddr_t)1, xp->x_size); p->p_flag &= ~SKEEP; -- Shmuel Goshen (714) 951-8053 Computer Consoles Inc. (714) 458-7282 Irvine, CA. {allegra!hplabs!felix,seismo!rlgvax}!ccicpg!goshen
goshen@ccicpg.UUCP (Shmuel Goshen) (07/28/87)
Ooops, Please disregard my previous posting (additional change to vm_pt.c). That fix is not needed with the fix posted by Bill Shannon. (It was needed only for the way WE, at CCI fixed the same problem in kern_exec.c. -- Shmuel Goshen (714) 951-8053 Computer Consoles Inc. (714) 458-7282 Irvine, CA. {allegra!hplabs!felix,seismo!rlgvax}!ccicpg!goshen
mojo@sun.uucp (Joseph Moran) (07/29/87)
In article <1497@ccicpg.UUCP> goshen@ccicpg.UUCP (Shmuel Goshen) writes: >In article <24281@sun.uucp> shannon@sun.uucp (Bill Shannon) writes: >>Description: >> vinifod (in vm_subr.c) is called with a pointer to pte's. vinifod >> calls bmap, which can sleep. While the process is sleeping in bmap, >> it can be swapped out. When it is swapped back in, the pte's can >> be allocated at a different kernel virtual address. vinifod will >> then use the old pte address and scribble on random kernel data. > >>Fix: >> In kern_exec.c, change line 453 to >> >> u.u_procp->p_flag |= pagi | SKEEP; >> >> After line 469 (after the call to vinifod) add: >> >> u.u_procp->p_flag &= ~SKEEP; > > >The same fix should be made in vm_pt.c around line 120. >Set SKEEP before the call to vinifod and reset it after the >call. The modified text will look like: > > if (xp->x_flag & XLOAD) { > p->p_flag |= SKEEP; > vinifod((struct fpte *)tptopte(p, 0), PG_FTEXT, xp->x_iptr, > (daddr_t)1, xp->x_size); > p->p_flag &= ~SKEEP; > Be careful, this stuff can be very tricky. I believe that the original fix suggested by shannon was correct and the 2nd fix is well intended but might be incorrect depending on other changes made to the system. It was not obvious from the original message, but in the first fix the SKEEP flag is on for the process being exec'ed all during the calls to vgetvm(), xalloc(), and vinifod() from getxfile(). It is my belief that the only time vinitpt() will end up calling vinifod() is when it is called as a result of getxfile() calling xalloc() calling xlink(). If my supposition is correct, then the change to getxfile() for setting the SKEEP flag will prevent the swap operation from occurring for both initialization of the data pte's (vinifod() being called directly from getxfile()) and for the text pte's (vinifod() being called from vinitpt() as a result of the xalloc() call). And as a matter of fact, using the second "fix" listed above will break the first fix since it is assuming the that SKEEP flag is turned off when vinitpt() is was called. Thus if you believe that the vinitpt() code might be called with the SKEEP flag off once the first fix is applied the correct change to vinitpt() would be something along the line of: if (xp->x_flag & XLOAD) { int fixflag; if ((p->p_flag & SKEEP) == 0) { printf("vinitpt: SKEEP not set"); p->p_flag |= SKEEP; fixflag = 1; } else fixflag = 0; vinifod((struct fpte *)tptopte(p, 0), PG_FTEXT, xp->x_iptr, (daddr_t)1, xp->x_size); if (fixflag) p->p_flag &= ~SKEEP; } My belief is that if the original suggested fix is made, the debugging printf should never occur. If it does, then my statement that says the the original fix should be sufficient would be incorrect, and then the code suggest above (minus the printf) could be used. So far in the testing we have done we have not seen vinifod() being called with the SKEEP flag off when using just the original proposed fix to getxfile(). Another way to attack this problem is to have vinifod() itself set and clear the SKEEP flag similar to how it's done above so that it doesn't assume anything about the process flags on entry. But then this assumes that there is no path from the place where the addresses passed in to vinifod() were calculated to the place where vinifod() turns on the SKEEP flag were the system could sleep and the process might get swapped out. It turns out that this is currently valid, as the only 2 calls to vinifod() use a tptopte() or a dptopte() macro inline with the vinifod() call, but this might change in some implementation of the BSD vm system. BTW, the failure mode we saw for this problem was that a user process would once and awhile dump core an a *very* heavily loaded system during some stress testing. Looking at the core file, all the data in the core file which was mapped to a place in the executable file that was beyond the cross-over to indirect blocks in the file (12 blocks) was all zeroes. We eventually tracked it down to the process getting swapped trying to read the indirect block in vinifod(). This caused the pte's for the part of the data segment that was mapped by the indirect block to end up being left marked as PG_FZERO (zero fill on demand) which was set up by vgetvm(). We never did see a kernel problem that resulted from vinifod() filling in pte's at the wrong address in usrpt space.
chris@mimsy.UUCP (Chris Torek) (07/29/87)
>>The same fix should be made in vm_pt.c around line 120. I had wondered about this too. In article <24372@sun.uucp> mojo@sun.uucp (Joseph Moran) writes: -It was not obvious from the original message, but in the first fix the -SKEEP flag is on for the process being exec'ed all during the calls to -vgetvm(), xalloc(), and vinifod() from getxfile(). It is my belief -that the only time vinitpt() will end up calling vinifod() is when it -is called as a result of getxfile() calling xalloc() calling xlink(). This seems to be true. Note that xalloc() can twiddle SKEEP itself, but only if not paging from an inode, in which case XPAGI is not set and hence xlink will not call vinitpt() anyway. In other words, the original change may unnecessarily turn off SKEEP, but so what? -Another way to attack this problem is to have vinifod() itself set and -clear the SKEEP flag similar to how it's done above so that it doesn't -assume anything about the process flags on entry. Or have vinifod() find the ptes itself. It could simply redo the ?ptopte() after each call to bmap(). But is it safe to swap the process before all its ptes are set? -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690) Domain: chris@mimsy.umd.edu Path: seismo!mimsy!chris