astro@princeto.UUCP (06/06/83)
<<FIX THIS BUG IF YOU DO LARGE TRANSFERS ON RAW DMA DEVICES!!!>> There is a paging routine bug in 4.1 BSD that affects the locking of memory for dma on Raw I/O devices. This bug can cause a process to hang at priority -24 (PSWP+1). The problem occurs when an attempt is made to lock a page that is in the process of being swapped out. The call in mlock in pagin() will block if this is the case. However anything can happen during this block. In particular some other process can have grabbed that page. Pagein() really should start the processing of that page fault again from the beginning. Here are the fixes to sys/vmpage.c 119,120c119,129 < if (p->p_flag & SDLYU) < mlock(pte->pg_pfnum); --- > if (p->p_flag & SDLYU) { > /* BUG FIX (WLS) */ > c = &cmap[pgtocm(pte->pg_pfnum)]; > if (c->c_lock) { > c->c_want = 1 ; > sleep( (caddr_t)c, PSWP+1); > goto restart; > } > c->c_lock = 1; > /* END BUG FIX (WLS) */ > } 150,151c159,169 < if (p->p_flag & SDLYU) < mlock(pte->pg_pfnum); --- > if (p->p_flag & SDLYU) { > /* BUG FIX (WLS) */ > c = &cmap[pgtocm(pte->pg_pfnum)]; > if (c->c_lock) { > c->c_want = 1 ; > sleep( (caddr_t)c, PSWP+1); > goto restart; > } > c->c_lock = 1; > /* END BUG FIX (WLS) */ > } The comments in mlock() and munlock() (in vmmem.c) in our source say something to the effect "THIS ROUTINE SHOULD TAKE A CMAP STRUCTURE AS AN ARGUMENT". Personally, I think this would be a bad idea for mlock(), as I think it is impossible after a block to guarantee that a page still belongs to the process that requested the mlock(). This bug tends to mainly show up on large transfers when a system is busy. It first turned up on our versatec. It is my theory that many poor innocent devices have been falsely suspected of losing interrupts due to this bug. About a month and a half ago I first reported this bug. Unfortunately the fix I reported then was erroneous. Rather than an occasional hang that fix would cause a guaranteed hang whenever the block in pagein occured. I am very sorry about that. The fix given above has been running for about two weeks without the hang re-occuring. William L. Sebok Princeton Univ. Observatory Peyton Hall, Rm 129 Princeton, N.J. 08544 ..!allegra!princeton!astro