stanonik@nprdc.arpa (Ron Stanonik) (09/12/86)
At least once a day Transcript 2.0 hangs for us. (We're running it on a VAX 780 running 4.2BSD.) The symptoms are that the receiving pscomm exits after complaining "BP is undefined, flushing rest of input", while the sending pscomm loops forever waiting for its remaining child, psrv, to exit. gcore'ing psrv and pscomm reveals that psrv is blocked writing the last chunk of the prolog, whereas pscomm has already read the trailer. Huh, pscomm should be reading what psrv writes? Our current guess is that psrv sometimes completes reversing before pscomm has closed descriptor zero and dup'ed the pipe from psrv into zero. When psrv signals pscomm, pscomm setjmps back and then around the descriptor zero code. pscomm is now reading from the postscript file, rather than from the pipe from psrv. psrv reads the prolog and blocks writing because the pipe is full. pscomm begins reading, WHERE PSRV LEFT OFF! (That's the way unix seems to work, when children inherit descriptors from parents.) The first line after the prolog is "BP", which is indeed undefined, not having read the prolog. The setjmp seems not only unneccessary, but wrong, so our tentative "fix" is to remove the "setjmp(waitonreverse)" and "longjmp("waitonreverse"). Ron Stanonik stanonik@nprdc.arpa