schwartz@ncar.ucar.edu (Michael Schwartz) (06/28/89)
I have used Sun's Lightweight Process (LWP) package for a couple of different research prototypes. I came across and solved some problems with the package. I am reporting the bugs to Sun, but since there won't be fixes for them until SunOS 4.1 comes out, I thought I'd tell people about them now. Summary of problems: 1. Problem with the use of the non blocking I/O library (libnbio.a) that sometimes caused the shell on which the program was running to exit. 2. Problem with the LWP library where after a while an internal thread called the stkreaper gets into a high prio infinite loop within a locked critical section, starving all the other threads, and wedging the CPU. This was by far the most difficult bug to track down, and also the one that caused the most trouble. 3. Problem with the nbio implementation of select, which causes threads to awaken when there is no I/O available and before a timeout has occurred. 4. Problem with the nbio implementation of connect, which returns with errno == EINVAL instead of ECONNREFUSED when the connection is refused. 5. Dependency bug in nbio library that causes it not to be linked in sometimes, even though you specify -lnbio when linking. 6. Inability to get sequentially numbered thread IDs, for use in indexing a global data structure. Problem 1. Since the LWP package is implemented as library code instead of in the kernel, the standard UNIX I/O calls will cause the UNIX process to block if any thread uses one. The Non-Blocking I/O package gets around that by supplying versions of most of the main I/O library routines that know about LWPs and do the right thing (i.e., just block one thread). The problem I had was that when an error caused the UNIX process to crash, the shell on which the program was running would exit, as if it had received an EOF. It turned out that the problem was due to something to which there was a passing reference in the BUGS section of the manpage Intro.3l: "Killing a process that uses the non-blocking I/O library may leave objects (such as its standard input) in a non-blocking state. This could cause confusion to the shell." When the program died, the shell was left with non-blocking input, and so it thought it was getting EOFs. To deal with this problem, you can catch signals and restore the file descriptors to blocking state, as follows: CleanupHandler(sig) int sig; { int fd; int FDState; sigsetmask(~0); printf("CleanupHandler received signal %d\n", sig); for (fd = 0; fd < NOFILE; fd++) { FDState = fcntl(fd, F_GETFL, 0); if (FDState != -1) fcntl(fd, F_SETFL, FDState & ~(FASYNC|FNDELAY)); } pod_exit(0); } Even after I did this, the problem still happened for some reason when the program wrote to stderr. I don't know why. stdout doesn't give me the problem. Problem 2. When a user-level thread dies, an internal thread called the stkreaper runs, and reclaims the stack space used by that thread. The problem I had was that the stkreaper did not properly clean up its data structures after a thread dies, causing it to get into a high prio infinite loop within a locked critical section. When this happens, none of the other threads make any more progress, and the CPU gets wedged. The problem will only happen in the case where some threads finish executing before others are started (hence my comment above about how this would only happen in a serious LWP application). The fix is to use a modified version of one of the LWP source files (stack.c). I can't give out the fixed file (since it is copyrighted), but if you have a source license I could give you a context diff of the fixed file. I could also give you .o files for Sun 3's and Sun 4's, so you can link with them (if you trust me!). Problem 3. A common application level paradigm using the non-blocking I/O package is as follows: send a message call select() to wait for the response (calling recv() at this point wouldn't work -- it would fail with errno == EWOULDBLOCK) call recv() to get the response When the I/O is complete, the UNIX process gets a SIGIO, and the LWP package must decide which thread(s) to awaken. But as it is currently implemented, every thread that is waiting on SIGIO gets reawakened. The result is that the semantics of select are not correctly implemented -- a thread could return from select with no I/O available, and before the timeout has expired. This breaks existing applications that you are trying to convert to use LWPs (as I was doing in one case). This should be fixed in the LWP package, but in the mean time, you can get around the problem by changing the above code to something like send a message do { call select() call recv() } while the recv call failed with errno == EWOULDBLOCK This doesn't quite do it, though, because each time you do the select you will be using the full original timeout. I ended up keeping track of how much time had passed, using something like: send a message; SelectAgain: gettimeofday(&starttime, &timezone); select(..., &timeout); if (recv(s, answer, anslen, 0) <= 0) { if (errno == EWOULDBLOCK) { gettimeofday(&endtime, &timezone); timeout.tv_sec -= (endtime.tv_sec - starttime.tv_sec); timeout.tv_usec -= (endtime.tv_usec - starttime.tv_usec); if (timeout.tv_usec < 0) { timeout.tv_sec--; timeout.tv_usec += 1000000; } else if (timeout.tv_usec > 1000000) { timeout.tv_sec++; timeout.tv_usec -= 1000000; } goto SelectAgain; } /* Otherwise it's a recv error */ } This is definitely a suboptimal solution, because there is alot of overhead in having every thread that's waiting for I/O (usually all of them) wake up and do this check, including the system calls to get the time and do a select again. This should be fixed in the guts of the LWP library. Problem 4. When a connection is refused (because no server is listening at a port that a program tries to connect to), the connect call should return with errno == ECONNREFUSED. For some reason, it comes back with EINVAL instead. I don't know why this is. For the time being I just hacked the connect code to translate EINVAL into ECONNREFUSED, but that's obviously only a stop-gap hack. Problem 5. To use the nbio library, you are supposed to be able to just do cc -o prog prog.c -lnbio -llwp However, I found that this didn't always work -- sometimes the entire UNIX process would block as soon as one of the threads did an I/O call. This problem has to do with the dependency chain between the routines in libnbio.a and liblwp.a. You can get around the problem either by doing ar x /lib/libnbio.a nb.o cc -o prog prog.c nb.o -lnbio -llwp rm nb.o or by doing cc -o prog prog.c -lnbio -llwp -lnbio -llwp Problem 6. Sometimes it is useful to build a global data structure shared by all threads, indexed by thread ID. The problem is that the thread IDs returned in the lwp_create call (or the lwp_self call) are not sequential -- thread and monitor ids come from a single number space, and thus any monitors that are created "below the scenes" by the LWP package cause the id numbers for the threads to skip values -- e.g., I got back ids 2, 5, 7, 8, 10, when creating a set of 5 threads... One thing you could do to get around this is to allocate more space in the global data structure taht you want to index by thread ID. A more space efficient way is to build a routine that keeps a mapping of Sun thread IDs to sequential thread IDs, using an array, as follows: static int ThreadIDs[5*MAX_LWPS]; /* * The numbering of unique ids generated by * Sun LWPs seems to skip no more than 2 or * 3 values between each LWP; use stride * size of 5 just to be safe. This is a * hack, but it's safe (since even if more * than 5 values are used sometimes, on * average fewer than 5 are, so we wont * overflow this array) and efficient. */ static int GlobalIDCount = 0; int MyThreadID() { thread_t tid; if (lwp_self(&tid) < 0) { sprintf(stderr, "MyThreadID: lwp_self failed\n"); pod_exit(1); } if (ThreadIDs[tid.thread_key] == 0) { /* * Not yet initialized -- * assumes globals are * initialized to all 0's */ ThreadIDs[tid.thread_key] = GlobalIDCount; } return(ThreadIDs[tid.thread_key] - 1); /* * -1 so the ID's are numbered * 0..whatever-1, rather than * 1..whatever */ } Again, it would be better if Sun provided a way to get sequential IDs directly, since this kind of expensive. - Mike Schwartz Dept. of Computer Science U. Colorado - Boulder