perf@efd.lth.se (Per Foreby) (10/16/89)
Facts: Bash 1.03, Sun3 SunOS 4.0.3. Since we have over 1100 users (yp) username completion takes some time. If I break while the completion is running, I get an error the next time I try a completion: bash$ ls ~per<M-?> < takes about 30 sec > ~perf ~perh ~pers bash$ ls ~per<M-?><^C> < break after a few seconds > bash$ ls ~per < try again > free: Called with already freed block argument mailing bug report... If I put all users in my local /etc/passwd, I can't reproduce the problem. -- Per Foreby System manager at EFD, Lund Institute of Technology (Lund University) Snail: E-huset, Tekniska Hogskolan i Lund, Box 118, S-221 00 LUND, Sweden. Email: perf@efd.lth.se Phone: int + 46 46-10 74 92
news@bbn.COM (News system owner ID) (10/18/89)
In article <PERF.89Oct16173540@osiris.efd.lth.se> perf@efd.lth.se (Per Foreby) writes: < Facts: Bash 1.03, Sun3 SunOS 4.0.3. < < Since we have over 1100 users (yp) username completion takes some < time. If I break while the completion is running, I get an error the < next time I try a completion: < < bash$ ls ~per<M-?> < takes about 30 sec > < ~perf ~perh ~pers < bash$ ls ~per<M-?><^C> < break after a few seconds > < bash$ ls ~per < try again > < free: Called with already freed block argument < mailing bug report... < < If I put all users in my local /etc/passwd, I can't reproduce the < problem. Congratulations, you have a slightly different version of what I call the YP vs. vfork(2) bug. The problem is that the yp routines (used in the Sun getpw* routines) are closing and re-opening a socket to the yp server only when they think they need to (which is only correct _some_ of the time). I found it because one of these cases is when the pid chages (i.e. after a fork() or vfork()). The problem is that vfork() shares the data fork of the child and the parent until the child exec()s, and the child must be very careful not to mess with the data of the parent gratitiously. (Actually, I have many flames on the existance of vfork(). It is a hack that should never have been done. The creaters did not think hard enough about the real problem.). Bash doesn't have a vfork() (or at least, it didn't the last time I looked, which was a while ago (if you are thinking of adding it, DON'T! -- it will get you in too much trouble)). On the other hand, interupting things in the middle will have the same effect, as far as the getpw* routines are concerned. The fix is (in the end) relatively easy. What Mat Landau ended up doing to tcsh was to put a call to fix_yp_bugs() before the first and after the last of a series of calls to getpw*. This seems to have fixed the bug nicely. Be careful, though. Too many calls to this and your performance will drop, as the YP routines have to make a new socket after each call to fix_yp_bugs. #ifdef YPBUGS fix_yp_bugs() { char *mydomain; /* * PWP: The previous version assumed that yp domain was the same as the * internet name domain. This isn't allways true. * (Thanks to Mat Landau <mlandau@bbn.com> for the original version * of this.) */ if (yp_get_default_domain(&mydomain) == 0) { /* if we got a name */ yp_unbind(mydomain); } } #endif /* YPBUGS */ For example, the following routine is used in tcsh to get the next entry from the current list of files or users: char * getentry (dir_fd, looking_for_lognames) DIR *dir_fd; { extern struct passwd *getpwent (); register struct passwd *pw; register struct direct *dirp; /* extern struct direct *readdir(); */ if (looking_for_lognames) { /* Is it login names we want? */ #ifdef YPBUGS pw = getpwent(); if (pw == NULL) { fix_yp_bugs(); return (NULL); } #else if ((pw = getpwent ()) == NULL) return (NULL); #endif return (pw -> pw_name); } else { /* It's a dir entry we want */ if (dirp = readdir (dir_fd)) return (dirp -> d_name); return (NULL); } } Happy hacking, -- Paul Placeway <PPlaceway@bbn.com> Am I a wizard? Are you qualified to judge? Does it really matter in the end? "What I am is what I am, are you what you are or what?" -- E.B.