dave@lsuc.UUCP (David Sherman) (01/25/85)
I have just added over 1000 accounts for our students, and we now have 1526 entries in our /etc/passwd, on a Perkin-Elmer 3220 running v7. Should I consider sorting them and using a binary search? Does anyone have versions of getpw{nam,ent,uid} and anything else which would need changing, to implement a binary search? (We have a source license.) Dave Sherman The Law Society of Upper Canada (416) 947-3466 -- {utzoo pesnta nrcaero utcs}!lsuc!dave {allegra decvax ihnp4 linus}!utcsrgv!lsuc!dave
Doug Gwyn (VLD/VMB) <gwyn@Brl-Vld.ARPA> (01/29/85)
Doug Kingston implemented a binary-searched passwd index on our PDP-11 UNIX systems and a more comprehensive user database system for 4.2BSD. Contact dpk@brl-tgr.ARPA
alb@Mitre-Bedford (01/31/85)
Got just such an animal here. ~800 users were enough for us to make this change. The general idea: Agree that uid's are unique, define a structure with all the /etc/passwd information on a line in it, read the password file - writing a structure in a database for each line. You now have a binary indexed database, indexed by uid. Since you know the length of the records, lseek is used to skip to the entry desired. (very fast) To get the uid from the name, we hashed the names/uid pairs and put them in a seperate file, but one could also skip throuth the db comparing name (or any field for that matter), to the appropriate field. You can reply directly to me if you have specific questions. len ----------------------------------------------------------------------- A. Leonard Brown alb@mitre-bedford.arpa len%tufts.csnet@csnet-relay
cudcv@wkdaisy.UUCP (Rob McMahon) (02/03/85)
> Message-ID: <330@lsuc.UUCP> > I have just added over 1000 accounts for our students, and we > now have 1526 entries in our /etc/passwd, on a Perkin-Elmer 3220 > running v7. Should I consider sorting them and using a binary > search? Does anyone have versions of getpw{nam,ent,uid} and anything > else which would need changing, to implement a binary search? > (We have a source license.) At Warwick we have an indexed password file. The password file itself has fixed length records, and there are two other files which act as indexes into this by uid and by name. This means that the only programs which *have* to be recompiled with the new library routines are those which modify the password file, or that use the shell field, since this is now terminated by a colon instead of a newline. The fixed length records means that programs which modify the passwd file can simply rewrite the record they are interested in instead of copying the whole file somewhere safe, making changes on the way, and then copying or moving it back over the original (the time taken to do this when you have over 2000 entries is no joke). The getpwuid routine now just indexes into the uid index, gets the password file address from there, and then seeks into the password file. The getpwnam routine does a binary search on the name index to get the password file address. Each of these routines checks to see the indexes are up to date before using them, in case the password file has been modified. passwd, chfn, and chsh (and one of our own called pwent which changes any field interactively) just touch the indexes when they are done since they can modify the password file line in situ. makeuser inserts the new entry at the appropriate point in the indexes. deluser just overwrites the password file entry with a 'free' record which is spotted by the next makeuser, and adjusts the indexes appropriately. The routines would need slight changes to run on your system, since we have added some fields to the password file for expiry date, per-process cpu limit login quota and disk allowance (which is obsolete now we use the 4.2 quotas). makeuser would probably need major changes, since we have our own standards for usercodes, login directories etc. There are also programs to convert from variable-length- to fixed-length-records, and for recreating the indexes should they get out of date. Programs such as ps and ls that create their own in-core password file whilst reading through for the entry they want, need this code written around to gain full speed. Most programs benefit just from recompiling. The difference in speed is impessive. Try an 'ls -l' or 'ps u' or 'f' or even just logging in without indexes, and with 2000 entries in the passwd file ! Watch out for those programs (like lastcomm) that assume you won't have uid's greater than 2000 or so and core dump when you do, as well. If this sounds interesting, mail me, and I'll mail you some source, diffs and so on. I think there's too much to go on the net. Rob McMahon ... mcvax!ukc!ubu!daisy!cudcv
lorne@uokvax.UUCP (02/07/85)
Yes, we do have a hashed passwd file. We have had (and I don't think this is much to brag about) up to 3000 user accounts on an 11/70. We also run about 1000 accounts on a VAX 11/780 and use the same hashing scheme there. This code was originally written back in the V6 days, and has been modified to run on every version we've had since then (V7, 2.Xbsd, 4.Xbsd). As an explanation of how this works, we have a file called u-seek, this is a copy of the password file that allows you to seek into it using the uid. Another file called p-seek, is the same as the u-seek file but for encryted passwords. To complete the set we have two other files, u-hash and sid-hash that allow you to come up with the uid by hashing either the login name or the Student ID number. There is a set of maintence programs that keep these files up to date. We also have our own newusr and mkclass programs that create accounts and a rmusr program that destroys accounts. We have completely re-written all the getpwent(3) routines so that they use the above files. It also requires changing or putting mods into every command that parses the passwd file, a short list of these would include login, passwd, finger, ls. Good points: 1. Since we regularly run 30-40 people at once on the 11/70 it decreases login time dramaticly. It used to take up to 5 minutes to log in. 2. It speeds up all the programs that access the passwd file, ls in particular was horribly slow with a large passwd file. Bad points: 1. A person can only have one account. Specifically one login name per uid. 2. There is alot of overhead in keeping all these files up to date. Basicly it is much more difficult to make random changes to the passwd file. 3. We have never distributed this stuff before. If you wanted it, you would be the first, and there are some problems associated with that. We would be glad to help out anyone who wanted a copy of it. Send mail to me for more information. Lorne Wilson University of Oklahoma ...ctvax!uokvax!lorne
guy@rlgvax.UUCP (Guy Harris) (02/07/85)
> At Warwick we have an indexed password file. The password file itself has > fixed length records, and there are two other files which act as indexes into > this by uid and by name. > > This means that the only programs which *have* to be recompiled with the > new library routines are those which modify the password file, or that use > the shell field, since this is now terminated by a colon instead of a newline. Programs which still rummage through the password file themselves, or which use "getpw", will have to be rewritten (they should have been rewritten *long* ago!) and recompiled. (The System V Release 2 "ls" still does this, so there are probably several other fossils lying around various flavors of UNIX.) Guy Harris {seismo,ihnp4,allegra}!rlgvax!guy