cpj@ENG.SUN.COM (Chuck Jerian) (07/05/89)
I took a large list of names, and set up the QUIPU server discussed in the earlier item. This name server used over 1K of memeory per name in the list, so that to store the large list of people and machines that I made of everyone at Su, it used over 16M of virtual memory. It seemed to reference almost 32M making this list, but freed about half of the total. The data is organized as a giant linked list within a level of the directory. A search of the directory using the ISO search mechanism which allows for 'x*y*z' causes the server to violently page trash, as it references every page of this giant list. An answer is sometimes forthecoming in a minute. The QUIPU server is better behaved for small sets of names. On the other hand gnu-grep can always scan this same list of data using arbitrary regular expressions which are more powerful than those in X.500 in less than .3 seconds on a Sun4/260 with the data represented as a text file. This suggests to me that the most important issue in searching name servers is the organization of data and the choice of algorithms, those in QUIPU are terrible, much worse than text files and gnu-grep. --cpj
steve@CS.UCL.AC.UK (Steve Kille) (07/07/89)
I suggest that we move this discussion to the <quipu@cs.ucl.ac.uk> list, which is focussed on the issue. This is an open list, send to <quipu-request@cs.ucl.ac.uk> of you want to join. If you have problems with installing or using quipu, please send reports to <quipu-support@cs.ucl.ac.uk>. >From: Chuck Jerian <cpj@eng.sun.com> >To: tcp-ip@sri-nic.arpa >Subject: in re:quipu as X.500 server >Date: Tue, 4 Jul 89 12:22:54 PDT >I took a large list of names, and set up the QUIPU server discussed >in the earlier item. This name server used over 1K of memeory per >name in the list, so that to store the large list of people and machines Right. There is a lot of structuring info needed. I guess that your entries don't have much data, as we estimate an average of 2k per entry. There is quite a bit of optimisation possible without too much effort, which we expect to do for QUIPU 6.0. >that I made of everyone at Su, it used over 16M of virtual memory. >It seemed to reference almost 32M making this list, but freed >about half of the total. Are you sure? This surprises me. >The data is organized as a giant linked list >within a level of the directory. A search of the directory using the >ISO search mechanism which allows for 'x*y*z' causes the server to >violently page trash, as it references every page of this giant list. >An answer is sometimes forthecoming in a minute. The QUIPU server >is better behaved for small sets of names. This is what I would expect. If you search the entire tree, you are going to touch bits all over the virtual memory used. If this can all fit into real memory, performace is reasonable (about 1000 entries per second, for Vaxstation II - quite a bit slower than the machine you quote). If you step off real memory, it thrashes (as you note). In many cases, the X.500 Directory Information Tree hierarchy will be used to control scope of search. We also plan to make some changes so that common searches touch less memory (e.g., by grouping all the phone data onto adjacent memory). >On the other hand gnu-grep can always scan this same list of data >using arbitrary regular expressions which are more powerful than >those in X.500 in less than .3 seconds on a Sun4/260 with the >data represented as a text file. grep is a good tool, and has its applications. However, supporting a wide area directory is not one of them. >This suggests to me that the most important issue in searching nam >e >servers is the organization of data and the choice of algorithms, Absolutely. Seems like motherhood to me! >those in QUIPU are terrible, much worse than text files and gnu-grep. > --cpj This does not follow. QUIPU can do a lot of things which gnu-grep can't! Let me explain some of the philosophy behind why QUIPU was done the way it was. The OSI Directory (X.500) has a very rich (too rich?) framework. One design approach is to chooes your database, and then provide X.500 access to it. For example, I could choose gnu-grep + single file as my database, and then give X.500 access. This would give stunning performance for the things my datanbase was good at, but would fall to pieces for things keyed the wrong way, or questions which could not be formulated. With QUIPU, the internal (memory) structures are aligned very much to that of the OSI directory. This means that a query will cost about in proportion to the complexity of the query in X.500 terms. This means that it will not be stunningly good for any sort of query. However (and more important in an experimental implementation) it will not be stunningly bad for any sort of query. One of the things I'd hope to learn from the QUIPU experiment is how one might key a database, and which are the things that need to be optimised for under "real" usage. Steve