whm@arizona.UUCP (05/27/87)
The man page for ndbm(3) says: The sum of the sizes of a key/content pair must not exceed the internal block size (currently 4096 bytes). However, it appears that the actual limit is 1024, rising from #define PBLKSIZ 1024 in <ndbm.h>. We found that changing PBLKSIZ to 4096 seems to work just fine, but it's not clear what's in error, the man page or the code. Bill Mitchell whm@arizona.edu {allegra,cmcl2,ihnp4,noao}!arizona!whm
allyn@sdcsvax.UCSD.EDU (Allyn Fratkin) (05/27/87)
In article <1739@megaron.arizona.edu>, whm@arizona.edu (Bill Mitchell) writes: > However, it appears that the actual limit is 1024, rising from #define > PBLKSIZ 1024 in <ndbm.h>. We found that changing PBLKSIZ to 4096 seems > to work just fine, but it's not clear what's in error, the man page or the > code. Some time ago, I was looking at the code for ndbm, and I concluded that somehow the values for the defines of DBLKSIZE and PBLKSIZE are reversed. It makes little sense (to me) to have DBLKSIZE be so large since most databases don't get that big. Flame me if I'm wrong (somebody will), but isn't the .dir file basically a bitmap (used in the hash calculation) of the available file pages? How many people regularly have databases that are 4096 pages long? Incidentally, I ported the ndbm package to an IBM PC running PC/IX and I changed the PBLKSIZE to 3072 and the DBLKSIZE to 512. It works fine. -- From the virtual mind of Allyn Fratkin allyn@sdcsvax.ucsd.edu or EMU Project {ucbvax, decvax, ihnp4} U.C. San Diego !sdcsvax!allyn
chris@mimsy.UUCP (05/28/87)
>In article <1739@megaron.arizona.edu> whm@arizona.edu (Bill Mitchell) writes: >>... it appears that the actual [key+content length restriction] is >>1024 [bytes], rising from #define PBLKSIZ 1024 in <ndbm.h>. In article <3231@sdcsvax.UCSD.EDU> allyn@sdcsvax.UCSD.EDU (Allyn Fratkin) writes: >It makes little sense (to me) to have DBLKSIZE be so large since most >databases don't get that big. ... isn't the .dir file basically a >bitmap (used in the hash calculation) of the available file pages? More precisely, it is a sort of binary tree describing which pages have been split. It takes 2 ** ceil(log (n+1)) - 1 2 bits to describe an n-page database. Our old hashed host files are 2020352 bytes, or 1973 pages, requiring 2048 bits. The 4096 DBLKSIZ means that one can describe a 32767 page, or ~33Mbyte, database without ever having to swap bitmap blocks. In practise, it takes fewer bits, as only ones are stored. One could have a database grow to 64MB before a second bitmap block was required. Increasing PBLKSIZ means that larger databases need fewer bits (since each bit describes a larger `page'), but increases the memory-to-memory copying overhead involved in adding and deleting items. On a 4K-block file system, it will improve I/O bandwidth. In short, it would probably be a win to change DBLKSIZ TO 1024 and PBLKSIZ to 4096. Alas, all old databases would have to be reconstructed. (Mdbm does not suffer from this problem: the data and map block sizes are set at database creation, and are stored in the database itself.) Incidentally, the actual key+content length restriction is PBLKSIZ - 3 * sizeof (short), or 1018 bytes in 4.3BSD. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 7690) Domain: chris@mimsy.umd.edu Path: seismo!mimsy!chris