johnl@ima.UUCP (John R. Levine) (10/07/86)
I was looking at the dbm routines the other day. If you haven't seen them, they're a neat little set of routines that maintain (key, data) pairs on the disk, with the key and data both being arbitrary byte strings, in such a way that you can fetch the data for a given key usually with only one disk read, even for a file many megabytes long, and the file is usually less than twice the size of the underlying data, which is not bad for such schemes. They were sent out with V7 and appear in the BSD releases and many Sys III and Sys V systems. Anyway, there are routines called firstkey() and nextkey() which let you sequence through all of the keys in the file. It appears to me that nextkey() has never worked for databases larger than one block. In the nextkey routine, it goes through some magic to get a key, and then (for reasonable reasons) checks to make sure that the key is in the data base and not the very first key. Unfortunately, when it looks up the very first key, that uses the same buffer where the key it was looking at lived, and smashes it. Am I missing something? It's not that complicated, and experiments verify that nextkey() fails miserably to dump out the contents of the dbm database for news history. I can fix it, but would just as soon not reinvent the wheel again. -- John R. Levine, Javelin Software Corp., Cambridge MA +1 617 494 1400 { ihnp4 | decvax | cbosgd | harvard | yale }!ima!johnl, Levine@YALE.EDU The opinions expressed herein are solely those of a 12-year-old hacker who has broken into my account and not those of any person or organization.
chris@umcp-cs.UUCP (Chris Torek) (10/08/86)
In article <240@ima.UUCP> johnl@ima.UUCP (John R. Levine) writes: >... It appears to me that nextkey() has never worked for databases >larger than one block. In the nextkey routine, it goes through >some magic to get a key, and then (for reasonable reasons) checks >to make sure that the key is in the data base and not the very >first key. Unfortunately, when it looks up the very first key, >that uses the same buffer where the key it was looking at lived, >and smashes it. The 4.2 nextkey() does not work this way. It uses the key to calculate the proper block, then finds the next object in that block (the one that follows the item given to nextkey). If it reaches the end of the block, it moves to the next block. The 4.3 nextkey() does something entirely different and incompatible, and therefore broken, though certainly more efficient. In particular, the 4.2 nextkey() would give you The Next Key after its argument. The 4.3 nextkey() gives you the next key after the previous call to nextkey(). If all your calls to nextkey() are sequential, this is equivalent. Of all the programs that use dbm or ndbm, I know of none that do not make sequential calls, for whatever that is worth. I submit that your vendor, like Berkeley, sought to `improve' Ken's code, and, like Berkeley, broke it. Unlike Berkeley, they failed to test it. -- In-Real-Life: Chris Torek, Univ of MD Comp Sci Dept (+1 301 454 1516) UUCP: seismo!umcp-cs!chris CSNet: chris@umcp-cs ARPA: chris@mimsy.umd.edu