olson@sax.cs.uiuc.edu (Robert Olson) (02/07/91)
I am using a dbm database (via dbmopen) to create a rather large, sparse database. My data fits nicely in the form... $db{key1} $db{key1, key2} $db{key1, key2, key3} etc where there are relatively few values of each of the first couple keys. (Perhaps an example is in order. The data is information about classes gleaned from the g++ compiler. $db{"class"} = "X,Y,Z" $db{"class", "X"} = "method,field" $db{"class", "X", "field" } = "field1,field2,field3" $db{"class", "X", "field", "field1"} = "type,offset" $db{"class", "X", "field", "field1", "type" } = "int" As you can guess, there are a large number of different keys in the second and fourth positions and relatively few in the first and third. ) Some of the values in the db are list which I just represent as comma separated lists of strings. It is easiest given the data I create the database from to store the initial value of the list in the database, and append succeeding values to the list as I read them from the input. Problem: I get errors such as dbm store returned -1, errno 28, key "class^\InfrastructureManager^\method" at db.pl line 42, <> line 810. I get errno 28 to be No space left on device. This shouldn't have been the case; there were >50M free on the disk. BUT, I guess this is telling: % ls -sl class* 28 -rw-rw-r-- 1 olson 61440 Feb 6 22:25 classinfo.dir 4840 -rw-rw-r-- 1 olson 491334656 Feb 6 22:25 classinfo.pag Not much real storage, but lotsa holes.... (Note: this is for the preallocated version (see below). The unpreallocated versions showed similar behavior but with the apparant size only in the 100M range). The first thing I tried was to make each key a constant length, so that instead of $db{"class", "X"} I had $db{"class", "X", '', '', ''}. This didn't seem to change much. Suspecting that appending to database elements (eg $db{"class"} = $db{"class"} . ',' . $elt) was hurting me, I tried preallocating the value via $db{"class"} = ' ' x 100; $db{"class"} = $firstValue but that didn't seem to help either. From what I did it may be obvious that I do not understand the workings of dbm files... Any suggestions? --bob PS. Apologies if this gets posted twice; the first one didn't show up on the nntp server... -- Bob Olson University of Illinois at Urbana/Champaign Internet: rolson@uiuc.edu UUCP: {uunet|convex|pur-ee}!uiucdcs!olson UIUC NeXT Campus Consultant NeXT mail: olson@fazer.champaign.il.us "You can't win a game of chess with an action figure!" AMA #522687 DoD #28
flee@cs.psu.edu (Felix Lee) (02/08/91)
> dbm store returned -1, errno 28, key "class^\InfrastructureManager^\method" > at db.pl line 42, <> line 810. >I get errno 28 to be No space left on device. This shouldn't have been You're running into ndbm limitations. Ndbm returns ENOSPC when your key+data exceeds PBLKSIZ bytes (in <ndbm.h>). This may be mentioned in the "BUGS" section of your ndbm(3) man page. (The SunOS man page claims 4096 bytes, but <ndbm.h> says 1024.) Ndbm also runs into problems when all the data that hashes to the same 32-bit integer overflows a PBLKSIZ block, but I don't think it returns ENOSPC in this case. -- Felix Lee flee@cs.psu.edu
olson@sax.cs.uiuc.edu (Bob Olson) (02/08/91)
Yep, that seems to be it. Credit also to Randal who emailed a similar answer. My solution? Tokenize all strings inserted into the database using another dbm database for the string->integer conversion and an array for the integer->string conversion. It works quite well, and the resulting databases are a LOT smaller, with seemingly small performance hit. The database ends up consisting of entries like 1^\3^\7^\190 --> 4,2,10,5 --bob