gary@dgcad.sv.dg.com (Gary Bridgewater) (12/07/90)
I was intrigued by the dynamic sizing of the .pag file reported to be in the latest DBZ in C-news so I decided to implement and test it. It has been running for a week now - through three expires - and I am pleased. Not that it is much (if any) faster than the next previous version - it seems nearly the same which is not surpising since B expire is I/O bound. What is very nice is that the history.pag file is now much smaller, i.e. -rw-r--r-- 1 news news 12506869 Dec 6 16:24 history -rw-r--r-- 1 news news 75 Dec 4 18:21 history.dir -rw-r--r-- 1 news news 2099264 Dec 6 16:24 history.pag (I keep 38 days of history) The history.pag file used to be about the same size as history (although it may be sparse I think the element size on my disk was bigger than any gaps so it was really that big) and, more to the point, wouldn't fit in memory when using INCORE. This made expire thrash quite badly and caused it to run ~6 hours on an unbusy system. So I had gone back to ~INCORE which took 4+ hours :-(. Using the newest DBZ and INCORE my expire time is around 3.5 hours. Also, the newest DBZ has a few perfomance enhancements that seem to make the system work a bit faster. So, as I said, I am pleased. It is probably possible to just replace an existing DBZ with the newest one. However, I decided to take advantage of some additional functionality in this DBZ to add the auto-tuning of the history.pag hash value. I don't have diffs and my line numbers won't match yours anyway but below are the relevant pieces of expire.c that have been changed for the new DBZ. There are no changes to any other pieces of B news - other than relinking inews and NNTP with the new dbz.o and restarting them after the obilgatory expire -R. Again, you probably don't have to do this to just use it as a DBM replacement. And some of this may conflict with patches 18 or 19 which I haven't gotten around to thinking about. If I was into thinking about it - I would just implement C-news expire (probably will). (and I am experimenting with real C-news on another host so, please, no flames about that). I have #ifdeff'd the DBZ changes using INCORE... Explanations are set off by lines containing "{{{" and "}}}". Actually using incore with INCORE is controllable separately via both a compile-time option and a runtime switch. //expire.c - patch level 17 + local stuff ... {{{ around line 39 - near the top - set up the DBZ values I want - yours _probably_ will differ. You must read the dbz documentation (:->) and the comments in dbz.c to figure out what you want if you want the most out of this.. Make using DBZ's INCORE switchable via a -DDBZ_INCORE compile definition while still using DBZ. Default is to incore it. Note that my compiler likes ANSII-style function declares }}} #ifdef INCORE #define dbz_SIZE 300007L #define dbz_FIELDSEP '\t' #define dbz_CMAP '=' #define dbz_TAGMASK 0x7f000000 #ifdef DBZ_INCORE #define dbz_INCORE_VAL DBZ_INCORE #else #define dbz_INCORE_VAL 1 #endif int dbz_INCORE=dbz_INCORE_VAL; int dbzincore(int); int dbzagain(char *, char *); int dbzfresh(char *, long, int, int, int); int dbmclose(); #endif {{{ around line 281 - at the end of switch processing - add a new expire switch "-N" to turn off DBZ INCORE if memory is tight for some reason }}} #ifdef INCORE case 'N': /* don't do incore */ dbz_INCORE = 0; break; #endif default: #ifdef INCORE printf("Usage: expire [ -v [level] ] [-e days ] [-i] [-a] [-r] [-h] [-p] [-u] [-f username] [-n newsgroups] [-H] [-N]\ n"); #else printf("Usage: expire [ -v [level] ] [-e days ] [-i] [-a] [-r] [-h] [-p] [-u] [-f username] [-n newsgroups] [-H]\n"); #endif {{{ around line 379 - in expire() - let the new dbz routines reopen the database or create it using the magic values defined above. This enables the smart hash sizing. I thought about pushing all this into initdbm() but it didn't seem any easier or cleaner so I just punted it. }}} #ifndef INCORE (void) close(creat(PAGFILE, 0666)); (void) close(creat(DIRFILE, 0666)); initdbm(NARTFILE); #else (void) dbzincore(dbz_INCORE); if ( dbzagain( NARTFILE, ARTFILE ) < 0 ) if ( dbzfresh( NARTFILE, dbz_SIZE, dbz_FIELDSEP, dbz_CMAP, dbz_TAGMASK ) < 0 ) xerror("Cannot create %s with dbzfresh or dbzagain", NARTFILE); #endif {{{ about line 1200 - in rebuilddbm() - do the same thing. Note there are three separate patches shown here for clarity(?). With this patch you don't have to make the initial database using the DBZ utility - just do a "normal" expire -R and you will get it. }}} #ifndef INCORE (void) sprintf(namebuf, "%s.dir", ARTFILE); (void) close(creat(namebuf, 0666)); (void) sprintf(namebuf, "%s.pag", ARTFILE); (void) close(creat(namebuf, 0666)); #endif (void) sprintf(namebuf, "%s", ARTFILE); fd = fopen(namebuf, "r"); if (fd == NULL) { perror(namebuf); xxit(2); } #ifndef INCORE initdbm(namebuf); #else (void) dbzincore(dbz_INCORE); if ( dbzagain( namebuf, namebuf ) < 0 ) if ( dbzfresh( namebuf, dbz_SIZE, dbz_FIELDSEP, dbz_CMAP, dbz_TAGMASK ) < 0 ) xerror("Cannot re-create %s with dbzfresh or dbzagain", namebuf); #endif while (fpos=ftell(fd), fgets(lb, BUFSIZ, fd) != NULL) { p = index(lb, '\t'); if (p) *p = 0; remember(lb, fpos); } #ifdef INCORE (void) dbmclose(); #endif {{{ And, finally, about line 1326 - in xxit() - at the end of expire - do a closedbm(), just in case, to save it if it is in core. }}} #if defined(DBM) && defined(INCORE) (void) dbmclose(); #endif rmlock(); exit(i); }