[news.software.b] expire oink

news@m2xenix.psg.com (Randy Bush) (08/20/90)

C News has been running quite happily here for some months.  Unfortunately,
with a 21 day window (which still does not get all the trash), expire is quite
the hog, 11MB and four hours (while, for comparison, a full pathalias build
takes 30 minutes).

The dbz option is taken at build time, and expire.c does seem to include dbz.
As there seem to be Xenix compatibility problems with the fast stdio lib, that
option is not being used.

What am I doing wrongly this time?

Configuration: Xenix/386 2.3.2, 4MB
-- 
..!{uunet,qiclab,intelhf}!m2xenix!news

henry@zoo.toronto.edu (Henry Spencer) (08/21/90)

In article <1990Aug19.202420.140@m2xenix.psg.com> news@m2xenix.psg.com (Randy Bush) writes:
>C News has been running quite happily here for some months.  Unfortunately,
>with a 21 day window (which still does not get all the trash), expire is quite
>the hog, 11MB and four hours (while, for comparison, a full pathalias build
>takes 30 minutes).

That's awfully slow for a dbz expire.  Ours typically takes 30 minutes to
do 28 days' worth (and I am unhappy with this and intend to improve it).
I have no very specific suggestions, though; insufficient data.
-- 
Committees do harm merely by existing. | Henry Spencer at U of Toronto Zoology
                       -Freeman Dyson  |  henry@zoo.toronto.edu   utzoo!henry

drew@lethe.uucp (Drew Sullivan) (08/22/90)

In article <1990Aug21.154310.20719@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
>In article <1990Aug19.202420.140@m2xenix.psg.com> news@m2xenix.psg.com (Randy Bush) writes:
>>C News has been running quite happily here for some months.  Unfortunately,
>>with a 21 day window (which still does not get all the trash), expire is quite
>>the hog, 11MB and four hours (while, for comparison, a full pathalias build
>>takes 30 minutes).
>
>That's awfully slow for a dbz expire.  Ours typically takes 30 minutes to
>do 28 days' worth (and I am unhappy with this and intend to improve it).
>I have no very specific suggestions, though; insufficient data.

My 386 runing Xenix take about 10 minutes to do a 22 day expire.
This seem reasonable to me.
-- 
  -- Drew Sullivan, <drew@lethe.uucp>

jerry@olivey.olivetti.com (Jerry Aguirre) (08/30/90)

In article <1990Aug22.165739.6918@lethe.uucp> drew@lethe.uucp (Drew Sullivan) writes:
>In article <1990Aug21.154310.20719@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
>>In article <1990Aug19.202420.140@m2xenix.psg.com> news@m2xenix.psg.com (Randy Bush) writes:
>>>the hog, 11MB and four hours (while, for comparison, a full pathalias build
>>That's awfully slow for a dbz expire.  Ours typically takes 30 minutes to
>My 386 runing Xenix take about 10 minutes to do a 22 day expire.

I had it down to 8 minutes for 28 days of news and 60 days of history.
All these comparisons are ignoring one vital factor, the amount of
memory in the system.  Expire runs very fast with the dbz INCORE option
but it uses many megs of memory.  The original poster claimed it was
using 11 meg; He later stated that this was on a system with only 4 meg
of physical memory.  I was running with 16 Meg and expire was using most
of that.

I would assume that dbz has a fairly random access to the incore
memory.  That is going to be swap nightmare.  Given that dbz has
several ways to configure the incore option it can be difficult to
tell.  The original poster didn't think he was using incore but that 11
meg. size makes me think he is.  (I know that in Bnews I have often
wished for a little program that would compile and when run would print
out what options it, and therefor news, was compiled with.  Wading thru
that mess of ifdef, undef, else define can be a real pain sometimes.)

henry@zoo.toronto.edu (Henry Spencer) (08/30/90)

In article <49312@olivea.atc.olivetti.com> jerry@olivey.olivetti.com (Jerry Aguirre) writes:
>... Expire runs very fast with the dbz INCORE option
>but it uses many megs of memory.  The original poster claimed it was
>using 11 meg...

11MB is just plain wrong unless you've got a colossal history file; the
size of the in-core table for utzoo (28 days of history) is about 800KB,
and other memory demands ought to be relatively minor.  If expire is
growing to multiple megabytes, there is a serious bug somewhere, either
in expire or in your system's memory allocator.  Given that expire used
to run quite happily on a 16-bit machine here, I suspect the latter.

>I would assume that dbz has a fairly random access to the incore
>memory.  That is going to be swap nightmare...

It's quite random, but for 800KB that should not be a disaster on a
system with a reasonable amount of memory.  If you're running dbz version
3 (the one distributed with C News), you can find out the exact size of
the table:  cat history.dir (it's a text file), take the second number
on the first line, multiply by sizeof(long) (normally 4).
-- 
TCP/IP: handling tomorrow's loads today| Henry Spencer at U of Toronto Zoology
OSI: handling yesterday's loads someday|  henry@zoo.toronto.edu   utzoo!henry

jerry@olivey.olivetti.com (Jerry Aguirre) (08/31/90)

In article <1990Aug30.162009.2017@zoo.toronto.edu> henry@zoo.toronto.edu (Henry Spencer) writes:
>It's quite random, but for 800KB that should not be a disaster on a
>system with a reasonable amount of memory.  If you're running dbz version
>3 (the one distributed with C News), you can find out the exact size of
>the table:  cat history.dir (it's a text file), take the second number
>on the first line, multiply by sizeof(long) (normally 4).

The dbz file looks like:

	dbz 3 999983 9 = 128 127 24 4 0 1 2 3
	265832 0 0 0 0 0 0 0 0 0 0

From which I calculate a 4 Meg size using Henry's formula.  That is for
a history file with 269616 lines.  I confess to raising the initial size
define based on recomendations for a large history file.

I seem to remember ps reporting expire as using 11 Meg on a 386 system
though.  (Not currently available for me to verify this.)  I wasn't too
concerned because I had 16 meg of physical memory.

Perhaps there is a sizeof or malloc bug involved.

henry@zoo.toronto.edu (Henry Spencer) (09/01/90)

In article <49326@olivea.atc.olivetti.com> jerry@olivey.olivetti.com (Jerry Aguirre) writes:
>The dbz file looks like:
>
>	dbz 3 999983 9 = 128 127 24 4 0 1 2 3
>	265832 0 0 0 0 0 0 0 0 0 0
>
>From which I calculate a 4 Meg size using Henry's formula.  That is for
>a history file with 269616 lines...

"Houston, we've got a problem."  Something is very wrong here.  For one
thing, all those zeros in the second line are a bad sign:  there is some
problem with expire, by the looks of it.  The second line is basically a
history of recent entry counts, with the first number being current count.
Those zeros say that the database is being rebuilt from scratch rather
than using the old one as a basis, which is what C expire now tries to do.

For another thing, your table is way too big.  For 269k entries, a table
of about 350k slots is lots.  Dbz 3 handles overflows gracefully; there
is no need to grossly over-size the table to avoid risk of overflow.
If you can find out why expire isn't using the old database as a basis
for the new one, this problem will solve itself, since expire will re-size
the table based on the actual usage.  (That's what the usage history is
for...)  The re-sizing calculation is based on piles of graphs from
simulations of performance tradeoffs, by the way, not just guesswork.
Whole trees died for dbz 3...

In any case, to get, say, a 10MB table, which is about what you'd need
to push expire up to 11MB in the absence of bugs, you would need to be
keeping about 18 months of history.  I doubt that anyone is doing that!
-- 
TCP/IP: handling tomorrow's loads today| Henry Spencer at U of Toronto Zoology
OSI: handling yesterday's loads someday|  henry@zoo.toronto.edu   utzoo!henry