[net.database] Speed of Un*x DBMS's

rmarti@sun.uucp (Bob Marti) (07/03/86)

In reply to yesterday's posting concerning the DeWitt benchmarks for DBMSs,
I got the following mail from Prof. DeWitt himself:

----- Begin Forwarded Message -----

>From uwvax!kraft-slices.wisc.edu!dewitt@seismo.CSS.GOV

Actually,  the sigmod84 paper describes the multiuser benchmarks
there is a 1983 VLDB paper by Bitton, DeWitt, and Turbyfill
that describes the single user benchmarks.  These are the
ones that have become the defacto standard.  tapes of these
benchmarks are available from me for a nominal charge or
from anyone else who will make you a tape for free.  There
is no copyright on them

I don't read news groups so you might forward this on.

dewitt

------ End Forwarded Message ------

The full reference for this earlier paper is:

Bitton,D., DeWitt,D.J., Turbyfill,C.:  Benchmarking Database Systems -- A
Systematic Approach.  In Proc. 9th Int. Conf. on Very Large Data Bases, 1983,
pp.8-19.

By the way: The newest issue of Computerworld contains an article "RTI claims
Oracle DBMS tests were biased" following an earlier article a couple of weeks
ago which said that Oracle claimed their new Version 5 outperformed INGRES
Version 4.

Bob Marti, Sun Microsystems, Inc.
 
UUCP:	{cbosgd, decvax, decwrl, ihnp4, hplabs, seismo, ucbvax}!sun!rmarti
ARPA:	rmarti@sun.com

bradbury@oracle.UUCP (Robert Bradbury) (07/21/86)

In article <4664@sun.uucp>, rmarti@sun.uucp (Bob Marti) writes:
> 
> There is a set of benchmarks for DBMS developped at U of Wisconsin by a guy
> named DeWitt.  The DeWitt benchmarks seem to be a de facto standard for
> relational DBMSs.  I seem to recall that a year or two ago, INGRES performed
> substantially faster than Oracle according to the DeWitt benchmark, although
> Oracle claimed that the benchmarks were unfair, since DeWitt -- then visiting
> professor at UC Berkeley -- supposedly used a souped up version of INGRES
> which was not available on the market at the time.

The true story is much worse than that.  The numbers quoted for Oracle in
the DeWitt paper were for a pre-release of Oracle Version 3.1 on a Berkeley
4.1c release of UNIX.  Since Oracle does not normally run in on Berkeley
UNIX (owing to the lack of shared memory) a large electronics manufacturer
coerced us into give them the sources so they could port Oracle to that
environment ostensibly for the purpose of running some of their own benchmarks.
Next thing we knew customers were calling us up quoting the DeWitt paper.

Because the paper compared Oracle (pre-release 3.1) on 4.1c (an operating
system which could not properly support it) with a well tuned University
Ingres on 4.1 with Commercial Ingres (2.0) on VMS [All on different hardware]
we consider it to be a good example of how NOT to do a benchmark.

This is not to say that the benchmark itself is not a reasonable (though
arbitrary) test of RDBMS performance.  One problem with it is that it
doesn't measure concurrancy performance or the impact of the RDBMS on other
system applications.

The last test we ran comparing Oracle on VMS to Ingres on VMS we were
beating Ingres in something like 13 out of 16 tests in the DeWitt Benchmark.
I would hesitate to post numbers to the net since I'm not sure how the
other RDBMS manufacturers would respond.

Our general sense here is that Oracle performs much better than Ingres on
micro's (3B2), somewhat better on mini's (3B20,VAX) and somewhat poorer on
mainframes (Amdahl/UTS).  We tend to outperform Informix and Unify across the
entire range for medium to complex queries.  The performance of Ingres and
Oracle are tending to approach their limits as increased efforts lead to
diminishing returns (or good returns in very narrow areas).

I have a proposed Benchmark Specification (based on DeWitt) from a company
in Palo Alto, CA (International Data Corporation Technology Laboratories)
which attempts to address the concurrancy and system impact issues.
In theory they intend to run it on all of the RDBMS.  The only problem
is that it requires the dedication of several machines for several
weeks in order to run it and that is alot of time (money) to run tests which
are likely to be obsolete in 6 months given the evolution of these systems.

There is also the problem that all of the RDBMS have different interface
languages so you have to write a seperate benchmark for each system meaning
you are to a degree comparing apples with oranges.  Until everyone conforms
to the ANSI X3H2 SQL standard it will be very difficult to realistically compare
one RDBMS with another.

-- 
Robert Bradbury
Oracle Corporation
(206) 364-1442                            {ihnp4!muuxl,hplabs}!oracle!bradbury

kempf@hplabsc.UUCP (Jim Kempf) (07/27/86)

In article <457@oracle.UUCP>, bradbury@oracle.UUCP (Robert Bradbury) writes:
> I have a proposed Benchmark Specification (based on DeWitt) from a company
> in Palo Alto, CA (International Data Corporation Technology Laboratories)
> which attempts to address the concurrancy and system impact issues.
> In theory they intend to run it on all of the RDBMS.  The only problem
> is that it requires the dedication of several machines for several
> weeks in order to run it and that is alot of time (money) to run tests which
> are likely to be obsolete in 6 months given the evolution of these systems.
> 
> There is also the problem that all of the RDBMS have different interface
> languages so you have to write a seperate benchmark for each system meaning
> you are to a degree comparing apples with oranges.  Until everyone conforms
> to the ANSI X3H2 SQL standard it will be very difficult to realistically compare
> one RDBMS with another.
> 

While I agree that the query language could have an impact on the 
outcome of a benchmark, I think that its contribution would be
minor compared to other factors. If the RDBMS supports a programmatic
interface (which I feel any serious contender on Un*x should), then
the query language could, to some degree, be factored out,
provided the programmatic interface was not simply an embedding
of the query language.

I also disagree that RDBMS systems are evolving so fast that any benchmarks
will be oboslete in 6 months. The underlying RDBMS technology is now
fairly mature, so mature that various vendors feel comfortable offering
products to the public. In order for a sound technical decision to
be made, an important (and in some applications *critical*) factor
is RDBMS performance. Without solid data about how a particular
RDBMS performs, how can a sound technical decision to buy be made?
A customer who is interested in buying now doesn't want to be told
that (s)he should wait six months, and persumably the company is
also interested in making the sale.

I'm glad to see that a standard is beginning to take shape. Perhaps
the best solution would be to have some third party (like BYTE or
UNIX WORLD) do the benchmarks and publish the results. This has
typically been how benchmarking has been done for C and Pascal
compilers.

		Jim Kempf	hplabs!kempf

<<<<<**** usual disclaimer ****>>>>>

pavlov@hscfvax.UUCP (840033@G.Pavlov) (08/09/86)

In article <502@hplabsc.UUCP>, kempf@hplabsc.UUCP (Jim Kempf) writes:
> I also disagree that RDBMS systems are evolving so fast that any benchmarks
> will be oboslete in 6 months. The underlying RDBMS technology is now
> fairly mature, ......
> A customer who is interested in buying now doesn't want to be told
> that (s)he should wait six months, and persumably the company is
> also interested in making the sale.

  True, a customer must buy if the application is waiting.  But it is also 
  true that any set of benchmarks more than 3-4 months old are likely to be
  obsolete.  Taking the last 6 months, in fact, Oracle issued one new release,
  and Ingres 2.  Each of these releases increased "overall performance" by 
  25-40% or so (depending on who you are talking to and which core functions
  you're looking at) - and both vendors promise more of the same in the near
  future.
  When looking at performance, one aspect that might be worth investigating
  is how each competing vendor achieves that performance (the "uderlying
  technology").  This may give a clue to what the vendor may be able to
  achieve in the future re further performance enhancements.  While perform-
  ance of some RDBMS's is now decent, some areas still leave much to be desired-
   for instance, obtaining aggregates (especially "a by b") is still quite
  painful, time-wise, in most systems.

    greg pavlov, fstrf, amherst, ny.