[comp.databases] chest thumping

tony@killer.UUCP (06/04/87)

> I cannot believe that with all those UNIX people out there on the net,
> no one is interested in talking about databases.  I thought that people
> bought more UNIX systems for database applications than for any other
> type of application.  If this is true, where the hell is everybody?
> 
Just a thought.  I would guess that most of the readers of this group are
really only familar with one database.  If this is true then any postings
to the merit or de-merits of their systems are biased (sp).  The same goes
for employees of their products.

I don't see how this group can get beyond the lofty talk that we see.  If
a user has a problem with a certain database it is a problem that must be
easy to describe and therefore easy to fix.  If it a problem that's a real 
head scratcher then there is too much info that the group must have in order 
to give any help.

I know that when I have a problem that I will find a work-around faster than
I could post to the net.


Tony Holden
!ihnp4!killer!tony

garyp@cognos.uucp (Gary Puckering) (06/09/87)

In article <959@killer.UUCP> tony@killer.UUCP writes:
>I know that when I have a problem that I will find a work-around faster than
>I could post to the net.

Great!  How about sharing your knowledge with the rest of us -- could
save me (and many others) loads of time.

Apart from that, identifying problems that are of a generic nature
(such as problems with the degree-3 consistency model in on-line
applications) are of interest to everyone:  vendors (because maybe
their engineers can find a practical solution), researchers (because
maybe new theories, algorithms, etc. are needed), and users (because
they should know what they are getting into beforehand).

It is amazing how many experienced DP professionals have "bought in"
to relational technology because of all the hype, with little real
knowledge or experience with that technology.  Lets make this a forum
where strengths and weaknesses of different database systems can be
discussed openly.  I'm sure we all have something to learn.
-- 
Gary Puckering        3755 Riverside Dr.
Cognos Incorporated   Ottawa, Ontario       decvax!utzoo!dciem!
(613) 738-1440        CANADA  K1G 3N3       nrcaer!cognos!garyp

vassos@utcsri.UUCP (06/12/87)

> [...] identifying problems that are of a generic nature
> (such as problems with the degree-3 consistency model in on-line
> applications) are of interest to everyone [...]

What is the problem with serialisability (which, I believe, is what
degree-3 consistency is) for on-line applications?
-- 
Vassos Hadzilacos
vassos@csri.toronto.edu

garyp@cognos.uucp (Gary Puckering) (06/19/87)

In article <4915@utcsri.UUCP> vassos@utcsri.UUCP writes:
>> [...] identifying problems that are of a generic nature
>> (such as problems with the degree-3 consistency model in on-line
>> applications) are of interest to everyone [...]
>
>What is the problem with serialisability (which, I believe, is what
>degree-3 consistency is) for on-line applications?

I was hoping someone would ask.

Actually, degree-3 consistency is not exactly the same as serializability.
Two transactions are said to be serializable as long as an interleaved
execution of their updates produces the same result as a serial execution.

Degree 3 consistency refers to a level of isolation that is equivalent
to each transaction being the sole user of the database.  In particular,
degree 3 consistency prevents the problem of "phantoms".  To illustrate
a phantom, consider a transaction which executes the following query:

    select all employees with salary > 50,000

If the same query is repeated within that transaction, it is possible
that some other concurrent transaction may have inserted a new employee
with a salary > 50,000 between the time of the first execution of the
query and the second execution.  Thus, the new employee will show up --
a "phantom".  If the two transactions executed serially, this could not
have happened.


Now, as to problems with on-line applications.  Theoretically, there
is nothing wrong that I can think of with degree 3 consistency.
Practically speaking, though, most implementations of relational
systems that I've seen (Rdb/VMS, DG/SQL, HP Allbase, and what I've
read or heard about Oracle, Ingres, DB2 and SQL/DS) have one or more
problems associated with implementation of degree 3 consistency that
cause significant reductions in concurrency.

A typical example:  in Rdb/VMS if several transactions are attempting
to insert tuples into a relation with a unique index, it is likely
that one or more may become deadlocked or go into a long wait state --
even if there is no collision on unique key values!  Why?  Well, to
prevent phantoms, Rdb/VMS must lock indexes for inserts.

Another example:  suppose you want to develop a screen that lets the
user browse through the database and, if he/she so desires, to update
any tuple that might be seen.  Ok.  This suggests the use of a
read_write transaction.  But, use of a read_write transaction causes
each tuple (probably page, depending on the lock granularity of your
system) to be S locked.  That's so someone else doesn't come along and
change it on you (remember degree 3 consistency).  So what if the user
goes to lunch before committing the transaction?  All those tuples/pages
that he's read are S locked, preventing anyone from updating them.

Now lets suppose the user comes back from lunch.  He continues browsing
and comes across a record he wants to change.  He changes it.  This
tuple/page now gets an X lock.  In most systems, that means other
read_write transactions will have to wait until the lock is released
before they can read that tuple/page.  More delays for concurrent users.
Well, lets say this user is aware that he shouldn't leave uncommitted
updates around too long, so he commits the transaction.  Ok.  That
frees all the locks.  But, it also releases his cursor.  So, if he was
only halfway through browsing along the particular predicate path, he
now has to restate is predicate to eliminate the tuples he's already seen.

This approach doesn't look too good -- although it's the most obvious
one.  Lets go back to the beginning.  Why not use a read_only
transaction for the browse.  That means no S locks are held.  The
system returns a version of each tuple that was current as at the
start of the transaction.  Ok, so how do we update?  Well, we could
start a read_write transaction each time the user requests an update.
Fetch the tuple that the user wants to update (you'll need a unique
key value or a database key) and check to make sure it isn't different
from the one the user fetched on his read_only transaction (a distinct
possibility) and update it.  Commit the read_write transaction.  Now
the user can continue on the browse transaction (it still hasn't been
committed, so the cursor is still valid).  This approach works but
puts the onus on the application to achieve concurrency.  It also
increases the risk that the users transaction will fail because some
concurrent transaction has changed data out from under him.

Yet another approach is to lower the isolation level of the
browse/update transaction to degree 2.  Transactions of this sort have
serializable updates, but may see phantoms.  In this application, I
don't consider phantoms to be a problem.  The advantage to degree 2 is
that you don't place S locks on tuples that are read and cursor
stability can be maintained after a commit.  The catch:  most
relational systems don't allow degree 2 consistency.  They enforce
degree 3.  Those that do make degree 3 the default, and most
programmers don't know any better.  The result:  poor concurrency.

In my humble opinion, a large part of the perceived "poor" performance
of relational systems may be related to the problems I've just
described.

-- 
Gary Puckering        3755 Riverside Dr.
Cognos Incorporated   Ottawa, Ontario       decvax!utzoo!dciem!
(613) 738-1440        CANADA  K1G 3N3       nrcaer!cognos!garyp

mjr@well.UUCP (Matthew Rapaport) (06/22/87)

In article <4915@utcsri.UUCP> vassos@utcsri.UUCP writes:
>What is the problem with serialisability (which, I believe, is what
>degree-3 consistency is) for on-line applications?

There is an excellent book on this very subject called:
The Theory of Database Concurrency Control  By Christos
Papadimitriou  Computer Science Press Rockville, Md. c 1986
check it out...
mjr@well

taso@munnari.oz (Taso Hatzi) (06/26/87)

In article <939@cognos.UUCP> garyp@cognos.UUCP (Gary Puckering) writes:
>
>A typical example:  in Rdb/VMS if several transactions are attempting
>to insert tuples into a relation with a unique index, it is likely
>that one or more may become deadlocked or go into a long wait state --
>even if there is no collision on unique key values!  Why?  Well, to
>prevent phantoms, Rdb/VMS must lock indexes for inserts.

This is indeed a very real problem with Rdb.  Can anyone comment on how
Oracle and Ingres measure up in this respect?

>read_write transaction.  But, use of a read_write transaction causes
>each tuple (probably page, depending on the lock granularity of your
>system) to be S locked.  That's so someone else doesn't come along and

Does anyone know for sure whether it's the tuple or the page that gets
locked?

> .... Why not use a read_only
>transaction for the browse.  That means no S locks are held.  The
>system returns a version of each tuple that was current as at the
>start of the transaction.  Ok, so how do we update?  Well, we could
>start a read_write transaction each time the user requests an update.

With Rdb, to have more than one transaction running at the same time, 
one has to code at the DSRI level because the RDO query language 
doesn't permit nested transactions.  Do Oracle or Ingres permit
nested transactions?

>
>In my humble opinion, a large part of the perceived "poor" performance
>of relational systems may be related to the problems I've just
>described.
>

I agree with this statement - all database systems should come with tools
which can allow one to see who locked what data and when.

On the subject of degree 2 & 3 consistencey with respect to Rdb, it is
stated that the DSRI architecture supports degree 2 and 3 consistency. Does
anyone know whether the Rdb implementations, either Rdb/VMS or Rdb/Eln,
support degree 2 consistency?