[comp.databases] BIG, BIG fun w/Informix cursors

demasi@paisano.UUCP (Michael C. De Masi) (07/17/87)

Hello again people,

If you remember my origional posting, I was the guy who was having
so much trouble with Informix cursors.  The two problems I was having
involved:  1)  The fact that they're seemed to be no way to back up
a cursor in an active list after a fetch.  2)  The fact that non-
readonly activity seemed to break every cursor declared in the program
at any given time.  I actually got quite a few kind responses from the
net on the first point, but general confusion on the second.  For the
answer to the second point, I had to consult with tech support at 
Informix.  In summary:

1)  No, there is no real way to back up a cursor in an active list.
The command 'fetch' only works in one direction, and there is no
equivalent of an 'unfetch'.  I did, however, receive several strategies
for getting around this problem, mostly involving creating temporary
tables and/or files with the data or rowids from the given database
tables.  This is pretty much what I was already doing, and was looking
to avoid.  One fellow even suggested declaring two cursors, one going
in ascending and the other in descending order, and querying and
redeclaring them as they approached and overlapped each other.
Also, from what I could glean, these problems are solved in 4GL
with what they call a 'scroll cursor' with which one can issue
commands like 'scroll up' 'scroll back' etc.

2)  As to the second point, I seemed to get a great deal of confusion
from the net regarding my statement of the problem.  Put as simply as
possible, it seemed that non-readonly activity (ie updates, deletes
and insertions) seemed to make any currently declared & opened cursor
in the program inoperative.  The sqlca.sqlcode returned from the next
fetch was -259 'cursor not open'.  Nobody on the net could place the
problem, so again I consulted with Informix.  The problem was not with
the update activity per se, but with the structure placed around it.
To insure data integrity, I had put all update statements within the
context of a 'begin work ... commit work else rollback work' structure.

IMPORTANT:  Both 'commit work' and 'rollback work' have the side effect
of closing every active cursor in the whole damn program!

(FLAME ON!!!)
Now this is not limited to cursors acted upon within the structure, or
indeed to cursors declared for update, but all of them.  The nice person
I spoke to at Informix assured me that this was indeed a documented
feature of both 'commit work' and 'rollback work'.  Documented or not,
I find this a rather bizarre concept.  I thought the purpose of this
structure was to insure transaction integrity, not to do cleanup work
of record locks, etc.  In my humble opinion, this is the kind of thing
best left to the programmer or to the database administrator itself and
should not be hidden within the context of a seemingly non-related
functionality.
(FLAME OFF!!!)  Lord, do I feel better.

Anyway, to be fair to Informix, with the exception of the above probs
(which I'll admit I probably would have avoided if I'd read the doc a
bit more carefully) ESQL/C is a good tool, and has served me well.

Till next time,
-- 
Michael C. De Masi - AT&T Communications (For whom I work and not speak)
3702 Pender Drive, Fairfax, Virginia 22030   Phone: 703-246-9555
UUCP:   seismo!decuac!grebyn!paisano!demasi
     "Life.  Don't tell me about life." - Marvin, the paranoid android

mjr@well.UUCP (Matthew Rapaport) (07/20/87)

In article <229@paisano.UUCP> demasi@paisano.UUCP (Michael C. De Masi) writes:
>Anyway, to be fair to Informix, with the exception of the above probs
>(which I'll admit I probably would have avoided if I'd read the doc a
>bit more carefully) ESQL/C is a good tool, and has served me well.
>
This also goes to prove a point I've been making in a column... That
even state-of-the-art non-procedural 4GL's are NOT so simple that any
end user can sit down and write complex retrieval and update operations.
This work is best left to specialists, no matter what the level of
language interface...

mjr@well (ptsfa!well!mjr)

jeff@rtech.UUCP (Jeff Lichtman) (07/21/87)

From article <229@paisano.UUCP>, by demasi@paisano.UUCP (Michael C. De Masi):
> To insure data integrity, I had put all update statements within the
> context of a 'begin work ... commit work else rollback work' structure.
> 
> IMPORTANT:  Both 'commit work' and 'rollback work' have the side effect
> of closing every active cursor in the whole damn program!
> 
> (FLAME ON!!!)
> Now this is not limited to cursors acted upon within the structure, or
> indeed to cursors declared for update, but all of them.  The nice person
> I spoke to at Informix assured me that this was indeed a documented
> feature of both 'commit work' and 'rollback work'.  Documented or not,
> I find this a rather bizarre concept.  I thought the purpose of this
> structure was to insure transaction integrity, not to do cleanup work
> of record locks, etc.  In my humble opinion, this is the kind of thing
> best left to the programmer or to the database administrator itself and
> should not be hidden within the context of a seemingly non-related
> functionality.
> (FLAME OFF!!!)  Lord, do I feel better.
> -- 
> Michael C. De Masi - AT&T Communications (For whom I work and not speak)

Closing all cursors when committing or rolling back a transaction makes
sense, and conforms to the ANSI SQL standard.

It is easier to explain why "rollback work" should close all open cursors,
so let me start with that.  The "rollback work" statement is supposed to
reset the database to the state it was in before the transaction began.
It would be nearly impossible, in the general case, to position cursors
properly after a "rollback work"; for example, suppose you had deleted
a bunch of rows from a table, positioned a cursor just before these
deleted rows, and then did a "rollback work".  The deleted rows have to
come back, so should the next row you fetch be the first deleted row or
the one after the deleted row?

Now, about releasing locks when committing or rolling back a transaction:
The purpose of a transaction is to provide atomicity.  That
is, transactions assure that you don't see anyone else's updates before
they're complete, and that no one sees your updates before they're complete.
Transactions are also the basic unit of recovery (in case of a
system crash, for example, all incomplete transactions will be backed out
during recovery).  Locking is only a technique for providing atomicity.
Database systems release locks at commit or rollback time because they
are no longer needed: the transaction has either been entirely committed
or entirely backed out, so the locks can be released safely without
jeopardizing consistency.  To hold the locks longer than this wouldn't
hurt consistency, but it would kill concurrency.  It would be possible
to build a system that would allow the programmer to control when the
locks were released, but that would be putting the burden of maintaining
database consistency on the programmer; that's a service that the DBMS
should provide, and is one of the things that distinguishes a DBMS from
a file system.

Now that we've established why the DBMS releases locks at commit or
rollback time, let's consider why cursors must be closed at commit
time.  Suppose you are in the middle of a transaction, and you have
a cursor positioned in the middle of a table.  Now you do a "commit work".
Suppose we want the cursor to remain open; the DBMS would still have
to release its locks, which would mean that the cursor couldn't maintain
whatever locks it had on its current position.  Suppose, before
you tried to fetch the next row from the cursor, that someone destroyed
the underlying table, or deleted the row that the cursor was going to
advance to, or added a row in front of the one it was to advance to.
What should happen?  Not only would it be extremely difficult to
implement the DBMS to account for all possible cases of trying to
use a cursor that has had the locks swept out from under it, it would
be difficult to define what should happen in all cases (and many
definitions would necessarily be arbitrary).

It seems that Mr. De Masi got into this mess by trying to use "commit
work" and "rollback work" to guarantee the success or failure of
individual database statements.  That is not a correct use of transactions.
It appears to me that Mr. De Masi is looking for statement atomicity; that is,
he wants every individual database statement to completely succeed or
be backed out.  This is something that every relational DBMS should provide for
the user.  I'm not familiar enough with Informix to say whether it does
this, but it's a pretty good bet that it does.  Therefore, there should
be no need to test every update and either commit it or roll it back;
the DBMS will leave in the effects of every successful statement, and
back out the effects of every statement that couldn't complete due to
some DBMS error.  In SQL, the effects of all successful statements in the
current transaction are committed at the time of a "commit work" statement, or
backed out at the time of a "rollback work" statement.

It's possible that I misinterpreted Mr. De Masi's statements, and what
he's really looking for is a general-purpose way of getting rid of
unwanted updates (e.g. a program makes a successful update, then
later on in the transaction it decides it shouldn't have made the update
and needs to get rid of it).  Standard SQL doesn't provide any tools
for doing this, and I'm not aware of any DBMS that makes this easy for
you.  If your tables all have unique primary keys (always a good idea),
you could write your program so that it remembers the old row and the
new key; that way, you could put back the old row if you wanted.
-- 
Jeff Lichtman at rtech (Relational Technology, Inc.)
"Saints should always be judged guilty until they are proved innocent..."
{amdahl, sun}!rtech!jeff
{ucbvax, decvax}!mtxinu!rtech!jeff

demasi@paisano.UUCP (Michael C. De Masi) (07/22/87)

In article <1076@rtech.UUCP>, jeff@rtech.UUCP (Jeff Lichtman) writes:
> From article <229@paisano.UUCP>, by demasi@paisano.UUCP (Michael C. De Masi):
> > To insure data integrity, I had put all update statements within the
> > context of a 'begin work ... commit work else rollback work' structure.
> > 
> > IMPORTANT:  Both 'commit work' and 'rollback work' have the side effect
> > of closing every active cursor in the whole damn program!
> > 
> > (FLAME ON!!!)

    ..... I complain for a while about the above activity .....

> > (FLAME OFF!!!)  Lord, do I feel better.
> > -- 
> > Michael C. De Masi - AT&T Communications (For whom I work and not speak)
> 
> Closing all cursors when committing or rolling back a transaction makes
> sense, and conforms to the ANSI SQL standard.
> 
     .....  Mr Lichtman explains his above statement nicely  .....
> 
> Now that we've established why the DBMS releases locks at commit or
> rollback time, let's consider why cursors must be closed at commit
> time.  Suppose you are in the middle of a transaction, and you have
> a cursor positioned in the middle of a table.  Now you do a "commit work".
> Suppose we want the cursor to remain open; the DBMS would still have
> to release its locks, which would mean that the cursor couldn't maintain
> whatever locks it had on its current position.  Suppose, before
> you tried to fetch the next row from the cursor, that someone destroyed
> the underlying table, or deleted the row that the cursor was going to
> advance to, or added a row in front of the one it was to advance to.
> What should happen?  Not only would it be extremely difficult to
> implement the DBMS to account for all possible cases of trying to
> use a cursor that has had the locks swept out from under it, it would
> be difficult to define what should happen in all cases (and many
> definitions would necessarily be arbitrary).
> 

Now granted, I probably don't have the level of understanding about
database theory that you do, but what about purely readonly queries that
do not cause any locking?  Granted, commiting an update does change the
nature of a given table, but can't that happen anyway?  That is, if I
do a readonly query of a group of database records, and one gets changed
by another process or user before I advance to that record, won't I get
incorrect or no longer existing data, or is there some functionality
that takes care of this of which I'm not aware?  If so, couldn't this
same fuctioanlity be used to keep readonly cursors open under the
circumstances I've described?

> It seems that Mr. De Masi got into this mess by trying to use "commit
> work" and "rollback work" to guarantee the success or failure of
> individual database statements.  That is not a correct use of transactions.
> It appears to me that Mr. De Masi is looking for statement atomicity; that is,
> he wants every individual database statement to completely succeed or
> be backed out.  This is something that every relational DBMS should provide for
> the user.  I'm not familiar enough with Informix to say whether it does
> this, but it's a pretty good bet that it does.  Therefore, there should
> be no need to test every update and either commit it or roll it back;

No, not unless you consider that one database 'transaction' from the
viewpoint of the user, may actually consist of any number of individual
database statements, any one of which is subject to failure.  This is
especially true in the case of relational databases, where the designer
is encouraged to divide logical records over any number of seperate and
distinct tables.  The actual 'update' of which I wrote in actuality
consists of at least two 'update' statements, a possible deletion and
a variable number of possible insertions which in scope spread themselves
over at least six seperate tables.  Every one of these statements must be
and is checked for error, and although you are correct, Informix does 
indeed automatically rollback statements that cause database errors,
the 'begin/commit/rollback work' structure is critical to insure that
the _entire_ transaction can be rolled back, to make the fuctionality
of the user's 'update' command truly atomic.

> It's possible that I misinterpreted Mr. De Masi's statements,

Yup!  But hey, what the hell, like you said:

> "Saints should always be judged guilty until they are proved innocent..."

                                  ;-)

Anyway, Jeff, thanks a lot for the input.  I really do appreciate
any help/explanations/arguments on this or any subject regarding
making my job easier!

Till the next disaster,
-- 
Michael C. De Masi - AT&T Communications (For whom I work and not speak)
3702 Pender Drive, Fairfax, Virginia 22030   Phone: 703-246-9555
UUCP:   seismo!decuac!grebyn!paisano!demasi
     "Life.  Don't tell me about life." - Marvin, the paranoid android

jeff@rtech.UUCP (Jeff Lichtman) (07/25/87)

From article <230@paisano.UUCP>, by demasi@paisano.UUCP (Michael C. De Masi):
> In article <1076@rtech.UUCP>, jeff@rtech.UUCP (Jeff Lichtman) writes:
>> From article <229@paisano.UUCP>, by demasi@paisano.UUCP (Michael C. De Masi):
>> 
>> Now that we've established why the DBMS releases locks at commit or
>> rollback time, let's consider why cursors must be closed at commit
>> time...
> 
> Now granted, I probably don't have the level of understanding about
> database theory that you do, but what about purely readonly queries that
> do not cause any locking?

Usually, a DBMS will get and hold shared locks (also called "read locks")
until end transaction time.  What most DBMS's guarantee by default is
serializability, also called "level-three consistency".  A set of
concurrent transactions is serializable if there is some way of
doing all of the transactions non-concurrently and getting the same
result.  That is, if you can untangle all of the interleaved operations
in your concurrent transactions, and then can find some way of laying
your transactions end-to-end in such a way that it would produce the
same result, the transactions are called serializable.

This can be a hard concept to absorb at first, so I'll put it a different
way: the job of concurrency control in a DBMS is to ensure that users get
a consistent view of the data.  For a read-only transaction, this requires
two things: that the transaction not be able to see other users' uncommitted
updates, and that reading the same record more than once within the same
transaction will yield the same values every time.  This requires getting
and holding read locks until end transaction time, even if it is a read-only
transaction.

Suppose, for instance, that you have database containing account information.
There are two programs running: one is using a read-only transaction to
add up all the accounts to make sure things balance.  Another is running
transactions to transfer funds between accounts.  If the read-only transaction
didn't get read locks, it would be possible for it to see some half-done
updates from the transfer transactions, and the accounts wouldn't balance.

(Note that there are other methods of concurrency control besides locking.
I won't discuss any of the other methods here.)

Now suppose that a similar situation exists, except the read-only
transaction has two parts: first, it tallies all accounts with a positive
balance, and then it tallies all accounts with a negative balance.  Again,
assume that there is another set of transactions going on at the same time
that transfer funds between accounts.  If the read-only transaction doesn't
hold its locks until end transaction time, it's possible that an update
to an account will cause it to be seen as a positive-balance account
on the first pass, and a negative-balance account on the second pass.

Thus, in order to guarantee the user a consistent view of the data,
a DBMS must get and hold locks, even in a read-only transaction.  This
is true in the general case.  However, if the DBMS knew enough about
the user's application, it might be able to figure out when locks were no
longer needed and release them earlier.  For example, a transaction that
does nothing but read from a single table, and never re-reads anything,
doesn't need to hold its read locks until the end of the transaction; it
only needs to hold its locks on each record, page, or whatever, until the
application is done with the it.  Please note, however, that a DBMS has no
way of knowing whether this will be true, because it doesn't have the brains
to study the application to see what it's doing.  The programmer does
have the brains, however.  Some DBMS's allow the user to control
locking in various ways; Ingres, for instance, allows the user to
tell it not to use locks when reading.  But all relational DBMS's that I
know of use level-three consistency as the default.  That is, some DBMS's
allow the user to risk seeing inconsistent views of the data, but it
must be done by the explicit action of the user.

> Granted, commiting an update does change the
> nature of a given table, but can't that happen anyway?  That is, if I
> do a readonly query of a group of database records, and one gets changed
> by another process or user before I advance to that record, won't I get
> incorrect or no longer existing data, or is there some functionality
> that takes care of this of which I'm not aware?  If so, couldn't this
> same fuctioanlity be used to keep readonly cursors open under the
> circumstances I've described?

This could be a problem if read-only transactions didn't get and hold
locks.  But they do, and a consequence is that even read-only transactions
won't see another transaction's incomplete updates.  It is possible for
one transaction to see another's *comitted* updates, but that's perfectly
OK.

>> It seems that Mr. De Masi got into this mess by trying to use "commit
>> work" and "rollback work" to guarantee the success or failure of
>> individual database statements...
> 
> No, not unless you consider that one database 'transaction' from the
> viewpoint of the user, may actually consist of any number of individual
> database statements, any one of which is subject to failure...

It sounded to me like the application was written something like this:

	database statement;
	if (success)
		commit work;
	else
		rollback work;

	database statement;
	if (success)
		commit work;
	else
		rollback work;

	database statement;
	if (success)
		commit work;
	else
		rollback work;

	...

	commit work;

When it should be written like this:

	database statement;
	if (failure)
		rollback work;

	database statement;
	if (failure)
		rollback work;

	database statement;
	if (failure)
		rollback work;
	...

	commit work;

I have left out some control structures, but you get the idea.  "Success"
and "failure" here don't necessarily refer to the database statements;
they could be talking about other conditions that arise in the application.

> Michael C. De Masi - AT&T Communications (For whom I work and not speak)
> 3702 Pender Drive, Fairfax, Virginia 22030   Phone: 703-246-9555
> UUCP:   seismo!decuac!grebyn!paisano!demasi
>      "Life.  Don't tell me about life." - Marvin, the paranoid android
-- 
Jeff Lichtman at rtech (Relational Technology, Inc.)
"Saints should always be judged guilty until they are proved innocent..."
{amdahl, sun}!rtech!jeff
{ucbvax, decvax}!mtxinu!rtech!jeff

garyp@cognos.uucp (Gary Puckering) (07/30/87)

In article <1076@rtech.UUCP> jeff@rtech.UUCP (Jeff Lichtman) writes:
>Now, about releasing locks when committing or rolling back a transaction:
> ...
>Database systems release locks at commit or rollback time because they
>are no longer needed: the transaction has either been entirely committed
>or entirely backed out, so the locks can be released safely without
>jeopardizing consistency.  To hold the locks longer than this wouldn't
>hurt consistency, but it would kill concurrency.  
> ...
>Suppose we want the cursor to remain open; the DBMS would still have
>to release its locks, which would mean that the cursor couldn't maintain
>whatever locks it had on its current position.  

Your statements seem inconsistent.  Why would the DBMS *have* to release
its locks?  I thought you said that they were released because it was
safe to do so, and desirable for good concurrency.  You haven't given
a reason why it is necessary.

>Suppose, before
>you tried to fetch the next row from the cursor, that someone destroyed
>the underlying table, or deleted the row that the cursor was going to
>advance to, or added a row in front of the one it was to advance to.

For almost all the applications I can think of which need the cursor
to survive a commit, it would be sufficient to end the first
transaction and begin another immediately, retaining locks on all
objects which the active cursors of the first transaction were addressing.
Locks on other objects (i.e. those which the transaction did not have
addressability to at commit time) could be released.

This approach would provide another way of ending transactions beside
the traditional COMMIT and ROLLBACK.  A good name for it escapes me,
but you are essentially ending a transaction, inheriting its cursor
(and requisite locks), and beginning a new transaction.

>What should happen?  Not only would it be extremely difficult to
>implement the DBMS to account for all possible cases of trying to
>use a cursor that has had the locks swept out from under it, it would
>be difficult to define what should happen in all cases (and many
>definitions would necessarily be arbitrary).

In the scheme I describe, the problem is avoided by not releasing
locks on objects which the cursors are addressing.

>
>It seems that Mr. De Masi got into this mess by trying to use "commit
>work" and "rollback work" to guarantee the success or failure of
>individual database statements.  That is not a correct use of transactions.

Mr. De Masi is not the only person to get himself into these problems.
Lots of other people have gone down the same garden path, led by
tantalizing database systems that do so many things for you
automatically that you can't get it to do what *you* want.

Suppose I write a simple on-line transaction program which allows me
to browse airline seat reservations given a partial name.  Assume
there are several qualifying records.  When I find the one I'm looking
for, I want to cancel the reservation.  If all this is done on a
single transaction, then every record I read will be locked.  Suppose
I decide not to cancel the reservation, and instead go to lunch.  This
leaves a bunch of records locked for a long period.  The other clerks
get very mad at me.

To avoid this, I could write the application so that browsing is done
on a versioned read (a read-only transaction).  This would eliminate
the locks.  But now I have to do the update on another transaction
(one that allows updating)!  But how do I get addressability to the
record I found during browsing?  How do I know that it hasn't changed
between the time I started the versioned read and the time I fetched
it on the update transaction?  Thanks for all the help, database
system, but why couldn't you just preserve my cursor across a COMMIT!

Actually, as I've mentioned in another article, I think there are lots
of applications where degree 3 consistency is overkill and that the
database system should allow you the option of degree 2 consistency
(with a stable cursor).  Both DB2 and SQL/DS provide this feature
(although the DB2 implementation is rather flakey).
-- 

Gary Puckering        3755 Riverside Dr.
Cognos Incorporated   Ottawa, Ontario    {allegra,decvax,ihnp4,linus,pyramid}
(613) 738-1440        CANADA  K1G 3N3    !utzoo!dciem!nrcaer!cognos!garyp

speegle@im4u.UUCP (Greg Speegle) (08/03/87)

In article <1200@smokey.UUCP>, garyp@cognos.uucp (Gary Puckering) writes:

> For almost all the applications I can think of which need the cursor
> to survive a commit, it would be sufficient to end the first
> transaction and begin another immediately, retaining locks on all
> objects which the active cursors of the first transaction were addressing.
> Locks on other objects (i.e. those which the transaction did not have
> addressability to at commit time) could be released.
> This approach would provide another way of ending transactions beside
> the traditional COMMIT and ROLLBACK.  A good name for it escapes me,
> but you are essentially ending a transaction, inheriting its cursor
> (and requisite locks), and beginning a new transaction.
> In the scheme I describe, the problem is avoided by not releasing
> locks on objects which the cursors are addressing.
> Suppose I write a simple on-line transaction program which allows me
> to browse airline seat reservations given a partial name.  Assume
> there are several qualifying records.  When I find the one I'm looking
> for, I want to cancel the reservation.  If all this is done on a
> single transaction, then every record I read will be locked.  Suppose
> I decide not to cancel the reservation, and instead go to lunch.  This
> leaves a bunch of records locked for a long period.  The other clerks
> get very mad at me.
> To avoid this, I could write the application so that browsing is done
> on a versioned read (a read-only transaction).  This would eliminate
> the locks.  But now I have to do the update on another transaction
> (one that allows updating)!  But how do I get addressability to the
> record I found during browsing?  How do I know that it hasn't changed
> between the time I started the versioned read and the time I fetched
> it on the update transaction?  Thanks for all the help, database
> system, but why couldn't you just preserve my cursor across a COMMIT!

The problem you propose here is called a long-duration transaction, which is
a serious problem for design databases, like CAD systems. One solution, which
solves many of the problems posted here, involves nested transactions. A nested
transaction is a tree structured representation of transactions, where a leaf
is a database operation (read/write) and a non-leaf node is a collection of
nodes. At the level above the leaves, a node would be the same as a standard
transaction, while the level above that would have a transaction as an 
operation. Thus, for your problem, one transaction would perform all the reads,
and a second transaction would perform all of the writes, but both of these
transactions would be subtransactions of the SAME TRANSACTION, thus locks could be preserved across the commit of the subtransaction. There are protocols which
ensure serializability and other stuff as well. I have not heard of any 
production system which uses nested transactions, and if anyone out there
does know of such a system, I would appreciate hearing about it, as nested
transactions are part of my dissertation research.

More information about nested transactions can be found in work done by
J. Elliot B. Moss, currently of the University of Massachusetts, in his
dissertation, now a book published by MIT Press called, "Nested Transactions".
This work is primarily involved with distributed databases, but it clearly
explains the concepts of nested transactions. Other references can be supplied 
on request.

Apparently, the above is not sufficient text to get this article posted, so
I'll ramble some more about nested transactions. The lock protocol proposed in
Moss allowed lock sharing as well. A transaction could lock an object only if
holders of incompatible locks are ancestors of the transaction needing the lock.
Once a transaction finishes, it releases its lock to its parent. Thus, in the 
example problem presented here, the first subtransaction would attempt to get
a read lock on the items needed, while the second transaction would aquire a 
write lock on the reservation which would need to be cancelled. the cancelled 
reservation would still be in the database because the parent of the "write" 
subtransaction would still hold the read locks. This is only a brief sketch 
of the concept, but I think it may prove very useful in more difficult 
database applications, such as CAD systems.

Greg Speegle
{ihnp4,siesmo,pyramid}!ut-sally!speegle    speegle@sally.UTEXAS.EDU