[comp.databases] Wondering about using datetime as unique key...

dshapiro@Apple.COM (David Shapiro) (01/25/91)

I haven't seen anyone talk about using the timestamp as a unique
key to table. For heavy tp applications, it seems like a great way
to get rid of the contention on the usual NextNumber table (which
holds the next unique integer for a table key). Basically, I was
wondering if anyone has used this approach and what experiences
you've encountered. The positive aspects I see are:
Zero wait time for a new key for a table
Guaranteed monotonically increasing unique number (date)

Drawbacks:
Uniqueness depends on the granularity of your timestamp function.
Possible duplication if someone screws with the system time.
(no one can be using the system during the daylight savings time adjustment)
Portability to new platform could require some time manipulations.

If the system stays on one platform, I don't see a problem. And in our
Sybase system, the timestamp will give a time value to the millisecond,
and assuming it is impossible to write to disk a row in less than
a ms, no two rows will have the same key/timestamp. 

Comments?

Dave Shapiro
Apple Computer

ubiquity@cs.utexas.edu (Richard Hoffman) (01/25/91)

In article <48429@apple.Apple.COM> dshapiro@Apple.COM (David Shapiro) writes:

>I haven't seen anyone talk about using the timestamp as a unique
>key to table. 

This is frequently done on systems where (as you note later) the
timestamp has sufficient granularity.  In SAA relational databases,
timestamp goes into the microseconds, which is more than enough to
guarantee uniqueness.

The value is always available via CURRENT TIMESTAMP, and can be added
to inserted rows automatically if a DEFAULT clause (either the DB2 NOT NULL
WITH DEFAULT or the ANS/ISO column default) is available.

One difficulty is with multi-row updates, or inserts from a subselect.
SAA SQL requires that the value of things such as CURRENT TIMESTAMP be
calculated at the beginning of the operation, and so would be the same
for all rows in a multi-row operation.  Of course, a unique index or
PRIMARY KEY clause would catch this, but it is inconvenient.

>Possible duplication if someone screws with the system time.
>(no one can be using the system during the daylight savings time adjustment)

Yes, one of the amusing things about IMS is that it must be down for
one hour out of each year while the clocks are being set back.  In a
relational system, there is less of a problem: put a unique index or
PRIMARY KEY clause on the column, and check after each insertion to
make sure that the row is not rejected because of the uniqueness
constraint.  If it is, simply try again.  You'll eventually get a
unique one (probably first or second try, with microsecond granularity).

>Portability to new platform could require some time manipulations.

Too true, especially since date/time features are not standard yet.
But the approval of SQL2, expected this year, could ease that.


-- 
Richard Hoffman       IBM Personal Systems Database Development   (512) 823-1822
1529 Ben Crenshaw Way
Austin, TX 78746         "Life is a gamble at terrible odds; 
(512) 327-9232            if it were a bet you wouldn't take it"  (Tom Stoppard)

mao@eden.Berkeley.EDU (Mike Olson) (01/26/91)

In <48429@apple.Apple.COM>, dshapiro@Apple.COM (David Shapiro) writes
> I haven't seen anyone talk about using the timestamp as a unique
> key to table.

we tried this in postgres for a while (actually, transaction id's were
a function of the system time).  we abandoned the approach after a while
as unworkable.

> The positive aspects I see are:
> Zero wait time for a new key for a table

no, you're guaranteed to have to do a system call, which guarantees
a context switch.

> Guaranteed monotonically increasing unique number (date)

you would hope so.  unfortunately, people *do* adjust their clocks.  recent
schemes for keeping the clocks on a network synchronized mean that every
machine's notion of the current time will be changed with pretty high
frequency.

> and assuming it is impossible to write to disk a row in less than
> a ms, no two rows will have the same key/timestamp. 

this is an invalid assumption.  if you're doing a multi-row insert (for
example, copying records from one table to another) you can do lots of
insertions to a single page fast, since that page will generally be in
the buffer cache.

this is another case where the database guys can't let the operating
system guys do their work for them, because the operating system can't
be trusted.
					mike olson
					postgres research group
					uc berkeley
					mao@postgres.berkeley.edu