[comp.databases] Recent postings to comp.databases...

larry@xanadu.uucp (Larry Rowe) (08/01/89)

i haven't been reading this newsgroup for the past couple of months
and i just spent the past 2 hours catching up.  i have several comments
on recent postings.  (for those of you that i haven't met yet :)!, i'm
one of the professors at Berkeley that started rti. mike stonebraker
and i are currently working on next generation dbms's (postgres) and
application development tools (picasso, formerly objfads.)

1) fetch previous
the relational model has nothing to say about application program interfaces
(API) to databases.  cursors are an API abstraction.  mike and i wrote a
paper several years ago that proposed a new API abstraction, called a portal,
that attempted to fix problems that we ran into trying to implement browsing
style interfaces to dbms's.  cursors are essentially sequential access
abstractions to a query result.  what a browsing style interface wants
is the ability to randomly fetch data from the query result.  moreover,
the fetch should be able to return more than 1 record.  if you have a table
on the screen and you want to scroll forward, it seems wasteful to issue
N ``fetch 1 record'' instead of one ``fetch N records.''  Portals allowed
fetch forward/backward and they also allowed you to say ``fetch N where P''
where P is a predicate that is used to search in the return set.  this
last operation can be used to implement ``scroll forward to the row in
the table with name = "Smith".''

most dbms's don't implement backward fetches because they don't have 
backward pointers on their disk pages.  remember that the previous page
may be an overflow page added between the previous main page and the 
current page.

another serious problem is that the demand for SQL compatibility for
commericial products is stiffling inovation at least for now.  it'll
probably be a couple of years before the vendors start to put truely
creative things into their SQL products.

2) normalization
donovan hsieh had it right when he said that you have to pay attention
to both consistency *and* performance.  i have seen too many people fail
when building applications on relational dbms's because they were too
restrictive in their thinking.  they weren't willing to denormalize their
db designs to meet application performance requirements.

3) referential integrity
every commercial SQL dbms will have implemented simple referential
integrity by this time next year.  sybase already has it implemented
and ingres and oracle will do it in the next release.  the point that
someone mentioned about rti doing more general rules, relates to
a point we have been making.  referential integrity is fine, but it
is only one kind of integrity rule that you might want to put on
your data.  general purpose rules in a data manager allow you to 
have the dbms maintain many other kinds of constraints.  the toy
example that we use is to represent the type of desk that an employee
should get.  emps under 40 get a steel desk, emps over 40 get a wood
desk, and the president gets a wood desk irrespective of age.  you
could represent this information by storing the value "wood" or
"steel" in a column, but this representation would get out of date
over time.  by storing the information as a rule, the dbms manages
the information.  of course, what you also want are database alerters
(i.e., asynchronouos commands to programs generated by the dbms)
so you can automatically issue the instructions to the facilities
people to change a person's desk on his/her 40th birthdate.

	larry

davidm@cimshop.UUCP (David Masterson) (08/02/89)

>2) normalization
>donovan hsieh had it right when he said that you have to pay attention
>to both consistency *and* performance.  i have seen too many people fail
>when building applications on relational dbms's because they were too
>restrictive in their thinking.  they weren't willing to denormalize their
>db designs to meet application performance requirements.
>
Would you say, though, that it is more appropriate to start with a database
that well constructed (ie. normalized) and work toward an efficient design or
vice versa?  This, of course, assumes that a person is willing to "wreck" his
well built database for the lowly consideration of performance.  ;-)

David Masterson
uunet!cimshop!davidm		or		DMasterson@cup.portal.com

larry@postgres.uucp (Larry Rowe) (08/04/89)

In article <450@cimshop.UUCP> davidm@cimshop.UUCP (David Masterson) writes:
>>2) normalization
>>donovan hsieh had it right when he said that you have to pay attention
>>to both consistency *and* performance.  i have seen too many people fail
	...
>Would you say, though, that it is more appropriate to start with a database
>that well constructed (ie. normalized) and work toward an efficient design or
>vice versa?  This, of course, assumes that a person is willing to "wreck" his
>well built database for the lowly consideration of performance.  ;-)

my advice would be to do a good logical database design.  personally, i
don't do normalization per se.  i use an entity-relationship modelling
technique which insures a normalized designed (3NF).  then, examine your
xacts and estimate their performance.  if you can't estimate them, prototype
your database and do a pilot study.  if the performance is acceptable,
you're done.  otherwise, denormalize your design to fix the performance
problems.  usually, the problems arise because you're doing costly joins
in time-critical xacts.  the obvious solution is to store the join rather
than the projections.

unfortunately, you may end up with a strange database design.  this problem
can be solved sometimes by adding more hardware or by building snapshots
(i.e., copies) of critical tables and denormalizing them. 

the bottom line is that you must build an application that works and that
relational dbms's, like most technology, can help but it is not perfect.
	larry

dbruck@ciss.Dayton.NCR.COM (Don Bruck@ciss.Dayton.NCR.COM) (08/05/89)

In article <450@cimshop.UUCP> davidm@cimshop.UUCP (David Masterson) writes:
>>
>Would you say, though, that it is more appropriate to start with a database
>that well constructed (ie. normalized) and work toward an efficient design or
>vice versa?  This, of course, assumes that a person is willing to "wreck" his
>well built database for the lowly consideration of performance.  ;-)
>
This is why many people suggest the separation of data design responsibility
into two. The first does analysis of the data: understand what it means to
the user and what the keys (data, relationships) are. (That to me is what
normalization is all about, defining the base elements of data.)

The second person is responsible for taking the normalized data and making
it useful, i.e. de-normalize it, add indices, foreign keys, etc. This
person must understand the problems of denormalization and explain to users
(including programmers) that the database will now perform better *BUT* it
will not be as flexible.

This gets us back to a question related to the original: Why go to all the
trouble of Normalizing if someone is just going to denormalize? Because
when you go through the whole process you can measure how much you have
"hurt" your original. If you don't go through the process of finding
perfection first then you can't measure how good the system is. You also
don't know which trade-off you should truly use and which access will need
to suffer since you cannot optimize all accesses.



--------------------------------------------------------------------------
Don Bruck	
NCR Corp.
Corporate Data Planning and Administration	
PCD 6
1700 S. Patterson Blvd.
Dayton OH 45479
My opinions and interests are my own.