gupta@cullsj.UUCP (Yogesh Gupta) (02/17/88)
I find that this group does not have much discussion about either the theory or the implementation of DBMS issues. Why is that? I know that quite a few people that are involved in DBMS research as well as product development read this group. So lack of people can not be the reason. Any comments? -- Yogesh Gupta | If you think my company will let me Cullinet Software, Inc. | speak for them, you must be joking.
UH2@PSUVM.BITNET (Lee Sailer) (02/18/88)
In article <232@cullsj.UUCP>, gupta@cullsj.UUCP (Yogesh Gupta) says: > >I find that this group does not have much discussion about >either the theory or the implementation of DBMS issues. >Why is that? Because it is hard to draw diagrams of entities with relationships between them. (semi 8-)
gorry@smidefix.liu.se (Goran Rydquist) (02/23/88)
In article <232@cullsj.UUCP> gupta@cullsj.UUCP (Yogesh Gupta) writes: >I find that this group does not have much discussion about >either the theory or the implementation of DBMS issues. I'd like to start! The greatest evil in today database theory is the unquestioned (?) assumption that the data model must be based on records. This is true for all three of the accepted and currently used data models, namely the hierarchical, the network and the relational models. I quote: Models which provide additional file structure around the records (eg sequencing, hierarchies, CODASYL networks) overcome some of the functional limitations of records. None of the overcome all the limitations. Furthermore, by building on top of record structures, they retain all the underlying ambiguities. In some cases, they simply add more options for representing something which could already be represented in several ways in record structure. [W. Kent, Limitations of Record-Based Information Models", ACM Transactions on Database Systems, vol 4, no 1, pp 107-131, March 1979] I would like to hear comments, historical motivations, views etc. --- Goran Rydqvist gorry@majestix.liu.se ---------
dc@gcm (Dave Caswell) (02/28/88)
In article <725@smidefix.liu.se> gorry@smidefix.liu.se (Goran Rydquist) writes:
)
)The greatest evil in today database theory is the unquestioned (?) assumption
)that the data model must be based on records. This is true for all three of
)the accepted and currently used data models, namely the hierarchical, the
)network and the relational models.
)
)I quote:
) Models which provide additional file structure around the records
) (eg sequencing, hierarchies, CODASYL networks) overcome some of the
) functional limitations of records. None of the overcome all the
) limitations. Furthermore, by building on top of record structures,
) they retain all the underlying ambiguities. In some cases, they simply
) add more options for representing something which could already be
) represented in several ways in record structure.
)[W. Kent, Limitations of Record-Based Information Models", ACM Transactions on
)Database Systems, vol 4, no 1, pp 107-131, March 1979]
)
)I would like to hear comments, historical motivations, views etc.
I'm not sure how to respond to this. You say that something is the greatest
evil without giving any reasons why. What are the functional limitations of
records? What would like to order by (use for sequencing) if not a field in
a record? Do records have a natural sequencing apart from their contents?
You say "In some cases, they simply add more options ..". Again I ask what
do you want to represent that can not be represented in records? What does
"adding file structure" or files in general have to do with data models?
What are all the underlying ambiguities?
In twenty-five pages the author must have had some arguments, why don't you
summarize them?
cy@ashtate (Cy Shuster) (03/02/88)
What is the specific meaning of "records" that causes problems here? Is it the grouping together of third-normal data (physically and logically), or is it the problem of "record definitions" with hierarchies, such as a date composed of month, day, year subfields? (i.e., composite domains). To my mind, there is an undeniable benefit of data independence in the rela- tional model (allowing the application and the database to change indepen- dent of each other), at the cost in many cases of lessened performance, from the underlying engine having to support arbitrary joins. It's perhaps akin to assembler language vs. a higher-level one: a good programmer can make impressive performance gains in assembler, at an increased maintenance cost. By the same token, a good programmer can take advantage of knowledge of the physical database layout, especially in a network system, for very good performance. But changing the network means changing all the code. You pays your money, and you takes yer choice. --Cy-- dBASE Mac Development UUCP:...seismo!scgvaxd!ashtate!cy
marti@ethz.UUCP (Robert Marti) (03/02/88)
In article <725@smidefix.liu.se>, gorry@smidefix.liu.se (Goran Rydquist) writes: > The greatest evil in today database theory is the unquestioned (?) assumption > that the data model must be based on records. [ ... ] > > I quote: > [ ... Quote ... ] > [W. Kent, Limitations of Record-Based Information Models", ACM Transactions on > Database Systems, vol 4, no 1, pp 107-131, March 1979] > > I would like to hear comments, historical motivations, views etc. > > Goran Rydqvist gorry@majestix.liu.se > --------- The ideas contained in Bill Kent's paper are now more than 10 years old. Even today, however, there are not many systems which implement them. This is partly due to the fact that their implementation has turned out to be decidedly non-trivial: Apart from deciding on how to map the data model onto secondary storage you also have to consider access path, recovery, and concurrency control issues. A few systems have been built as research prototypes -- CCA's work on DAPLEX (LDM, DDM), Xerox PARC's Cypress, and the TAXIS project at U Toronto come to mind. But there are certainly no widely used products which support semantic data models on the market today, although the situation seems to be changing with all the interest in applying database techniques to engineering and AI applications: A new generation of object-oriented database systems is emerging, e.g., Servio Logic's GemStone and Ontologic's VBase products. Still, users of record-oriented database systems apparently have been able to build a lot of very useful applications with this technology. -- Robert Marti Phone: +41 1 256 52 36 Institut fur Informatik ETH Zentrum/SOT CSNET/ARPA: marti%ifi.ethz.ch@relay.cs.net CH-8092 Zurich, Switzerland UUCP: ...uunet!mcvax!ethz!marti
gorry@senilix.liu.se (Goran Rydquist) (03/04/88)
I wrote this some time ago >)The greatest evil in today database theory is the unquestioned (?) assumption >)that the data model must be based on records. and got answers like >In twenty-five pages the author must have had some arguments so ... I'll give you some of my own arguments, much inspired by the original article of course. Let's start with a definition [also by W. Kent]. "By record we mean here a fixed sequence of field values, conforming to a static description usually contained in catalogs and/or in programs. The description consists mainly of name, length and data type for each field." The static description in the definition is usually referred to as the schema. The phrase "conforming to", implies that the schema is *extracted* - that is the information needed to interpret (or at least process) the data is stored separately from the record itself. The major idea of the record is that the schema is extracted. The motivation is that we save space by avoiding the repetition of the same information. The reversed view is the distributed schema. By a distributed schema I mean that the information to interpret a data instance is stored explicitly together with that instance. Casually glancing at a record data model, the schema appears to be distributed. The extraction is a computational, machine-oriented way of handling large amounts of data. The space saved by extracting the schema easily becomes illusory. The resulting rigid system does not handle variation well and a user is confronted with the unnatural requirement of predicting the worst case. This estimate is then allocated in every instance, resulting in much waste. A person is a good example of an entity in the real world. What attributes would be needed to model a person. Consider name, address, social security number, length, age, sex, maiden name etc. All of these attributes are not be needed for every person instance - some people haven't got a social security number, only girls have maiden names etc. Person 1 Person 2 ---------------------- ---------------------- name "Stan Smith" name "Ann Smith" address "Park Avenue 32" address "Main street 1" length 6' length 5' age 24 age 22 maiden-name "Jones" To accommodate the variations, we could: - Define the record format to include the union of all relevant fields, where not all the fields are expected to have values in every record. These null values naturally leeds to storage overhead, a user or application programmer is forced to predict every possible field that may appear in a person record, and there is no restriction on what fields should have values when. - Allow the same field to have different meanings in different records. The meaning of the field would then be interpreted by adding an extra type field to the record. Unfortunately the interpretation of this record will only be known by the application that conceived it. The database and independent applications treats the two conceptually associated fields as separate chunks of data, with no known restrictions. Further, space will be wasted if not all the data in the union happens to be of equal size. - Define a new record type for every combination of fields. This approach eliminate the storage space overhead, but if the data varies too much, the system will be littered with record types. The desired correspondence between entity and record disappears completely, and no restrictions exist that prevent two records to model the same entity at the same time. Suppose we have a bank account record type. An account can be allocated to either a corporation, or to a person. This relationship is naturally modeled by having an owned-by field in the account record. The problem arises because persons are identified by social security number, while corporations are identified by name (string). The record modeling problems and the possible solutions are similar to the ones that were described in the previous example. We could possibly use a generic pointer type or something like that, but what we really want is that the value of the owned-by field should be able to assume more than one type. A record is far from self-describing. Consider the problem of coding a generic procedure that prints records in a common format. Such a procedure must minimally know what data it is going to print, and the format of this data. Programming languages typically use compilation to hard-wire the schema into the code, which leaves no possibilities of querying the record instance of its composition. Yeah man! - gry --- Goran Rydqvist gorry@majestix.liu.se ---------
allbery@ncoast.UUCP (Brandon Allbery) (03/14/88)
As quoted from <733@senilix.liu.se> by gorry@senilix.liu.se (Goran Rydquist): +--------------- | I wrote this some time ago | | >)The greatest evil in today database theory is the unquestioned (?) assumption | >)that the data model must be based on records. | | and got answers like | | >In twenty-five pages the author must have had some arguments | | so ... I'll give you some of my own arguments, much inspired by the original | article of course. +--------------- Which he proceeds to do. One problem. I don't see any incriminating evidence against *records*; I see incriminating evidence against *static schemas*, a different kettle of fish entirely. The concept of a *record* (i.e. an object composed of fields) still remains in the new system. The proposed system sounds to me like an obvious extension to relational databases. (Relational purists will probably flame me to death for that!) -- Brandon S. Allbery, moderator of comp.sources.misc {well!hoptoad,uunet!hnsurg3,cbosgd,sun!mandrill}!ncoast!allbery