bg0l+@andrew.cmu.edu (Bruce E. Golightly) (06/27/90)
The Ingres extensions I talked about are still consistant with the requirements be specified. What Ingres Corp is providing is a user-defined data type, which might very well be graphical in nature. Along with that comes to ability and responsibility to define appropriate functions for manipulating the new type. These functions might well include over-loading of something basic like the plus sign operator. It sounds like the correct conclusions have already been drawn in the discussion. OO extensions for a DBMS will provide an enormous potential for sophisticated developers. That power, however, carries with it weighty responsibilities.
segel@Tellabs.COM (Mike Segel) (06/28/90)
In article <1189@abcom.ATT.COM> brr@abcom.ATT.COM (Rao) writes: First, let me stress that I am not working for Informix, and that my posting is an educated guess not necessarily true. (Maybe Dr. Scump aka Col. Panic can verify :-) The functionality you are looking for can be found in the multi media DB Online by Informix. (Multimedia, cute little buzzword no?) I belive Silicon Graphics introduced a product based on Informix's standard engine as well. > > I would like to put forth the following necessary > functionality: > > 1) To be able to update the graphical entry of a tuple. > This gets to be tricky. It is not efficient to actually store the image as part of the tupple, instead actually store a pointer to the image. In Online I believe it is a pointer to the DB space which holds all of the images specified by the table. I think what SG did was to have a field store the FD of the image and actually put locks on the file when that tuple was accesed, regardless if the image was shown. > 2) To be able to change the attributes (i.e. charecteristics) > of the graphical field. > This is more of a function of the front end. The database need only consider the image to be a large Binary Blob. Sort of abstracts the data. #2 seems to be application dependant. > 3) To be able to compare two fields of type "graphical". > This might be the hardest to define. A Griphics Guru > can be of help here. > This is definately application dependant. Who knows what you are storing in the BLOB. It could be a voice mail message, or some non-visual binary field. > 4) To implement all operations that are possible > on generic fields using SQL, > i.e. Insert, Update, Make Null, etc. > > 5) Aditionally, to be able to zoom, reduce size, rotate, etc. > graphical operations on the graphical field. > #4 is a given, 'cept NULL would be difficult to explain. #5 is really a process of the front end. >Guess an object -oriented database is better for such a >requirement, where function (operator) overloading can be >used by polymorphism. > I tend to disagree. Most applications which people say require an OODBMS can be done on a relational DBMS. Granted I don't know much about OODBMS's (I am still reading up on them when I get the time) but there is a lot which can be done in relational tupples, if one takes an abstract look at the data for a long enough period. [Guys don't flame me for this one. OK?] > >Would like to invite suggestions/additions from netters >and even other approaches. > Why reinvent the wheel? I think Oracle might have a BLOB field, as well so might SYBASE. What they all lack is a good GUI which takes advantage of the back end ability. I think a possible reason for this, is that such a front end would have to be too generic to be useful, or too application specific or platform specific. Maybe someone from their companies could verify? >-bindu rama rao -Gargoyle
cameron@kirk.nmg.bu.oz (Cameron Stevenson) (06/28/90)
From article <YaVp8zK00Uh7E0l15z@andrew.cmu.edu>, by bg0l+@andrew.cmu.edu (Bruce E. Golightly): > We're looking at some similar areas for our next round of development. > As the providers of voice and data services for the university, we must > manage a cable plant, which implies that we need to handle maps and plans > showing the locations of cables, wiring closets and outlets. Given those > goals, I am starting to look at the kinds of things mentioned. > > Carnegie Mellon uses Ingres for administrative data base applications. A > extension to Ingres has recently been announced that supports user-defined > data types. I believe that this may be the key to what we wish to do. > > More news as it develops. We do exactly the same thing here at Bond University. However, we are not trying to hold all the information within a relational database. Instead, we run a CAD package (MicroStation - from Intergraph) to hold the graphical information. This package allows for links between the graphical elements and relational elements (rows in a table). The range of link types are supported ie. one to one, one to many, many to one, etc.. The links are maintained by the CAD package, and run through Intergraph's relational server. This effectively allows the CAD package to talk to 'any' database which conforms to ANSI SQL. Currently there are links to ORACLE, Ingres, and Informix. Having established the links, it is possible to execute SQL queries through the graphics system, to the relational database, and have the results displayed graphically ie. hilite in red all data outlets with PC's attached. Intergraph also sell a development package which can handle these capabilities through a forms based application, complete with screen gadgets (a sort of SuperHyperCard ??). ALSO ... without sounding too much like an advertisement ... MicroStation runs on PC's (currently links to dBase, with hints of links to ORACLE), Mac's (links to ORACLE), Intergraph's workstations (Sun/Apollo/Silicon competitors) with all the goodies I mentioned earlier (relational server, development package, links to multiple database systems). If that wasn't enough, MicroStation will support both raster and vector graphics. So getting floor plan information into the system can be extremely quick. Send me some mail if you want more information, but I'd suggest you give it a look if you haven't already considered it. Cameron Stephenson Telephone +61 75 951220 Bond University Gold Coast Australia
snorri@strengur.is (snorri) (06/28/90)
The new Informix OLTP engine (Informix OnLine) has the BYTE and TEXT datatypes. The BYTE and TEXT datatypes are called BLOBs (Binary Large OBjects) and have a theoretical limit of 2 gigabytes. One can store scanned images, voice, video, spreadsheet files, WP files, ordinary text files, etc. in those fields. These BLOBs are selected from the database through standard sql and all the Informix application tools (ISQL, 4GL, ESQL/C etc.) can access them. -- Snorri Bergmann, Strengur Consulting Engineers, Reykjavik Iceland INTERNET: snorri@strengur.is
bapat@rm1.UUCP (Bapat) (06/28/90)
In <897@dgis.dtic.dla.mil> jkrueger@dgis.dtic.dla.mil (Jon) writes: >brr@abcom.ATT.COM (Rao) writes: >> I am trying to write a relational database >> that can be used to store graphical images >> ...file name as string, reference to graphical file >Then no atomic, serializable updates. Is this OK? Why is that? Why can I not update the graphical field atomically? If I'm only doing single-transaction updates, why would serializability be a concern? >> ...a binary field to hold a large graphical image >Lacks operations on columns of this type. >It's unstructured and essentially display-only. Agreed. One cannot do pattern matches on large binary data fields. For example, one couldn't say "select mug_shot from employee where <the employee wears glasses>." Could a graphical type field be implemented as a blob? (How do vendors implement blobs anyway?) -- Subodh Bapat bapat@rm1.uu.net OR ...uunet!rm1!bapat MS E-204, P.O.Box 407044, Racal-Milgo, Ft Lauderdale, FL 33340 (305) 846-6068 "In the great journey of life, I seem to have lost my boarding pass."
kevinc@sequent.UUCP (Kevin Closson) (06/28/90)
In article <2873@tellab5.tellabs.com> segel@Tellabs.COM (Mike Segel) writes: >In article <1189@abcom.ATT.COM> brr@abcom.ATT.COM (Rao) writes: > This is more of a function of the front end. The database >need only consider the image to be a large Binary Blob. Sort of ^^^^^^^^^^^^^^^^^^ a large binary (B)inary (l)arge (ob)ject ??????
moiram@tekcae.CAX.TEK.COM (Moira Mallison) (06/29/90)
In article <2873@tellab5.tellabs.com> segel@Tellabs.COM (Mike Segel) writes: >> 1) To be able to update the graphical entry of a tuple. >> > This gets to be tricky. It is not efficient to actually store >the image as part of the tupple, instead actually store a pointer to >the image. The problem with this is that you lose one of the important aspects of the relational model, ie all the data resides in relations. Now, some of the attributes hold data, and some hold pointers, and it's up to the application to know what to do with the pointers. I don't see this as a step forward. >> 3) To be able to compare two fields of type "graphical". >> This might be the hardest to define. A Griphics Guru >> can be of help here. >> > This is definately application dependant. Who knows what you >are storing in the BLOB. It could be a voice mail message, or some >non-visual binary field. Advances in database technology will ideally make the DBMS smarter. A BLOB does not. "All I've got is a whole bunch of bytes. I don't know what to do with them. You better figure that out." The more information that can be stored in the DBMS, the less effort will be expended to build applications around it. What is stored in the DBMS can be more easily shared and re-used. >>Guess an object -oriented database is better for such a >>requirement, where function (operator) overloading can be >>used by polymorphism. >> > I tend to disagree. Most applications which people say require >an OODBMS can be done on a relational DBMS. Granted I don't know much about >OODBMS's (I am still reading up on them when I get the time) but there is >a lot which can be done in relational tupples, if one takes an abstract >look at the data for a long enough period. There is a lot that can be done with the relational model, but that doesn't mean that the relational model can be all things to all people. And some things that can be done with the relational model cannot necessarily be done EASILY with the relational model. One of the attractive aspects of the relational model is it's simplicity and it's rigor. But you start to lose that when you start storing pointers to other objects (get it?) in your attributes. If that's what you need to do, get a DBMS that more fully supports the operations on the data. Moira Mallison CAx Data Management Tektronix, Inc
jkrueger@dgis.dtic.dla.mil (Jon) (06/29/90)
bapat@rm1.UUCP (Bapat) writes: >>> ...file name as string, reference to graphical file >>Then no atomic, serializable updates. Is this OK? >Why is that? Why can I not update the graphical field atomically? How do you roll back an update to a file? >If I'm only doing single-transaction updates, why would serializability >be a concern? 1) other users doing concurrent updates 2) why give up transactions? you lose atomic updates to structured data >one couldn't say > "select mug_shot from employee where <the employee wears glasses>." In the sense that such operations aren't defined on unstructured BLOBs, indeed on couldn't. But there are ways of defining them on data types; in fact, the data type is defined by the operation. A useful graphical data type is one on which one can select by graphic features. -- Jon -- Jonathan Krueger jkrueger@dtic.dla.mil uunet!dgis!jkrueger Drop in next time you're in the tri-planet area!
miket@blia.BLI.COM (Mike Tossy) (06/29/90)
> > I am trying to write a relational database > that can be used to store graphical images > ...file name as string, reference to graphical file > The ShareBase II product line includes the ability to store UNIX like - byte array - files on the relational database server. File operations are governed by the same transaction management system as the relational database, therefore code like this: set autocommit off select blah, blah .... file operation ... set autocommit on does indeed result in the dbms and file operations being atomic. (The name of the file holding the drawing becomes a column in the table.) This technique has been used for storing both digitized photographs and CAD/CAM drawings in files and data about those drawings in tables. Mike Tossy ShareBase Coropration miket@blia.bli.com 14600 Wichester Blvd (408) 378-7575 ext2200 Los Gatos, CA 95030 (Formerly: Britton Lee, Inc.)
jkrueger@dgis.dtic.dla.mil (Jon) (06/29/90)
moiram@tekcae.CAX.TEK.COM (Moira Mallison) writes: >Advances in database technology will ideally make the DBMS smarter. >A BLOB does not. "All I've got is a whole bunch of bytes. I don't >know what to do with them. You better figure that out." The more >information that can be stored in the DBMS, the less effort will >be expended to build applications around it. What is stored in >the DBMS can be more easily shared and re-used. Strongly agree. I like to ask people if they would tolerate a DBMS which had no date or date/time data type. What's the problem? You can just use seconds since 1970, right? You don't mind coding conversion routines into each application, do you? Writing parsing and query generation into otherwise trivial programs? OK, how about storing dates in a group of text fields, then you don't have to do that, you just lose integrity (e.g. dates like "9/9/99" or "2/29/81"), programmer productivity try selecting on a data range) and performance (try executing the above select). Most people begin to agree that ADT's are a Good Thing for DBMS. For hard core cases I ask if they would tolerate a DBMS without an integer type. What's the problem? Just store them as bitstrings, you can write your own math routines, right? Integers -- who needs 'em. Also floats -- I can write my own IEEE compliant math into every application. I haven't yet had anyone say that's OK. Then I like to opine that someday we'll feel the same way about a DBMS without support for user-defined data types. It's all a matter of where we draw the line. Today we accept DBMS that support basic data types important to business. Some day we'll want more. We won't need them for every application, but then every application doesn't need database management either. Finding the right tool for the job will be more straightforward someday. Right now it can be a frustrating exercise in weighing tradeoffs. -- Jon -- Jonathan Krueger jkrueger@dtic.dla.mil uunet!dgis!jkrueger Drop in next time you're in the tri-planet area!
nigelc@cognos.UUCP (Nigel Campbell) (06/29/90)
In response to D.L. asking about anything else try Starbase from Cognos or Interbase from Interbase It has a blob datatype which can also be supported by filters which are described to the database kernel . This allows you to filter one blob to another type e.g. a compressed blob to an uncompressed blob etc . This is a DSRI compliant database and is supported by PowerHouse 4.g.l. from us . -- Nigel Campbell Voice: (613) 783-6828 P.O. Box 9707 Cognos Incorporated FAX: (613) 738-0002 3755 Riverside Dr. uucp: nigelc@cognos.uucp || uunet!mitel!sce!cognos!nigelc Ottawa, Ontario CANADA K1G 3Z4
llojd@rivm.UUCP (Jan Diesel) (06/29/90)
In the discussion concerning the storage of graphical data in rdbms's the (new) Ingres user-defined datatype has been mentioned a few times. As far as I know the maximum size of *any* Ingres datatype/column is 2Kb. In my opinion this is a serious drawback for storing graphical data. ------------------------------------------------------------------- Jan Diesel llojd@rivm.UUCP National Institute for Public Health and Environmental Protection Laboratory for Air Research P.O.Box 1, 3720 BA BILTHOVEN, The Netherlands. -------------------------------------------------------------------
segel@tellabs.com (Mike Segel) (06/30/90)
In article <6207@tekgen.BV.TEK.COM> moiram@tekcae.CAX.TEK.COM (Moira Mallison) writes: >In article <2873@tellab5.tellabs.com> segel@Tellabs.COM (Mike Segel) writes: >>> 1) To be able to update the graphical entry of a tuple. >>> >> This gets to be tricky. It is not efficient to actually store >>the image as part of the tupple, instead actually store a pointer to >>the image. > >The problem with this is that you lose one of the important aspects >of the relational model, ie all the data resides in relations. >Now, some of the attributes hold data, and some hold pointers, and >it's up to the application to know what to do with the pointers. >I don't see this as a step forward. > Well, the problem arises when you now have each tuple being 1 - 2 Meg in size. So now how do you efficiently sort on non graphic data? Or keep a multituple view in memory? What I am saying, is that the database stores a pointer to where the graphical information is stored. For example, I BELIEVE Informix's Online stores a pointer to the Blob area inside the database. When the end user / applications programmer goes to fetch the blob, he does not see this, he/she will fetch it like it is part of the tuple. (Again I am assuming this, perhaps someone from Informix will elaborate.) My point is, that You are decreasing the performance of your engine if you actually try to store the image as part of the tuple. Whether you let the application see the pointer, or keep it internal to the engine, is up to you. >>> 3) To be able to compare two fields of type "graphical". >>> This might be the hardest to define. A Griphics Guru >>> can be of help here. >>> >> This is definately application dependant. Who knows what you >>are storing in the BLOB. It could be a voice mail message, or some >>non-visual binary field. > >Advances in database technology will ideally make the DBMS smarter. >A BLOB does not. "All I've got is a whole bunch of bytes. I don't >know what to do with them. You better figure that out." The more >information that can be stored in the DBMS, the less effort will >be expended to build applications around it. What is stored in >the DBMS can be more easily shared and re-used. > What sort of information is contained in the blob is application dependant. I prefer to take the minimalist approach in designing back-ends. The less intelligent the backend, the greater ability to treat data in an abstract fashion. Yes, it puts more emphasis on the 4GL or front-end, but it allows for a wider range of potential applications to be developed. Now, not being an object oriented guru, I thought that if I were designing an OODBMS, I would keep the backend as simple as possible and let the front end do all the work. >and it's rigor. But you start to lose that when you start storing >pointers to other objects (get it?) in your attributes. Not really. How the back-end stores the information should be a black box to the front-end. As long as the front end can get the information in a timely fashion. > >Moira Mallison -Mike Segel -- Mike Segel | uunet!balr.com | Std.disclaimer BALR Corporation | segel@quanta.eng.ohio-state.edu | implied and Oakbrook, Illinios | uunet!tellabs.com!segel | understood -------------------^-----------------------------------^----------------
ghm@ccadfa.adfa.oz.au (Geoff Miller) (07/02/90)
This discussion started, as I recall, around the problem of a database to store ID images, so maybe someone can answer this question - can acceptable ID images be made up of standard components, as in the "Identikit" system used by the police? If so, then that would provide both simplified storage and a method for comparison of images, but it only works if the resulting images are accurate enough (and looking at the Identikit pictures the police release I would have to wonder about that). Geoff Miller ghm@cc.adfa.oz.au
brianc@labmed.ucsf.edu (Brian Colfer) (07/03/90)
In article <6207@tekgen.BV.TEK.COM> moiram@tekcae.CAX.TEK.COM (Moira Mallison) writes: >In article <2873@tellab5.tellabs.com> segel@Tellabs.COM (Mike Segel) writes: >>> 1) To be able to update the graphical entry of a tuple. >>> >> This gets to be tricky. It is not efficient to actually store >>the image as part of the tupple, instead actually store a pointer to >>the image. > >The problem with this is that you lose one of the important aspects >of the relational model, ie all the data resides in relations. >Now, some of the attributes hold data, and some hold pointers, and >it's up to the application to know what to do with the pointers. >I don't see this as a step forward. This is not true with all BLOB type systems. For example, with Informix-Online the engine treats it as if it were stored as part of the tuple so to the application it doesn't matter... I don't know the actual algorithms in OnLine ... I only know that it doesn't really matter since I can almost treat the engine as a black data box. > >>> 3) To be able to compare two fields of type "graphical". >>> This might be the hardest to define. A Griphics Guru >>> can be of help here. >>> >> This is definately application dependant. Who knows what you >>are storing in the BLOB. It could be a voice mail message, or some >>non-visual binary field. > >Advances in database technology will ideally make the DBMS smarter. >A BLOB does not. "All I've got is a whole bunch of bytes. I don't >know what to do with them. You better figure that out." The more >information that can be stored in the DBMS, the less effort will >be expended to build applications around it. What is stored in >the DBMS can be more easily shared and re-used. Non-numeric, Non-date/time, and non-ASCII data is not very well defined. It would be interesting to have a system which knew about all the common graphic types, GIF, TIFF, XBM, XWD, CGM, PostScript etc.... and all the important spreadsheet types ... and all the sound storage types ... and ... I think this is better handled in the front end ... there are some free utilities to deal with most of these types which can be integrated into the frontend. > >> .... Most applications which people say require >>an OODBMS can be done on a relational DBMS. >There is a lot that can be done with the relational model, but that >doesn't mean that the relational model can be all things to all >people. And some things that can be done with the relational model >cannot necessarily be done EASILY with the relational model. One >of the attractive aspects of the relational model is it's simplicity >and it's rigor. OODBMS may be the solution to some of the limitations of current relational database products. But I am not yet convinced. I don't see what advantage OODBMS has over relational databases. For example, one advantage proposed by some OODBMS advocates is in the ability to specify methods, data transform procedures, as a part of an objects definition. While not as abstract but certainly more clear one can define such transforms through a system table. The ability for the relational model to accomadate self reference, e.g. system tables, coupled with its inherent algebraic character makes it the most powerful strategy to deal with data. -- Brian Colfer | UC San Francisco |------------------------| | Dept. of Lab. Medicine | System Administrator, | brianc@labmed.ucsf.edu | S.F. CA, 94143-0134 USA | Programer/Analyst | BRIANC@UCSFCCA.BITNET | PH. (415) 476-2325 |------------------------|
davek@infmx.UUCP (David Kosenko) (07/03/90)
In article <6207@tekgen.BV.TEK.COM> moiram@tekcae.CAX.TEK.COM (Moira Mallison) writes: >In article <2873@tellab5.tellabs.com> segel@Tellabs.COM (Mike Segel) writes: >>> 1) To be able to update the graphical entry of a tuple. >>> >> This gets to be tricky. It is not efficient to actually store >>the image as part of the tupple, instead actually store a pointer to >>the image. > >The problem with this is that you lose one of the important aspects >of the relational model, ie all the data resides in relations. >Now, some of the attributes hold data, and some hold pointers, and >it's up to the application to know what to do with the pointers. >I don't see this as a step forward. > Mr. Segel was somewhat incorrect in his description of INFORMIX-OnLine's treatment of BLOB data types in this respect. There is the ability, internal to the database engine, to store actual BLOB data in a physical space seperate from the rest of the data. This is done for various reasons, most relating to efficiency of access. What is stored, again internal to the database, in the actual tuple data is a pointer to the location of this BLOB data; however, to the end user, this data is still held in the relation itself. You do not need to know about this internal structure in your applications, and the relational model is maintained. I hope you can see this implementation as a step forward. >>> 3) To be able to compare two fields of type "graphical". >>> This might be the hardest to define. A Griphics Guru >>> can be of help here. >>> >> This is definately application dependant. Who knows what you >>are storing in the BLOB. It could be a voice mail message, or some >>non-visual binary field. > >Advances in database technology will ideally make the DBMS smarter. >A BLOB does not. "All I've got is a whole bunch of bytes. I don't >know what to do with them. You better figure that out." The more >information that can be stored in the DBMS, the less effort will >be expended to build applications around it. What is stored in >the DBMS can be more easily shared and re-used. > Yes, and BLOBs do fit this criteria. The ability to store arbitrarily large data items in a relation, and have it accessible to all who can query that relation does, in many ways, "make the DBMS smarter." Dave Kosenko n e w s f o d d e r -- Disclaimer: The opinions expressed herein | There's more than one answer are by no means those of Informix Software | to these questions pointing me (though they make you wonder about the | in a crooked line... strange people they hire). |
jkrueger@dgis.dtic.dla.mil (Jon) (07/05/90)
segel@tellabs.com (Mike Segel) writes: >Well, the problem arises when you now have each tuple being 1 - 2 Meg >in size. So now how do you efficiently sort on non graphic data? By using appropriate storage structures, access methods, sort algorithms, query decomposition (might reduce rows to be sorted based on other query terms, even eliminate the need for sort if 0 or 1 returned row). >Or keep a multituple view in memory? Not a problem. Perhaps a solution to some other, unstated, problem. Besides "which memory", etc., e.g. how to distribute over net. >You are decreasing the performance of your engine if you actually try >to store the image as part of the tuple. Neither true nor false. Which usually indicates you aren't honing in on the right issues. "ADT's don't kill performance, people who don't know how to use them kill performance" :-) >I prefer to take the minimalist approach in designing >back-ends. The less intelligent the backend, the greater ability >to treat data in an abstract fashion. What would be an example of this? To weigh against all the counter-examples. -- Jon -- Jonathan Krueger jkrueger@dtic.dla.mil uunet!dgis!jkrueger Drop in next time you're in the tri-planet area!
segel@tellabs.com (Mike Segel) (07/09/90)
[As a point of clarification, I am talking about DB internal mechanisms for handling Blobs. Anyone confused by this discusion should sit back and think for a while. Please do not use my discussion as an opinion on the performance of any database or database company. HAPPY DAVE? ;-] In article <913@dgis.dtic.dla.mil> jkrueger@dgis.dtic.dla.mil (Jon) writes: >segel@tellabs.com (Mike Segel) writes: > >>Well, the problem arises when you now have each tuple being 1 - 2 Meg >>in size. So now how do you efficiently sort on non graphic data? > >By using appropriate storage structures, access methods, sort >algorithms, query decomposition (might reduce rows to be sorted based >on other query terms, even eliminate the need for sort if 0 or 1 >returned row). > Jon, you are missing the point. By keeping the Blob as part of the tuple, you have now a tuple of 2Meg (+- rest of tuple) in width. So each of your records are now 2Meg in width. How many and how efficiently can you sort/querry ect. on a tuple of this size. How much memory is now required. (How much swap?). Lets say you build up a cursor and it contains 3 rows. That could be around 6 Meg of memory/swap. Not to effiecient since 2 Meg of that data is not required for any join or relational function. (Yet...) As well as the fact that not all tuples will have a blob attached, but will have to have space allocated for a blob. All of these problems are reduced when you just have a pointer as part of the tuple. The pointer can point to the Blob storage area of a raw disk (Informix Online), or to a file or directory in Unix. At the end of last year, Silicon Graphics announced a database product which allowed for the storage of Blobs based on Informix's Standard Engine. How could they do this? Having not seen the package, or source code, I can only conjecture on the following.... The front end I belive was in X, or some other windowing environment. So it could have been written in ESQL/C. The storage of the blobs could be that they store all the blobs as a single file, or a directory of multiple files. (I would think that multiple files is better.) Then the only question would be, How do they perform locking? This is fairly straightforward and has been published in several books. Then the tuple need only contain an FD to the graphic blob. Of course there are some other potential problems which are irrelevant to this discussion. The point is, they (SG) and Informix are providing the ability of ADT's by allowing for Blobs. I think back that the discussion evolved from trying to allow for ADT's like graphical images, sounds, text, or various other fields. This can all be accomplished withing the relational model. > >Neither true nor false. Which usually indicates you aren't honing >in on the right issues. "ADT's don't kill performance, people >who don't know how to use them kill performance" :-) > Yeah. It's like tuning the back-end to gain performance when a series of code reviews and a rethink of the specs would do more good ;-) >>I prefer to take the minimalist approach in designing >>back-ends. The less intelligent the backend, the greater ability >>to treat data in an abstract fashion. > >What would be an example of this? To weigh against all the >counter-examples. > Simple. Take the idea of a blob. In informix, it is a byte stream of up to 2 Meg. Now, with this simple type, you can now allow for a database to contain Images, voice/sound, or any other data which is in its simplest form, digital information. Now informix also allows for a varchar and I belive an another type of stream. (I need to go back and check so don't flame me if I am wrong.) What they have done, is to have the back end define the basic building blocks which will allow for other ADT's. This is great for certain applications. One example, is the real estate demo, informix uses for online. They have this demo done twice. Once using Sunview on a Sun workstation and a CD Rom device, and the other on a Mac. running Wingz. (Another fine product from Informix ;-) Now, both show you a raster of the house, and different views. How is it stored? How can you take a raster/gif/ picture which is required for two or three diferent machines, and store it in the DB? You could create an ADT for each raster/image format, but that means storing the photograph in the DB several times. Or you could separate the header information from the blob, then have the front end application, based on the machine, reasemble the image in the correct format and the header information. So now your front-end application needs to be a little smarter, yet your back-end is capable of supporting various front ends without having to be modified. My point is that the DB backend should be storing the data in its simplest components rather than trying to handle data in its more complex forms. >-- Jon - Mike (" I am no expert. Noone pays me for my opinions" ;-) -- Mike Segel | uunet!balr.com | Std.disclaimer BALR Corporation | segel@quanta.eng.ohio-state.edu | implied and Oakbrook, Illinios | uunet!tellabs.com!segel | understood -------------------^-----------------------------------^----------------
jkrueger@dgis.dtic.dla.mil (Jon) (07/10/90)
segel@tellabs.com (Mike Segel) writes: >Jon, you are missing the point. By keeping the Blob as part of the tuple, >you have now a tuple of 2Meg (+- rest of tuple) in width. No. It appears that way to queries that don't project fewer columns, though. The old virtual/transparent distinction. How the engine manages resources like disk storage isn't visible to folks that send queries to the engine. Nor how the engine selects or sorts on the large columns -- could be that it trims trailing whitedots, uses G3 compression, uses sparse matrix algorithms, etc. And of course avoiding exhaustive scan of large columns isn't any different in principle from avoiding exhaustive scans of many rows. It is harder to implement, however; witness that no commercial product of which I'm aware provides lazy fetching of columns. >As well as the fact that not all tuples will have a blob attached, but will >have to have space allocated for a blob. No. Look at how VM algorithms work. Empty cols can cost small fixed allocations. Instances of BLOBs can cost in proportion to their contents. This even presumes that tables and cols still appear to be fixed length, which isn't a hard requirement; they could expand arbitrarily to fit their contents, too. But even without that they can appear as fixed length while being implemented in cheaper ways. Trailing whitespace compression for text cols works this way now. >All of these problems are reduced when you just have a pointer as part of >the tuple. The pointer can point to the Blob storage area of a raw disk >(Informix Online), or to a file or directory in Unix. If the pointer appears different from ordinary objects to the user, you lose the simplicity and safety of the data model. If not, why call it a pointer? Also it's unlikely the overhead of the UNIX filesystem is going to be your bottleneck. Work smarter! Don't expect your image database will get its best performance on the highest bandwidth, some engines will use processor to avoid some of those bit copies. >The point is, they (SG) and Informix are providing the ability of ADT's >by allowing for Blobs. But ADT's have nothing to do with BLOBS. Nothing. Consider bignums, arbitrary precision floats, etc. They have everything to do with defining operations on objects of that type, and preventing access to and manipulation of objects of that type via other means than the defined operations. >I think back that the discussion evolved from >trying to allow for ADT's like graphical images, sounds, text, or various >other fields. This can all be accomplished withing the relational model. Correct. But these are just the sexy data (lit. and fig., in some cases :-) Consider what it would mean to scientific and engineering folk to have a database with a numeric type that doesn't overflow. >...Sunview on a Sun workstation and a CD Rom device, and the other on a Mac. >running Wingz. Now, both show you >a raster of the house, and different views. How is it stored? How can you >take a raster/gif/ picture which is required for two or three diferent >machines, and store it in the DB? You could create an ADT for each >raster/image format, but that means storing the photograph in the DB >several times. Or you could separate the header information from the blob, >then have the front end application, based on the machine, reasemble the >image in the correct format and the header information. You're confusing two, no three issues here. One is remote data access, one is defining families of image data types, and one is database design. In times of old this was known as the incompatible subroutine library problem -- different floating point formats at different precisions, and overlapping sets of routines for each format, but worse yet specific to the language, operating system, and processor you call them from. Calling sequence, don't you know. It took twenty years to standardize on IEEE floats, write mostly portable libraries, and arrange for common (or at least sane) linkage schemes. It is now possible to write programs that use floats and expect them to behave the same way across a wide variety of machines. It is *still* not possible to share binary floats between programs running on different Mwchines in a net -- without conversion, which is the point. That's a separate problem. And yet a third problem is designing good databases to get shared. In the example you cite, the right solution is to design an image type appropriate for the pictures of houses you have, define this type to your database engine, populate the database, provide common access to your network of (different) machines, and, decide on a common model for painting the pictures on different display hardware. Notice how little of this is a database problem. Also notice that we still want the engine to understand the image data type, not each front end. >My point is that the DB backend should be storing the data >in its simplest components rather than trying to handle data in its more >complex forms. But *floats* are complex. Very. Interpretation of bitmapped images should learn from this. We have and can examine engines that share floats among different hardware architectures. They do *not* do this be keeping the engine ignorant of what the bits in a float mean. They do not do this by letting each front end decide what floats mean to it. Access to had better not equal subversion of. -- Jon -- Jonathan Krueger jkrueger@dtic.dla.mil uunet!dgis!jkrueger Drop in next time you're in the tri-planet area!