[comp.databases] INGRES/EQC and SQL/C ...and general notes...

berlin@bu-cs.BU.EDU (David K. Fickes) (08/25/88)

We're running INGRES 5.1 on our suns and I'm doing a bit 
of programming using EQUEL/C ... Someone has suggested that our
applications be written in SQL/C for "portability." Is this
simply a pipe dream or might it actually be possible to transfer
something written with the Ingres C / SQL preprocessor and move
it to another system using say.. Informix/C ?? 

One other question. What's so hard about supporting NESTED EQUEL
statements in the EQUEL/C processor???  I've got some code that looks like 
this:

## ingres "einstein"
## range of d is dukasfin3
## retrieve (c_uid=d.uid) where d.date=" " and
##      squeeze(d.doc_date)="00/00/??";
##{
##      retrieve (c_year=right(squeeze(d.doc_date),2)) where d.uid=c_uid;
##      replace d(date=date(01/01/c_year)) where d.uid=c_uid;
##}


Besides the basic point of .. "this could be done a WHOLE LOT BETTER" this
is a nice painless way of stating for each record that qualifies
change it to X...

But without the nesting you have to jump a few more loops.. it seems
to me to be a rather OBVIOUS feature thats been left out?  Why.. is
it a technically difficult thing to do .. or is RTI just not doing
the obvious?  

- david-- 
==============================================================================
David K. Fickes     Center for Einstein Studies/Einstein Papers Project
UUCP: ...harvard!bu-it!berlin			Boston University 
OTHERWISE: berlin@bu-it.bu.edu			745 Commonwealth Avenue
PHONE:	(617) 353-9249	(617) 277-9741		Boston, MA 02215      
				 

wong@llama.rtech.UUCP (J. Wong) (08/26/88)

In article <24546@bu-cs.BU.EDU> berlin@buita.bu.edu (David Fickes- Einstein Project) writes:
>
>One other question. What's so hard about supporting NESTED EQUEL
>statements in the EQUEL/C processor???  I've got some code that looks like 
>this:
>
>## ingres "einstein"
>## range of d is dukasfin3
>## retrieve (c_uid=d.uid) where d.date=" " and
>##      squeeze(d.doc_date)="00/00/??";
>##{
>##      retrieve (c_year=right(squeeze(d.doc_date),2)) where d.uid=c_uid;
>##      replace d(date=date(01/01/c_year)) where d.uid=c_uid;
>##}
>

Actually, this can be written as a single QUEL REPLACE statement!

##	replace d (date = date("01/01/"+right(squeeze(d.doc_date),2)))
##		where d.date=" " and squeeze(d.doc_date)="00/00/??";
J. Wong		wong@rtech.com
****************************************************************
S-s-s-ay!

wong@rtech.rtech.com (J. Wong) (08/26/88)

In article <24546@bu-cs.BU.EDU>, berlin@bu-cs.BU.EDU (David K. Fickes) writes:
> 
> One other question. What's so hard about supporting NESTED EQUEL
> statements in the EQUEL/C processor???  I've got some code that looks like 
> this:
> 
> ## ingres "einstein"
> ## range of d is dukasfin3
> ## retrieve (c_uid=d.uid) where d.date=" " and
> ##      squeeze(d.doc_date)="00/00/??";
> ##{
> ##      retrieve (c_year=right(squeeze(d.doc_date),2)) where d.uid=c_uid;
> ##      replace d(date=date(01/01/c_year)) where d.uid=c_uid;
> ##}
> 

Actually, a this can be done using a single QUEL REPLACE statement.

##	replace d (date = date("01/01/"+right(squeeze(d.doc_date),2)))
##		where d.date=" " and squeeze(d.doc_date)="00/00/??";

berlin@bu-cs.BU.EDU (David K. Fickes) (08/27/88)

Yes a simpler QUEL statement will do it .. BUT ONLY IF ALL OF THE RESULTS
ARE A CORRECT DATE... take the string "00/00-02/89?" and put it through our two
statements as one of 40,000 records ...  
(if INGRES would allow the nesting...) 
and the one transaction would fail... not a problem unless you have 39,999
other records that would be changed.

If I use my statement 39,999 would change and I could 
easily find the one other...

In yours... the transaction fails on an UNKNOWN record and NO 
RECORDS ARE CHANGED leaving me to write more code to TRAP the one failure...

The only other real way is to work with the TID... or the UNIQUE
ID NUMBERS (d.uid) which I maintain with our records...  Now back 
to the original question

WHY CAN'T WE NEST THE DARN QUEL STATEMENTS?


In article <24546@bu-cs.BU.EDU>, berlin@bu-cs.BU.EDU (David K. Fickes) writes:
> 
> One other question. What's so hard about supporting NESTED EQUEL
> statements in the EQUEL/C processor???  I've got some code that looks like 
> this:
> 
> ## ingres "einstein"
> ## range of d is dukasfin3
> ## retrieve (c_uid=d.uid) where d.date=" " and
> ##      squeeze(d.doc_date)="00/00/??";
> ##{
> ##      retrieve (c_year=right(squeeze(d.doc_date),2)) where d.uid=c_uid;
> ##      replace d(date=date(01/01/c_year)) where d.uid=c_uid;
> ##}
> 

Actually, a this can be done using a single QUEL REPLACE statement.

##	replace d (date = date("01/01/"+right(squeeze(d.doc_date),2)))
##		where d.date=" " and squeeze(d.doc_date)="00/00/??";
	

-- 
==============================================================================
David K. Fickes     Center for Einstein Studies/Einstein Papers Project
UUCP: ...harvard!bu-it!berlin			Boston University 
OTHERWISE: berlin@bu-it.bu.edu			745 Commonwealth Avenue
PHONE:	(617) 353-9249	(617) 277-9741		Boston, MA 02215      
				 

jkrueger@daitc.daitc.mil (Jonathan Krueger) (08/28/88)

In article <24585@bu-cs.BU.EDU> berlin@buita.bu.edu
(David Fickes, Einstein Project) writes (some material trimmed):
>Yes a simpler QUEL statement will do it .. BUT ONLY IF ALL OF THE RESULTS
>ARE A CORRECT DATE... take the string "00/00-02/89?" and put it through our two
>statements as one of 40,000 records ...  
>WHY CAN'T WE NEST THE DARN QUEL STATEMENTS?

You're confusing three issues:

[1]	Why can't I tell INGRES to relax atomicity of single statement
	transactions?

[2]	Why can't type conversion functions return an error value flagging
	problem arguments, rather than or as well as aborting the
	transaction?

[3]	Why can't EQUEL provide more convenient iteration constructs,
	in particular beyond those offered by QUEL?  This would
	allow the programmer to work around problems such as the above,
	and would also be useful for their own sake.

Issues [1] and [2] have been discussed before in this forum and
elsewhere.  I can't think of any way for vendors to allow [1] without
also allowing you to subvert the relational model.  Then when our
programs did strange and difficult to predict things, Ed and Chris
would say "We told you so" and there would be great wailing and
gnashing of teeth throughout the land.

But I think [2] could be implemented cleanly.  However,

Major thesis: we don't want special values in each data type to
indicate illegal, similar to IEEE "not a number" values for floating
point types.  The reason is that they can't avoid the extra complexity
and we can.  If a column is of type date, we want to know that all
values are legal dates.  Neither do we want a generic untyped BAD flag
similar to MISSING.  Instead of flagging a mistake they would store it
in the database, in effect gradually corrupting it.  MISSING values,
while also adding complexity, at least can be traced back through an
audit trail.  They aren't derived from functions of other values,
they're just set or cleared.  Type conversion failures can be high
order side effects very difficult to trace.  If we find a broken window
MISSING allows us to find the fingerprints on the rock, BAD just tells
us the time and proximate cause.

Minor thesis: type conversion functions can't flag conversion failures,
since they can only return successes.  Type conversion functions return
objects of the target type.  By the major thesis, there are no
predefined "bad" values for each type.  Of course, the functions could
be redesigned to also return status codes, but the assignment syntax
for multiple return values would require great changes to the language,
might not even permit upward compatibility of current programs.  Even
if this could be done, you're still calling the function in different
way, the status code becomes an argument in the where clause instead of
the object being a source value in the target list.

Therefore: to handle type conversion failure cleanly, you would have
to perform separate tests using different functions (or the same
functions called in different ways).  If you don't perform the tests
to avoid "falling into" the type conversion failure, we still abort
the entire transaction.  So for bulletproof applications, you write
extra code.  E.g., for an interactive application:

	/* find the guys that won't convert */
	retrieve (d.doc_date)
		where d.date=" " and squeeze(d.doc_date)="00/00/??"
		and isdate("01/01/"+right(squeeze(d.doc_date),2)) = 0

For an automated application:

	/* don't do possible updates without flagging impossible ones */
	begin transaction
	/* do the possible */
	replace d (date = date("01/01/"+right(squeeze(d.doc_date),2)))
		where d.date=" " and squeeze(d.doc_date)="00/00/??";
		and isdate("01/01/"+right(squeeze(d.doc_date),2)) = 1
	/* flag the impossible */
	retrieve into problem_d (d.all)
		where d.date=" " and squeeze(d.doc_date)="00/00/??"
		and isdate("01/01/"+right(squeeze(d.doc_date),2)) = 0
	end transaction

Now, if we agree that relaxing atomicity [1] is a bad idea, then
providing better iteration constructs [3] is neither necessary nor
sufficient to answer David's original question.  He still needs some
way of deciding what to do inside his nested loop when an individual
replace fails.  He has to be able to distinguish type conversion
failures from any other update failure, e.g. violating an integrity
constraint on d.date.  But solving problem [2] is necessary and
sufficient, as shown above.  Therefore, while [3] would be great for
other reasons, it's not appropriate here.

Finally, looking toward the day that commercial dbms support user
defined abstract data types, their type definition facility must
provide a way to define conversion functions and also functions to
check for conversion failure.  E.g. if I define a domain "color"
drawing from {red,orange,yellow,gree,blue,indigo,violet}, ordered low
to high as shown left to right, I should be able to query:

	/* print the observations by color where it's known */
	retrieve (colorful_obs = color(right(squeeze(d.string),2)))
		where iscolor(right(squeeze(d.string),2)) = 1
		sort by colorful_obs

-- Jon
-- 
Jonathan Krueger  uunet!daitc!jkrueger  jkrueger@daitc.arpa  (703) 998-4777

Inspected by: No. 15

davek@rtech.rtech.com (Dave Kellogg) (08/31/88)

In article [...] berlin@buita.bu.edu (David Fickes Einstein Project) writes:
[quoting himself from an earlier posting]
>> 
>> One other question. What's so hard about supporting NESTED EQUEL
>> statements in the EQUEL/C processor???  

This really isn't the pre-processor's choice.  It's giving you a warning 
when you nest a statement within a retrieve loop only because it was told
to.  And it was told to simply because retrieve loops weren't designed
to support that construct.

What, then, are they designed to do?  The answer is that the EQUEL retrieve
loop was designed (and this is more of a DBMS issue than an pre-processor
issue) as a high-speed "portal" through which you can extract multiple
rows from the database.  The "high-speed" part is true because retrieve 
loops are implemented to return lots of rows in user-tuneable "pipeblocks"
from the back-end to the front-end.  In most cases, even if your front-end
code isn't ready to accept another row, the back-end (or DBMS) will be 
finding rows, and filling up pipeblocks full of data to send to the front-end.

Thus, the back-end won't be looking for (or expecting) another DBMS 
statement until you either get all the rows, or issue a ## endretrieve
to stop the processing.  It *can* (architecturally, no flames please)
get data back faster than other approaches
(discussed below) because it's job is to get data and send it back to
the front-end.  Thus, given it's fairly narrow duty, it does it pretty
fast.

All this springs from the issue that QUEL and SQL are 'set-oriented' 
languages.  QUEL and SQL commands generally affect sets of tuples.  
RETRIEVE/SELECT statements can return 0, 1, or a whole bunch of rows.
Similarly, a single UPDATE statement can change 0, 1 or a whole bunch
of rows.

Programming languages, on the other hand, tend to be more 'record-
oriented.' Thus the problem becomes interfacing a record-oriented
language with a set-oriented one.  Two approaches are:

	1. RETRIEVE-loops
	2. CURSORs

Retrieve loops were invented in QUEL and a code fragment best shows the 
idea.

	## retrieve ( name = emp.name )
	## {

		printf("Name is now:  %s\n", name);

	## }

That is, once for every row that comes back from the DBMS, the code
between the ## { and the ## } will be executed.

Cursors came with SQL and the idea is different.  With cursors you 
'declare' a named cursor which will run a query.  You then 'open' 
the cursor which conceptually sets the cursor to be pointing right 
before the first row to be returned.  Finally, you issue [multiple]
'fetch'es from the cursor and after each fetch a row is returned and
the conceptual pointer is moved forward to the next qualifying row.
Since you can have multiple cursors open at one time, and you can 
update and delete the current row of a cursor, you get the nesting
effect that David desired.

In pseudo ESQL (because this is getting long)

	EXEC SQL DECLARE FRED CURSOR FOR [.. select statement ..]
	EXEC SQL OPEN FRED

	while ( rows_keep_coming_back )
	{
		EXEC SQL FETCH [.. next row from the cursor ..]
	}

Thus, *depending on the implementation* (no flames please), retrieve-
loops can get data faster than cursors because retrieve-loops
send back 'chunks' of data and not rows.  Cursors, *conceptually*,
send back rows, and if implemented "literally" will cause more inter-process
communication for DBMS's (like most) where the front-end and back-end
are separate processes. 

In any case, with INGRES 6 QUEL users will get the best of SQL (cursors
will be supported in QUEL) and SQL users will get best of QUEL because
select-loops will be supported in SQL.  In version 5, each construct
is found only in its native habitat (or language).



Sorry this was so long.

David Kellogg
Relational Technology (INGRES) New York City
davek@rtech.rtech.com

"Opinions above are mine and not necessarily those of my employer"

DMasterson@cup.portal.com (09/01/88)

In message <2407@rtech.rtech.com>, davek@rtech.rtech.com writes:
>In article [...] berlin@buita.bu.edu (David Fickes Einstein Project) writes:
>[quoting himself from an earlier posting]
>>> 
>>> One other question. What's so hard about supporting NESTED EQUEL
>>> statements in the EQUEL/C processor???  
>
>This really isn't the pre-processor's choice.  It's giving you a warning 
>when you nest a statement within a retrieve loop only because it was told
>to.  And it was told to simply because retrieve loops weren't designed
>to support that construct.
>
Baloney!  Perhaps Ingres's retrieve loops were designed that way, but that
need not be a restriction.  Britton-Lee had nested retrieve loops in their IDL
preprocessor for the IDM 4 years ago.  This language looks very much like Quel
for Ingres (it was designed by Dr. Epstein who did much of the work on UCB's
Ingres).

>What, then, are they designed to do?  The answer is that the EQUEL retrieve
>loop was designed (and this is more of a DBMS issue than an pre-processor
>issue) as a high-speed "portal" through which you can extract multiple
>rows from the database.  The "high-speed" part is true because retrieve 
>loops are implemented to return lots of rows in user-tuneable "pipeblocks"
>from the back-end to the front-end.  In most cases, even if your front-end
>code isn't ready to accept another row, the back-end (or DBMS) will be 
>finding rows, and filling up pipeblocks full of data to send to the front-end.
>
In a proper configuration, there is nothing to say that nested retrieve loops
couldn't use multiple "pipeblocks".  Only allowing one retrieve loop to be
active prevents users of applications from requesting more info about
something (retrieve details where detail.key = user input) on the fly.  This
causes the application programmer to do ineffecient things to allow this to
happen (join and retrieve ahead of time).

>[lines with bad premise deleted]

You said some pretty flame worthy things there and admitted it yourself.  It
sounded very bias and not a very good answer to the question.

>David Kellogg
>Relational Technology (INGRES) New York City
>davek@rtech.rtech.com

David Masterson
DMasterson@cup.portal.com