[comp.databases] data base client/server communications

jkrueger@dgis.dtic.dla.mil (Jon) (02/06/90)

dlw@odi.com (Dan Weinreb) writes:

>Your reply and Mike Olsen's both explain why it's a good idea to keep
>the network communication code distinct from the rest of the code, and
>that all makes a lot of sense.  But I don't see why they need to run
>in different Unix processes.

Need to?  If one has separation mechanisms one is wise to use them.
We don't need to run software at all, or have online databases.  If
we choose to, and also aim at reliable systems and correct data, why
refuse to use available mechanisms?  Plus, anyone who is serious about
security needs to prevent direct access to data of any sort.

>It seems that the network communications
>code should be a library, and there would be various different various
>for the different environments; you'd just link together the pieces
>that you need into a single executable and run it in one process.

This option is always available.  The option to separate raw access
to data into a separate trusted process is only available if the
operating system supports it.  Fortunately, most do.

>Perhaps your answer to this is "because we want to avoid the need to
>re-QA, etc".  But it seems to me that when you run the basic database
>engine with a different network protocol, this should be QA'ed whether
>or not the new network code is linked into the same executable, or in
>a separate executable and a separate process.  That is, whether or not
>you use a distinct process would seem to have no bearing on the need
>for QA.

QA is not a magic wand that somehow spots bugs or ferrets out badly
designed or implemented software.  QA is not a philosopher's stone that
transforms less reliable code into more reliable.  You're using "QA" as
if it allowed one to attain equally reliable software regardless of
tools used.  On the contrary, QA consists in large part of finding out
what the relevant tools are and insisting that people use them.  The QA
analyst would be very interested in why one would throw out the
protection afforded by separate processes.

QA is not application of tests: that's QC.  Your QC folk will happily
apply the same tests to the function call model as to the two-process
model.  If they get the same results, they will either pass or fail
both.  When they behave differently in the field the QC staff will
(quite validly) point out that it's not their fault: they did their
job.  It's QA's job to point out that one thing we've learned about
software is that absence of bugs isn't presence of quality.  (The
bad news is you still have to test: it's necessary but not sufficient).

Also, QA is not a verb: one does not "QA network protocols" nor does
one "re-QA" anything.  You're hardly alone in this usage, but it's
wrong.  You're turning an acronym into a transitive verb, forgetting
that it already has a verb: assure and its object: quality.  One can
assure quality: one can not "QA" anything.  My personal belief is that
this usage tends to mislead the hearer into thinking QA is magic.

-- Jon
-- 
Jonathan Krueger    jkrueger@dtic.dla.mil   uunet!dgis!jkrueger
The Philip Morris Companies, Inc: without question the strongest
and best argument for an anti-flag-waving amendment.

paulf@lamont.ldgo.columbia.edu (paul friberg) (02/08/90)

Thanks to all who have replied to the questions regarding
client/server data base communications. Many of the responses have
been quite helpful.

One issue that still bothers me is the creation of intermediary
communications servers:

Jon Krueger (jkrueger@dtic.dla.mil writes:
>> dlw@odi.com (Dan Weinreb) writes:
>> Your reply and Mike Olsen's both explain why it's a good idea to keep
>> the network communication code distinct from the rest of the code, and
>> that all makes a lot of sense.  But I don't see why they need to run
>> in different Unix processes.

> Need to?  If one has separation mechanisms one is wise to use them.
> We don't need to run software at all, or have online databases.  If
> we choose to, and also aim at reliable systems and correct data, why
> refuse to use available mechanisms?  Plus, anyone who is serious about
> security needs to prevent direct access to data of any sort.

Some questions to your reply, aside from the point made by Mike Olsen
(about separating the protocol code from the application for compiling
reasons i.e. his 6 x 50 example), why use a separtion mechanism if 
you are going to give up performance? Why not use a library for the
network code? What does security have to do with a separation mechansim
any way? Can't that be implemented straight up through a library just 
the same?  I personally feel that efficiency should rule on the 
side of performance.

Paul Friberg
 @ Lamont-Doherty Geological Observatory of Columbia Univ.
----------------------------------------------------------------------
Inet:   paulf@lamont.ldgo.columbia.edu
Analog: (914) 359-2900 x620

mao@eden (Mike Olson) (02/09/90)

In message <2097@lamont.ldgo.columbia.edu>,
Paul Friberg (paulf@lamont.ldgo.columbia.edu) writes

> One issue that still bothers me is the creation of intermediary
> communications servers:
>
> Jon Krueger (jkrueger@dtic.dla.mil writes:
>> Need to?  If one has separation mechanisms one is wise to use them.
>> We don't need to run software at all, or have online databases.  If
>> we choose to, and also aim at reliable systems and correct data, why
>> refuse to use available mechanisms?  Plus, anyone who is serious about
>> security needs to prevent direct access to data of any sort.
>
> why use a separtion mechanism if you are going to give up performance?
> Why not use a library for the network code? What does security have to
> do with a separation mechansim any way?

i expect you could argue that separate processes get some security benefits
from the kernel in that they don't share an address space.  as a practical
matter, i don't personally know of any vendors who implement comm protocol
servers for security reasons.

one reason that it does happen is for sharing.  arbitrarily many client
processes can open up connections to a single comm server process; they
don't all need to link identical versions of the library.  the comm server
talks over the net to its counterpart, which talks directly to the database
server.

with shared libraries, this is becoming a less compelling reason for the
design, but there are still lots of systems that don't support shared libs.

performance is sort of a funny metric; it's not always the case that a
design that appears slow really will be.  using a comm server is going to
involve at least one extra copy of data coming back from the database server,
but then, a library might, as well.  if you can shovel data over sockets (or
whatever they're called on other systems) fast enough, the server may not be
all that expensive.  in any case, the bottleneck almost always appears on the
wire, and not in the machine at either end of it.  in that respect, delays
imposed by using a separate process for communications are going to get lost
in the noise.
					mike olson
					postgres research group
					uc berkeley
					mao@postgres.Berkeley.EDU

jkrueger@dgis.dtic.dla.mil (Jon) (02/10/90)

paulf@lamont.ldgo.columbia.edu (paul friberg) writes:

>why use a separtion mechanism if you are going to give up performance?
>...I personally feel that efficiency should rule on the 
>side of performance.

If performance is more important to you than correctness, safety,
robustness, security, then skip any and all protection mechanisms
you like.  Go ahead.  Make your day :-)

>Why not use a library for the network code?

You use a library either way.  The only thing separate processes
add (as pertains to this discussion) is isolation of errors.  If
you don't need that, skip it.

>What does security have to do with a separation mechanism anyway?

Consider how you will assure limited shared access to data in a hostile
environment.  Examine the alternate mechanisms of file permissions.  If
still confused, see Stonebraker et. al., The INGRES Papers.
 
>Can't that be implemented straight up through a library just 
>the same?

Depends entirely on the semantics available to the functions.
Consider why you can't write a library to provide time services
on a machine without an addressable clock.

-- Jon
-- 
Jonathan Krueger    jkrueger@dtic.dla.mil   uunet!dgis!jkrueger
The Philip Morris Companies, Inc: without question the strongest
and best argument for an anti-flag-waving amendment.

rbp@well.sf.ca.us (Bob Pasker) (02/11/90)

paulf@lamont.ldgo.columbia.edu (paul friberg) writes:

>One issue that still bothers me is the creation of intermediary
>communications servers:
> why use a separtion mechanism if 
>you are going to give up performance? Why not use a library for the
>network code? What does security have to do with a separation mechansim
>any way? Can't that be implemented straight up through a library just 
>the same?  I personally feel that efficiency should rule on the 
>side of performance.

The other reason to use a separarte process is so that many users can
share the same com server-type process. the com server type process
can perform many efficiencies that would be difficult to do if it existed
in each db process.

	- session sharing (multiplexing) - many fe/be connections
	(associations) from node A to node B can use a single virtual
	circuit. This, in turn, allows concatenation of messages
	across connections for larger transfers when appropriate.

	- single point of analysis - if you wanted some kind of dynamic
	instrumentation of all the sessions, you would have to interrogate
	each of the db processes, with a single comserver server you just
	interrogate a single process.  E.G. If you wanted a list of all
	the associations, you'd have to interrogate each process, compile
	a list and then print them out.

	- single point of control - same idea as previous example, but for
	starting/shutting, enabling/disabling & changing other parameters.

	- communications prioritization -a single com server can manipulate
	the priorities/resources used by each connection independent of
	the db process.

	- information sharing - a single process would have information
	about things like address mapping, condition of the lines, etc.
	and would be able to change these things without less worry about
	multi-process synchronization & locking.

More will come to mind...

-- 
- bob
;-----------------------------------------------------------------
; Bob Pasker, San Francisco, CA	 	| rbp@well.sf.ca.us
; +1 415-695-8741			| {apple|pacbell}!well!rbp