[comp.databases] Client/server model in popular portable relational databases

clh@tacitus.tfic.bc.ca (Chris Hermansen) (11/10/89)

In article <1205@unify.UUCP> nico@unify.UUCP (Nico Nierenberg) writes:
...
>my question is why can a user process do a better job of multi-tasking
>then the OS can.  Please consider this a devil's advocate position, and
>I am really curious what you all think.

Yeah, me too!  The usual discussion along these lines seems to be "we know
UN*X can't support more than 196 processes so we had to roll our own, and
we're ever so happy we've done so".

And where do things like Mach or Sun's multithreaded routine support fit in?

Chris Hermansen                         Timberline Forest Inventory Consultants
Voice: 1 604 733 0731                   302 - 958 West 8th Avenue
FAX:   1 604 733 0634                   Vancouver B.C. CANADA
uunet!ubc-cs!van-bc!tacitus!clh         V5Z 1E5

eric@pyramid.pyramid.com (Eric Bergan) (11/10/89)

In article <1989Nov7.223725.20327@odi.com> dlw@odi.com writes:
>
>(1) Yes, the homebrew thread packages may have to acquire new features
>that will slow them down some amount; but how much?

	Unclear, but remember that none of the OS developers set out
to "slow down" the context switch time - it has happened as additional
functionality was added. There is no reason to not expect threads
packages (even home brew ones) to not suffer the same as more functionality
is added.

>(2) Yes, operating systems can be written with efficient thread
>packages; but how efficient, and when will we see these operating
>systems on so many platforms that we can throw away our homebrew
>thread packages?
>
>Much of your posting makes claims (pretty good claims, I think) about
>what an operating system *could* do.  However, for someone who has to
>produce a database product for sale in the next N months, it's still
>important to consider what current operating systems, the ones we'll
>really be seeing at customer sites, *currently* do.  So the real
>upshot of your points is that developers who are currently using
>homebrew thread systems should modularize their software so that when
>they port to an OS that provides a really good thread system, they can
>use use the OS's threads instead of their own.  (That's what I'm doing.)

	As director of the group within Pyramid responsible for working
with the database companies, I obviously have a biased view, but the
point is we are trying to better fit the database and operating system
features and functionality to each other, and hopefully N is a small-ish
number.

	One major problem with homebrew thread implementations is the
lack of general purpose tools such as debuggers, profilers, code coverage
suites, etc. Each thread implementer either has to burn resources developing
(and maintaining) their own, or do without.

	One other clarification - I'm to a certain extent playing devil's
advocate in this discussion. I think there are some advantages to 
user-level threads, both as a programming model and for potential 
application-specific performance tuning. But I do think that one should
examine all the performance issues, and really understand why one feels
user-level context switching is occuring more quickly. Many times, it is
because functionality has been dropped, and some of that may come creeping
back. (Honest, we really don't slow the clock every time we enter kernel
mode...)

-- 

					eric
					...!pyramid!eric

eric@pyramid.pyramid.com (Eric Bergan) (11/10/89)

In article <3573@dev.dtic.dla.mil> jkrueger@dev.dtic.dla.mil (Jonathan Krueger) writes:
>
>Multiple multithreaded servers are simply a choice of what
>functionality to put in the operating system and what functionality to
>put in code executing at less exalted levels.  They represent one
>tradeoff of growth of memory versus loss of concurrency with increased
>concurrent users.  They're the right tradeoff for the current cost of
>memory, the current and reasonably prospective tools available, the
>lessons learned in 30 years of software engineering, and the state of
>the art in multiprocessors available for general purpose computing.
>They're the wrong tradeoff for other conditions.  Try relaxing the
>general purpose constraint, for instance.

	Very true. Another change in constraint would be considering
non-homogeneous memory or cpu architectures. Very unclear how a database
should work on an MP machine where some memory access is very cheap, some
is "medium" and some is expensive. DBMS vendors are starting to think about
this, but often come back to thinking the machine is just a very tightly
coupled distributed environment, and relying on their already existing
distributed strategy, running each processor as an independent node in
the network. Worse is where some processors are "fast" and others "slow",
although I think the OS in most cases will hide that by having the slow
processors for IO. But if you had a specialized vector processor that
could be useful on sort's, for instance, you might want to try and
migrate sorts to that processor. 

	I think we still have some more to learn about all this and that,
as Jonathan states, multiple multithreaded servers are a match for
the current state of computing, not necessarily forever.

-- 

					eric
					...!pyramid!eric

dlw@odi.com (Dan Weinreb) (11/10/89)

In article <24456@sequent.UUCP> sweiger@sequent.UUCP (Mark Sweiger) writes:

		Now, if individual backend server threads could actually run
   on different CPUs of the multiprocessor *concurrently*, there would be no
   need for multiple server processes.  Since there is no easy way to achieve
   this functionality in a portable fashion,...

First, please temporarily stipulate that portability is not an issue;
pretend I'm designing a DBMS intended to run only on Sequent machines
under the Sequent operating system.  Sequent's version of Unix *does*
have a way to let me run multiple truly concurrent (i.e. can really
use separate CPUs at the same time) threads within a single Unix
process, right?

Second, let's return to reality; of course portability is an issue.
The state of the world might be one of two things:

(1) Every serious multiprocessor Unix vendor provides some way to run
truly concurrent threads within a single Unix process; however, the
details of how this work differ from one vendor to another.

(2) There are serious multiprocessor Unix vendors who do not provide
any way to run truly concurrent threads within a single Unix process.

If (1) is the case, it seems that it would not be very hard to provide
a small software layer that would hide the differences between the
various operating systems.  Then porting would require writing a new
version of this layer, sort of like writing a new device driver for a
new kind of disk, but it would be pretty easy.

I could be wrong about the "pretty easy" if the different vendors used
very different paradigms for their threads; but I'm having a hard time
seeing how they could do threads so differently that the port would
really be a problem.  I've seen a lot of thread packages, and they all
look pretty similar to me.

So are you saying that (2) is the case?  Or is there some other
mistake in my reasoning?  Thank you.

Dan Weinreb		Object Design, Inc.		dlw@odi.com

dlw@odi.com (Dan Weinreb) (11/10/89)

In article <90594@pyramid.pyramid.com> eric@pyramid.pyramid.com (Eric Bergan) writes:

   In article <1989Nov7.223725.20327@odi.com> dlw@odi.com writes:
   >
   >(1) Yes, the homebrew thread packages may have to acquire new features
   >that will slow them down some amount; but how much?

	   Unclear, but remember that none of the OS developers set out
   to "slow down" the context switch time - it has happened as additional
   functionality was added. There is no reason to not expect threads
   packages (even home brew ones) to not suffer the same as more functionality
   is added.

If we are comparing a homebrew lightweight process package to a
builtin lightweight process package, the main reason I could see that
the homebrew one might be faster is that it can be specialized for the
database system.  The builtin one might have accreted lots of features
because it is trying to be a general-purpose utility.  If I use the
builtin one, therefore, I may pay for features that I do not need.
I understand what you've been saying about features "creeping back",
but since the homebrew system is being used in only one situation,
probably not all the features would creep back; some might have only
been needed by fundamentally different applications.

(Obviously the situation is very different when we're talking about
large-grain multiprocessors: in that case, the builtin one wins hands
down since it can use the true concurrency.)

I'm talking "could" and "might", of course.  I make no claims about
any real system.  Only metering will prove anything, one way or the
other.  I'm certainly not trying to impugn your software or anyone
else's.

	   One major problem with homebrew thread implementations is the
   lack of general purpose tools such as debuggers, profilers, code coverage
   suites, etc. Each thread implementer either has to burn resources developing
   (and maintaining) their own, or do without.

Yes, that's quite true.

	   One other clarification - I'm to a certain extent playing devil's
   advocate in this discussion. I think there are some advantages to 
   user-level threads, both as a programming model and for potential 
   application-specific performance tuning. But I do think that one should
   examine all the performance issues, and really understand why one feels
   user-level context switching is occuring more quickly. Many times, it is
   because functionality has been dropped, and some of that may come creeping
   back. (Honest, we really don't slow the clock every time we enter kernel
   mode...)

I agree 100%.  It's fun to talk about what causes might have what
effects, but nobody should confuse such talk with evidence of actual
metering.  As I said, I am *not* really a fan of homebrew threads.  I
would ideally like to see threads provided by the operating system, in
a portable way, with reasonable efficiency.  It's interesting to note
that in the OS/2 world, threads are part of OS/2, and work the same
way on every OS/2 machine.  I hope that threads get standardized
between all the different Unix versions.

Dan Weinreb		Object Design, Inc.		dlw@odi.com

eric@pyramid.pyramid.com (Eric Bergan) (11/11/89)

In article <1989Nov10.061329.16565@odi.com> dlw@odi.com writes:
>In article <24456@sequent.UUCP> sweiger@sequent.UUCP (Mark Sweiger) writes:
>
>		Now, if individual backend server threads could actually run
>   on different CPUs of the multiprocessor *concurrently*, there would be no
>   need for multiple server processes.  Since there is no easy way to achieve
>   this functionality in a portable fashion,...
>
>Second, let's return to reality; of course portability is an issue.
>The state of the world might be one of two things:
>
>(1) Every serious multiprocessor Unix vendor provides some way to run
>truly concurrent threads within a single Unix process; however, the
>details of how this work differ from one vendor to another.
>
>(2) There are serious multiprocessor Unix vendors who do not provide
>any way to run truly concurrent threads within a single Unix process.

	Actually, there are a number of standards efforts underway to
try and define a standard UNIX thread API. POSIX, UI, and OSF all
have groups meeting on this, and they actually are trying to converge
on a single standard. Further, since there are several reasonable models
to look at, its reasonable to assume that we can probably define a
useful standard. Note that defining the API does not necessarily restrict
whether the implementation causes the threads to be controlled in user
or kernel space, but in theory, as long as the API is complete, a developer
should not specifically care how the functionality is implemented. In 
particular, the IEEE POSIX Real Time group had a reasonable draft of a
definition of threads, called "p-threads", I believe. I do not know the
current status of the proposal, however.

-- 

					eric
					...!pyramid!eric

jkrueger@dgis.dtic.dla.mil (Jon) (11/11/89)

dlw@odi.com (Dan Weinreb) writes:

>It's interesting to note
>that in the OS/2 world, threads are part of OS/2, and work the same
>way on every OS/2 machine.  

Good news.  How many architectures did you say OS/2 runs on?

>I hope that threads get standardized
>between all the different Unix versions.

Not, as we see, an apples-to-apples comparison.

-- Jon
-- 
Jonathan Krueger    jkrueger@dtic.dla.mil   uunet!dgis!jkrueger
Isn't it interesting that the first thing you do with your
color bitmapped window system on a network is emulate an ASR33?

tim@binky.sybase.com (Tim Wood) (11/14/89)

In article <123@tacitus.tfic.bc.ca> clh@tacitus.UUCP (Chris Hermansen) writes:
>In article <1205@unify.UUCP> nico@unify.UUCP (Nico Nierenberg) writes:
>...
>>my question is why can a user process do a better job of multi-tasking
>>then the OS can.  Please consider this a devil's advocate position, and
>>I am really curious what you all think.
>
>Yeah, me too!...
>
>And where do things like Mach or Sun's multithreaded routine support fit in?

In Sybase's case, that feature of Mach and SunOS did not exist when we
archtitected our system.  Even if it did, we might very well not have 
used it because: a) it is not available on all platforms we would want
to run on; b) there's no guarantee that their lightweight process scheduling 
would be optimal for our RDBMS, whereas we fully control that by writing
our own; and c) our system behaves more uniformly on different platforms
because in all cases, the same policies for scheduling DBMS tasks are
in effect.  All we ask is to occupy as much memory and CPU time as possible! :-)

The aim of our design is to have a small amount of memory and non-DBMS CPU
time per user by using that memory and CPU only for our own 
overhead instead of getting the OS into the act for each DBMS
task.

>
>Chris Hermansen                         Timberline Forest Inventory Consultants

-TW
Sybase, Inc. / 6475 Christie Ave. / Emeryville, CA / 94608	  415-596-3500
tim@sybase.com          {pacbell,pyramid,sun,{uunet,ucbvax}!mtxinu}!sybase!tim
		This message is solely my personal opinion.
		It is not a representation of Sybase, Inc.  OK.

dhepner@hpisod2.HP.COM (Dan Hepner) (11/15/89)

From: eric@pyramid.pyramid.com (Eric Bergan)
> 
> 	Actually, there are a number of standards efforts underway to
> try and define a standard UNIX thread API. POSIX, UI, and OSF all
> have groups meeting on this, and they actually are trying to converge
> on a single standard. Further, since there are several reasonable models
> to look at, its reasonable to assume that we can probably define a
> useful standard. []

Do any of these standards imply any given multiple processor shared
memory data cache behavior?

Thanks,
Dan Hepner

eric@pyramid.pyramid.com (Eric Bergan) (11/24/89)

In article <13520002@hpisod2.HP.COM> dhepner@hpisod2.HP.COM (Dan Hepner) writes:
>From: eric@pyramid.pyramid.com (Eric Bergan)
>> 
>> 	Actually, there are a number of standards efforts underway to
>> try and define a standard UNIX thread API. POSIX, UI, and OSF all
>> have groups meeting on this, and they actually are trying to converge
>> on a single standard. Further, since there are several reasonable models
>> to look at, its reasonable to assume that we can probably define a
>> useful standard. []
>
>Do any of these standards imply any given multiple processor shared
>memory data cache behavior?

	This is probably the source of the most active discussions within
the various standards committee's - what type of architecture to assume.

	The most conservative approaches tend to assume homogeneous processors,
shared memory system (almost certainly using a bus) with a moderate (1-30)
number of processors.

	The other end of the spectrum is heterogeneous processors, no
assumptions about memory access (read "message passing"), tons of processors
using a variety of interconnects.

	From what I have seen, most of the comittees have been trying to
define an interface and a model for threads that is independent of
implementation issues. But there are almost always some kind of 
underlying architectural assumptions which influence the specifications.

-- 

					eric
					...!pyramid!eric