[comp.databases] Client/server processes and implementations

clh@tacitus.tfic.bc.ca (Chris Hermansen) (11/10/89)

In article <6895@sybase.sybase.com> forrest@phobos.UUCP (Jon Forrest) writes:
>In article <2184@kodak.UUCP> deal@kodak.UUCP () writes:
...
>>I know little about Sybase but I am sure that some of their people who
>>contribute to this newsgroup can add some information here (please?).
>>
>
>One of the strong points of the Sybase architecture is that we
>require only one operating system process per Server. Internal
>to this process is our own multitasking kernel which handles
>many of the facilities that other systems rely on the operating
>system to provide. Another benefit of this approach is that our

I apologize in advance if this sounds like a flame; it's not meant to be :-)

So (I believe) you are claiming that Sybase is `better' at implementing
the elements of the `multitasking kernel' you hint at above?  Better than
what?  If I may guess at what the original poster was hinting at, s/he is
of the opinion that your software is more efficient than others because it
uses fewer processes; your answer appears to be that you use your own
rather than the operating system's.  In order to substantiate your claim for
improved efficiency, you might want to present some more solid info as to
how *your* kernel is better than the O/S.

Imagine if I were to advertise a car that had a 25 hp gasoline engine,
got 75mpg of gas, and went 300mph.  To anyone surprised enough to enquire,
I would grudgingly admit that it ALSO had a 4000hp kerosene burning turbojet
in the back that you needed to use to get to anything over 12mph, and then
your mileage went down to 2mpg.  This is probably a poor analogy, but that's
how the above statement strikes me.

>system to provide. Another benefit of this approach is that our
>Server needs no special privileges to run. The operating system
>sees our Server as just another non-privileged user process.

Maybe I'm being a little dense here; I don't have a strong feeling as to
why any other architecture needs `special privileges' to run.

>Therefore, we can't crash the machine running our Server.

Haw.  I've heard THIS before.

Sorry about the above if I sound like I'm overreacting; I certainly have
nothing against Sybase or M. Forrest.  This reply just seems a little too
vague (relative to the original question, at least) to let it pass into
folklore.

Chris Hermansen                         Timberline Forest Inventory Consultants
Voice: 1 604 733 0731                   302 - 958 West 8th Avenue
FAX:   1 604 733 0634                   Vancouver B.C. CANADA
uunet!ubc-cs!van-bc!tacitus!clh         V5Z 1E5

forrest@phobos.sybase.com (Jon Forrest) (11/12/89)

I said :

One of the strong points of the Sybase architecture is that we
require only one operating system process per Server. Internal
to this process is our own multitasking kernel which handles
many of the facilities that other systems rely on the operating
system to provide.

Chris Hermansen writes in response :

   I apologize in advance if this sounds like a flame; it's not meant to be :-)
   
   So (I believe) you are claiming that Sybase is `better' at implementing
   the elements of the `multitasking kernel' you hint at above?  Better than
   what?  If I may guess at what the original poster was hinting at, s/he is
   of the opinion that your software is more efficient than others because it
   uses fewer processes; your answer appears to be that you use your own
   rather than the operating system's.  In order to substantiate your claim for
   improved efficiency, you might want to present some more solid info as to
   how *your* kernel is better than the O/S.

I retort :

  Although in my original posting I never said that Sybase is better at
  implementing the elements of a multitasking kernel than a native O.S., I'll
  accept your challenge and give you a couple of examples of how
  certain features of our multitasking kernel are better than one
  O.S., VMS. Note that I don't hold this against the VMS. One
  operating system can't be everything to all people.

  The first example is process creating. Process creation on VMS
  is expensive because a process requires a lot of context and
  resources. People that have ported certain Unix program to VMS
  soon learn about this. While I have no real problem with this
  when using VMS interactively, such behavior was unacceptable
  in our server. Our database processes are much simpler than VMS processes.
  Thus, why should we carry the weight of a VMS process on our
  shoulders when we don't need most of what it has to offer?

  The second example has to do with locks. The VMS lock manager is
  a strange and wonderful thing that was designed to provide lock
  manager services within a cluster. Providing such services is again
  much more complicated than providing locking within one process.
  Also, the model of locking provided by the VMS lock manager is
  different than what our server requires.
  In this case by using a single VMS process we were able to provide
  all the services our server requires without the overhead and
  complexity of the VMS lock manager. Indeed I believe that
  Oracle's problems on a Vaxcluster are due to VMS lock manager
  overhead and complexity. (An Oracle person told me this yesterday
  on the plane home from DECUS.)

  I could probably go on but I hope you see my point. Our server,
  as well as servers from our competition, don't need all the services
  provided by a native O.S.. As the O.S. becomes more complicated,
  (VMS vs. Unix) this becomes more obvious.

I also wrote :

  Another benefit of this approach is that our
  Server needs no special privileges to run. The operating system
  sees our Server as just another non-privileged user process.
  Therefore, we can't crash the machine running our Server.

Chris Hermansen continues :

  Maybe I'm being a little dense here; I don't have a strong feeling as to
  why any other architecture needs `special privileges' to run.
  
and in response to my claim that we can't crash the machine running
our Server :
  
  Haw.  I've heard THIS before.

I reply :
  Dense might not be the right word. In any case, I don't know if
  our competition requires special privileges. This wasn't the point
  I was trying to make. Any program of any kind that doesn't requires
  special privileges shouldn't be able to crash the machine on which
  it is running unless the machine is very badly tuned or else the
  O.S. isn't intended for production environments. I know that I feel
  good knowing that in the only case I know of that a machine running
  Sybase has crashed it has been because of a bug in VMS.

----
Anything you read here is my opinion and in no way represents Sybase, Inc.

Jon Forrest WB6EDM
forrest@sybase.com
{pacbell,sun,{uunet,ucbvax}!mtxinu}!sybase!forrest
415-596-3422

jkrueger@dgis.dtic.dla.mil (Jon) (11/13/89)

forrest@phobos.sybase.com (Jon Forrest) writes:

[that Sybase saves overhead and complexity by preferring
multiprogramming provided exclusively by their own server to similar
services provided by operating systems such as VMS]

This is a straightforward claim and easily granted.

Now, who will claim that these advantages outweigh the disadvantages?

-- Jon
-- 
Jonathan Krueger    jkrueger@dtic.dla.mil   uunet!dgis!jkrueger
Isn't it interesting that the first thing you do with your
color bitmapped window system on a network is emulate an ASR33?

dhepner@hpisod2.HP.COM (Dan Hepner) (11/15/89)

From: forrest@phobos.sybase.com (Jon Forrest)
> 
> I said :
> 
> One of the strong points of the Sybase architecture is that we
> require only one operating system process per Server.

Could you comment on if or how this model affects Sybase's
ability to exploit multiple processor architectures?

Thanks,
Dan Hepner

forrest@phobos.sybase.com (Jon Forrest) (11/16/89)

In article <13520001@hpisod2.HP.COM> dhepner@hpisod2.HP.COM (Dan Hepner) writes:
>From: forrest@phobos.sybase.com (Jon Forrest)
>> 
>> I said :
>> 
>> One of the strong points of the Sybase architecture is that we
>> require only one operating system process per Server.
>
>Could you comment on if or how this model affects Sybase's
>ability to exploit multiple processor architectures?
>
>Thanks,
>Dan Hepner

Instead of replying to this perfectly reasonable question, I'll
defer to descriptions you'll see in the future about our MP product.
(Sorry to be so evasive but I'm not sure how much I can say about
this and I'd rather be safe than sorry. I do think that you won't
be disappointed when all the information becomes public.)

----
Anything you read here is my opinion and in no way represents Sybase, Inc.

Jon Forrest WB6EDM
forrest@sybase.com
{pacbell,sun,{uunet,ucbvax}!mtxinu}!sybase!forrest
415-596-3422

harrism@aquila.DG.COM (Mike Harris) (11/17/89)

Although the Sybase approach does have it's merits, there are several problems
with it:

1) A single server architecture cannot take DIRECT advantage of multiprocessor
hardware. By this I mean that the server threads cannot take advantage of an idle
processor since the server threads are bound to the process. Workarounds are
either 1) OS threads which can be dispatched across cpus, or 2) Improving the
Server architecture such that multiple (one per cpu usually) Sybase style
servers can service their threads. Shouldn't be too difficult. A bit of SHM,
a few semaphore locks & there your are! Option (2) would be a good alternative
for them until proper threads are available from kernel folk.

2) A single server process may not be able to compete successfully for cpu
resources in a cpu bound environment. Cpu cycles are allocated & scheduled for
processes, not application threads. This is a more difficult problem than (1)
above. For this instance, one would want each thread to be considered equally
in competition for the systems cpu cycles. Priority scheduling won't help, although some systems allow scheduling parameters (quanta, etc) to be associated
with a process. This helps, but it is still hard to be fair unless the load is
consistent. Again, properly implemented threads & multi server architectures
are helps here.

Mike Harris - KM4UL harrism@dg-rtp.dg.com
Data General Corporation {world}!mcnc!rti!dg-rtp!harrism
Research Triangle Park, NC

clh@tacitus.tfic.bc.ca (Chris Hermansen) (11/17/89)

In article <7037@sybase.sybase.com> forrest@sybase.com writes:
>...
>I retort :
>  Although in my original posting I never said that Sybase is better at
>  implementing the elements of a multitasking kernel than a native O.S., I'll
>  accept your challenge and give you a couple of examples of how
>  certain features of our multitasking kernel are better than one
>  O.S., VMS.

From our (rather limited) experience with VMS, I don't blame you for
implementing your own multitasking here; it probably was easier than sorting
out all the SYS$HOOEY that you would have needed to make things work under
VMS.

Also, all the yack about overhead (in other responses to Jon's original
posting) of process creation brought out at least one interesting question:
by the time the DBMS vendor's thread handling stuff is jazzed up to
include accounting, etc, etc, might the differences between the standard
O/S process creation and the DBMS stuff not be all that great?

>I reply :
>  Dense might not be the right word. In any case, I don't know if
>  our competition requires special privileges. This wasn't the point
>  I was trying to make. Any program of any kind that doesn't requires
>  special privileges shouldn't be able to crash the machine on which
>  it is running unless the machine is very badly tuned or else the
>  O.S. isn't intended for production environments. I know that I feel
>  good knowing that in the only case I know of that a machine running
>  Sybase has crashed it has been because of a bug in VMS.

Fair enough.  I guess the point I was trying to make is that your competition
probably doesn't require these privileges either.  I can certainly appreciate
why a software vendor feels more comfortable being able to tell customers
that no system crashes are due to bugs in the vendor's software.

As I recall, the original post was basically asking for information as to
why Sybase appeared to be so much more efficient, in terms of its use of
system resources.  Your response sounded to me like "because we have
implemented our own, more efficient, routines for accomplishing the same
tasks".  I would not dispute that your company has good reasons for
approaching the problem this way; however, I feel it is reasonable to
wonder exactly how much measurable benefit there is to your theoretically
better approach.

I am not trying to dump on you (or Sybase) at all, sir.  In particular, I
support any company that tries to implement better software through design.

Chris Hermansen                         Timberline Forest Inventory Consultants
Voice: 1 604 733 0731                   302 - 958 West 8th Avenue
FAX:   1 604 733 0634                   Vancouver B.C. CANADA
uunet!ubc-cs!van-bc!tacitus!clh         V5Z 1E5

forrest@phobos.sybase.com (Jon Forrest) (11/18/89)

In article <375@xyzzy.UUCP> harrism@aquila.DG.COM (Mike Harris) writes:
>Although the Sybase approach does have it's merits, there are several problems 
>with it:
>
>1) A single server architecture cannot take DIRECT advantage of multiprocessor
>hardware. [Other comments edited out...]

	No comment. Wait and see what our MP product does when it is released.

>2) A single server process may not be able to compete successfully for cpu 
>resources in a cpu bound environment. [Other comments edited out ...]

	This is probably true but a server of any kind, single process
	or multi process, that runs in a cpu bound environment is not
	running in an environment in which a server should be run.
	Servers are meant to be run on machines with as little else
	going on as possible. This is one of the rudements of a
	client/server architecture.

----
Anything you read here is my opinion and in no way represents Sybase, Inc.

Jon Forrest WB6EDM
forrest@sybase.com
{pacbell,sun,{uunet,ucbvax}!mtxinu}!sybase!forrest
415-596-3422

dg@sisyphus.sybase.com (David Gould) (11/19/89)

In article <125@tacitus.tfic.bc.ca> clh@tacitus.UUCP (Chris Hermansen) writes:
>
>From our (rather limited) experience with VMS, I don't blame you for
>implementing your own multitasking here; it probably was easier than sorting
>out all the SYS$HOOEY that you would have needed to make things work under
>VMS.
> ...
I agree that VMS can be complex, but it provides better facilities than
most (you may insert 'unix' here if you like) platforms we run on, even
though we did not design with VMS specifically in mind.  Nonetheless we ask
VMS or any OS for as few services as we can.

> ... more about Sybase tasking vs. OS processes ...
>tasks".  I would not dispute that your company has good reasons for
>approaching the problem this way; however, I feel it is reasonable to
>wonder exactly how much measurable benefit there is to your theoretically
>better approach.
> ...
It is reasonable to wonder, but not all reasonable to expect an answer to
'exactly how much measurable benefit ...'. This is of course proprietary
information, and of course, varies with the workload.  I can say that my
experience with VMS is that on a loaded system typically 40% to 60% of the
cpu is used in system modes (intr stack, kernal mode, exec mode), but on
a maxed out VMS system running only the Sybase server, quite a bit less
than 10% of the cpu is in used in system state.  I will also say that we
don't spend that time in our tasking services either.  As always, these
sort of figures depend almost everything, so your mileage may vary.

>Chris Hermansen                         Timberline Forest Inventory Consultants

- dg

------  All opinions are mine and may or may not represent Sybase Inc.  ------
David Gould       dg@sybase.com        {sun,lll-tis,pyramid,pacbell}!sybase!dg
                  (415) 596-3414      6475 Christie Ave.  Emeryville, CA 94608

jkrueger@dgis.dtic.dla.mil (Jon) (11/20/89)

dg@sisyphus.sybase.com (David Gould) writes:

>'exactly how much measurable benefit ...'. This is of course proprietary
>information

For you, perhaps.  Some of your competitors can substantiate their
performance claims with DATA.

Then there are vendors with both claims and data, but the two have
nothing to do with each other.  Oracle, for instance.

-- Jon
-- 
Jonathan Krueger    jkrueger@dtic.dla.mil   uunet!dgis!jkrueger
Isn't it interesting that the first thing you do with your
color bitmapped window system on a network is emulate an ASR33?

forrest@phobos.sybase.com (Jon Forrest) (11/20/89)

In article <125@tacitus.tfic.bc.ca> clh@tacitus.UUCP (Chris Hermansen) writes:
>From our (rather limited) experience with VMS, I don't blame you for
>implementing your own multitasking here; it probably was easier than sorting
>out all the SYS$HOOEY that you would have needed to make things work under
>VMS.
>
>Also, all the yack about overhead (in other responses to Jon's original
>posting) of process creation brought out at least one interesting question:
>by the time the DBMS vendor's thread handling stuff is jazzed up to
>include accounting, etc, etc, might the differences between the standard
>O/S process creation and the DBMS stuff not be all that great?
>

I don't know what you are expecting to see in a multi-threaded
environment but I think that you would find that a process in our
kernel is much simplier than you might think. In fact, it's my
personal belief (that I have no facts to backup) that process
creation time in our kernel is negligable. It's certainlly less
than on BSD Unix. But again I don't hold this against BSD Unix.
Our kernel is a special purpose kernel that can leave a lot of
the hard stuff to the operating system kernel.

----
Anything you read here is my opinion and in no way represents Sybase, Inc.

Jon Forrest WB6EDM
forrest@sybase.com
{pacbell,sun,{uunet,ucbvax}!mtxinu}!sybase!forrest
415-596-3422

clh@tacitus.tfic.bc.ca (Chris Hermansen) (11/21/89)

In article <666@dgis.dtic.dla.mil> jkrueger@dgis.dtic.dla.mil (Jon) writes:
>dg@sisyphus.sybase.com (David Gould) writes:
>
>>'exactly how much measurable benefit ...'. This is of course proprietary
>>information
>
>For you, perhaps.  Some of your competitors can substantiate their
>performance claims with DATA.

Thanks, Jon.  I was beginning to worry about becoming some kind of tight-ass
in my old age :-).

As I tried to emphasize by my `measurable benefit' question: design is one
thing, performance is another.  I had a car with four-wheel independent
suspension not to long ago; the car I have now has a beam rear axle.  Guess
which one handles better and is more fun to drive - surprise! not the
`theoretically better' one.  And honest! I'm not trying to accuse Sybase of
having an inferior product; I just don't like unsubstantiated performance
claims.

Never go to a snake-oil salesman to study herpetology.

Chris Hermansen                         Timberline Forest Inventory Consultants
Voice: 1 604 733 0731                   302 - 958 West 8th Avenue
FAX:   1 604 733 0634                   Vancouver B.C. CANADA
uunet!ubc-cs!van-bc!tacitus!clh         V5Z 1E5

clh@tacitus.tfic.bc.ca (Chris Hermansen) (11/21/89)

In article <7139@sybase.sybase.com> forrest@sybase.com writes:
...
>kernel is much simplier than you might think. In fact, it's my
>personal belief (that I have no facts to backup) that process
>creation time in our kernel is negligable. It's certainlly less
>than on BSD Unix. But again I don't hold this against BSD Unix.
>Our kernel is a special purpose kernel that can leave a lot of
>the hard stuff to the operating system kernel.

OK; let's suppose that I'm managing a computing center, and a bunch of
users want to get a database server.  In order to pay for the server, I
decide to implement per-user accounting on the server.  Now let's assume
that I buy Sybase's product :-)

Is it not the case that, in order to satisfy my desire for accounting, 
you are going to have to open an accounting file, write a record, and
close the file for each process created?  And will this not require locks
or process synchronization or some such thing?

Furthermore, let's suppose that user Joe Bloggs screws up a select statement
and puts a != into the join condition unstead of an =, causing the server
to go off and start retrieving zillions of lines of stuff from the database.
What mechanism do you provide to stop his processes in the back end?  Does
this not imply some kind of `process registration' in your kernel?

Now, I certainly wouldn't claim to be a systems hacker, so I could be way
off base on the above.  I don't dispute the utility of lightweight processes
in general; I'm just curious about how much unnecessary activity in normal
process creation you can eliminate in your server kernel.  It seems to me
that the *only* thing you don't require is a separate address space for each
process (since the processes presumably don't do the kind of things normal
user programs are prone to do in ordinary operating systems, like trample
all over memory they don't own).

Can you give us some specifics as to what significant issues you feel you
can safely ignore?  Without revealing proprietary secrets, of course :-)

Regards,
Chris Hermansen                         Timberline Forest Inventory Consultants
Voice: 1 604 733 0731                   302 - 958 West 8th Avenue
FAX:   1 604 733 0634                   Vancouver B.C. CANADA
uunet!ubc-cs!van-bc!tacitus!clh         V5Z 1E5

harrism@aquila.DG.COM (Mike Harris) (11/21/89)

In article <7114@sybase.sybase.com>, forrest@phobos.sybase.com (Jon Forrest) writes:
> 
> >2) A single server process may not be able to compete successfully for cpu 
> >resources in a cpu bound environment. [Other comments edited out ...]
> 
> 	This is probably true but a server of any kind, single process
> 	or multi process, that runs in a cpu bound environment is not
> 	running in an environment in which a server should be run.
> 	Servers are meant to be run on machines with as little else
> 	going on as possible. This is one of the rudements of a
> 	client/server architecture.

The previous text is the text, in whole, that I am responding to.

First, this is an important point. & worth more consideration:

> 	This is probably true but a server of any kind, single process
> 	or multi process, [...edited]

An example: If there are M processes on the system, including the Sybase server,
successfully competing equally for cpu resources, the distribution of cpu
will be 1/M *100 percent per process. If the Sybase server could run N
servers, the distribution would be 1/(M+N) *100 percent per process. Since
there were N servers working for the same application (Sybase), the 
distribution to Sybase would be N/(M+N) *100. 

Grant me that M=5 and that there was enough Sybase work to allow N=2. With
a single server architecture, the distribution to Sybase would be 1/5 *100, or,
20% of the CPU. Given the multi server architecture, the distribution would
change to 2/(5+2) *100, or 28.5%. A significant change indeed!

Also the most convenient way for an administrator to tune a system is to
change the number of server processes. It is easy to do this if more processes
can be created. It is difficult to do this if they can't. In the previous
example, the only easy way to give "Sybase" more cpu was to create more servers.
If, for example, 2 of the 5 processes were for another service, I could give 
Sybase's single server more cpu by reducing the other service's servers to
1. Note that I couldn't have done this if the other service didn't already
have multiple servers.

Jon mentions (elsewhere in the net) a new MP architecture. Is Sybase working
on this soley to take advantage of multiple processors? Wouldn't this
architecture allow multiple servers on a single architecture?

Second:
> 	[edited....          ] but a server of any kind, single process
> 	or multi process, that runs in a cpu bound environment is not
> 	running in an environment in which a server should be run.
> 	Servers are meant to be run on machines with as little else
> 	going on as possible. [edited.....]

Perhaps for a simple PC or workstation server. Consider, also, the small
[medium?} size business that has a mix of order entry, order processing,
payroll, accounting, etc, applications. They are very likely to purchase
a medium size "server", probably unix, for their business. This "server"
will be handling ALL of their applications. I would say that the "accounting"
servers have as much right to CPU as the "payroll", or (classic) "order
entry" servers. 

I believe that people, all too often, think of a database server as the only 
type of server. Airline reservation systems, and Bank Teller applications are 
other examples of services that can (and do) benefit from the client-server
architecture. They (not seen to you at the terminal) also will act as clients
to a database server.

Even forgetting, for now, about the mixed use departmental server, consider
our server machine hosting a heterogenous mix of server applications. They
are all server processes. Hosted on my "server" machine. How do I tune them
now? The only way the Sybase (or single process server) will get more cpu is
when the other processes wait on it (demand scheduling comes into play). If 
some of the other services aren't using Sybase, they wont be forced to
give up cpu to Sybase.

Thirdly:
> 	[edited....]This is one of the rudements of a
> 	client/server architecture.

This has nothing to do with the client/server architecture. It may happen
to be and ideal environment to run it in, but that is beside the point.

No flames intended.

Mike Harris - KM4UL                      harrism@dg-rtp.dg.com
Data General Corporation                 {world}!mcnc!rti!dg-rtp!harrism
Research Triangle Park, NC

forrest@phobos.sybase.com (Jon Forrest) (11/22/89)

In article <127@tacitus.tfic.bc.ca> clh@tacitus.UUCP (Chris Hermansen) writes:
>OK; let's suppose that I'm managing a computing center, and a bunch of
>users want to get a database server.  In order to pay for the server, I
>decide to implement per-user accounting on the server.  Now let's assume
>that I buy Sybase's product :-)
>
>Is it not the case that, in order to satisfy my desire for accounting, 
>you are going to have to open an accounting file, write a record, and
>close the file for each process created?  And will this not require locks
>or process synchronization or some such thing?
>

Funny you should ask this question because this is very similar to
something I'm looking at now. My current approach doesn't require
opening an accounting file. Instead, we can just put the accounting
data in a row in a table. After all, we're talking about a relational
database. Then, an accounting program need only perfrom the relevent
queries to get the accounting data from the table to the system accounting
file. (This is not official. We may do something entirely different or
not do anything at all.)

>Furthermore, let's suppose that user Joe Bloggs screws up a select statement
>and puts a != into the join condition unstead of an =, causing the server
>to go off and start retrieving zillions of lines of stuff from the database.
>What mechanism do you provide to stop his processes in the back end?  Does
>this not imply some kind of `process registration' in your kernel?
>

Sybase and, as far as I know, all our competition provide some way
of iterrupting a command while it is executing. This is not a problem.

>Now, I certainly wouldn't claim to be a systems hacker, so I could be way
>off base on the above.  I don't dispute the utility of lightweight processes
>in general; I'm just curious about how much unnecessary activity in normal
>process creation you can eliminate in your server kernel.  It seems to me
>that the *only* thing you don't require is a separate address space for each
>process (since the processes presumably don't do the kind of things normal
>user programs are prone to do in ordinary operating systems, like trample
>all over memory they don't own).
>
>Can you give us some specifics as to what significant issues you feel you
>can safely ignore?  Without revealing proprietary secrets, of course :-)
>

I don't know enough about the internals of other operating systems
to give a full description of how we differ from each system.
So, instead I'll tell you what we do when we create a process. It's
really quite simple. We find a free process data structure, fill
in a few fields having to do with scheduling, assign a stack to
the process, and add the process to a queue. All this takes less
than 20 lines of code. If you look at the VMS Internals and Data
Structures manual you can get some idea of how complex process
creation can get.

----
Anything you read here is my opinion and in no way represents Sybase, Inc.

Jon Forrest WB6EDM
forrest@sybase.com
{pacbell,sun,{uunet,ucbvax}!mtxinu}!sybase!forrest
415-596-3422

john@anasaz.UUCP (John Moore) (11/22/89)

In article <510@xyzzy.UUCP> harrism@aquila.DG.COM (Mike Harris) writes:
]An example: If there are M processes on the system, including the Sybase server,
]successfully competing equally for cpu resources, the distribution of cpu
]will be 1/M *100 percent per process. If the Sybase server could run N
]servers, the distribution would be 1/(M+N) *100 percent per process. Since
]there were N servers working for the same application (Sybase), the 
]distribution to Sybase would be N/(M+N) *100. 
]
]Grant me that M=5 and that there was enough Sybase work to allow N=2. With
]a single server architecture, the distribution to Sybase would be 1/5 *100, or,
]20% of the CPU. Given the multi server architecture, the distribution would
]change to 2/(5+2) *100, or 28.5%. A significant change indeed!

What all of this points out is not so much the problem with a single
server as with the normal Unix operating system. I grant that a single
server won't take full advantage of multi-cpu systems. However, if
Unix is to be used seriously as a database server, the scheduler
should get smarter - you need to be able to tune the cpu usage by
process class. Likewise, you need better I/O system, which has:
asynchronous I/O, ordered I/O (FIFO disk I/O on request), and
disk mirroring. Sybase has dealt with this by writing their own
device driver. Unfortunately, this has limited their portability.
The lack of portability, and the lack of ability to effectively utilize
more than one CPU are two of the reasons that we reluctantly
disqualified Sybase from our recent high performance RDBMS selection
process. By the way, the Pyramid OS has all the features I described
above, and we are using it. I wish it was standard in the industry!

Anyhow,... the fact is that with proper cooperation from the OS
scheduler and I/O system, architectures that allow requests from
many processes to be served by one server can do a better job than
process per server. We just benchmarked a realistic application
mix where the Informix back-end memory usage alone exceed 350MBytes!
The advantage of one process per multiple users is that the
"threads" can be cheap - low memory usage, low CPU usage because
of precisely targetted scheduling algorithms. In addition, it
is cheaper for them to cooperate for optimized scheduling. Finally,
we have found that lock contention consumes lots of CPU when
you have a large number of very active processes. The single
server (or few servers) is a good way to reduce the lock overhead.
-- 
John Moore (NJ7E)           mcdphx!anasaz!john asuvax!anasaz!john
(602) 861-7607 (day or eve) long palladium, short petroleum
7525 Clearwater Pkwy, Scottsdale, AZ 85253
The 2nd amendment is about military weapons, NOT JUST hunting weapons!

forrest@phobos.sybase.com (Jon Forrest) (11/23/89)

In article <510@xyzzy.UUCP> harrism@aquila.DG.COM (Mike Harris) writes:
[a lot of good stuff about processes, servers, and CPU usage]

As I understand the issue, the main reason why we don't use multiple
processes is because of locking and synchronizations issues, and because
of the lack of portability inherent in any solutions to these problems.
We have no religious disagreements with using multiple processes.
But, first of all, by controlling locking and synchronization in
a portable single O.S. process environment, I think we can do a better
job than trying to understand the effects of each O.S.'s methods (if any)
of doing locking and synchronization. 

Don't forget that our limitiation is only that one database can't
be accessed by more than one server at a time. You can run any number
of servers on one machine, as long as you have the resources to support
the servers. Plus, given the fact that servers can talk to other servers
by  using RPC's, the end effect isn't that much different than having
multiple servers talking to the same database.

>Jon mentions (elsewhere in the net) a new MP architecture. Is Sybase working
>on this soley to take advantage of multiple processors? Wouldn't this
>architecture allow multiple servers on a single architecture?

I think as more MP servers come on the market from us and our competitors
there will be some discussion about just what constitutes a single server.
I also think this will be true for any kind of server that runs in
an MP environment, and not just a database server.
Other than this I have no comment about MP servers.

>
[a lot of good comments about dedicating resources to servers]

There's no doubt that people will use servers of various kinds on
machines that must perform other work. A dedicated machine for each
server is a ideal to strive for, but isn't always practical.
So, the performance of any server will deteriorate as a function
of what else is being done on the machine. When the server performance
become unacceptable either the competing work or the server itself
will get moved off to a different machine. This is normal and although
I grant that this might not be an axiom of client/server architecture
it is certainly a corolary.

----
Anything you read here is my opinion and in no way represents Sybase, Inc.

Jon Forrest WB6EDM
forrest@sybase.com
{pacbell,sun,{uunet,ucbvax}!mtxinu}!sybase!forrest
415-596-3422

hargrove@harlie.sgi.com (Mark Hargrove) (11/23/89)

In article <510@xyzzy.UUCP> harrism@aquila.DG.COM (Mike Harris) writes:

>In article <7114@sybase.sybase.com>, forrest@phobos.sybase.com (Jon Forrest) writes:
>Second:
>> 	[edited....          ] but a server of any kind, single process
>> 	or multi process, that runs in a cpu bound environment is not
>> 	running in an environment in which a server should be run.
>> 	Servers are meant to be run on machines with as little else
>> 	going on as possible. [edited.....]
>
>Perhaps for a simple PC or workstation server. Consider, also, the small
>[medium?} size business that has a mix of order entry, order processing,
>payroll, accounting, etc, applications. They are very likely to purchase
>a medium size "server", probably unix, for their business. This "server"
>will be handling ALL of their applications. I would say that the "accounting"
>servers have as much right to CPU as the "payroll", or (classic) "order
>entry" servers. 

Accounting, payroll, order entry, etc. aren't servers, they're *clients* of
a database server!

>
>I believe that people, all too often, think of a database server as the only 
>type of server. Airline reservation systems, and Bank Teller applications are 
>other examples of services that can (and do) benefit from the client-server
>architecture. They (not seen to you at the terminal) also will act as clients
>to a database server.

What are you trying to get across here?  Your first sentence puts forth a 
proposition which is completely unconnected to the following sentences.

>
>Even forgetting, for now, about the mixed use departmental server, consider
>our server machine hosting a heterogenous mix of server applications. They
>are all server processes. Hosted on my "server" machine. How do I tune them
>now? The only way the Sybase (or single process server) will get more cpu is
>when the other processes wait on it (demand scheduling comes into play). If 
>some of the other services aren't using Sybase, they wont be forced to
>give up cpu to Sybase.
>

Huh?  A server is just a process.  Don't make the problem more difficult
than it is.  There are facilities in all real OS's to allow an administrator
to schedule/prioritize a process.

>Thirdly:
>> 	[edited....]This is one of the rudements of a
>> 	client/server architecture.
>
>This has nothing to do with the client/server architecture. It may happen
>to be and ideal environment to run it in, but that is beside the point.

It has *everything* to do with client/server architecture in the context
that concerns this newsgroup, and I think *you* miss the point.

The future is here, yet ye refuse to see it.

The day of the fully general purpose commercial processor is fading
rapidly.  Mike Harris's concerns over "mixed application loads" is a
symptom of living in this fading day, and entirely misses the vision
embodied in the notion of client-server architectures.  Here's the idea
Mike, since you seem to have missed it:

The client and server don't have to run on the same machine.  In fact,
as Jon Forrest (correctly) points out, in the general case, you don't
*want* them to run on the same machine.

When CPU demand is low relative to capacity, you can run both clients
and servers on the same box.  As demand grows, you balance things for a
while using whatever scheduling/prioritization facilities the OS has
available.  As demand continues to grow, you *distribute* the processing
load across multiple machines.  The *first* action to take is to
separate clients from servers!  If demand still continues to grow, you
separate the servers onto their own machines (and in the extreme (and
not at all impractical) case, you run each client and each server on its
own machine).  This model is simple, elegant, and fundamentally right.
The client-server model is what *enables* this controlled evolution of a
commercial application environment.

Silicon Graphics has VAXen, 500+ Macs, 100+ PC's and over 1000
workstations sitting on our network.  I would have to be severely
retarded not to *want* to take advantage of all that processing power;
the client-server model gives me a practical way to *plan* to take
advantage of it.

Folks have talked about "coopertive computing" and "distributed
processing" for a long time, while still worrying about whether they can
fit in just one more report into their nightly batch.  The client-server
model, as embodied in the actual *products* available from Sybase, Ingres, 
et.al., make it possible to *do* something today!

*Some restrictions may apply.  Severe penalties for early withdrawal. :-)

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Mark Hargrove                                       Silicon Graphics, Inc.
email: hargrove@harlie.corp.sgi.com                 2011 N.Shoreline Drive
voice: 415-962-3642                                 Mt.View, CA 94039

coop@phobos.sybase.com (John Cooper) (11/23/89)

In article <7185@sybase.sybase.com> forrest@sybase.com writes:
>Don't forget that our limitiation is only that one database can't
>be accessed by more than one server at a time. You can run any number
>of servers on one machine, as long as you have the resources to support
>the servers. Plus, given the fact that servers can talk to other servers
>by  using RPC's, the end effect isn't that much different than having
           ^^^^^
>multiple servers talking to the same database.

My colleague Jon Forrest neglected to mention what RPC's are.  For those not
familiar with Sybase this stands for Remote Procedure Calls.  As of version
4.0 of our SQL-Server, we offer the ability for a user in one server to call a
stored procedure remotely in another server.

I believe Sybase is the only RDBMS that has stored procedures.  For those who
aren't familiar with them in general:
   Stored procedures are batches of SQL code which can be created and stored in
a database.  Like a 3GL program, they can contain variables, have sophisticated
capabilities such as loops and recursion, accept input arguments which can be
altered for return, and return status codes upon completion.  They are also
compiled so that their execution plans can be resident in cache; the compiled
code is stored in system tables as is the SQL source.

John Cooper                               | My opinions, may not be employer's.
sybase!coop@sun.com                       |
{sun,lll-tis,pyramid,pacbell}!sybase!coop | "If you don't like the news you can
6475 Christie Av.  Emeryville, CA 94608   | always go out and make some of your
415 596-3500                              | own."

clh@tacitus.tfic.bc.ca (Chris Hermansen) (11/23/89)

In article <7167@sybase.sybase.com> forrest@phobos.UUCP (Jon Forrest) writes:
>In article <127@tacitus.tfic.bc.ca> clh@tacitus.UUCP (Chris Hermansen) writes:
...
>>Is it not the case that, in order to satisfy my desire for accounting, 
>>you are going to have to open an accounting file, write a record, and
>>close the file for each process created?  And will this not require locks
>>or process synchronization or some such thing?
...
>...                         Instead, we can just put the accounting
>data in a row in a table. After all, we're talking about a relational

OK, but that's not quite what I was asking; whether you put it in an
accounting file or a table, some process somewhere has to collect the info,
open the thing (or at least lock it), emit the info, and close (or at least
unlock it) - IE THERE'S A MORE-OR-LESS EQUIVALENT AMOUNT OF OVERHEAD IN EACH
CASE.

>Sybase and, as far as I know, all our competition provide some way
>of iterrupting a command while it is executing. This is not a problem.

OK again; but what I was trying to get at is, using your approach or the
vanilla O/S process approach, a bunch of info related to the process has
to be kept, in order to, among other things, monitor and cancel it - IE
THERE'S A MORE-OR-LESS EQUIVALENT AMOUNT OF OVERHEAD IN EACH CASE HERE, TOO.

>>....        I'm just curious about how much unnecessary activity in normal
>>process creation you can eliminate in your server kernel.  It seems to me
...
>So, instead I'll tell you what we do when we create a process. It's
>really quite simple. We find a free process data structure, fill
>in a few fields having to do with scheduling, assign a stack to
>the process, and add the process to a queue. All this takes less
>than 20 lines of code.

OK yet again; but this is just setting a process up to be executed.  If
you're comparing it to the standard UN*X fork/exec, you haven't begun
the exec yet; I don't even see that you've completed the equivalent of a
fork, as you haven't made a data space for your process.  So, what about
the control program that actually executes the process?  Presumably, it's
doing some kind of time-slicing in order to ensure response time; it must
also be doing the data-space allocation, collection of accounting info,
opening/closing/locking/unlocking tables, etc. etc.  This doesn't seem to
be a whole lot less than what a regular O/S does for a living, especially
when shared memory is available for interprocess communication, and when
forks (these days) don't commonly involve quite as much overhead as they
did back in the old days.

What about the extra overhead incurred by your (non-kernel) time slicer
operating beneath the UN*X time slicer?  Are you sure that you don't pay
some kind of performance penalty by having one large user process operating,
rather than a bunch of small ones, when paging is an issue?

I can understand that you might want to do all this yourself in some systems;
for instance, Software AG has certainly followed this approach in dealing
with IBM mainframe OLTP applications.  Even bad old CICS looks like this.
I would also suppose that an MS-DOS (network?) version would require the
roll-your-own approach.  And, as I said before, it's probably easier to
do it yourself under VMS than figure out how to get VMS to do it for you.
All that being the case, it makes sense that your approach is more self
reliant.  This is a different issue than the one we have been discussing,
though (ie, performance).

I'm getting a little nervous about the amount of net we're using here; should
we be carrying on this discussion over e-mail, instead?

Regards,

Chris Hermansen                         Timberline Forest Inventory Consultants
Voice: 1 604 733 0731                   302 - 958 West 8th Avenue
FAX:   1 604 733 0634                   Vancouver B.C. CANADA
uunet!ubc-cs!van-bc!tacitus!clh         V5Z 1E5

jkrueger@dgis.dtic.dla.mil (Jon) (11/23/89)

coop@phobos.sybase.com (John Cooper) writes:

>I believe Sybase is the only RDBMS that has stored procedures.

You believe wrong, mister :-)
INGRES has them now.  I wouldn't be surprised if others did too.

Is this a good thing?  It's a key to getting good TP benchmarks.
Vendors respond to them, because we (generic customers) respond to
them.  Should we?  If one's applications consist mostly of repeated
execution of the same short, simple queries, then yes.  Otherwise,
TP numbers predict relative performance poorly, or not at all, or
even opposite to fact.

>   Stored procedures are batches of SQL code which can be created and stored in
>a database.  Like a 3GL program, they can contain variables, have sophisticated
>capabilities such as loops and recursion, accept input arguments which can be
                                ^^^^^^^^^

Tell us more.  What would the Sybase procedure look like that maintains
the integrity in this table

	+--------+
	| people | parent   child
	+--------+--------+-------+
                 |   Jim  |   Jon |
                 | Jesse  |   Jim |
                 +--------+-------+

that no person can be his or her own ancestor or descendent?

-- Jon (not my own grandpa) Krueger
-- 
Jonathan Krueger    jkrueger@dtic.dla.mil   uunet!dgis!jkrueger
Isn't it interesting that the first thing you do with your
color bitmapped window system on a network is emulate an ASR33?

tim@binky.sybase.com (Tim Wood) (11/25/89)

In article <936@anasaz.UUCP> john@anasaz.UUCP (John Moore) writes:
> ... However, if
>Unix is to be used seriously as a database server, ...
>... you need better I/O system, which has:
>asynchronous I/O, ordered I/O (FIFO disk I/O on request), and
>disk mirroring. Sybase has dealt with this by writing their own
>device driver. Unfortunately, this has limited their portability.

I'd like to clarify: on Sun platforms, users licence,via MtXinu, a
version of SunOS 3.x which we modified to support async I/O to raw
disks.  So the user would actually install a different version of
SunOS.  This was our only choice at the time to get the throughput
advance of async I/O on Sun at the time (and avoid using the UNIX file
system and its attendant inability to verify I/O completion).

Now that SunOS 4.x has this feature, Sybase 4.0 supports their 
implementation & the Pyramid implementation (I just looked at the
release code :-).  On VMS (which has always offered async I/O),
we have always taken advantage of the feature.

Sybase 4.0 has its own (portable) mirrored-disk facility and solves
the ordered-write problem independently of the OS (which is all I 
can say about it :-).

Sybase, Inc. / 6475 Christie Ave. / Emeryville, CA / 94608	  415-596-3500
tim@sybase.com          {pacbell,pyramid,sun,{uunet,ucbvax}!mtxinu}!sybase!tim
		This message is solely my personal opinion.
		It is not a representation of Sybase, Inc.  OK.

forrest@phobos.sybase.com (Jon Forrest) (11/28/89)

(I'm not sure how far to continue this thread of discussion. I'll
be glad to stop, by popular demand, at any time.)

In article <128@tacitus.tfic.bc.ca> clh@tacitus.UUCP (Chris Hermansen) writes:

>OK, but that's not quite what I was asking; whether you put it in an
>accounting file or a table, some process somewhere has to collect the info,
>open the thing (or at least lock it), emit the info, and close (or at least
>unlock it) - IE THERE'S A MORE-OR-LESS EQUIVALENT AMOUNT OF OVERHEAD IN EACH
>CASE.
>

Trust me. The overhead of doing this won't be significant.

>
>OK again; but what I was trying to get at is, using your approach or the
>vanilla O/S process approach, a bunch of info related to the process has
>to be kept, in order to, among other things, monitor and cancel it - IE
>THERE'S A MORE-OR-LESS EQUIVALENT AMOUNT OF OVERHEAD IN EACH CASE HERE, TOO.
>

Again, I think that you're assuming that our kernel is more complicated
than it really is. There's no doubt that providing asynchronous cancelling
of a process is difficult because a process can be in any state when
the cancel occurs. Yet, there must be a way to change to a known
state in a graceful and predictable way. I'm not sure I understand
your concerns but again, I think that what we've done is the best
that could be done, given the usual constraints of portability and
speed.

>OK yet again; but this is just setting a process up to be executed.  If
>you're comparing it to the standard UN*X fork/exec, you haven't begun
>the exec yet; I don't even see that you've completed the equivalent of a
>fork, as you haven't made a data space for your process.  So, what about
>the control program that actually executes the process?  Presumably, it's
>doing some kind of time-slicing in order to ensure response time; it must
>also be doing the data-space allocation, collection of accounting info,
>opening/closing/locking/unlocking tables, etc. etc.  This doesn't seem to
>be a whole lot less than what a regular O/S does for a living, especially
>when shared memory is available for interprocess communication, and when
>forks (these days) don't commonly involve quite as much overhead as they
>did back in the old days.
>

I think you misunderstand the architecture of our kernel. It's not
one that allows arbitrary code to be run, like a Unix or VMS kernel.
All the "processes" that can be run are actually part of the server
executable image (with the exception of Remote Procedure Calls).
A process that runs under our kernel must behave according to various
rules imposed by the kernel. These include rules for giving up the
processor, how to start new processes, etc. By the way, we don't do
timeslicing. Our processes run either to completion or until they
voluntarily give up the CPU. 

>What about the extra overhead incurred by your (non-kernel) time slicer
>operating beneath the UN*X time slicer?  Are you sure that you don't pay
>some kind of performance penalty by having one large user process operating,
>rather than a bunch of small ones, when paging is an issue?
>

First of all, since we don't do time slicing in our kernel I don't
see any issue here. Second of all, even if we did do time slicing,
since this would be handled by our kernel, and paging would be handled
by the O.S. kernel I again don't see what the issue is. Our kernel
would not be (and is not) aware of paging taking place. Third of all,
having a bunch of small O.S. processes would bring up all those nasty
problems relating to locking and syncronization that caused us to
use the single process model in the first place.

>I can understand that you might want to do all this yourself in some systems;
>for instance, Software AG has certainly followed this approach in dealing
>with IBM mainframe OLTP applications.  Even bad old CICS looks like this.
>I would also suppose that an MS-DOS (network?) version would require the
>roll-your-own approach.

I'm happy to profess complete ignorance of CICS. I'll probably live
longer as a result.

>
>I'm getting a little nervous about the amount of net we're using here; should
>we be carrying on this discussion over e-mail, instead?
>

Me too. Since we starting to repeat ourselves why don't we just call
it quits.

----
Anything you read here is my opinion and in no way represents Sybase, Inc.

Jon Forrest WB6EDM
forrest@sybase.com
{pacbell,sun,{uunet,ucbvax}!mtxinu}!sybase!forrest
415-596-3422

monty@delphi.uchicago.edu (Monty Mullig) (11/28/89)

jkrueger@dgis.dtic.dla.mil (Jon) writes :

>coop@phobos.sybase.com (John Cooper) writes:
>>I believe Sybase is the only RDBMS that has stored procedures.

>You believe wrong, mister :-)
>INGRES has them now.  I wouldn't be surprised if others did too.

you mean I can get a version of ingres installed on my sun4
this afternoon that has stored procedures ?  or do you mean that
you've announced them for release 7.  if you aren't shipping them
yet (today!), then i think you may be wrong, mister :-).

--monty

ben@hobbes.sybase.com (ben ullrich) (11/28/89)

In article <128@tacitus.tfic.bc.ca> clh@tacitus.UUCP (Chris Hermansen) writes:
>What about the extra overhead incurred by your (non-kernel) time slicer
>operating beneath the UN*X time slicer?  Are you sure that you don't pay
>some kind of performance penalty by having one large user process operating,
>rather than a bunch of small ones, when paging is an issue?

i'm not sure which side of the fence you're on here -- i would think that the
context switching and process & memory overhead would be more significant for
numerous small processes than for one large one.  Paging is more of an issue
when there are more processes around, each wanting pieces of their own
executable and data, at unpredictable times.  Operating under the UNIX time
slicer is unavoidable as long as there is an OS, but living with this is
simpler with one process.  I'm not saying that multiple servers are not a good
idea per se, but wrt time slicing, they are worse off than a single-process 
architecture.

..ben
----
ben ullrich	       consider my words disclaimed,if you consider them at all
sybase, inc., emeryville, ca	"When you deal with human beings, a certain
+1 (415) 596 - 3500	        amount of nonsense is inevitable." - mike trout
ben@sybase.com			       {pyramid,pacbell,sun,lll-tis}!sybase!ben

harrism@aquila.DG.COM (Mike Harris) (11/28/89)

In article <7139@sybase.sybase.com>, forrest@phobos.sybase.com (Jon Forrest) writes:
> 
> I don't know what you are expecting to see in a multi-threaded
> environment but I think that you would find that a process in our
> kernel is much simplier than you might think. In fact, it's my
> personal belief (that I have no facts to backup) that process
> creation time in our kernel is negligable. It's certainlly less
> than on BSD Unix. But again I don't hold this against BSD Unix.
> Our kernel is a special purpose kernel that can leave a lot of
> the hard stuff to the operating system kernel.
> 
> ----

I'm sure that it is much quicker now. And simpler. And it has it's limitations.
Your company is working on the MP version now. I doubt that it will be
as quick. Or as simple. Or have all of it's previous limitations.
It can't be (as quick, simple). It has processes & threads to coordinate; data
structures to protect, etc.  To some extent your scheduler is converging 
towards the same stuff that the OS kernel is doing. 

Don't get me wrong. The scheduler your people wrote is pretty slick,  and 
it was written to fulfill an unmet need & allowed your product to run 
effectively on PCs. And it does. And it fulfills the needs of a very nice
slice of the marketplace. The other slices, however, are also very nice.
And their requirements are much more stringent in terms of performance,
versatility, tunability, and most importantly, scalability of performance.
Your non MP versions lacks the tunability & scalability aspects to a 
significant degree.

I'm getting off the point some. The primary point is that a database 
developers resources are most effectively utilized when directed against
developing database managers - NOT developing operating systems.

I will also agree that VMS (& yes, even DG's AOS/VS) have significant 
overheads WRT  process creation and context switching. I will also state
that DG's DG/UX implementation can context switch in 100 instructions
(not lines of C). I can't compete with that.

I also argue that the process creation time shouldn't be a primary
consideration. That is, unless, the servers didn't stay up, or were created
on the fly for clients. I'm not arguing for server per client either.

As a case study, I'll discuss a DG platform.

We have a product, TPMS, which runs on AOS/VS. It is a "Transaction Processing
Management System". It supports multiple copies of servers to support load
balancing. It supports multiple server types. It load balances servers against
the load by bringing up & destroying servers. Servers aren't bound to clients.
It event goes as far as having a server "exec" another process type to avoid
the process creation overhead. It knows how to work with DG/SQL SQL servers,
with INFOS servers (another data manager), with DG/DBMS (codasyl), and also
with your code. It's largest customer has an installation with a machine hosting
TPMS along with our data managers & their servers (for their application, 
running under the control of TPMS) with 750 concurrent users.

I could go an and talk about the SPERRY/UNISYS 1100 and the TIP subsystem also.
Perhaps later.

The point  is that while there is some penalty for OS context  switching 
overhead, the gains that can be acheived by taking advantage of an os's
prime resource, processes, more than outweighs the costs of it's use. This
holds for any MP directly, and indirectly for non MPs - mostly as a result
of the tuning that becomes available.

regards,

Mike Harris - KM4UL                      harrism@dg-rtp.dg.com
Data General Corporation                 {world}!mcnc!rti!dg-rtp!harrism
Research Triangle Park, NC

jkrueger@dgis.dtic.dla.mil (Jon) (11/28/89)

monty@delphi.uchicago.edu (Monty Mullig) writes:

>you mean I can get a version of ingres installed on my sun4
>this afternoon that has stored procedures ?

Yes.  Release 6.

>or do you mean that
>you've announced them for release 7.  if you aren't shipping them
>yet (today!), then i think you may be wrong, mister :-).

I agree.  If it ain't shipping, it doesn't exist.  For instance,
neither the Sybase nor the Oracle secure DBMS exists, based on that
simple rule.

-- Jon
-- 
Jonathan Krueger    jkrueger@dtic.dla.mil   uunet!dgis!jkrueger
Isn't it interesting that the first thing you do with your
color bitmapped window system on a network is emulate an ASR33?

harrism@aquila.rtp.dg.com (Mike Harris) (11/28/89)

>In article <510@xyzzy.UUCP> harrism@aquila.DG.COM (Mike Harris) writes:
>[a lot of good stuff about processes, servers, and CPU usage]
>
>As I understand the issue, the main reason why we don't use multiple
>processes is because of locking and synchronizations issues, and because
>of the lack of portability inherent in any solutions to these problems.
>We have no religious disagreements with using multiple processes.
>But, first of all, by controlling locking and synchronization in
>a portable single O.S. process environment, I think we can do a better
>job than trying to understand the effects of each O.S.'s methods (if any)
>of doing locking and synchronization. 

The locking and synchronizations issues (implementation thereof) aren't trivial.
I wrote the kernel portion of our Server Manager product and these issues
were the most difficult - especially when performance is required. Timing 
problems are murder.  Please understand that I'm not slamming the Sybase 
product. I just believe that an MP style architecture is required for larger, 
faster machines, for cpu utilization, for I/O bandwidth, and for tuning.

I do take issues with "lack of portability." By implementing your own locking
and sync routines, some portability is lost. True. But a port of that subsystem
shouldn't take more than a day or two. As an example, we have a "Server Manager"
product which manages multiple servers, etc. The "os/lck" subsystem code is 
about 300 lines of C, only about 50 required modification to port it. Yes, it
had to be ported, but the effort was minimal.

As far as the methods for locking & sync, that is part of the porting effort.
For hitting the highest numbers of platforms the fastest, with a quick 
product, their (Sybase) approach was very effective. Now that that has been 
accomplished, it's time to make it run faster on the bigger machines.

>
>Don't forget that our limitiation is only that one database can't
>be accessed by more than one server at a time. You can run any number
>of servers on one machine, as long as you have the resources to support
>the servers. Plus, given the fact that servers can talk to other servers
>by  using RPC's, the end effect isn't that much different than having
>multiple servers talking to the same database.
>
This is a VERY significant limitation. The single server against the one
database will be a bottleneck in any high volume application. Take DG's
DG/SQL product. It runs twenty times as fast as oracle or ingres. And this
is with journaling, synchronous commits, etc. We can't compete with it where
4gl's are required, but all of our large customers choose it for their high
volume applications because nobody else can keep up. Not Sybase. Not Oracle.
Not Ingres. It is a multi server architecture by the way.

>>Jon mentions (elsewhere in the net) a new MP architecture. Is Sybase working
>>on this soley to take advantage of multiple processors? Wouldn't this
>>architecture allow multiple servers on a single architecture?
>
>I think as more MP servers come on the market from us and our competitors
>there will be some discussion about just what constitutes a single server.

Please clarify this question.

>I also think this will be true for any kind of server that runs in
>an MP environment, and not just a database server.
>Other than this I have no comment about MP servers.
>
>>
>[a lot of good comments about dedicating resources to servers]
>
>There's no doubt that people will use servers of various kinds on
>machines that must perform other work. A dedicated machine for each
>server is a ideal to strive for, but isn't always practical.

Very often the case. Not realistic either when, for some applications,
(I can give examples) the Application as a whole must run on the one
machine. Otherwise they must communicate on the net to collect the 
required information. This slows the application down too much.

I would state that the goal would be to have the application be self
contained on one machine. This would yield highest performance for the
application. It does, however, require proper tools & support for tuning
from the applications & it's servers to extract the highest performance
from the system. An MP (as in multi processed - not multi processor)
is one of these requirements.

>So, the performance of any server will deteriorate as a function
>of what else is being done on the machine. When the server performance
>become unacceptable either the competing work or the server itself
>will get moved off to a different machine. This is normal and although
>I grant that this might not be an axiom of client/server architecture
>it is certainly a corolary.

When the application gets too large, they will buy a bigger machine or
faster components to support it.

Mike Harris - KM4UL                      harrism@dg-rtp.dg.com
Data General Corporation                 {world}!mcnc!rti!dg-rtp!harrism
Research Triangle Park, NC

harrism@aquila.DG.COM (Mike Harris) (11/28/89)

In article <7189@sybase.sybase.com>, coop@phobos.sybase.com (John Cooper) writes:
> My colleague Jon Forrest neglected to mention what RPC's are.  For those not
> familiar with Sybase this stands for Remote Procedure Calls.  As of version
> 4.0 of our SQL-Server, we offer the ability for a user in one server to call a
> stored procedure remotely in another server.
> 
It should be made clear that these are Data Base RPCs. An RPC is a generic
term relating to any Remote Procedure Call. It can be against a database, as
in stored procecures, or against remote language subroutine code (such as C
), etc. It, by default, indicates the calling of a remote language procedure, 
such as C, not a database stored procedure.

> I believe Sybase is the only RDBMS that has stored procedures.  
> 
Several other companies, DG's DG/SQL for instance, provide pre compilation, 
statement caching, database resident SQL code, etc. This is an area where 
apples & oranges are in the same basket & the gains received by them are 
goverened by the procducts overall architecture & how it is complemented.
> John Cooper                               | My opinions, may not be employer's.
regards,

Mike Harris - KM4UL                      harrism@dg-rtp.dg.com
Data General Corporation                 {world}!mcnc!rti!dg-rtp!harrism
Research Triangle Park, NC

harrism@aquila.DG.COM (Mike Harris) (11/28/89)

In article <936@anasaz.UUCP>, john@anasaz.UUCP (John Moore) writes:
> In article <510@xyzzy.UUCP> harrism@aquila.DG.COM (Mike Harris) writes:
    [stuff about cpu allocation]
> 
> [much good stuff about Unix needing to grow up & provide a configurable
>  scheduler, better I/O facilities, etc]

I fully agree. However, many business, which are run by people that aren't
as technical as your group, Spec UNIX for "portability" and "open systems". 
It is pure Specmanship.  Our AOS/VS operating system has all of the OS process 
scheduling and I/O support you mentioned, including mirroring. We even have 
hardware to support it (mirroring & intelligent controllers). We can't sell
it as well now, because managers, etc are Spec'ing UNIX. If people won't 
buy our hardware because it doesn't run the OS they want (even if ours is
technically superior, or meets the requirements for reliable, high volume
TP) then we can't sell the machine. So, what do we do? We sell Unix boxes to
those that want them & we enhance our UNIX kernel so that it meets our needs.

We, in an effort to support High Volume TP, & driven by DG/SQL, which takes
advantage of all of these features, have been adding these to DG/UX.
> 
> [good stuff about proper OS scheduler, etc leading to threads]
> of precisely targetted scheduling algorithms. In addition, it

I heartily agree with the threads discussion & look forward to the time when
they are readily available [soon, I believe, from our kernel].

> we have found that lock contention consumes lots of CPU when
> you have a large number of very active processes. The single
> server (or few servers) is a good way to reduce the lock overhead.

Your threads will have to cooperate also. If they can be scheduled, 
concurrently, against multiple CPUs then that is the way to go.

Overhead is inescapable when processes or threads must cooperate. Typically
the price is worth the gain.

My interpretation (and responses) of some of the discussions was biased more 
towards what is widely commercially available.

> John Moore (NJ7E)           mcdphx!anasaz!john asuvax!anasaz!john

regards,
Mike Harris - KM4UL                      harrism@dg-rtp.dg.com
Data General Corporation                 {world}!mcnc!rti!dg-rtp!harrism
Research Triangle Park, NC

harrism@aquila.DG.COM (Mike Harris) (11/28/89)

In article <HARGROVE.89Nov22184755@harlie.sgi.com>, hargrove@harlie.sgi.com (Mark Hargrove) writes:
> In-reply-to: harrism@aquila.DG.COM's message of 20 Nov 89 18:36:07 GMT
> In article <510@xyzzy.UUCP> harrism@aquila.DG.COM (Mike Harris) writes:
> >In article <7114@sybase.sybase.com>, forrest@phobos.sybase.com (Jon Forrest) writes:
	[edited....          ] 
> 
> Accounting, payroll, order entry, etc. aren't servers, they're *clients* of
> a database server!
> 
> > 	[harris's stuff about Airline reservation systems, Bank Tellers, etc]
> >	[Stuff about these being both servers [to users] [& clients of dbs ]
> >I believe that people, all too often, think of a database server as the only 
> >type of server. Airline reservation systems, and Bank Teller applications are 
> >other examples of services that can (and do) benefit from the client-server
> >architecture. They (not seen to you at the terminal) also will act as clients
> >to a database server.
> 
> What are you trying to get across here?  Your first sentence puts forth a 
> proposition which is completely unconnected to the following sentences.
> 

Open your eyes and look at the marketplace! My point is EXACTLY what I said
and I will quote again: "I believe that people, all too often, think of a 
database server as the only type of server."

Coonsider how the Airlines reservation system operates. (a CLASSIC high volume
TP application). Bank Card & Credit Card validations systems are other similar
examples.

There are about 50,000 terminals out there. They are connected via complex
com networks into the computing complex. The terminals are CLIENTS of [a 
heterogenous mix of] forms handling SERVERS. These packetize the requests and
forward them to the reservations SERVERS suite of processes. These are managed
by a TP manager. These in turn, are CLIENTS of the database, and also CLIENTS
of the TICKET printers, etc. Theses systems provide not only database integrity,but that of ticket printing, screen recovery, etc. The database is but ONE of
MANY parties to the commit. They are but ONE of MANY resources managed & 
corrdinated. 

Consider teller applications. The banks are VERY serious about being sure that
you don't get $ from the machine unless they debit your account. The primary
parties here are 1) the database, and 2) a process or service which guarantees
delivery of the cash dispensing. Theses two participate in a JOINT commit
protocal.

Yes, often a database is ONE of the SERVERS consulted. Ever consider a nested
heirarchy? We have a client server manager which doesn't force pure clients
and pure servers. The applications which run under it can act as both clients
and servers as required to get the job done. A database server is typically
just one of many other servers participating in a unit of work.
The servers (before the data managers) run very well, but they take advantage
of our automatic load balancing. The application often needs more front end
servers running than database servers.
> >
> > [Harris's comments about heterogenous server process mix]
> 
> Huh?  A server is just a process.  Don't make the problem more difficult
> than it is.  There are facilities in all real OS's to allow an administrator
> to schedule/prioritize a process.

I agree that real OSs allow an administrator to do this. I have worked on 
several "real" OSs, & now, unfortunately, I am being paid to make some of the
same stuff run on UNIX.  I also heartily support threads (os especially).
Unfortunately, we have applications that we want to sell on the UNIX platform.
There is a considerable amount of $$ there & people want to spend it. If
you want to sell into a marketplace you must meet it's immediate, or at worst,
it's needs in the immediate future. The market isn't waiting for threads or
powerful schedulers in UNIX. IT WANTS UNIX. This pains me also! So, for my
target environment, I must do Multi servers, etc.

A second point is that even if you have a powerful scheduler, It MIGHT be able
to help on an MP HDW IFF the application mix is sufficiently balanced that the
scheduler could be told to run servers on different CPUs, etc. Otherwise it
commonly happens that a processor is idle (often >10% of the time even on 
a busy system) because there wasn't a second server to schedule against that
cpu. You can't play the game if you don't have the equipment.

	[edited....; stuff from everybody about "This is one of the rudements 
	of a client/server architecture."]
     &	[a comment about my vision relative to the future, etc]

Touchy, Touchy, Touchy! I heartily agree with distributed processing, 
threads, RPCS, multi's, pcs, heterogenous environments. I am not a stodgy
old fart who wants to buy a bigg'n from IBM! I just take issue with people
who aren't open enough or don't (but claim to) know enough about the
marketplace to consider applications other then their own! Learn something
about the common High Volume TP applications. Learn something about other
peoples applications. Think about multi level CLIENT/SERVER applications.
Imagine a world where everything was a PC. Somebody would say "You know,
there are situations where a bigger machine which could hold more of
(or all of) my application would be faster & more cost effective" And he
would buy it from the people who could supply them. The proper machine
environment depends upon the specific APPLICATION, not some general
theories! "Money talks & bullshit walks."

> hargrove

regards,

Mike Harris - KM4UL                      harrism@dg-rtp.dg.com
Data General Corporation                 {world}!mcnc!rti!dg-rtp!harrism
Research Triangle Park, NC

forrest@phobos.sybase.com (Jon Forrest) (11/29/89)

In article <684@xyzzy.UUCP> harrism@aquila.DG.COM (Mike Harris) writes:
>Your company is working on the MP version now. I doubt that it will be
>as quick. Or as simple. Or have all of it's previous limitations.
>It can't be (as quick, simple). It has processes & threads to coordinate; data
>structures to protect, etc.  To some extent your scheduler is converging 
>towards the same stuff that the OS kernel is doing. 
>

Although you are welcome to talk theoretically, until you can buy our
(or anyone else's) MP product it isn't fair to talk about what such
a product can or can't do. I have been careful to avoid discussing our
MP product partially because it isn't right to talk about unreleased
products on comp.databases.

[a bunch of good stuff edited out]
>Your non MP versions lacks the tunability & scalability aspects to a 
>significant degree.
>

Please elaborate. How does our current (Release 4.0) product lack
tunability and scalability?

>I'm getting off the point some. The primary point is that a database 
>developers resources are most effectively utilized when directed against
>developing database managers - NOT developing operating systems.
>

Since a database management system is intimately tied to many
capabilities provided by an operating system, it is difficult for
database developers to ignore operating systems. There is also the
matter of being able to provide replicable performance on a number
of different platforms. By using a portable kernel we can better
understand how our database system will perform. Some companies (not
just DBMS companies) that sell a product that runs on a number of
different platforms are selling essentially a different product for
each platform because the implementation on each platform is based
on platform specific code. Sybase has made the effort to make most of
our code portable.

>
>I also argue that the process creation time shouldn't be a primary
>consideration.

I agree. I only mentioned this in response to a previous posting.

>
>As a case study, I'll discuss a DG platform.
>
[a good description of a very interesting DG DBMS]
>
>
>The point  is that while there is some penalty for OS context  switching 
>overhead, the gains that can be acheived by taking advantage of an os's
>prime resource, processes, more than outweighs the costs of it's use. This
>holds for any MP directly, and indirectly for non MPs - mostly as a result
>of the tuning that becomes available.
>

Again, having multiple servers accessing the same database requires
all sorts of additional concurrency control using O.S. services.
I don't argue that in theory being able to do this is a bad idea.
However, I believe that in practice, especially when a product has
to run on many platforms and under many O.S.'s, the penalty in doing
this can eat up much of the gain. When you combine this with all the
platform specific code you'd have to have I'm not convinced that
the net result is a win. To me, given where things are going, I'm
much more interested in seeing a "server" run on a 6 CPU MP machine
than on 6 separate machines.

----
Anything you read here is my opinion and in no way represents Sybase, Inc.

Jon Forrest WB6EDM
forrest@sybase.com
{pacbell,sun,{uunet,ucbvax}!mtxinu}!sybase!forrest
415-596-3422

wong@rtech.rtech.com (J. Wong) (11/29/89)

In article <6434@tank.uchicago.edu> monty@delphi.UUCP (Monty Mullig) writes:
>jkrueger@dgis.dtic.dla.mil (Jon) writes :
>
>>coop@phobos.sybase.com (John Cooper) writes:
>>>I believe Sybase is the only RDBMS that has stored procedures.
>
>>You believe wrong, mister :-)
>>INGRES has them now.  I wouldn't be surprised if others did too.
>
>you mean I can get a version of ingres installed on my sun4
>this afternoon that has stored procedures ?  or do you mean that
>you've announced them for release 7.  if you aren't shipping them
>yet (today!), then i think you may be wrong, mister :-).

Yes, INGRES 6.1 is available today on the Sun 4 (SPARCstation)
and does include stored procedures as a feature.  Talk to your
Sales Rep!
-- 

J. Wong		wong@rtech.com		Ingres Corporation.
****************************************************************
S-s-s-ay!

jkrueger@dgis.dtic.dla.mil (Jon) (11/29/89)

harrism@aquila.rtp.dg.com (Mike Harris) writes:

> ... DG/SQL ... runs twenty times as fast as oracle or ingres. And this
>is with journaling, synchronous commits, etc.

Can you tell us where this number (20x) comes from?

-- Jon
-- 
Jonathan Krueger    jkrueger@dtic.dla.mil   uunet!dgis!jkrueger
Isn't it interesting that the first thing you do with your
color bitmapped window system on a network is emulate an ASR33?

monty@delphi.uchicago.edu (Monty Mullig) (11/29/89)

In article <4178@rtech.rtech.com> wong@rtech.rtech.com (J. Wong) writes:
>Yes, INGRES 6.1 is available today on the Sun 4 (SPARCstation)
>and does include stored procedures as a feature.  Talk to your
>Sales Rep!
>
>J. Wong		wong@rtech.com		Ingres Corporation.

yes, mr. wong, i talked to my sales rep and he told me that ingres 6.x
was available for my suns, but my pc clients would need to run ingres
5.x since the pc ingres conversion to 6.x hadn't been made yet.  he
also told me that a "gateway" between 5.x on the pcs and 6.x on the
suns was available, but that he "couldn't guarantee its reliability".
moreover, people from our business school who ran 6.1 ingres were very
sceptical about the stability of the backend, so much so that they
wouldn't recommend it to us and that they felt that they had to keep
their production systems in 5.x.

i don't want a buggy, unstable backend.  i want a mature, stable
backend that can support my pcs frontends without awkward,
unguaranteeable bridges.   semi-solutions don't work with my users.
and your product hasn't shown itself to be as stable or mature in my
environment (sun servers and pc frontends) as sybase.

--monty

dlw@odi.com (Dan Weinreb) (11/30/89)

In article <713@xyzzy.UUCP> harrism@aquila.rtp.dg.com (Mike Harris) writes:

   The locking and synchronizations issues (implementation thereof) aren't trivial.
   I wrote the kernel portion of our Server Manager product and these issues
   were the most difficult - especially when performance is required. Timing 
   problems are murder.  Please understand that I'm not slamming the Sybase 
   product. I just believe that an MP style architecture is required for larger, 
   faster machines, for cpu utilization, for I/O bandwidth, and for tuning.

Excuse me, but with all these postings, I'm not clear about whether
we're all using the same terminology.  When you say "an MP style
architecture is required", I'm not sure precisely what you mean.

Is it an MP style architecture if you have a LWP package running
inside a single O/S process, and the LWP is the conventional kind
such as SunOS currently provides, and that you can easily write yourself?

Is it an MP style architecture if you have a LWP package running
inside a single O/S process, and the LWP is the kind provided by the
operating system that is capable of running many of the LWPs at the
same time on distinct processors of a coarse-grained multiprocessor
system, such as Sequent provides?

Or do you mean an architecture in which there are many different
operating system processes, each with its own address space, running
on the same database?

Or do you mean it is required that many different machines (not a
multiprocessor computer but many machines, connected by a LAN or
something like that) be able to all directly acts as servers for the
same database?

I apologize if I'm sounding pedantic, but I am honestly having trouble
following some of the interesting ideas on comp.databases because
industry terminology simply isn't uniformly standard.  Thanks.

Dan Weinreb		Object Design		dlw@odi.com

hargrove@harlie.sgi.com (Mark Hargrove) (11/30/89)

In article <724@xyzzy.UUCP> harrism@aquila.DG.COM (Mike Harris) writes:

   Open your eyes and look at the marketplace! My point is EXACTLY what I said
   and I will quote again: "I believe that people, all too often, think of a 
   database server as the only type of server."
   [comments about nested, multi-level client-server architectures deleted]

Whoops, I find that we're in violent agreement on this issue, I just didn't
follow your point the first time.
  
  
   [re: my complaints about Mike's lack of vision]
   Touchy, Touchy, Touchy! I heartily agree with distributed processing, 
   threads, RPCS, multi's, pcs, heterogenous environments. I am not a stodgy
   old fart who wants to buy a bigg'n from IBM! I just take issue with people
   who aren't open enough or don't (but claim to) know enough about the
   marketplace to consider applications other then their own! Learn something
   about the common High Volume TP applications. Learn something about other
   peoples applications. Think about multi level CLIENT/SERVER applications.
   Imagine a world where everything was a PC. Somebody would say "You know,
   there are situations where a bigger machine which could hold more of
   (or all of) my application would be faster & more cost effective" And he
   would buy it from the people who could supply them. The proper machine
   environment depends upon the specific APPLICATION, not some general
   theories! "Money talks & bullshit walks."

Again, we're in violent agreement.  In fact, I do know quite a bit about
reasonably high volume TP applications, and what kind of iron it takes
to support them.  I certainly didn't mean to imply that you can (or even
want to) do everything with PC's.  From a user's perspective, a PC will
frequently host client applications, but you are CLEARLY correct about 
two things:  all sorts of iron will be involved, and many participating
computers will act as servers for some functions and will be clients to
other servers as well.

--
=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=
Mark Hargrove                                       Silicon Graphics, Inc.
email: hargrove@harlie.corp.sgi.com                 2011 N.Shoreline Drive
voice: 415-962-3642                                 Mt.View, CA 94039

john@anasaz.UUCP (John Moore) (11/30/89)

In article <718@xyzzy.UUCP> harrism@aquila.DG.COM (Mike Harris) writes:
]In article <936@anasaz.UUCP>, john@anasaz.UUCP (John Moore) writes:
]> In article <510@xyzzy.UUCP> harrism@aquila.DG.COM (Mike Harris) writes:
]    [stuff about cpu allocation]
]> 
]> [much good stuff about Unix needing to grow up & provide a configurable
]>  scheduler, better I/O facilities, etc]
]
]I fully agree. However, many business, which are run by people that aren't
]as technical as your group, Spec UNIX for "portability" and "open systems". 
]It is pure Specmanship.  Our AOS/VS operating system has all of the OS process 
]scheduling and I/O support you mentioned, including mirroring. We even have 

After having been stung by having our software run on a proprietary platform
(IBM S/1 EDX - yeach!), our company is very serious about portability. This
is why I would like to see the configurable scheduler, better I/O, etc
in UNIX (as a standard), and have difficulty taking advantage of it when
it is a one-off customization in one manufacturer's kernel. Of course, if
the RDBMS manufacturer does it, I don't have to worry about it as much, but
I still contend that life would be better for OLTP customers if Unix
were to develop better OLTP performance related features.
]
]> we have found that lock contention consumes lots of CPU when
]> you have a large number of very active processes. The single
]> server (or few servers) is a good way to reduce the lock overhead.
]
]Your threads will have to cooperate also. If they can be scheduled, 
]concurrently, against multiple CPUs then that is the way to go.
]
Yes, that's true. However, if you use a few multi-threaded processes,
and can make the locking routines non-pre-emptable, the contention is
minimized.
]Overhead is inescapable when processes or threads must cooperate. Typically
]the price is worth the gain.
Not in our benchmarks (which were, I admit, rather extreme!).

]Mike Harris - KM4UL                      harrism@dg-rtp.dg.com
]Data General Corporation                 {world}!mcnc!rti!dg-rtp!harrism
]Research Triangle Park, NC



-- 
John Moore (NJ7E)           mcdphx!anasaz!john asuvax!anasaz!john
(602) 861-7607 (day or eve) long palladium, short petroleum
7525 Clearwater Pkwy, Scottsdale, AZ 85253
The 2nd amendment is about military weapons, NOT JUST hunting weapons!

harrism@aquila.Berkeley.EDU (Mike Harris) (12/02/89)

In article <674@dgis.dtic.dla.mil>, jkrueger@dgis.dtic.dla.mil (Jon) writes:
 [edited]
> Can you tell us where this number (20x) comes from?

From ET-1 benchmarks. These have been verified fairly closely (really!) by 
actual (non ET-1) customer benchmarks run both by us (for
regression/performance
testing), and as part of bid evaluations.

> -- Jon

regards,
Mike Harris - KM4UL                      harrism@dg-rtp.dg.com
Data General Corporation                 {world}!mcnc!rti!dg-rtp!harrism
Research Triangle Park, NC

harrism@aquila.Berkeley.EDU (Mike Harris) (12/02/89)

[forrest writes:]
> Although you are welcome to talk theoretically, until you can buy our
> (or anyone else's) MP product it isn't fair to talk about what such
> a product can or can't do. I have been careful to avoid discussing our
> MP product partially because it isn't right to talk about unreleased
> products on comp.databases.

You can buy our DG/SQL database manager. You can buy our TPMS transaction
manager. You can buy our DG/DBMS database manager. You can buy the 
Sperry/Unisys data managers & transaction managers...Most of these products
have been available for more than 5 years. The Server Manager that I am working
on will be shipping early next year (you can order it now) with DG/ICOBOL &
DG/Object-Office. Soon thereafter with DG/ISAM. (These latter three on UNIX).
I haven't been discussing academic research which hasn't a base in reality.
> 
> Please elaborate. How does our current (Release 4.0) product lack
> tunability and scalability?

If it lacks MP support (or concurrently schedulable threads) then it lacks 
support for a significant aspect of application performance tuning.
I didn't mean to imply that it didn't have any support for tuning. Sorry.

> [ good stuff about portable code, etc]

I believe that there is a middle line that is somewhere between your Sybase 
product &, for example, our very propriatary DG/SQL. Utilization of commonly
available features such as multi processors is a requirement. Use of commonly
available I/O primitives is another (async I/O, etc). For example, I don't know
of any os that doesn't have primitives that aren't close enough to allow the
creation of a small sync subsystem. The subsystem must be ported to different
OSs, but it isn't difficult if the developer only uses basic functions such
as P() & V(). 

Your product is more portable & runs on a wide variety of smaller
platforms. Our
(DG/SQL) isn't as portable, but it is faster & supports large platforms (one
customer has 750 concurrent users). Different ends of the extremes! We are 
working to make ours more portable & you are working to make yours take 
advantage of larger (read mp, etc) platforms.
effic
> 
> Again, having multiple servers accessing the same database requires
> all sorts of additional concurrency control using O.S. services.
> I don't argue that in theory being able to do this is a bad idea.

It works in practice. Earlier comments describe such systems.

> ....under many O.S.'s, the penalty in doing ... this can eat up much of the 
> gain. When you combine this with all the platform specific code ...
> <I'm not convinced that the net result is a win.>

If one is careful to use "commonly" available functionality & insulate the
remainder of the code from it with a standard interface, then there is
a win. To me, common at the moment includes primitive semaphores P(), V(),
basic shared memory, possibly message queues (an easy to write subsystem using
shm & sems), and some I/O operations. Things that don't count are OS threads
and/or tasks, fancy I/O, etc. Standard RPC packages are coming out so I'd add
those to the list of "common" soon.

BTW, it is MUCH easier to put in support for MP (or distribution, or
whatever..)
from the beginning than it is to retrofit something to it. The "MUCH" depends 
upon whether or not MP was a consideration (a planned future) when the product 
was designed. Your MP work shouldn't be too difficult as you basically had a 
multi process architecture with your internal threads. A significant head
start, indeed!

> I'm ... much more interested in seeing a "server" run on a 6 CPU MP machine
> than on 6 separate machines.

So am I. The split is usually best based upon the application requirements - 
frequently accessed vs never acessed databases, etc. This is where the big
consulting $$$ come in! The beauty is that there isn't any absolutes - only
guidlines. Therefore, there is always work to do & $$ to earn.
> 
> Jon Forrest WB6EDM

regards,
Mike Harris - KM4UL                      harrism@dg-rtp.dg.com
Data General Corporation                 {world}!mcnc!rti!dg-rtp!harrism
Research Triangle Park, NC

jkrueger@dgis.dtic.dla.mil (Jon) (12/02/89)

harrism@aquila.Berkeley.EDU (Mike Harris) writes:

>In article <674@dgis.dtic.dla.mil>, jkrueger@dgis.dtic.dla.mil (Jon) writes:
> [edited]
>> Can you tell us where this number (20x) comes from?

>From ET-1 benchmarks. These have been verified fairly closely (really!) by 
>actual (non ET-1) customer benchmarks run both by us (for
>regression/performance testing), and as part of bid evaluations.

Can you tell us more?  The number as stated was a ratio of 20:1;
how did you arrive at that ratio?

-- Jon
-- 
Jonathan Krueger    jkrueger@dtic.dla.mil   uunet!dgis!jkrueger
Isn't it interesting that the first thing you do with your
color bitmapped window system on a network is emulate an ASR33?

aland@infmx.UUCP (Dr. Scump) (12/05/89)

In article <936@anasaz.UUCP> john@anasaz.UUCP (John Moore) writes:
>...
>Anyhow,... the fact is that with proper cooperation from the OS
>scheduler and I/O system, architectures that allow requests from
>many processes to be served by one server can do a better job than
>process per server. We just benchmarked a realistic application
>mix where the Informix back-end memory usage alone exceed 350MBytes!

Um, I would be *really* interested to see how you computed this.
Did you take into account text sharing, memory usage by the kernel,
database front-end processes, and other unrelated processes?
I can't fathom 350 MB of *actual main memory consumption* by the engines
and server(s).  I think I know the benchmark to which you refer, and
it was not even close to being memory-bound.

>The advantage of one process per multiple users is that the
>"threads" can be cheap - low memory usage, low CPU usage because
>of precisely targetted scheduling algorithms. In addition, it
>is cheaper for them to cooperate for optimized scheduling. Finally,
>we have found that lock contention consumes lots of CPU when
>you have a large number of very active processes. The single
>server (or few servers) is a good way to reduce the lock overhead.
>John Moore (NJ7E)           mcdphx!anasaz!john asuvax!anasaz!john

This depends on in what way the processes are "active".  If they are
competing for update access to exactly the same pages, lock contention
can indeed make the database spend much of its time managing locks.
(For example, running multiple insert processes against the same table
with page mode locking in use is considered to be in poor taste :-]).
In a normal transaction mix, however, these lock contentions are
atypical.

--
    Alan S. Denney  @  Informix Software, Inc.    
         {pyramid|uunet}!infmx!aland                 "I want to live!
   --------------------------------------------       as an honest man,
    Disclaimer:  These opinions are mine alone.       to get all I deserve
    If I am caught or killed, the secretary           and to give all I can."
    will disavow any knowledge of my actions.             - S. Vega

john@anasaz.UUCP (John Moore) (12/05/89)

In article <2762@infmx.UUCP> aland@infmx.UUCP (alan denney) writes:
]In article <936@anasaz.UUCP> john@anasaz.UUCP (John Moore) writes:
]>...
]>Anyhow,... the fact is that with proper cooperation from the OS
]>scheduler and I/O system, architectures that allow requests from
]>many processes to be served by one server can do a better job than
]>process per server. We just benchmarked a realistic application
]>mix where the Informix back-end memory usage alone exceed 350MBytes!
]
]Um, I would be *really* interested to see how you computed this.
]Did you take into account text sharing, memory usage by the kernel,
]database front-end processes, and other unrelated processes?
]I can't fathom 350 MB of *actual main memory consumption* by the engines
]and server(s).  I think I know the benchmark to which you refer, and
]it was not even close to being memory-bound.
]
It was computed two ways: (1) The amount of data space memory that
each front-end/back-end pair takes up (~.5mB); (2) The paging rate
on the system when we ran the number of processes that the above
estimate indicated should take all the memory. Remember, in the
benchmark you are thinking about, the front end and back ends
were in the same machine. By the way, this is not meant to be
a slam at Informix (hey... I'm a stockholder as well as a user :-) ),
just an observation of the pitfalls of using one server per user.

]>The advantage of one process per multiple users is that the
]>"threads" can be cheap - low memory usage, low CPU usage because
]>of precisely targetted scheduling algorithms. In addition, it
]>is cheaper for them to cooperate for optimized scheduling. Finally,
]>we have found that lock contention consumes lots of CPU when
]>you have a large number of very active processes. The single
]>server (or few servers) is a good way to reduce the lock overhead.
]>John Moore (NJ7E)           mcdphx!anasaz!john asuvax!anasaz!john
]
]This depends on in what way the processes are "active".  If they are
]competing for update access to exactly the same pages, lock contention
]can indeed make the database spend much of its time managing locks.
](For example, running multiple insert processes against the same table
]with page mode locking in use is considered to be in poor taste :-]).
]In a normal transaction mix, however, these lock contentions are
]atypical.

In our benchmark, with a very realistic transaction mix, the CPU
usage went up in a non-linear way with the number of processes,
eventually using up all the MIPS on the machine. We were able
to drive a system hard enough that it spent so much time thrashing
locks that it could only do .1 transactions per second. At lighter
load, it did 20 TPS.
-- 
John Moore (NJ7E)           mcdphx!anasaz!john asuvax!anasaz!john
(602) 861-7607 (day or eve) long palladium, short petroleum
7525 Clearwater Pkwy, Scottsdale, AZ 85253
The 2nd amendment is about military weapons, NOT JUST hunting weapons!

jwc@unify.uucp (J. William Claypool) (12/06/89)

In article <2762@infmx.UUCP> aland@infmx.UUCP (alan denney) writes:
>This depends on in what way the processes are "active".  If they are
>competing for update access to exactly the same pages, lock contention
>can indeed make the database spend much of its time managing locks.
>(For example, running multiple insert processes against the same table
>with page mode locking in use is considered to be in poor taste :-]).
>In a normal transaction mix, however, these lock contentions are
>atypical.

Unless, for example, you have an application where every transaction
does an insert into a history table.  ;-)
-- 

Bill Claypool    W. (916) 920-9092 |I know what I know if you know what I mean
jwc@unify.UUCP   H. (916) 381-4205 |------------------------------------------
    ...!{csusac,pyramid}!unify!jwc |  SCCA SFR Solo II   74 es  1984 CRX 1.5

harrism@aquila.rtp.dg.com (Mike Harris) (12/07/89)

In article <1989Nov29.224606.19358@odi.com>, dlw@odi.com (Dan Weinreb) writes:
: In article <713@xyzzy.UUCP> harrism@aquila.rtp.dg.com (Mike Harris) writes:
: 
:>    [locking and synchronizations issues ] 

: ....[When you say "an MP style architecture is required", I'm not sure 
: precisely what you mean]
: 
: Is it an MP style architecture if you have a LWP package running
: inside a single O/S process, and the LWP is the conventional kind
: such as SunOS currently provides, and that you can easily write yourself?

I am not familiar with their package. The following should clarify, however..
: 
: Is it an MP style architecture if you have a LWP package running
: inside a single O/S process, and the LWP is the kind provided by the
: operating system that is capable of running many of the LWPs at the
: same time on distinct processors of a coarse-grained multiprocessor
: system, such as Sequent provides?
: 

Yes.

: Or do you mean an architecture in which there are many different
: operating system processes, each with its own address space, running
: on the same database?

Yes, although "the same database" is not necessarily relevant.

: 
: Or do you mean it is required that many different machines (not a
: multiprocessor computer but many machines, connected by a LAN or
: something like that) be able to all directly acts as servers for the
: same database?

No. I was primarily refering to tightly coupled multi processing on single or
tightly coupled multi processors (same memory store, etc). The issues for
database/application partitioning for effective use of loosly coupled
processors
are equally as interesting.

: Dan Weinreb		Object Design		dlw@odi.com

regards,

Mike Harris - KM4UL                      harrism@dg-rtp.dg.com
Data General Corporation                 {world}!mcnc!rti!dg-rtp!harrism
Research Triangle Park, NC

bapat@rm1.UUCP (Bapat) (12/08/89)

   The recent discussion on client-server architectures has been fascinating,
   unfortunately I seem to have caught only the tail end of it. Would some
   kind soul who may have saved an archive of some of the salient earlier
   articles please email it to me? It would be much appreciated.

   Also, is the IBM concept of Co-operative Processing (under SAA) derived
   from client-server architectural concepts, or does it involve
   full-function equal peer entities sharing out a distributed processing load?
   Any insight from Co-op Processing gurus will be appreciated.
-- 
S. Bapat        novavax!rm1!bapat@uunet.uu.net     Racal-Milgo, Ft Lauderdale

"Our new RISC machine is so fast it finishes an infinite loop in 3 minutes."

aland@infmx.UUCP (Dr. Scump) (12/11/89)

In article <=LD.=$#@unify.uucp> jwc@unify.UUCP (J. William Claypool) writes:
>In article <2762@infmx.UUCP> aland@infmx.UUCP (alan denney) writes:
>>(For example, running multiple insert processes against the same table
>>with page mode locking in use is considered to be in poor taste :-]).
  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>In a normal transaction mix, however, these lock contentions are
>>atypical.
>
>Unless, for example, you have an application where every transaction
>does an insert into a history table.  ;-)
>-- 
>Bill Claypool    W. (916) 920-9092 >jwc@unify.UUCP  

You mean, you would recommend the use of page mode locking in such 
an application (constant history table inserts) ???

--
    Alan S. Denney  @  Informix Software, Inc.    
         {pyramid|uunet}!infmx!aland                 "I want to live!
   --------------------------------------------       as an honest man,
    Disclaimer:  These opinions are mine alone.       to get all I deserve
    If I am caught or killed, the secretary           and to give all I can."
    will disavow any knowledge of my actions.             - S. Vega

tore@idt.unit.no (Tore Saeter) (12/12/89)

I'm posting this on behalf of a colleague.

-----------------------------------------------------------------

I'd be interested in looking at some home brewed thread implementations. 
Does anyone know of any in the public domain or elsewhere that I could 
get a copy of?

Please answer by mail, thanks in advance
Heidi Bergh-Hoff
bergh-hoff%vax.runit.unit.uninett@nac.no

----------------------------------------------------------------

Tore Saeter
SINTEF / ELAB-RUNIT
N-7034 Trondheim, Norway
saeter@carmen.er.sintef.no

jwc@unify.uucp (J. William Claypool) (12/15/89)

In article <2813@infmx.UUCP> aland@infmx.UUCP (alan denney) writes:
>In article <=LD.=$#@unify.uucp> jwc@unify.UUCP (J. William Claypool) writes:
>>In article <2762@infmx.UUCP> aland@infmx.UUCP (alan denney) writes:
>>>(For example, running multiple insert processes against the same table
>>>with page mode locking in use is considered to be in poor taste :-]).
>  ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
>>>In a normal transaction mix, however, these lock contentions are
>>>atypical.
>>
>>Unless, for example, you have an application where every transaction
>>does an insert into a history table.  ;-)
>
>You mean, you would recommend the use of page mode locking in such 
>an application (constant history table inserts) ???

Quite the opposite.  I was suggesting that such a transaction is not
entirely atypical.  As a result, row level locking IS required and there
is more potential for contention in the lock management routines.

Actually, the big problem with some DBMS systems is that there is a
single insert point when extending a table.  Hence the games some
vendors have had to play with multiple history tables for TP1.

Of course, if there are enough insert points for all of the active
transactions, page locks may be entirely adequate.
-- 

Bill Claypool    W. (916) 920-9092 |I know what I know if you know what I mean
jwc@unify.UUCP   H. (916) 381-4205 |------------------------------------------
    ...!{csusac,pyramid}!unify!jwc |  SCCA SFR Solo II   74 es  1984 CRX 1.5

dhepner@hpisod2.HP.COM (Dan Hepner) (12/16/89)

From: jwc@unify.uucp (J. William Claypool)
>
>Quite the opposite.  I was suggesting that such a transaction is not
>entirely atypical.  As a result, row level locking IS required and there
>is more potential for contention in the lock management routines.
>
>Actually, the big problem with some DBMS systems is that there is a
>single insert point when extending a table.  Hence the games some
>vendors have had to play with multiple history tables for TP1.

The freedom to "play games with multiple history tables" is
being codified in TPC-A.

The rationale seems to be that if the benchmark insisted on a single history
table, then most existing products would immediately become bottlenecked
at precisely that place.  If there were justification for that
bottleneck being taken taken that seriously, it wasn't discovered.

Dan Hepner

jwc@unify.uucp (J. William Claypool) (12/21/89)

In article <13520008@hpisod2.HP.COM> dhepner@hpisod2.HP.COM (Dan Hepner) writes:
>From: jwc@unify.uucp (J. William Claypool)
>>
>>Quite the opposite.  I was suggesting that such a transaction is not
>>entirely atypical.  As a result, row level locking IS required and there
>>is more potential for contention in the lock management routines.
>>
>>Actually, the big problem with some DBMS systems is that there is a
>>single insert point when extending a table.  Hence the games some
>>vendors have had to play with multiple history tables for TP1.
>
>The freedom to "play games with multiple history tables" is
>being codified in TPC-A.

Yes.  However, TPC-A requires disclosure if the partitioning of the
history table is not transparent to the application.  It seems apparent
from this that there was (and is) some objection to a non-transparent
implementation.

>The rationale seems to be that if the benchmark insisted on a single history
>table, then most existing products would immediately become bottlenecked
>at precisely that place.  If there were justification for that
>bottleneck being taken taken that seriously, it wasn't discovered.

TPC is the same companies that will be running the benchmark and have
an interest in the results.  TPC-A was defined by all of these
companies reaching a consensus.  Surely, you can't expect that the
resulting benchmark will have any aspects which pose a serious problem
for any of the major participants.

Yes, Unify Corporation is a TPC member.
No, Unify 2000 does not bottleneck on the history table.
No, this article is not the official opinion of Unify Corporation.
-- 

Bill Claypool    W. (916) 920-9092 |I know what I know if you know what I mean
jwc@unify.UUCP   H. (916) 381-4205 |------------------------------------------
    ...!{csusac,pyramid}!unify!jwc |  SCCA SFR Solo II   74 es  1984 CRX 1.5