tim@binky.sybase.com (Tim Wood) (11/22/89)
In article <126@tacitus.tfic.bc.ca> clh@tacitus.UUCP (Chris Hermansen) writes: >In article <666@dgis.dtic.dla.mil> jkrueger@dgis.dtic.dla.mil (Jon) writes: >>dg@sisyphus.sybase.com (David Gould) writes: >> >>>'exactly how much measurable benefit ...'. This is of course proprietary >>>information >> >>For you, perhaps. Some of your competitors can substantiate their >>performance claims with DATA. > > >As I tried to emphasize by my `measurable benefit' question: design is one >thing, performance is another. [ ... analogy deleted ... ] >I'm not trying to accuse Sybase of >having an inferior product; I just don't like unsubstantiated performance >claims. > >Chris Hermansen Timberline Forest Inventory Consultants Your question puts us in something of a bind. If we don't give data, folks can say, "shucks, we want data." If we give data, folks might say, "you're a vendor, we don't believe your data; besides, it doesn't measure my application--TP1's are meaningless." And so on. We have numbers we like from competitive benchmarks against other products, conducted by our customers & prospects. However, publicizing those numbers is another issue. IMO, the most beneficial thing would be for a publication like _Digital Review_ to conduct a benchmark. Their trials of hardware and software seem to be very well conducted and documented. Also note that it's quite difficult to design a meaningful DBMS benchmark--about as hard as designing a DBMS schema for a real application. I'd like to see more consensus on what a meaningful benchmark is, then someone could measure our performance against it publicly. But the only benchmarks that finally matter are people's applications. -TW Sybase, Inc. / 6475 Christie Ave. / Emeryville, CA / 94608 415-596-3500 tim@sybase.com {pacbell,pyramid,sun,{uunet,ucbvax}!mtxinu}!sybase!tim This message is solely my personal opinion. It is not a representation of Sybase, Inc. OK.
sullivan@aqdata.uucp (Michael T. Sullivan) (11/23/89)
From article <7169@sybase.sybase.com>, by tim@binky.sybase.com (Tim Wood): > > IMO, the most beneficial thing would be for a publication > like _Digital Review_ to conduct a benchmark. _EE Times'_ SPEC is, I believe, working on this. However, it will take some time to get together. -- Michael Sullivan uunet!jarthur.uucp!aqdata!sullivan aQdata, Inc. aqdata!sullivan@jarthur.claremont.edu San Dimas, CA
jkrueger@dgis.dtic.dla.mil (Jon) (11/23/89)
tim@binky.sybase.com (Tim Wood) writes: >Your question puts us in something of a bind. If we don't give data, >folks can say, "shucks, we want data." If we give data, folks might say, >"you're a vendor, we don't believe your data; besides, it doesn't measure my >application--TP1's are meaningless." And so on. Just provide measurements we can replicate, at least in principle. Cf. the MIPS Performance Brief. -- Jon -- Jonathan Krueger jkrueger@dtic.dla.mil uunet!dgis!jkrueger Isn't it interesting that the first thing you do with your color bitmapped window system on a network is emulate an ASR33?
dhepner@hpisod2.HP.COM (Dan Hepner) (11/29/89)
From: hargrove@harlie.sgi.com (Mark Hargrove) > >The client and server don't have to run on the same machine. In fact, >as Jon Forrest (correctly) points out, in the general case, you don't >*want* them to run on the same machine. How much this will buy you is directly dependent upon the distribution of CPU cycle requirements between the clients and the server(s), and the relative cost of remote vs local communication between the clients and the server. 1. Is it your experience that more than 10% of the work is done by the clients? 2. Is it your experience that remote communication costs don't end up chewing into the savings attained by moving the clients somewhere else? >(and in the extreme (and not at all impractical) case, you run each > client and each server on its own machine). This model is simple, > elegant, and fundamentally right. This would require basically a 50-50 split of the workload between the client and server. A practical assumption? Dan Hepner
jkrueger@dgis.dtic.dla.mil (Jon) (11/30/89)
dhepner@hpisod2.HP.COM (Dan Hepner) writes: >1. Is it your experience that more than 10% of the work is done by > the clients? Sometimes. If it's only 10%, we may then assign 10 clients per server, thus balancing the load. Yes, the server load increases too, but not proportionately; balance might be 12 or 15 clients per server. >2. Is it your experience that remote communication costs don't end > up chewing into the savings attained by moving the clients > somewhere else? No, the lower bandwidth is more than offset by multiprocessing. When this isn't true, you probably have a poorly partitioned problem, not insufficient communiciations hardware. The same profiling that tells you who's shouldering more of the processing burden will also reveal if both sides are waiting for communications. >>(and in the extreme (and not at all impractical) case, you run each >> client and each server on its own machine). This model is simple, >> elegant, and fundamentally right. This isn't the extreme case. Multiple processors can divide work with better granularity than client and server processes. -- Jon -- Jonathan Krueger jkrueger@dtic.dla.mil uunet!dgis!jkrueger Isn't it interesting that the first thing you do with your color bitmapped window system on a network is emulate an ASR33?
hargrove@harlie.sgi.com (Mark Hargrove) (11/30/89)
In article <13520004@hpisod2.HP.COM> dhepner@hpisod2.HP.COM (Dan Hepner) writes: From: hargrove@harlie.sgi.com (Mark Hargrove) > >The client and server don't have to run on the same machine. In fact, >as Jon Forrest (correctly) points out, in the general case, you don't >*want* them to run on the same machine. How much this will buy you is directly dependent upon the distribution of CPU cycle requirements between the clients and the server(s), and the relative cost of remote vs local communication between the clients and the server. Huh? I'm afraid I don't understand what you're getting at wrt "distribution of CPU cycle requirements". Naturally, you are correct in assuming that you do have to ponder communication costs when building client-server models. 1. Is it your experience that more than 10% of the work is done by the clients? 10% of *what* work? It's my experience that a client does 100% of the work that's appropriate for the client to perform. The content of this work is highly variable -- it depends upon your application. Clients make requests to servers, and then do something with the results. A "typical" client-server application might have the client application handling presentation, user interface, and flow of control, while making requests of database servers (and perhaps of directory servers to locate the DB servers, and perhaps of Kerberos-type servers to handle authentication, etc.) 2. Is it your experience that remote communication costs don't end up chewing into the savings attained by moving the clients somewhere else? What do you mean by "remote"? What do you mean by "cost"? If my client and server live on the same ethernet (or FDDI ring, or UltraNet backbone) then no, I don't see a problem. On the other hand, if I'm communicating over a 300 baud dial-up network, then I'd better be pretty sure my clients aren't impatient for responses from the servers. Nevertheless, there are clear cases where even this slow link is perfectly OK. Try rephrasing your question :-). >(and in the extreme (and not at all impractical) case, you run each > client and each server on its own machine). This model is simple, > elegant, and fundamentally right. This would require basically a 50-50 split of the workload between the client and server. A practical assumption? No! Not at all. I'm not sure you really understand the notion of client-server. Have you read Mike Harris' recent postings? He gives good examples of what client-server is all about. I think *you're* drifting in the direction of distributed processing, where a single problem is broken up and shared by several machines. This is NOT what we're talking about here. -- =-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-= Mark Hargrove Silicon Graphics, Inc. email: hargrove@harlie.corp.sgi.com 2011 N.Shoreline Drive voice: 415-962-3642 Mt.View, CA 94039
tim@binky.sybase.com (Tim Wood) (12/01/89)
In article <13520004@hpisod2.HP.COM> dhepner@hpisod2.HP.COM (Dan Hepner) writes: >From: hargrove@harlie.sgi.com (Mark Hargrove) >> >>The client and server don't have to run on the same machine. In fact, >>as Jon Forrest (correctly) points out, in the general case, you don't >>*want* them to run on the same machine [...] >>(and in the extreme (and not at all impractical) case, you run each >> client and each server on its own machine). This model is simple, >> elegant, and fundamentally right. > >This would require basically a 50-50 split of the workload between >the client and server. A practical assumption? > This seems a bit simplistic. Database servers and client applications do fundamentally different kinds of work. Assuming an update-intensive workload, the server should try to keep the disks and the client connections busy. That is, as the volume of requests rises, the I/O activity, and commit rate, should rise linearly up to the saturation level of the I/O system. To speed up an I/O-saturated system, you add more I/O bandwidth. Conversely, if the CPU can't keep the disks busy under heavy workload, you need a faster CPU (or a different DBMS :-). The client side is concerned with very different issues, like window refresh strategies, presentation styles and local data analysis. All those things are very CPU-intensive. So, the different character of the client and server workloads makes comparing them a case of apples vs. oranges, and underlines the benefit in client/server: each party in the relationship does a few things well. -TW Sybase, Inc. / 6475 Christie Ave. / Emeryville, CA / 94608 415-596-3500 tim@sybase.com {pacbell,pyramid,sun,{uunet,ucbvax}!mtxinu}!sybase!tim This message is solely my personal opinion. It is not a representation of Sybase, Inc. OK.
dhepner@hpisod2.HP.COM (Dan Hepner) (12/01/89)
From: jkrueger@dgis.dtic.dla.mil (Jon) > > >1. Is it your experience that more than 10% of the work is done by > > the clients? > > Sometimes. If it's only 10%, we may then assign 10 clients per server, > thus balancing the load. Yes, the server load increases too, but not > proportionately; balance might be 12 or 15 clients per server. In the example, if one moved 10 clients taking 10% of a 100% used CPU, we would simplistically end up with the client CPU 10% used, and the server CPU still 90%. Adding one more client, we would end up with a saturated system with 11 Clients on an 11% utilized client machine, while the server was now 99% used. If this were so, it wouldn't seem either all that balanced, and probably a economically unjustifyable move. 100+% increase in hardware cost yielding a 10% increase in throughput. I don't see where the 12 or 15 came from, but even if true they don't seem on the surface to be all that good a deal. > >2. Is it your experience that remote communication costs don't end > > up chewing into the savings attained by moving the clients > > somewhere else? > > No, the lower bandwidth is more than offset by multiprocessing. Let's assume you have plenty of bandwidth, but not plenty of CPU cycles at the server. Remote communication, especially reliable remote comm, being more expensive than local communication. The extreme of my concern would be illustrated if the remote communication costs at the server end exceeded the processing/terminal handling done by the client, in which case one would actually lose by adding a remote machine for the clients. > >>(and in the extreme (and not at all impractical) case, you run each > >> client and each server on its own machine). This model is simple, > >> elegant, and fundamentally right. > > This isn't the extreme case. Multiple processors can divide work > with better granularity than client and server processes. Maybe you can clarify. The case in question was how frequently it would practical to put each client and each server on its own machine, with the assertion that if the client/server workload split weren't near 50-50, it wouldn't be practical. The points of confusion: 1) "Multiple processors" can be ambiguous as to remoteness, but given the context I'll assume remoteness. (right?) 2) Granularity. Are you postulating a flexible division of the work between client and server? A server which is flexibly divisible over both machines? I think all of these questions are facets of the same underlying question: how much of the typical application can be done at the client? > -- Jon Dan Hepner
jkrueger@dgis.dtic.dla.mil (Jon) (12/02/89)
dhepner@hpisod2.HP.COM (Dan Hepner) writes: >From: jkrueger@dgis.dtic.dla.mil (Jon) >> >> >1. Is it your experience that more than 10% of the work is done by >> > the clients? >> >> Sometimes. If it's only 10%, we may then assign 10 clients per server, >> thus balancing the load. Yes, the server load increases too, but not >> proportionately; balance might be 12 or 15 clients per server. >In the example, if one moved 10 clients taking 10% of a 100% used CPU, >we would simplistically end up with the client CPU 10% used, and >the server CPU still 90%. Perhaps I'm not making myself clear. That's 10% per client. 10% of the work is done by the client; this client serves a single user. Each additional concurrent user gets another client, which consumes another 10%, in this example. >Adding one more client, we would end up with >a saturated system with 11 Clients on an 11% utilized client machine, >while the server was now 99% used. If this were so, it wouldn't >seem either all that balanced, and probably a economically unjustifyable >move. All you're saying is that a two-process model doesn't scale well if we're already bottlenecked on either process. This is a tautology. >100+% increase in hardware cost yielding a 10% increase in >throughput. Indeed, it's worse than that: the interconnects aren't free. One doesn't win by distributing inherently sequential problems that one doesn't know how to decompose. Again, a tautology. >> >2. Is it your experience that remote communication costs don't end >> > up chewing into the savings attained by moving the clients >> > somewhere else? >> >> No, the lower bandwidth is more than offset by multiprocessing. >Let's assume you have plenty of bandwidth, but not plenty of CPU >cycles at the server. Remote communication, especially reliable remote >comm, being more expensive than local communication. In exactly the same way that reading bytes off disks costs more cycles than referencing memory, yes. But compelling cases for not requiring databases to reside in main memory can be made, no? >The extreme of my >concern would be illustrated if the remote communication costs at the server >end exceeded the processing/terminal handling done by the client, >in which case one would actually lose by adding a remote machine >for the clients. A valid concern. Got any data? Measured degradation in latencies? Throughput? I don't deny it can happen, just asking how often it does. And again, you're simply saying that sometimes costs of distributing the load are greater than benefits achieved. How true: sometimes the problem is intractable, or you don't know enough to decompose it, or your tools are poor, or the implementation is poor. Then you get the biggest monoprocessor you can afford, indeed. You've admitted you can't work smarter, so you'd better work harder. >> >>(and in the extreme (and not at all impractical) case, you run each >> >> client and each server on its own machine). This model is simple, >> >> elegant, and fundamentally right. >> >> This isn't the extreme case. Multiple processors can divide work >> with better granularity than client and server processes. >Maybe you can clarify. The case in question was how frequently it would >practical to put each client and each server on its own machine, with >the assertion that if the client/server workload split weren't near >50-50, it wouldn't be practical. The usual assumption is that each client can get its own machine, but the server has to share a single machine. This makes the server the bottleneck, in general. It's also a bad assumption: multithreaded servers can use multiprocessors to scale up, distributed DBMS can use distributed hosts to execute queries, and parallel servers can apply processors to each component of each query. The first two animals exist now. >The points of confusion: > 1) "Multiple processors" can be ambiguous as to remoteness, but given > the context I'll assume remoteness. (right?) Wrong, as in previous graf. > 2) Granularity. Are you postulating a flexible division of the work > between client and server? A server which is flexibly divisible > over both machines? Nope, a flexible approach to designing database engines. Remember, your query language can't tell the difference anyway. >I think all of these questions are facets of the same underlying question: >how much of the typical application can be done at the client? Fair question, but needlessly special. The general question is how can we divide up work, and what tools do we need, and how many of them exist yet? -- Jon -- Jonathan Krueger jkrueger@dtic.dla.mil uunet!dgis!jkrueger Isn't it interesting that the first thing you do with your color bitmapped window system on a network is emulate an ASR33?
dhepner@hpisod2.HP.COM (Dan Hepner) (12/02/89)
From: hargrove@harlie.sgi.com (Mark Hargrove) > > >(and in the extreme (and not at all impractical) case, you run each > > client and each server on its own machine). This model is simple, > > elegant, and fundamentally right. > > This would require basically a 50-50 split of the workload between > the client and server. A practical assumption? > Have you read Mike Harris' recent postings? He gives > good examples of what client-server is all about. I think *you're* > drifting in the direction of distributed processing, where a single > problem is broken up and shared by several machines. This is NOT what > we're talking about here. Why would "you run each client and each server on its own machine" if not to break a single problem up to be shared by more than one machine? Indeed, I'm talking about distributed processing, but I disagree with your protest that I'm alone. Without a distributed processing goal, what is the point of a C/S architecture? [Mark points out miscommunication] Clearly it's pointless to dispute the definitions of words, so let's get to some meat. Here's my claim: 1) If I need to do a set of several tasks, this becomes my "problem". Now I can probably break this problem easily into Mark's "single problems", or tasks which share no common resource. I can place each of those problem solutions on a different machine, and establish a communications network allowing me access to each, from my terminal, or whatever I have on my desk. I may even have intelligence on my desk to translate my high level human requests into a complex series of interaction with those solution machines. If you'd like me to stop here, and call that the fulfillment of a C/S architecture, some might contest how important your definition of C/S was. You might also call this distributed processing, but some might contest how important your definition of distributed processing was. 2) We may not agree on whether we've been discussing distributed processing, but maybe we'll agree on the desireability of distributed processing, even to the extent of Mike Harris> Imagine a world where everything was a PC. 3) The essense of distributed processing is, to use Mark's terminology, "breaking single problems into into parts which can be worked on by different machines", where a "single problem" implies a shared resource, e.g. a database. 4) The division of any "single problem" into a C/S allows for distribution, by allowing the client and server to be on different machines (serial connected, ethernet if you like). This I claim is the non-trivial definition of C/S, and indeed the primary purpose of doing so. This definition is certainly meaningful across an arbitrary number of levels of abstraction. 5) The value of that C/S distribution will be proportional to the percentage of the problem cost, as typically measured by CPU cycles, which is moved into the client. Dividing a problem into 10% client and 90% server isn't all that valuable. Doing so across 10 layers of abstraction is demonstrably ridiculous. [Yes, one should consider the security advantages in some C/S divisions to be of some value] On the other hand, dividing a problem into 90% client, and 10% server would yield an immediate potential for a 10X improvement using equal technology hardware. And doing so across 10 layers of abstraction would achieve the distributed dream: a world of PCs. This dream is what drives statements which describe C/S as "simple, elegant, and fundamentally right". 6) Unfortunately, the state of the art C/S divisions actually available from vendors usually end up without much work being done by the client, and many such products IMHO would be better off without having bothered. Hopefully this will change with time. If you have any notion of a world of only PCs, you gotta hope so too. Dan Hepner dhepner@hpda.hp.com Disclaimer: HP may well disagree with every word of this opinion.